From infroma at gmail.com Fri May 2 21:34:12 2014 From: infroma at gmail.com (Roman Inflianskas) Date: Fri, 02 May 2014 23:34:12 +0400 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers Message-ID: <19168724.RU9dWeh3or@romas-x230-suse.lan> It's really useful that python 3 allows me to use some Unicode symbols (as specified in https://docs.python.org/3.4/reference/lexical_analysis.html#identifiers[1]), especially Greek symbols for mathematical programs. But when I write mathematical program with lots of indices I would like to use symbols from block "Superscripts and Subscripts" (as id_continue), for example: ???? I don't see any problems with allowing yet another subset of Unicode symbols. In Julia, for example, I can use them without problems. -- Regards, Roman Inflianskas -------- [1] https://docs.python.org/3.4/reference/lexical_analysis.html#identifiers -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leapyear.org Fri May 2 22:48:49 2014 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 2 May 2014 13:48:49 -0700 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: <19168724.RU9dWeh3or@romas-x230-suse.lan> References: <19168724.RU9dWeh3or@romas-x230-suse.lan> Message-ID: This block includes non-alphanumeric characters. You wouldn't want to allow variables named x?? (~ x+1) Some of the characters in this block are already allowed (the letters in category Lm). The characters you want are in the No (other numbers) category. Unfortunately, adding that category would be problematic as it includes characters like ? and you surely don't want a variable named x? or x?. That's x1/2 and x(1) for those without Unicode fonts. --- Bruce Learn how hackers think: http://j.mp/gruyere-security https://www.linkedin.com/in/bruceleban On Fri, May 2, 2014 at 12:34 PM, Roman Inflianskas wrote: > It's really useful that python 3 allows me to use some Unicode symbols (as specified in https://docs.python.org/3.4/reference/lexical_analysis.html#identifiers), especially Greek symbols for mathematical programs. But when I write mathematical program with lots of indices I would like to use symbols from block "Superscripts and Subscripts" (as id_continue), for example: > > > ???? > > > I don't see any problems with allowing yet another subset of Unicode symbols. In Julia, for example, I can use them without problems. > > > -- > > Regards, Roman Inflianskas > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Fri May 2 23:17:33 2014 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 2 May 2014 17:17:33 -0400 Subject: [Python-ideas] Allow using symbols from Unicode block "Superscripts and Subscripts" in identifiers In-Reply-To: <19168724.RU9dWeh3or@romas-x230-suse.lan> References: <19168724.RU9dWeh3or@romas-x230-suse.lan> Message-ID: On Fri, May 2, 2014 at 3:34 PM, Roman Inflianskas wrote: > I would like to use symbols from block "Superscripts and Subscripts" -1 Python uses ** operator for what is superscript in math and [] operator for what is subscript. Allowing sub/superscripts in identifiers will create confusion. (It is not uncommon to mix typeset math with python code in generated documentation.) If you have many identifiers with subscripts, I would recommend using a list or a dictionary and call them a[1], a[2], etc. instead of a_{1,
a_{2.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From tjreedy at udel.edu Sat May 3 00:29:52 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 02 May 2014 18:29:52 -0400
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To: <19168724.RU9dWeh3or@romas-x230-suse.lan>
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>
Message-ID:

On 5/2/2014 3:34 PM, Roman Inflianskas wrote:
> It's really useful that python 3 allows me to use some Unicode
> symbols (as specified
> inhttps://docs.python.org/3.4/reference/lexical_analysis.html#identifiers),
> especially Greek symbols for mathematical programs. But when I write
> mathematical program with lots of indices I would like to use symbols
> from block "Superscripts and Subscripts" (as id_continue), for
> example:
>
> ????
>
> I don't see any problems with allowing yet another subset of Unicode
> symbols. In Julia, for example, I can use them without problems.

From 2.3. Identifiers and keywords
"The syntax of identifiers in Python is based on the Unicode standard
annex UAX-31, with elaboration and changes as defined below; see also
PEP 3131 for further details."

--
Terry Jan Reedy

From tjreedy at udel.edu Sat May 3 04:27:56 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 02 May 2014 22:27:56 -0400
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To:
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

Message-ID:

On 5/2/2014 6:29 PM, Terry Reedy wrote:
> On 5/2/2014 3:34 PM, Roman Inflianskas wrote:
>> It's really useful that python 3 allows me to use some Unicode
>> symbols (as specified
>> inhttps://docs.python.org/3.4/reference/lexical_analysis.html#identifiers),
>>
>> especially Greek symbols for mathematical programs. But when I write
>> mathematical program with lots of indices I would like to use symbols
>> from block "Superscripts and Subscripts" (as id_continue), for
>> example:
>>
>> ????
>>

I believe 'other numbers' are intentionally omitted.

>> I don't see any problems with allowing yet another subset of Unicode
>> symbols. In Julia, for example, I can use them without problems.

If the rules for identifiers are expanded, any code the uses newly
allowed names cannot be backported or run on previous versions. In
contracted, the opposite problem occurs. I do not think they should be
changed either way without a strong cause.

> From 2.3. Identifiers and keywords
> "The syntax of identifiers in Python is based on the Unicode standard
> annex UAX-31, with elaboration and changes as defined below; see also
> PEP 3131 for further details."

In other words, we use the standard with a few intentional
modifications. The 2.x ascii rules were the same or very similar as in
other languages (such as C). The 3.x rule are similar to other languages
that follow the same standard. There is a benefit to this.

--
Terry Jan Reedy

From steve at pearwood.info Sat May 3 06:50:23 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 3 May 2014 14:50:23 +1000
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To:
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

Message-ID: <20140503045022.GA4273@ando>

On Fri, May 02, 2014 at 10:27:56PM -0400, Terry Reedy wrote:

> If the rules for identifiers are expanded, any code the uses newly
> allowed names cannot be backported or run on previous versions. In
> contracted, the opposite problem occurs. I do not think they should be
> changed either way without a strong cause.

That applies to any new feature -- code using that feature cannot be
easily backported. In this case, it's actually quite simple to backport
code using the new rules for identifiers: just change the identifiers.
The algorithm used by the code remains that same.

> > From 2.3. Identifiers and keywords
> >"The syntax of identifiers in Python is based on the Unicode standard
> >annex UAX-31, with elaboration and changes as defined below; see also
> >PEP 3131 for further details."
>
> In other words, we use the standard with a few intentional
> modifications.

Playing Devil's Advocate, perhaps we could add a few more intentional
modifications.

While there are advantages to following a standard just for the sake of
following a standard, once you allow any changes, you're no longer
following the standard. So the argument becomes, why should we allow
that change but not this change?

Particularly for mathematically-focused code, I think it would be useful
to be able to use identifiers like (say) ?? for variance, g? for sample
skewness, or ?? for Pearson's skewness, to give a few real-world
examples. Regular digits may be ambiguous: compare s?? for the sample
variance with Bessel's correction, versus s12. (s twelve?)

I'm going to give a tentative +1 vote to allowing superscript and
subscript letters and digits in identifiers, if it can be done without
excessive cost in complexity or performance. Anything else, like (say) ?
(CIRCLED DIGIT FIVE), I will give a firm -1.

--
Steven

From greg.ewing at canterbury.ac.nz Sat May 3 08:38:21 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 03 May 2014 18:38:21 +1200
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To: <20140503045022.GA4273@ando>
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

<20140503045022.GA4273@ando>
Message-ID: <53648EDD.8060201@canterbury.ac.nz>

Steven D'Aprano wrote:
> Particularly for mathematically-focused code, I think it would be useful
> to be able to use identifiers like (say) ?? for variance,

Having ?? be a variable name could be confusing. To a
mathematician, it's not a distinct variable, it's
just ? ** 2.

--
Greg

From rosuav at gmail.com Sat May 3 08:49:16 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Sat, 3 May 2014 16:49:16 +1000
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To: <53648EDD.8060201@canterbury.ac.nz>
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

<20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz>
Message-ID:

On Sat, May 3, 2014 at 4:38 PM, Greg Ewing wrote:
> Steven D'Aprano wrote:
>>
>> Particularly for mathematically-focused code, I think it would be useful
>> to be able to use identifiers like (say) ?? for variance,
>
>
> Having ?? be a variable name could be confusing. To a
> mathematician, it's not a distinct variable, it's
> just ? ** 2.

Maybe, but subscripts can be useful. Recently we were discussing
linear acceleration on python-list, and the way I learned the
principle (other people learned it with different letters) was:

V? = V?t + at?/2

which should translate into Python as:

V? = V?*t + a*t*t/2

(Not sure if people's fonts have all those characters; that's read
"V-t equals V-0 t plus a t squared over two".)

Being able to use subscripts in identifiers wouldn't be *often*
useful, but it would make direct translation from math to code a bit
easier.

ChrisA

From bruce at leapyear.org Sat May 3 09:19:52 2014
From: bruce at leapyear.org (Bruce Leban)
Date: Sat, 3 May 2014 00:19:52 -0700
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To:
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

<20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz>

Message-ID:

I've actually written programs like that and honestly names like 'sigma'
and 'beta' and 'v_t' worked just fine. Many of us have used (x1, y1) and
(x2, y2) without confusing anyone because the digits weren't subscripted.

The ability to use Unicode in identifiers I'm sure is appreciated by
non-English writers but that's a decidedly different issue. This is a
solution without an actual problem.

--- Bruce
(from my phone)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From stefan_ml at behnel.de Sat May 3 09:29:07 2014
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 03 May 2014 09:29:07 +0200
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To:
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

<20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz>

Message-ID:

Bruce Leban, 03.05.2014 09:19:
> I've actually written programs like that and honestly names like 'sigma'
> and 'beta' and 'v_t' worked just fine. Many of us have used (x1, y1) and
> (x2, y2) without confusing anyone because the digits weren't subscripted.

Plus, the numbers are much easier to read that way than in tiny subscripts.

Stefan

From rosuav at gmail.com Sat May 3 09:30:17 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Sat, 3 May 2014 17:30:17 +1000
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To:
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

<20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz>

Message-ID:

On Sat, May 3, 2014 at 5:19 PM, Bruce Leban wrote:
> I've actually written programs like that and honestly names like 'sigma' and
> 'beta' and 'v_t' worked just fine. Many of us have used (x1, y1) and (x2,
> y2) without confusing anyone because the digits weren't subscripted.

Yeah; like I said, it's not a big thing. I certainly wouldn't choose a
language on the basis of subscript-digit-support-in-identifiers. But
when I'm working with maths I'm not overly familiar with (stuff a lot
more complicated than simple linear acceleration), and I'm trying to
translate a not-quite-perfect set of handwritten scribbles into code,
every little bit helps. That's why WYSIWYG music editing software is
so much more popular with novices than GNU Lilypond is - if you're not
*really* familiar with what you're working with, the difference
between "dot on the page that looks like this" and "c'8." slows you
down. Not insurmountable but the mind glitches across the gap.

ChrisA

From steve at pearwood.info Sat May 3 11:05:24 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 3 May 2014 19:05:24 +1000
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To: <53648EDD.8060201@canterbury.ac.nz>
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

<20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz>
Message-ID: <20140503090523.GB4273@ando>

On Sat, May 03, 2014 at 06:38:21PM +1200, Greg Ewing wrote:
> Steven D'Aprano wrote:
> >Particularly for mathematically-focused code, I think it would be useful
> >to be able to use identifiers like (say) ?? for variance,
>
> Having ?? be a variable name could be confusing. To a
> mathematician, it's not a distinct variable, it's
> just ? ** 2.

Actually, not really. A better way of putting it is that the standard
deviation is "just" the square root of ??. Variance comes first (it's
defined from first principles), and then the standard deviation is
defined by taking the square root.

But really, it doesn't matter which is derived from which. To a
mathematician, x? is just as much a legitimate variable as x. One can
say that f is a function of x? just as well as saying that it is a
function of y, where y happens to equal x?.

But regardless of philisophical differences regarding the nature of what
is or isn't a variable, versus something derived from a variable, it
simply is useful to have a one-to-one correspondence between variables
in Python code and notation used in mathematics.

Is it useful enough to make up for the (minor) issues that others have
already mentioned? I think so, but I will understand if others disagree.
I think that the ability to distinguish between x? and x? can be
important, and both x2 and x_2 are poor substitutes. (Of the two, I
prefer x2.)

But I'm also aware that this is very dependent on the problem domain. I
wouldn't use x? and x? outside of a mathematical context.

--
Steven

From stephen at xemacs.org Sat May 3 14:27:31 2014
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 03 May 2014 21:27:31 +0900
Subject: [Python-ideas] Allow using symbols from Unicode
block "Superscripts and Subscripts" in identifiers
In-Reply-To: <20140503090523.GB4273@ando>
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

<20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz>
<20140503090523.GB4273@ando>
Message-ID: <87ha57kpxo.fsf@uwakimon.sk.tsukuba.ac.jp>

Steven D'Aprano writes:
> On Sat, May 03, 2014 at 06:38:21PM +1200, Greg Ewing wrote:
> > Steven D'Aprano wrote:
> > >Particularly for mathematically-focused code, I think it would be useful
> > >to be able to use identifiers like (say) ?? for variance,
> >
> > Having ?? be a variable name could be confusing. To a
> > mathematician, it's not a distinct variable, it's
> > just ? ** 2.
>
> Actually, not really. A better way of putting it is that the standard
> deviation is "just" the square root of ??. Variance comes first (it's
> defined from first principles), and then the standard deviation is
> defined by taking the square root.

Thank you for writing that better than I could have. :-)

> But really, it doesn't matter which is derived from which. To a
> mathematician, x? is just as much a legitimate variable as x. One can
> say that f is a function of x? just as well as saying that it is a
> function of y, where y happens to equal x?.

We part company here. x? (in the usage "function of x?") is not a
variable, it's an expression. I don't think I've even seen the usage
"f(x?) = ..." in a *definition* of "f", with the single exception of
the use of "f(?,??) = ..." in defining the distribution of a random
variable, and even then that's unusual (? is almost always more
convenient, even for test statistics). I'd consider that the
exception that proves the rule.... Especially in a case like
z(x,?,??) = (x - ?)/?!

To put it another way, I suspect you would get rather upset if I used
both x and x? in such a context and treated them as I would x and y.
Or, if in real analysis I ignored the fact that x? is necessarily
non-negative. I could go on, but I think the point is clear:
*linguistically* these are expressions, not variables -- they are
constructed syntactically, and their semantics can be deduced from the
syntax.

Of course in mathematics you can treat them as variables (as
statisticians do ??), but that works because in mathematics no symbols
or syntax have fixed semantics, not ?, not even 0. If you can get a
version of Python that has "where ..." clauses in it that can define
semantics for sub- and superscript syntax past Guido, I'd be all for
this. But I really don't think that's going to happen.

> Is it useful enough to make up for the (minor) issues that others
> have already mentioned? I think so, but I will understand if others
> disagree. I think that the ability to distinguish between x? and
> x? can be important,

Which, I suspect, means these notations don't pass the "generalized
grit on Tim's monitor" test.

> and both x2 and x_2 are poor substitutes.

In programming (as opposed to the chemistry of nuclear fusion), if you
need to distinguish x? from x?, and x**2 and x[2] don't do the trick,
I suspect your notation has real readability problems no matter how
you arrange things spatially. I guess that use cases where such usage
is in good taste are way too rare to justify this.

From ron3200 at gmail.com Sat May 3 17:39:23 2014
From: ron3200 at gmail.com (Ron Adam)
Date: Sat, 03 May 2014 11:39:23 -0400
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To: <20140503090523.GB4273@ando>
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

<20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz>
<20140503090523.GB4273@ando>
Message-ID:

On 05/03/2014 05:05 AM, Steven D'Aprano wrote:
> On Sat, May 03, 2014 at 06:38:21PM +1200, Greg Ewing wrote:

>> >Steven D'Aprano wrote:

>>> > >Particularly for mathematically-focused code, I think it would be useful
>>> > >to be able to use identifiers like (say) ?? for variance,

>> >Having ?? be a variable name could be confusing. To a
>> >mathematician, it's not a distinct variable, it's
>> >just ? ** 2.

> Actually, not really. A better way of putting it is that the standard
> deviation is "just" the square root of ??. Variance comes first (it's
> defined from first principles), and then the standard deviation is
> defined by taking the square root.

The main problem I see is that many possible questions come to mind rather
than one simple or obvious interpretation.

Cheers,
Ron

From steve at pearwood.info Sat May 3 19:57:03 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 4 May 2014 03:57:03 +1000
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To:
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

<20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz>
<20140503090523.GB4273@ando>
Message-ID: <20140503175702.GF4273@ando>

On Sat, May 03, 2014 at 11:39:23AM -0400, Ron Adam wrote:
>
>
> On 05/03/2014 05:05 AM, Steven D'Aprano wrote:
> >On Sat, May 03, 2014 at 06:38:21PM +1200, Greg Ewing wrote:
>
> >>>Steven D'Aprano wrote:
>
> >>>> >Particularly for mathematically-focused code, I think it would be
> >>>useful
> >>>> >to be able to use identifiers like (say) ?? for variance,
>
> >>>Having ?? be a variable name could be confusing. To a
> >>>mathematician, it's not a distinct variable, it's
> >>>just ? ** 2.
>
> >Actually, not really. A better way of putting it is that the standard
> >deviation is "just" the square root of ??. Variance comes first (it's
> >defined from first principles), and then the standard deviation is
> >defined by taking the square root.
>
>
> The main problem I see is that many possible questions come to mind rather
> than one simple or obvious interpretation.

If I name a variable "x2", what is the "one simple or obvious
interpretation" that such an identifier presumably has? If standard,
ASCII-only identifiers don't have a single interpretation, why should
identifiers like ?? be held to that requirement?

Like any other identifier, one needs to interpret the name in context.
Identifiers can be idiomatic ("i" for a loop variable, "c" for a
character), more or less descriptive ("number_of_pages", "npages"), or
obfuscated ("e382702"). They can be written in English, or in some other
language. They can be ordinary words, or jargon that only means
something to those who understand the problem domain. None of this will
be different if sub/superscript digits and letters are allowed.

One of the frustrations on this list is how often people hold new
proposals to higher standard than existing features. Particularly
*impossible* standards. It simply isn't possible for characters like
superscript-two to be given a *single* interpretation (although there is
an obvious one, namely "squared") any more than it is possible for the
letter "a" to be given a *single* interpretation.

There are valid objections to this proposal. It may be that the effort
needed to allow code points like ? in identifiers without also allowing
? or ? may be too great. Or the performance cost is too high. Or the
benefit for mathematical-style code doesn't justify adding additional
language complexity.

Or even a purely aethetic judgement "I just don't like it". (I don't
like identifiers written in cyrillic, because I can't read them, but I'm
not the target audience for such identifiers and I will never need to
read them. Consequently I don't object if other people use cyrillic
identifiers in their personal code.)

Holding this proposal up to an impossible standard which plain ASCII
identifiers don't even meet is simply not cricket.

Thank you all for letting me get that off my chest, and apologies to Ron
for singling him out.

--
Steven

From rosuav at gmail.com Sat May 3 20:11:39 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 4 May 2014 04:11:39 +1000
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To: <20140503175702.GF4273@ando>
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

<20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz>
<20140503090523.GB4273@ando>
<20140503175702.GF4273@ando>
Message-ID:

On Sun, May 4, 2014 at 3:57 AM, Steven D'Aprano wrote:
> One of the frustrations on this list is how often people hold new
> proposals to higher standard than existing features. ...
>
> Holding this proposal up to an impossible standard which plain ASCII
> identifiers don't even meet is simply not cricket.
>
> Thank you all for letting me get that off my chest, and apologies to Ron
> for singling him out.

A fair point in this case, and yet there is such a thing as the
grandfather clause. Adding something to the language has a much higher
bar than merely retaining something (because *removing* something from
the language has an even higher bar), so a proposal can't simply say
"It's no worse than what we have already" to get acceptance.
Impossible standard? A bit unfair. Higher than existing features?
Quite possibly has its place.

ChrisA

From stephen at xemacs.org Sat May 3 20:34:32 2014
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sun, 04 May 2014 03:34:32 +0900
Subject: [Python-ideas] Allow using symbols from Unicode
block "Superscripts and Subscripts" in identifiers
In-Reply-To: <20140503175702.GF4273@ando>
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

<20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz>
<20140503090523.GB4273@ando>
<20140503175702.GF4273@ando>
Message-ID: <8761lmlnif.fsf@uwakimon.sk.tsukuba.ac.jp>

Steven D'Aprano writes:

> If I name a variable "x2", what is the "one simple or obvious
> interpretation" that such an identifier presumably has? If standard,
> ASCII-only identifiers don't have a single interpretation, why should
> identifiers like ?? be held to that requirement?

Because subscripts and superscripts are syntactic constructs, and
naturally decompose into two identifiers in a specific relationship
(even if that relationship cannot be further specified without going
deep into some domain of discourse) -- and that is is much of the
motivation for wanting to use them. "x2" does not carry that load.

Note that Unicode itself considers them *compatibility* characters and
says:

Superscripts and subscripts have been included in the Unicode
Standard only to provide compatibility with existing character
sets. In general, the Unicode character encoding does not attempt
to describe the positioning of a character above or below the
baseline in typographical layout.

In other words, Unicode is reluctant to guarantee that x2, x?, and x?
are actually different identifiers! It's considered bad practice to
treat them as the same, but not actually forbidden.

At least 2 technical reports (#20 and #25) discourage their use except
in the case where they are letter-like (phonetic transcriptions use
several such letters, where they have different meaning from their
compatibility equivalents).

The more I look into this, the more I think it is really problematic.

From tjreedy at udel.edu Sat May 3 23:48:51 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 03 May 2014 17:48:51 -0400
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To: <20140503045022.GA4273@ando>
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

<20140503045022.GA4273@ando>
Message-ID:

On 5/3/2014 12:50 AM, Steven D'Aprano wrote:
> On Fri, May 02, 2014 at 10:27:56PM -0400, Terry Reedy wrote:
>
>> If the rules for identifiers are expanded, any code the uses newly
>> allowed names cannot be backported or run on previous versions. In
>> contracted, the opposite problem occurs. I do not think they should be
>> changed either way without a strong cause.
>
> That applies to any new feature -- code using that feature cannot be
> easily backported. In this case, it's actually quite simple to backport
> code using the new rules for identifiers: just change the identifiers.
> The algorithm used by the code remains that same.

It appears that I consider lexicography more 'fundamental' in some sense
than you do. But lets skip over this.

>>> From 2.3. Identifiers and keywords
>>> "The syntax of identifiers in Python is based on the Unicode standard
>>> annex UAX-31, with elaboration and changes as defined below; see also
>>> PEP 3131 for further details."

Without reading the annex, I cannot tell which part of the 'below'
actually defines a 'change', as opposed to an 'elaboration'
(explanation). I have no idea whether the unknown changes are additions,
deletions, or merely selections of options.

>> In other words, we use the standard with a few intentional
>> modifications.
>
> Playing Devil's Advocate, perhaps we could add a few more intentional
> modifications.

Or perhaps not, depending on what the modifications actually are and
what the reasons were.

> While there are advantages to following a standard just for the sake of
> following a standard, once you allow any changes, you're no longer
> following the standard. So the argument becomes, why should we allow
> that change but not this change?

Nick recently argued, very similarly, that having restored string 'u'
prefixes was a reason to restore dict.iterxyz methods. You agreed with
me that there were good reasons why B did not follow from A.

To properly compare current and proposed changes, we must know the
current 'modifications and changes', their reasons and effects, and the
proposed changes and their reasons (any real parallels) and likely
effects. If you were to do the research, I would be willing to discuss.

> Particularly for mathematically-focused code, I think it would be useful
> to be able to use identifiers like (say) ?? for variance, g? for sample
> skewness, or ?? for Pearson's skewness, to give a few real-world
> examples. Regular digits may be ambiguous: compare s?? for the sample
> variance with Bessel's correction, versus s12. (s twelve?)

I agree that there are good uses for this restricted set of additions.
Would you allow super/subscripts as prefixes rather than suffixes? I
presume not since we already disallow initial numbers.

> I'm going to give a tentative +1 vote to allowing superscript and
> subscript letters and digits in identifiers, if it can be done without
> excessive cost in complexity or performance.

Would you consider doubling the cost of checking each character (a
reasonable estimate, I think) excessive or not?

> Anything else, like (say) ? (CIRCLED DIGIT FIVE),
> I will give a firm -1.

--
Terry Jan Reedy

From tjreedy at udel.edu Sun May 4 00:06:06 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 03 May 2014 18:06:06 -0400
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To:
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

<20140503045022.GA4273@ando>
Message-ID:

On 5/3/2014 5:48 PM, Terry Reedy wrote:
> Would you consider doubling the cost of checking each character (a
> reasonable estimate, I think) excessive or not?

Thinking about it more, I think double is an over-estimate. Since I do
not know how the unicode lexer works, I won't guess or worry about the
cost until there it times code with and without the change.

--
Terry Jan Reedy

From alexander.belopolsky at gmail.com Sun May 4 00:08:51 2014
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 3 May 2014 18:08:51 -0400
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To:
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

<20140503045022.GA4273@ando>
Message-ID:

On Sat, May 3, 2014 at 5:48 PM, Terry Reedy wrote:

> Would you allow super/subscripts as prefixes rather than suffixes? I
> presume not since we already disallow initial numbers.

Python 3 does not recognize subscripts as numbers:

>>> int('?')
Traceback (most recent call last):
File "", line 1, in
ValueError: invalid literal for int() with base 10: '?'
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From steve at pearwood.info Sun May 4 04:40:44 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 4 May 2014 12:40:44 +1000
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To: <8761lmlnif.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

<20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz>
<20140503090523.GB4273@ando>
<20140503175702.GF4273@ando> <8761lmlnif.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20140504024043.GG4273@ando>

On Sun, May 04, 2014 at 03:34:32AM +0900, Stephen J. Turnbull wrote:

> Note that Unicode itself considers them *compatibility* characters and
> says:
>
> Superscripts and subscripts have been included in the Unicode
> Standard only to provide compatibility with existing character
> sets. In general, the Unicode character encoding does not attempt
> to describe the positioning of a character above or below the
> baseline in typographical layout.
>
> In other words, Unicode is reluctant to guarantee that x2, x?, and x?
> are actually different identifiers!
[...]

I don't think this is a valid interpretation of what the Unicode
standard is trying to say, but the point is moot. I think you've just
identified (pun intended) a major objection to the proposal, one serious
enough to change my mind from limited support to opposition.

Python identifiers are treated by their NFKC normalised form:

All identifiers are converted into the normal form NFKC while
parsing; comparison of identifiers is based on NFKC.

https://docs.python.org/3/reference/lexical_analysis.html

And superscripts and subscripts normalise to standard characters:

py> [unicodedata.normalize('NFKC', s) for s in 'x? x? x2'.split()]
['x2', 'x2', 'x2']

So that categorically rules out allowing superscripts and subscripts as
*distinct* characters in identifiers. So even if they were allowed, it
would mean that x? and x? would be treated as the same identifier as x2.

For my use-case, I would want x? and x? to be treated as distinct
identifiers, not just as a funny way of writing x2. So from my
perspective, *at best* there is now insufficient benefit to bother
allowing them.

It's actually stronger than that: allowing superscripts and subscripts
would be an attractive nuisance for my use-case. If they were allowed, I
would be tempted to write x? and x?, which could end up being a subtle
source of bugs if I accidentally used them both in the same namespace,
thinking that they were distinct when they actually aren't. So I am now
-1 on allowing superscripts and subscripts.

--
Steven

From infroma at gmail.com Sun May 4 09:10:56 2014
From: infroma at gmail.com (Roman Inflianskas)
Date: Sun, 04 May 2014 11:10:56 +0400
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To: <20140504024043.GG4273@ando>
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>
<8761lmlnif.fsf@uwakimon.sk.tsukuba.ac.jp> <20140504024043.GG4273@ando>
Message-ID: <2552140.U2lLahrYnB@romas-x230-suse.lan>

On Sunday 04 May 2014 12:40:44 Steven D'Aprano wrote:
> On Sun, May 04, 2014 at 03:34:32AM +0900, Stephen J. Turnbull wrote:
>
> > Note that Unicode itself considers them *compatibility* characters and
> > says:
> >
> > Superscripts and subscripts have been included in the Unicode
> > Standard only to provide compatibility with existing character
> > sets. In general, the Unicode character encoding does not attempt
> > to describe the positioning of a character above or below the
> > baseline in typographical layout.
> >
> > In other words, Unicode is reluctant to guarantee that x2, x?, and x?
> > are actually different identifiers!
> [...]
>
> I don't think this is a valid interpretation of what the Unicode
> standard is trying to say, but the point is moot. I think you've just
> identified (pun intended) a major objection to the proposal, one serious
> enough to change my mind from limited support to opposition.
>
> Python identifiers are treated by their NFKC normalised form:
>
> All identifiers are converted into the normal form NFKC while
> parsing; comparison of identifiers is based on NFKC.
>
> https://docs.python.org/3/reference/lexical_analysis.html
>
> And superscripts and subscripts normalise to standard characters:
>
> py> [unicodedata.normalize('NFKC', s) for s in 'x? x? x2'.split()]
> ['x2', 'x2', 'x2']
>
> So that categorically rules out allowing superscripts and subscripts as
> *distinct* characters in identifiers. So even if they were allowed, it
> would mean that x? and x? would be treated as the same identifier as x2.
>
> For my use-case, I would want x? and x? to be treated as distinct
> identifiers, not just as a funny way of writing x2. So from my
> perspective, *at best* there is now insufficient benefit to bother
> allowing them.
>
> It's actually stronger than that: allowing superscripts and subscripts
> would be an attractive nuisance for my use-case. If they were allowed, I
> would be tempted to write x? and x?, which could end up being a subtle
> source of bugs if I accidentally used them both in the same namespace,
> thinking that they were distinct when they actually aren't. So I am now
> -1 on allowing superscripts and subscripts.
>
>
>
That's the strongest point against allowing superscripts and subscripts in a whole discussion, IMHO. I would want x? and x? to be treated as distinct identifiers either.

I've tried this use case in Julia and it works:
julia> x? = 1
1

julia> x? = 2
2

julia> x?
1

julia> x?
2

But then I've found thread in Julia's bugtracker covering unicode identifiers normalization[1]. As I understood they don't use NFKC. As a consequence symbols "?" (0x00b5) and "?" (0x03bc) are treated as different. They understood that it's weird and they need to do something about this. Some of they don't want to use NFKC because of the same reason (+ for example, "H" and "?" would became equal identifiers). Others decided to give a warning when new identifier is equal to the defined one (in the terms of NFKC normalization).

Now I understood that things are more complicated that I considered them when I did a proposal. I think that there is no "good way" to add support for subscripts and superscripts. So it's better to leave the situation as is.

--
Regards, Roman Inflianskas

--------
[1] covering unicode identifiers normalization
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From tjreedy at udel.edu Sun May 4 11:51:25 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 04 May 2014 05:51:25 -0400
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To: <2552140.U2lLahrYnB@romas-x230-suse.lan>
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>
<8761lmlnif.fsf@uwakimon.sk.tsukuba.ac.jp> <20140504024043.GG4273@ando>
<2552140.U2lLahrYnB@romas-x230-suse.lan>
Message-ID:

On 5/4/2014 3:10 AM, Roman Inflianskas wrote:

> Now I understood that things are more complicated that I considered them
> when I did a proposal. I think that there is no "good way" to add
> support for subscripts and superscripts. So it's better to leave the
> situation as is.

If you are the one who opened the tracker issue, please close it. And
thanks for bringing the discussion here.

--
Terry Jan Reedy

From infroma at gmail.com Sun May 4 12:00:20 2014
From: infroma at gmail.com (Roman Inflianskas)
Date: Sun, 04 May 2014 14:00:20 +0400
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To:
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>
<2552140.U2lLahrYnB@romas-x230-suse.lan>
Message-ID: <2117120.VD1tLruAqg@romas-x230-suse.lan>

On Sunday 04 May 2014 05:51:25 Terry Reedy wrote:
> On 5/4/2014 3:10 AM, Roman Inflianskas wrote:
>
> > Now I understood that things are more complicated that I considered them
> > when I did a proposal. I think that there is no "good way" to add
> > support for subscripts and superscripts. So it's better to leave the
> > situation as is.
>
> If you are the one who opened the tracker issue, please close it. And
> thanks for bringing the discussion here.

Done. Thank you for participation in this discussion. The next time I will not open bug before discussion, I promise :)

--
Regards, Roman Inflianskas

From ron3200 at gmail.com Sun May 4 18:17:42 2014
From: ron3200 at gmail.com (Ron Adam)
Date: Sun, 04 May 2014 12:17:42 -0400
Subject: [Python-ideas] Allow using symbols from Unicode block
"Superscripts and Subscripts" in identifiers
In-Reply-To: <20140503175702.GF4273@ando>
References: <19168724.RU9dWeh3or@romas-x230-suse.lan>

<20140503045022.GA4273@ando> <53648EDD.8060201@canterbury.ac.nz>
<20140503090523.GB4273@ando>
<20140503175702.GF4273@ando>
Message-ID:

On 05/03/2014 01:57 PM, Steven D'Aprano wrote:
> On Sat, May 03, 2014 at 11:39:23AM -0400, Ron Adam wrote:
>> >
>> >
>> >On 05/03/2014 05:05 AM, Steven D'Aprano wrote:
>>> > >On Sat, May 03, 2014 at 06:38:21PM +1200, Greg Ewing wrote:
>> >
>>>>> > >>>Steven D'Aprano wrote:
>> >
>>>>>>> > >>>> >Particularly for mathematically-focused code, I think it would be
>>>>> > >>>useful
>>>>>>> > >>>> >to be able to use identifiers like (say) ?? for variance,
>> >
>>>>> > >>>Having ?? be a variable name could be confusing. To a
>>>>> > >>>mathematician, it's not a distinct variable, it's
>>>>> > >>>just ? ** 2.
>> >
>>> > >Actually, not really. A better way of putting it is that the standard
>>> > >deviation is "just" the square root of ??. Variance comes first (it's
>>> > >defined from first principles), and then the standard deviation is
>>> > >defined by taking the square root.
>> >
>> >
>> >The main problem I see is that many possible questions come to mind rather
>> >than one simple or obvious interpretation.
> If I name a variable "x2", what is the "one simple or obvious
> interpretation" that such an identifier presumably has? If standard,
> ASCII-only identifiers don't have a single interpretation, why should
> identifiers like ?? be held to that requirement?

Steven Turnbull pointed out some of the different interpretations I was
thinking about in his reply to this message. Mainly that of it being more
of a syntactic form, but as you said it also might be interpreted as an
identifier spelling.

> Like any other identifier, one needs to interpret the name in context.
> Identifiers can be idiomatic ("i" for a loop variable, "c" for a
> character), more or less descriptive ("number_of_pages", "npages"), or
> obfuscated ("e382702"). They can be written in English, or in some other
> language. They can be ordinary words, or jargon that only means
> something to those who understand the problem domain. None of this will
> be different if sub/superscript digits and letters are allowed.
>
> One of the frustrations on this list is how often people hold new
> proposals to higher standard than existing features. Particularly
> *impossible* standards. It simply isn't possible for characters like
> superscript-two to be given a*single* interpretation (although there is
> an obvious one, namely "squared") any more than it is possible for the
> letter "a" to be given a*single* interpretation.
>
> There are valid objections to this proposal. It may be that the effort
> needed to allow code points like ? in identifiers without also allowing
> ? or ? may be too great. Or the performance cost is too high. Or the
> benefit for mathematical-style code doesn't justify adding additional
> language complexity.
>
> Or even a purely aethetic judgement "I just don't like it". (I don't
> like identifiers written in cyrillic, because I can't read them, but I'm
> not the target audience for such identifiers and I will never need to
> read them. Consequently I don't object if other people use cyrillic
> identifiers in their personal code.)
>
> Holding this proposal up to an impossible standard which plain ASCII
> identifiers don't even meet is simply not cricket.

> Thank you all for letting me get that off my chest, and apologies to Ron
> for singling him out.

No problem, you didn't comment on me, but expressed your own thoughts.
That's fine. But thanks for clarifying the context of your message, it
does help us avoid unintended misunderstandings in message based
conversations like these where we don't get to hear the tone of a message.

I feel the same as you describe here in many of these discussions. Enough
so that I'm attempting to write a minimal language that uses some of the
features I've thought about. The exercise was/is helping me understand
many of the lower level language-design patterns in python and some other
languages.

Some of the ideas I've wanted just don't fit with pythons design, and some
would work, but not without many changes to other parts. And some ideas we
can't do because they directly conflict with something we already have.
Sigh. The ones that most interest me are the ones that simplify or unify
existing features, but those are also the one that are the hardest to do
right. ;-)

Cheers,
Ron

From ram.rachum at gmail.com Mon May 5 15:17:16 2014
From: ram.rachum at gmail.com (Ram Rachum)
Date: Mon, 5 May 2014 06:17:16 -0700 (PDT)
Subject: [Python-ideas] Implement `itertools.permutations.__getitem__` and
`itertools.permutations.index`
Message-ID:

I suggest implementing:

- `itertools.permutations.__getitem__`, for getting a permutation by its
index number, and possibly also slicing, and
- `itertools.permutations.index` for getting the index number of a given
permutation.

What do you think?

Thanks,
Ram.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ram.rachum at gmail.com Mon May 5 18:07:27 2014
From: ram.rachum at gmail.com (Ram Rachum)
Date: Mon, 5 May 2014 09:07:27 -0700 (PDT)
Subject: [Python-ideas] Implement `itertools.permutations.__getitem__`
and `itertools.permutations.index`
In-Reply-To:
References:
Message-ID: <081c1efb-4f7e-4d70-9a50-957d9f347e1e@googlegroups.com>

And now that I think about it, I'd also like to give it a `__len__`, and to
give `itertools.product` the same treatment. What do you think?

On Monday, May 5, 2014 4:17:16 PM UTC+3, Ram Rachum wrote:
>
> I suggest implementing:
>
> - `itertools.permutations.__getitem__`, for getting a permutation by its
> index number, and possibly also slicing, and
> - `itertools.permutations.index` for getting the index number of a given
> permutation.
>
> What do you think?
>
>
> Thanks,
> Ram.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From steve at pearwood.info Mon May 5 19:15:38 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 6 May 2014 03:15:38 +1000
Subject: [Python-ideas] Implement `itertools.permutations.__getitem__`
and `itertools.permutations.index`
In-Reply-To:
References:
Message-ID: <20140505171538.GR4273@ando>

On Mon, May 05, 2014 at 06:17:16AM -0700, Ram Rachum wrote:
> I suggest implementing:
>
> - `itertools.permutations.__getitem__`, for getting a permutation by its
> index number, and possibly also slicing, and
> - `itertools.permutations.index` for getting the index number of a given
> permutation.
>
> What do you think?

An intriguing idea.

range() objects also implement indexing, and len. But range() objects
have an obvious, unambiguous order: range(2, 6) *must* give 2, 3, 4, 5,
in that order, by the definition of range. Permutations aren't like
that. The order of the permutations is an implementation detail, not
part of the definition. If permutations provides indexing operations,
then the order becomes part of the interface. I'm not sure that's such a
good idea.

I think, rather that adding __getitem__ to permutations, I would rather
see a separate function (not iterator) which returns the nth
permutation.

--
Steven

From steve at pearwood.info Mon May 5 19:23:14 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 6 May 2014 03:23:14 +1000
Subject: [Python-ideas] Implement `itertools.permutations.__getitem__`
and `itertools.permutations.index`
In-Reply-To: <081c1efb-4f7e-4d70-9a50-957d9f347e1e@googlegroups.com>
References:
<081c1efb-4f7e-4d70-9a50-957d9f347e1e@googlegroups.com>
Message-ID: <20140505172314.GS4273@ando>

On Mon, May 05, 2014 at 09:07:27AM -0700, Ram Rachum wrote:
> And now that I think about it, I'd also like to give it a `__len__`, and to
> give `itertools.product` the same treatment. What do you think?

Consider:

p = itertools.permutations('CAT')
assert len(p) == 6

So far, that's obvious. But:

next(p)
=> returns a permutation

Now what will len(p) return? If it still returns 6, that will lead to
bugs when people check the len, but fail to realise that some of those
permutations have already been consumed. In the most extreme case, you
could have:

assert len(p) == 6
list(p) == []

which is terribly surprising.

On the other hand, if len(p) returns the number of permutations
remaining, apart from increasing the complexity of the iterator, it will
also be surprising to those who expect the length to be the total number
of permutations.

I would rather have a separate API, perhaps something like this:

p.number() # returns the total number of permutations

--
Steven

From songofacandy at gmail.com Tue May 6 00:22:56 2014
From: songofacandy at gmail.com (INADA Naoki)
Date: Tue, 6 May 2014 07:22:56 +0900
Subject: [Python-ideas] Implement `itertools.permutations.__getitem__`
and `itertools.permutations.index`
In-Reply-To: <20140505171538.GR4273@ando>
References:
<20140505171538.GR4273@ando>
Message-ID:

> range() objects also implement indexing, and len. But range() objects
> have an obvious, unambiguous order: range(2, 6) *must* give 2, 3, 4, 5,
> in that order, by the definition of range. Permutations aren't like
> that. The order of the permutations is an implementation detail, not
> part of the definition. If permutations provides indexing operations,
> then the order becomes part of the interface. I'm not sure that's such a
> good idea.

I don't think the order of permutation is implementation detail.
Python implementations should follow CPython's documented order.

https://docs.python.org/3.4/library/itertools.html#itertools.permutations

> Permutations are emitted in lexicographic sort order. So, if the input iterable is sorted, the permutation tuples will be produced in sorted order.

On Tue, May 6, 2014 at 2:15 AM, Steven D'Aprano wrote:
> On Mon, May 05, 2014 at 06:17:16AM -0700, Ram Rachum wrote:
>> I suggest implementing:
>>
>> - `itertools.permutations.__getitem__`, for getting a permutation by its
>> index number, and possibly also slicing, and
>> - `itertools.permutations.index` for getting the index number of a given
>> permutation.
>>
>> What do you think?
>
> An intriguing idea.
>
> range() objects also implement indexing, and len. But range() objects
> have an obvious, unambiguous order: range(2, 6) *must* give 2, 3, 4, 5,
> in that order, by the definition of range. Permutations aren't like
> that. The order of the permutations is an implementation detail, not
> part of the definition. If permutations provides indexing operations,
> then the order becomes part of the interface. I'm not sure that's such a
> good idea.
>
> I think, rather that adding __getitem__ to permutations, I would rather
> see a separate function (not iterator) which returns the nth
> permutation.
>
>
> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

--
INADA Naoki

From ethan at stoneleaf.us Tue May 6 01:06:39 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 05 May 2014 16:06:39 -0700
Subject: [Python-ideas] Implement `itertools.permutations.__getitem__`
and `itertools.permutations.index`
In-Reply-To:
References:
<20140505171538.GR4273@ando>

Message-ID: <5368197F.20507@stoneleaf.us>

On 05/05/2014 03:22 PM, INADA Naoki wrote:
>> range() objects also implement indexing, and len. But range() objects
>> have an obvious, unambiguous order: range(2, 6) *must* give 2, 3, 4, 5,
>> in that order, by the definition of range. Permutations aren't like
>> that. The order of the permutations is an implementation detail, not
>> part of the definition. If permutations provides indexing operations,
>> then the order becomes part of the interface. I'm not sure that's such a
>> good idea.
>
> I don't think the order of permutation is implementation detail.
> Python implementations should follow CPython's documented order.
>
> https://docs.python.org/3.4/library/itertools.html#itertools.permutations
>
>> Permutations are emitted in lexicographic sort order. So, if the input iterable is sorted, the permutation tuples will be produced in sorted order.

What does that mean? If permutations are emitted in an order, why does the input iterable have to be ordered? What
happens if it's not?

--> list(''.join(p) for p in permutations('abc'))
['abc', 'acb', 'bac', 'bca', 'cab', 'cba']

--> list(''.join(p) for p in permutations('cab'))
['cab', 'cba', 'acb', 'abc', 'bca', 'bac']

Okay, read http://en.wikipedia.org/wiki/Lexicographical_order -- I think 'lexicographic' is not the best choice of
word... maybe positional?

--
~Ethan~

From ethan at stoneleaf.us Tue May 6 01:57:38 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 05 May 2014 16:57:38 -0700
Subject: [Python-ideas] A library for the deprecation of a function/class
In-Reply-To:
References:
Message-ID: <53682572.5070702@stoneleaf.us>

On 04/17/2014 12:28 PM, St?phane Wirtel wrote:
>
> With the CPython sprint, I was thinking about a lib to mark a function/class as deprecated.

[...]

> The deprecated decorator should check the version of the software and the version of Python if asked with the arguments.
> it will raise warnings.warn with PendingDeprecationWarning or DeprecationWarning. Can be used in the documentation, via
> introspection.

Seems like a useful idea. Others have also thought so and there are some code snippets at
https://wiki.python.org/moin/PythonDecoratorLibrary#Generating_Deprecation_Warnings

--
~Ethan~

From steve at pearwood.info Tue May 6 04:39:02 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 6 May 2014 12:39:02 +1000
Subject: [Python-ideas] Implement `itertools.permutations.__getitem__`
and `itertools.permutations.index`
In-Reply-To:
References:
<20140505171538.GR4273@ando>

Message-ID: <20140506023902.GV4273@ando>

On Tue, May 06, 2014 at 07:22:56AM +0900, INADA Naoki wrote:

> I don't think the order of permutation is implementation detail.
> Python implementations should follow CPython's documented order.
>
> https://docs.python.org/3.4/library/itertools.html#itertools.permutations

Hmmm. Well, since the order of permutations is documented, I suppose my
objection is answered. In that case, it becomes a question of whether or
not there is an easy way to generate the Nth permutation without having
to iterate through the previous N-1 permutations.

> > Permutations are emitted in lexicographic sort order. So, if the
> > input iterable is sorted, the permutation tuples will be produced in
> > sorted order.

I think I know what the docs are trying to say, but I'm not sure if they
are quite saying it correctly. If the permutations are emitted in
"lexicographic sort order", that implies that they are sortable, but
that's not necessarily the case:

py> 4j > 2j
Traceback (most recent call last):
File "", line 1, in
TypeError: no ordering relation is defined for complex numbers
py> list(itertools.permutations([4j, 2j]))
[(4j, 2j), (2j, 4j)]

I think that just removing the word "sort" is sufficient: "Permutations
are emitted in lexicographic order" is meaningful, and correct, even
when the elements are not sortable.

--
Steven

From taleinat at gmail.com Tue May 6 11:35:09 2014
From: taleinat at gmail.com (Tal Einat)
Date: Tue, 6 May 2014 12:35:09 +0300
Subject: [Python-ideas] Implement `itertools.permutations.__getitem__`
and `itertools.permutations.index`
In-Reply-To: <20140506023902.GV4273@ando>
References:
<20140505171538.GR4273@ando>

<20140506023902.GV4273@ando>
Message-ID:

On Tue, May 6, 2014 at 5:39 AM, Steven D'Aprano wrote:
> On Tue, May 06, 2014 at 07:22:56AM +0900, INADA Naoki wrote:
>
>> I don't think the order of permutation is implementation detail.
>> Python implementations should follow CPython's documented order.
>>
>> https://docs.python.org/3.4/library/itertools.html#itertools.permutations
>
> Hmmm. Well, since the order of permutations is documented, I suppose my
> objection is answered. In that case, it becomes a question of whether or
> not there is an easy way to generate the Nth permutation without having
> to iterate through the previous N-1 permutations.

Yes, it is possible using factorial decomposition of N.

See, for an example: http://stackoverflow.com/a/7919887/40076

- Tal Einat

From p.f.moore at gmail.com Tue May 6 14:45:02 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 6 May 2014 13:45:02 +0100
Subject: [Python-ideas] Implement `itertools.permutations.__getitem__`
and `itertools.permutations.index`
In-Reply-To:
References:
<20140505171538.GR4273@ando>

<20140506023902.GV4273@ando>

Message-ID:

On 6 May 2014 10:35, Tal Einat wrote:
>> Hmmm. Well, since the order of permutations is documented, I suppose my
>> objection is answered. In that case, it becomes a question of whether or
>> not there is an easy way to generate the Nth permutation without having
>> to iterate through the previous N-1 permutations.
>
> Yes, it is possible using factorial decomposition of N.
>
> See, for an example: http://stackoverflow.com/a/7919887/40076

For large N, this is much slower than itertools.permutations when you
only want the first few entries.

p = itertools.permutations(range(10000))
for i in range(5):
print(next(p))

vs

for i in range(5):
print(ithperm(10000, i))

The first is substantially faster.

That's not to say that ithperm isn't useful, just that its
computational complexity may be surprising if it's spelled as an
indexing operation.

Paul

From alan.cristh at gmail.com Tue May 6 15:04:44 2014
From: alan.cristh at gmail.com (Alan Cristhian Ruiz)
Date: Tue, 06 May 2014 10:04:44 -0300
Subject: [Python-ideas] Plug-ins for IDLE
Message-ID: <5368DDEC.40709@gmail.com>

I think it would be great to have an minimun API for writing plug-ins
for Python IDLE.

I don't know if there are people interested in this, but I do.

What do you think? Python could benefit in some way with this?

From taleinat at gmail.com Tue May 6 17:40:53 2014
From: taleinat at gmail.com (Tal Einat)
Date: Tue, 6 May 2014 18:40:53 +0300
Subject: [Python-ideas] Implement `itertools.permutations.__getitem__`
and `itertools.permutations.index`
In-Reply-To:
References:
<20140505171538.GR4273@ando>

<20140506023902.GV4273@ando>

Message-ID:

On Tue, May 6, 2014 at 3:45 PM, Paul Moore wrote:
> On 6 May 2014 10:35, Tal Einat wrote:
>>> Hmmm. Well, since the order of permutations is documented, I suppose my
>>> objection is answered. In that case, it becomes a question of whether or
>>> not there is an easy way to generate the Nth permutation without having
>>> to iterate through the previous N-1 permutations.
>>
>> Yes, it is possible using factorial decomposition of N.
>>
>> See, for an example: http://stackoverflow.com/a/7919887/40076
>
> For large N, this is much slower than itertools.permutations when you
> only want the first few entries.

If someone just wants the first few entries, they probably aren't
worried about it being super fast. And if they were, they could just
iterate to get the first permutations.

As for getting anything past the first few permutations (e.g. an
arbitrary one), factorial decomposition would be faster by several
orders of magnitude than iterating from the beginning. For relatively
large permutations, iterating from the beginning could be unfeasible,
while factorial decomposition would still take far less than a second.

The real question IMO is if this is useful enough to bother including
in the stdlib. For example, I don't think it would pass the "potential
uses in the stdlib" test. Perhaps Ram (the OP) has some actual
use-cases for this?

- Tal

From p.f.moore at gmail.com Tue May 6 17:49:36 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 6 May 2014 16:49:36 +0100
Subject: [Python-ideas] Implement `itertools.permutations.__getitem__`
and `itertools.permutations.index`
In-Reply-To:
References:
<20140505171538.GR4273@ando>

<20140506023902.GV4273@ando>

Message-ID:

On 6 May 2014 16:40, Tal Einat wrote:
> The real question IMO is if this is useful enough to bother including
> in the stdlib. For example, I don't think it would pass the "potential
> uses in the stdlib" test. Perhaps Ram (the OP) has some actual
> use-cases for this?

Agreed, I suspect this is more appropriate as a utility on PyPI. But I
stand by my statement that wherever it's implemented, it should *not*
be spelled permutations(x)[N], as having indexing with a small index
being significantly slower than a few calls to next() is a nasty
performance trap for the unwary (no matter how rare it will be in
practice).

Paul

From taleinat at gmail.com Tue May 6 17:51:19 2014
From: taleinat at gmail.com (Tal Einat)
Date: Tue, 6 May 2014 18:51:19 +0300
Subject: [Python-ideas] Plug-ins for IDLE
In-Reply-To: <5368DDEC.40709@gmail.com>
References: <5368DDEC.40709@gmail.com>
Message-ID:

On Tue, May 6, 2014 at 4:04 PM, Alan Cristhian Ruiz
wrote:
> I think it would be great to have an minimun API for writing plug-ins for
> Python IDLE.
>
> I don't know if there are people interested in this, but I do.
>
> What do you think? Python could benefit in some way with this?

This already exists!

In IDLE they're called Extensions. See extend.txt (link below) in the
IDLE source code for details on how to write them. IDLE also ships
with several built-in plugins which provide some key functionality,
such as auto-completion and parenthesis matching. You can check those
out for examples.

http://hg.python.org/cpython/file/v3.4.0/Lib/idlelib/extend.txt

- Tal

From ram.rachum at gmail.com Wed May 7 19:21:25 2014
From: ram.rachum at gmail.com (Ram Rachum)
Date: Wed, 7 May 2014 20:21:25 +0300
Subject: [Python-ideas] Implement `itertools.permutations.__getitem__`
and `itertools.permutations.index`
In-Reply-To:
References:
<20140505171538.GR4273@ando>

<20140506023902.GV4273@ando>

Message-ID:

Hi Tal,

I'm using it for a project of my own (optimizing keyboard layout) but I
can't make the case that it's useful for the stdlib. I'd understand if it
would be omitted for not being enough of a common need.

On Tue, May 6, 2014 at 6:40 PM, Tal Einat wrote:

> On Tue, May 6, 2014 at 3:45 PM, Paul Moore wrote:
> > On 6 May 2014 10:35, Tal Einat wrote:
> >>> Hmmm. Well, since the order of permutations is documented, I suppose my
> >>> objection is answered. In that case, it becomes a question of whether
> or
> >>> not there is an easy way to generate the Nth permutation without having
> >>> to iterate through the previous N-1 permutations.
> >>
> >> Yes, it is possible using factorial decomposition of N.
> >>
> >> See, for an example: http://stackoverflow.com/a/7919887/40076
> >
> > For large N, this is much slower than itertools.permutations when you
> > only want the first few entries.
>
> If someone just wants the first few entries, they probably aren't
> worried about it being super fast. And if they were, they could just
> iterate to get the first permutations.
>
> As for getting anything past the first few permutations (e.g. an
> arbitrary one), factorial decomposition would be faster by several
> orders of magnitude than iterating from the beginning. For relatively
> large permutations, iterating from the beginning could be unfeasible,
> while factorial decomposition would still take far less than a second.
>
> The real question IMO is if this is useful enough to bother including
> in the stdlib. For example, I don't think it would pass the "potential
> uses in the stdlib" test. Perhaps Ram (the OP) has some actual
> use-cases for this?
>
> - Tal
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From taleinat at gmail.com Wed May 7 19:40:22 2014
From: taleinat at gmail.com (Tal Einat)
Date: Wed, 7 May 2014 20:40:22 +0300
Subject: [Python-ideas] Implement `itertools.permutations.__getitem__`
and `itertools.permutations.index`
In-Reply-To:
References:
<20140505171538.GR4273@ando>

<20140506023902.GV4273@ando>

Message-ID:

On Wed, May 7, 2014 at 8:21 PM, Ram Rachum wrote:
> Hi Tal,
>
> I'm using it for a project of my own (optimizing keyboard layout) but I
> can't make the case that it's useful for the stdlib. I'd understand if it
> would be omitted for not being enough of a common need.

At the least, this (a function for getting a specific permutation by
lexicographical-order index) could make a nice cookbook recipe.

- Tal

From ram.rachum at gmail.com Wed May 7 19:43:20 2014
From: ram.rachum at gmail.com (Ram Rachum)
Date: Wed, 7 May 2014 20:43:20 +0300
Subject: [Python-ideas] Implement `itertools.permutations.__getitem__`
and `itertools.permutations.index`
In-Reply-To:
References:
<20140505171538.GR4273@ando>

<20140506023902.GV4273@ando>

Message-ID:

I'm probably going to implement it in my python_toolbox package. I already
implemented 30% and it's really cool. It's at the point where I doubt that
I want it in the stdlib because I've gotten so much awesome functionality
into it and I'd hate to (a) have 80% of it stripped and (b) have the class
names changed to be non-Pythonic :)

On Wed, May 7, 2014 at 8:40 PM, Tal Einat wrote:

> On Wed, May 7, 2014 at 8:21 PM, Ram Rachum wrote:
> > Hi Tal,
> >
> > I'm using it for a project of my own (optimizing keyboard layout) but I
> > can't make the case that it's useful for the stdlib. I'd understand if it
> > would be omitted for not being enough of a common need.
>
> At the least, this (a function for getting a specific permutation by
> lexicographical-order index) could make a nice cookbook recipe.
>
> - Tal
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From alonisser at gmail.com Wed May 7 22:40:56 2014
From: alonisser at gmail.com (alonn)
Date: Wed, 7 May 2014 23:40:56 +0300
Subject: [Python-ideas] Things I wish Pip learned from Npm
Message-ID:

A "Rant" I wrote about pip and npm. Maybe someone would find this
interesting or even useful in thinking on Pip's future.

I would like to stress that this is really not meant to hurt anyone and in
particular the great members of this open source community who bear the
burden of developing and maintaining Pip.

https://medium.com/devops-programming/f712fa26f5bc

Twitter:@alonisser
LinkedIn Profile
Facebook
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From mal at egenix.com Wed May 7 22:48:29 2014
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 07 May 2014 22:48:29 +0200
Subject: [Python-ideas] Things I wish Pip learned from Npm
In-Reply-To:
References:
Message-ID: <536A9C1D.6000703@egenix.com>

On 07.05.2014 22:40, alonn wrote:
> A "Rant" I wrote about pip and npm. Maybe someone would find this
> interesting or even useful in thinking on Pip's future.
>
> I would like to stress that this is really not meant to hurt anyone and in
> particular the great members of this open source community who bear the
> burden of developing and maintaining Pip.
>
> https://medium.com/devops-programming/f712fa26f5bc

Please note that you should probably post this to the pip mailing
list and/or the distutils list.

python-ideas is about ideas for Python itself and even though Python 3.4
includes bootstrap code to install pip, pip itself is not developed by
the Python Core Devs.

Thanks,
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, May 07 2014)
>>> Python Projects, Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
2014-04-24: Released mxODBC.Connect 2.0.5 ... http://egenix.com/go55

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/

From wegge at wegge.dk Wed May 7 22:50:00 2014
From: wegge at wegge.dk (Anders Wegge Keller)
Date: 07 May 2014 22:50:00 +0200
Subject: [Python-ideas] Things I wish Pip learned from Npm
References:
Message-ID: <871tw52u13.fsf@huddi.jernurt.dk>

alonn writes:

> A "Rant" I wrote about pip and npm. Maybe someone would find this
> interesting or even useful in thinking on Pip's future.
>
> I would like to stress that this is really not meant to hurt anyone and in
> particular the great members of this open source community who bear the
> burden of developing and maintaining Pip.
>
> https://medium.com/devops-programming/f712fa26f5bc

One nit:

$ Selenium==2.4.1
...
$ Selenium==2.35.0
Pip Just downgraded a package version without even asking for

I would say that pip upgraded Selenium 31 minor releases. However, I
come from a background where each of major.minor.rev can be a
multi-digit number, so I might be wrong in this context.

--
/Wegge

Leder efter redundant peering af dk.*,linux.debian.*

From ncoghlan at gmail.com Thu May 8 01:09:44 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 8 May 2014 09:09:44 +1000
Subject: [Python-ideas] Things I wish Pip learned from Npm
In-Reply-To:
References:
Message-ID:

On 8 May 2014 06:41, "alonn" wrote:
>
> A "Rant" I wrote about pip and npm. Maybe someone would find this
interesting or even useful in thinking on Pip's future.
>
> I would like to stress that this is really not meant to hurt anyone and
in particular the great members of this open source community who bear the
burden of developing and maintaining Pip.
>
> https://medium.com/devops-programming/f712fa26f5bc

While MAL is correct that distutils-sig is a better list, I also suggest
reading packaging.python.org and the referenced PEPs (especially PEP 426),
along with articles like http://lwn.net/Articles/580399/ to come up to
speed with the current state of play in the Python packaging ecosystem.

Cheers,
Nick.

>
>
> Twitter:@alonisser
> LinkedIn Profile
> Facebook
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From flying-sheep at web.de Thu May 8 17:04:12 2014
From: flying-sheep at web.de (Philipp A.)
Date: Thu, 8 May 2014 17:04:12 +0200
Subject: [Python-ideas] Things I wish Pip learned from Npm
In-Reply-To: <536A9C1D.6000703@egenix.com>
References:
<536A9C1D.6000703@egenix.com>
Message-ID:

2014-05-07 22:48 GMT+02:00 M.-A. Lemburg :

Please note that you should probably post this to the pip mailing
> list and/or the distutils list.
>
> python-ideas is about ideas for Python itself and even though Python 3.4
> includes bootstrap code to install pip, pip itself is not developed by
> the Python Core Devs.
>
There?s one point that?s relevant and worth discussing here, I quote:

*pip?s choice of defaulting to a global installation is wrong*
>
Yes. Python?s pip bootstrapping is there in order to guarantee that all the
? just type pip install foobar? tutorials work.

Linux distributions like Arch, Debian and Ubuntu
deliberatelybroke
this guarantee, and for good reasons:

Pip per default installs globally, which should be the system package
manager?s territory.

It would be best if pip would work like this if you run it without some -g,
--global switch outside a venv:

On Linux: ?You?re on linux, please use your package manager for global
installation or use a virtual environment. Use the -g switch to force
global installation?

On Windows and OSX: ?Please use the -g switch for installations outside of
virtual environments?
------------------------------

Is it to late to change this? I really *want* the guarantee and pip to
work. But I also totally understand why all those distributions break it!
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ncoghlan at gmail.com Thu May 8 17:12:27 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 9 May 2014 01:12:27 +1000
Subject: [Python-ideas] Things I wish Pip learned from Npm
In-Reply-To:
References:
<536A9C1D.6000703@egenix.com>

Message-ID:

On 9 May 2014 01:05, "Philipp A." wrote:
>
> ________________________________
>
> Is it to late to change this? I really want the guarantee and pip to
work. But I also totally understand why all those distributions break it!

There's already an open pip issue to change the default install location to
be user installs (at least on POSIX systems), so no, that's not a novel
idea, and this still isn't the right list to discuss it.

(I'll also note that Fedora was able to implement PEP 453 successfully, so
it is certainly possible for distros to comply with it)

Cheers,
Nick.

>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From donald at stufft.io Thu May 8 17:16:08 2014
From: donald at stufft.io (Donald Stufft)
Date: Thu, 8 May 2014 11:16:08 -0400
Subject: [Python-ideas] Things I wish Pip learned from Npm
In-Reply-To:
References:
<536A9C1D.6000703@egenix.com>

Message-ID:

On May 8, 2014, at 11:04 AM, Philipp A. wrote:

> 2014-05-07 22:48 GMT+02:00 M.-A. Lemburg :
>
>
>
> Please note that you should probably post this to the pip mailing
> list and/or the distutils list.
>
> python-ideas is about ideas for Python itself and even though Python 3.4
> includes bootstrap code to install pip, pip itself is not developed by
> the Python Core Devs.
>
>
> There?s one point that?s relevant and worth discussing here, I quote:
>
>
>
> pip?s choice of defaulting to a global installation is wrong
>
>
> Yes. Python?s pip bootstrapping is there in order to guarantee that all the ? just type pip install foobar? tutorials work.
>
> Linux distributions like Arch, Debian and Ubuntu deliberately broke this guarantee, and for good reasons:
>
> Pip per default installs globally, which should be the system package manager?s territory.
>
>

This isn?t exactly accurate. Linux distributions aren?t installing pip globally using ensurepip, which is an entirely different thing than pip. We never expected Linux distros *to* use ensurepip for that purpose. If you install python-pip on any of these distros you still get a version of pip that installs things globally. Some distros (Fedora I believe) are making Python depend on pip so the outcome is exactly the same.
> It would be best if pip would work like this if you run it without some -g, --global switch outside a venv:
>
>

I don?t think pip will ever require a virtual environment by default. However there is an open ticket to make ?user installs the default when running as non root.
> On Linux: ?You?re on linux, please use your package manager for global installation or use a virtual environment. Use the -g switch to force global installation?
>
> On Windows and OSX: ?Please use the -g switch for installations outside of virtual environments?
>
> Is it to late to change this? I really want the guarantee and pip to work. But I also totally understand why all those distributions break it!
>
>

Like I said above, no distribution currently breaks pip, a few have a broken ensurepip but that is being fixed.

> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL:

From barry at python.org Thu May 8 21:22:31 2014
From: barry at python.org (Barry Warsaw)
Date: Thu, 8 May 2014 15:22:31 -0400
Subject: [Python-ideas] Things I wish Pip learned from Npm
References:
<536A9C1D.6000703@egenix.com>

Message-ID: <20140508152231.3387e6d4@anarchist.wooz.org>

On May 08, 2014, at 11:16 AM, Donald Stufft wrote:

>Some distros (Fedora I believe) are making Python depend on pip so the
>outcome is exactly the same.

Debian and Ubuntu won't be providing pip automatically outside of a
virtualenv, but we'll provide some hints as to how to install it using the
OS package manager.

I'm close to a solution for
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=732703 which will fix pyvenv
and pip inside a pyvenv. It's not at all as straightforward as you might
think, but I have A Plan.

I still think pip-outside-venv should install to --user by default so I'm glad
pip upstream is moving in that direction. There's a use case, although IMO
fairly limited, for global installation via `sudo pip` so we'll support that,
but you'll have to use the python*-pip distro package for that to work.

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL:

From bunslow at gmail.com Fri May 9 07:38:34 2014
From: bunslow at gmail.com (Bill Winslow)
Date: Fri, 9 May 2014 00:38:34 -0500
Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools
Message-ID:

Hey guys.

This is simple pattern I've added to my own "myutils.py" file, which I
think could see wide use if added to the standard library.

Simply, it's the following function meant to be used primarily as a
decorator.

def export(func):
global __all__
if callable(func) and hasattr(func, '__name__'):
try:
__all__.append(func.__name__)
except NameError:
__all__ = [func.__name__]
return func

Then, instead of having a magic __all__ declaration at the top of a module
with a list of strings (that may or may not be accurate [of course stdlib
modules are maintained more rigorously]), people writing libraries can
instead use the following idiom:

import stuff_not_meant_to_be_visible

def _private_func_1(): pass

def _private_func_2(): pass

@export
def public_func_1(): pass

@export
def public_func_2(): pass

Of course, this doesn't actually solve any problem, because programmers
using best-practice will prepend underscores to private functions and
define their __all__ properly.

However I still think this might be worth adding to the stdlib (presumably
in functools) because

1) Readability counts (and explicit is better than implicit): it's easy to
determine that, other than "well there's no underscore so this is probably
a public function", that "yes, the library author meant for this function
to be used".

2) Proper maintenance becomes easier. Attaching a small decorator next to
each public function is easier to remember than remembering to add an
arbitrary string to an arbitrary global constant. There is also the added
benefit that renaming/refactoring also doesn't require modifying the magic
global when you're done.

3) It helps encourage best practices, especially among either lazy
programmers or those new to Python.

One possible counter argument is that it's not very important/isn't a core
feature for library inclusion:
Well, things like an lru_cache or total_ordering aren't core features, but
they are nice to have, which is why they were added; export would fall into
the same category.

What are everyone's thoughts?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From rosuav at gmail.com Fri May 9 07:59:35 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 9 May 2014 15:59:35 +1000
Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools
In-Reply-To:
References:
Message-ID:

On Fri, May 9, 2014 at 3:38 PM, Bill Winslow wrote:
> 2) Proper maintenance becomes easier. Attaching a small decorator next to
> each public function is easier to remember than remembering to add an
> arbitrary string to an arbitrary global constant. There is also the added
> benefit that renaming/refactoring also doesn't require modifying the magic
> global when you're done.

+1 for this reason. Attaching info to code is the purpose of
docstrings, and it makes very good sense to implement __all__ the same
way. But your given implementation seems to have a problem: how can
you import that into another module? It looks at "global __all__",
which will look at the module it's implemented in. Would it work like
this, perhaps?

# This is starting to read a little oddly, but oh well :)
from eustace_scrubb import export, government, drain

# ... define your functions with @export, as above ...

__all__ = export.get_all()

The get_all() function would return the list, and empty it in
readiness for the next module. It's non-reentrant, so you'd have to
make sure you don't import any other modules in the middle of defining
your own exports.

Or is there something I'm not seeing about your original export() that
makes it work?

ChrisA

From bunslow at gmail.com Fri May 9 09:12:01 2014
From: bunslow at gmail.com (Bill Winslow)
Date: Fri, 9 May 2014 02:12:01 -0500
Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools
In-Reply-To:
References:

Message-ID:

> But your given implementation seems to have a problem: how can
> you import that into another module?

And thus I am caught as not having actually tested this -- you are of
course correct. Having a statement at the bottom defining "__all__" as you
suggest would (partially) defeat the point, which is to make it easy for
the programmer to state what should be public, that is, it shouldn't be
necessary to muck with __all__. I'm trying to avoid the
magicness/arbitraryness of assignments to __all__.

I'll try and think up an alternative implementation that would work as
advertised when imported from another module. (Note that if the code itself
were copy and pasted, the function would work fine, yet importing it fails
-- something I have not yet encountered in Python. This also suggests one
trivial solution -- import a function that instead exec()'s the definition
above.) I suspect I might have to learn something about import internals to
come up with a (better-than-the-trivial) solution. (Copy and pasting code
is of course unacceptable as well.)

> Here's a simpler implementation:

That is of course the same as mine, except less error checking and also
assumes the global already exists (remember, one goal is to not have to
muck with __all__ in any way, not even declaring it above the function
definition). At least somebody else also had the same idea; hopefully I can
come up with an importable solution...
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From berker.peksag at gmail.com Fri May 9 09:06:47 2014
From: berker.peksag at gmail.com (=?UTF-8?Q?Berker_Peksa=C4=9F?=)
Date: Fri, 9 May 2014 10:06:47 +0300
Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools
In-Reply-To:
References:
Message-ID:

On Fri, May 9, 2014 at 8:38 AM, Bill Winslow wrote:
> Hey guys.
>
> This is simple pattern I've added to my own "myutils.py" file, which I think
> could see wide use if added to the standard library.
>
> Simply, it's the following function meant to be used primarily as a
> decorator.
>
> def export(func):
> global __all__
> if callable(func) and hasattr(func, '__name__'):
> try:
> __all__.append(func.__name__)
> except NameError:
> __all__ = [func.__name__]
> return func

Here's a simpler implementation:
http://hg.python.org/release/file/b270b4d5cf2c/3.4/dryparse/dryparse.py#l57

--Berker

From rosuav at gmail.com Fri May 9 09:38:18 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 9 May 2014 17:38:18 +1000
Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools
In-Reply-To:
References:

Message-ID:

(By the way: You responded to two different posts in yours, and didn't
cite either of them. Please keep the original-poster line, such as
you'll see on the next non-blank line.)

On Fri, May 9, 2014 at 5:12 PM, Bill Winslow wrote:
>> But your given implementation seems to have a problem: how can
>> you import that into another module?
>
> And thus I am caught as not having actually tested this -- you are of course
> correct. Having a statement at the bottom defining "__all__" as you suggest
> would (partially) defeat the point, which is to make it easy for the
> programmer to state what should be public, that is, it shouldn't be
> necessary to muck with __all__. I'm trying to avoid the
> magicness/arbitraryness of assignments to __all__.

I don't mind the concept of one-line directives to specify things. In
the same way that you would put "import socket" at the top if you use
sockets, you put "__all__ = export.get_all()" at the bottom to capture
all the __all__ entries. It still deduplicates and brings the
information right to where the function's defined, so there is some
value in it.

> I'll try and think up an alternative implementation that would work as
> advertised when imported from another module. (Note that if the code itself
> were copy and pasted, the function would work fine, yet importing it fails
> -- something I have not yet encountered in Python. This also suggests one
> trivial solution -- import a function that instead exec()'s the definition
> above.) I suspect I might have to learn something about import internals to
> come up with a (better-than-the-trivial) solution. (Copy and pasting code is
> of course unacceptable as well.)

For the exec method to work, it would have to be passed a reference to
globals() for the calling module, so you can simplify it. I don't know
how useful it would be, but this ought to work:

# my_tools.py
def make_all(globls, listname="__all__"):
globls[listname] = []
def grabber(obj):
globls[listname].append(obj)
return obj
return grabber

# your module
import my_tools
export = make_all(globals())

def _private_func_1(): pass

def _private_func_2(): pass

@export
def public_func_1(): pass

@export
def public_func_2(): pass

This is still a bit magical, in that you assign to the name "export"
and it actually is for setting __all__, but it's better than exec :)

ChrisA

From bunslow at gmail.com Fri May 9 10:03:02 2014
From: bunslow at gmail.com (Bill Winslow)
Date: Fri, 9 May 2014 03:03:02 -0500
Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools
In-Reply-To:
References:

Message-ID:

On Fri, May 9, 2014 at 2:38 AM, Chris Angelico wrote:

> (By the way: You responded to two different posts in yours, and didn't
> cite either of them. Please keep the original-poster line, such as
> you'll see on the next non-blank line.)

Sorry -- I think this is correct? I'm new to mailing lists :P

> I don't mind the concept of one-line directives to specify things. In
> the same way that you would put "import socket" at the top if you use
> sockets, you put "__all__ = export.get_all()" at the bottom to capture
> all the __all__ entries. It still deduplicates and brings the
> information right to where the function's defined, so there is some
> value in it.

Fair enough, but I'd still like to avoid such statements if possible. I
think we can do better.

> > I'll try and think up an alternative implementation that would work as
> > advertised when imported from another module. (Note that if the code
> itself
> > were copy and pasted, the function would work fine, yet importing it
> fails
> > -- something I have not yet encountered in Python. This also suggests one
> > trivial solution -- import a function that instead exec()'s the
> definition
> > above.) I suspect I might have to learn something about import internals
> to
> > come up with a (better-than-the-trivial) solution. (Copy and pasting
> code is
> > of course unacceptable as well.)
>
> For the exec method to work, it would have to be passed a reference to
> globals() for the calling module, so you can simplify it. I don't know
> how useful it would be, but this ought to work:
>
> # my_tools.py
> def make_all(globls, listname="__all__"):
> globls[listname] = []
> def grabber(obj):
> globls[listname].append(obj)
> return obj
> return grabber
>
> # your module
> import my_tools
> export = make_all(globals())
>
> def _private_func_1(): pass
>
> def _private_func_2(): pass
>
> @export
> def public_func_1(): pass
>
> @export
> def public_func_2(): pass
>
>
> This is still a bit magical, in that you assign to the name "export"
> and it actually is for setting __all__, but it's better than exec :)

That's what I had basically just got working, except with exec instead of
just straight up modifying the globals... and if you're going to do that,
you may as well directly assign globls['export'] = grabber :P (still
better than an exec of course :D).

On the other hand, while testing my version of the above (with exec), I ran
into another issue (or at least I perceive it as such): the import stuff
only respects __all__ *if* we are importing * from the module. If instead
we do "import module as m", the *entire* namespace of the module is made
available in m, even those that start with an underscore.

Can someone please explain the rationale behind that? I would consider this
surprising. Why should "import module" give different results than "from
module import *"? If the latter can be author-limited, why not the former
as well? Even a pointer to relevant documentation would be helpful.

(Basically, I had this idea because I thought that __all__ was way more
important than it apparently is.)

-Bill
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From rosuav at gmail.com Fri May 9 10:17:16 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 9 May 2014 18:17:16 +1000
Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools
In-Reply-To:
References:

Message-ID:

On Fri, May 9, 2014 at 6:03 PM, Bill Winslow wrote:
>
> On Fri, May 9, 2014 at 2:38 AM, Chris Angelico wrote:
>>
>> (By the way: You responded to two different posts in yours, and didn't
>> cite either of them. Please keep the original-poster line, such as
>> you'll see on the next non-blank line.)
>
> Sorry -- I think this is correct? I'm new to mailing lists :P

Yep, looks good! Thanks!

You'll find this sort of thing helpful even with non-list mail, too.
As soon as you get a long and complex discussion (particularly between
more than two people), it's useful to bottom-post, trim quotes,
maintain proper citations, etc, etc. Good habits to be in.

> That's what I had basically just got working, except with exec instead of
> just straight up modifying the globals... and if you're going to do that,
> you may as well directly assign globls['export'] = grabber :P (still
> better than an exec of course :D).

You could do it that way too, yes. That would shorten the usage a little.

> On the other hand, while testing my version of the above (with exec), I ran
> into another issue (or at least I perceive it as such): the import stuff
> only respects __all__ *if* we are importing * from the module. If instead we
> do "import module as m", the *entire* namespace of the module is made
> available in m, even those that start with an underscore.
>
> Can someone please explain the rationale behind that? I would consider this
> surprising. Why should "import module" give different results than "from
> module import *"? If the latter can be author-limited, why not the former as
> well? Even a pointer to relevant documentation would be helpful.
>
> (Basically, I had this idea because I thought that __all__ was way more
> important than it apparently is.)

Yep. When you "import module" (optionally "as m"), you can reference
whatever you want; __all__ is to help tame an "import * from". But
neither of them can truly hide anything; Python doesn't work that way.
You can always go digging around and finding the internals of
something. The nearest you can get to truly hiding something from the
namespace is to del it when you're done, which obviously means you
can't reference it yourself either. (Has its uses, though; for
instance, you might undefine export when you're done with it, in your
above examples.)

ChrisA

From __peter__ at web.de Fri May 9 10:45:33 2014
From: __peter__ at web.de (Peter Otten)
Date: Fri, 09 May 2014 10:45:33 +0200
Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools
References:
Message-ID:

Bill Winslow wrote:

> Hey guys.
>
> This is simple pattern I've added to my own "myutils.py" file, which I
> think could see wide use if added to the standard library.
>
> Simply, it's the following function meant to be used primarily as a
> decorator.
>
> def export(func):
> global __all__
> if callable(func) and hasattr(func, '__name__'):
> try:
> __all__.append(func.__name__)
> except NameError:
> __all__ = [func.__name__]
> return func
>
>
> Then, instead of having a magic __all__ declaration at the top of a module
> with a list of strings (that may or may not be accurate [of course stdlib
> modules are maintained more rigorously]), people writing libraries can
> instead use the following idiom:
>
> import stuff_not_meant_to_be_visible
>
> def _private_func_1(): pass
>
> def _private_func_2(): pass
>
> @export
> def public_func_1(): pass
>
> @export
> def public_func_2(): pass
>
>
> Of course, this doesn't actually solve any problem, because programmers
> using best-practice will prepend underscores to private functions and
> define their __all__ properly.
>
> However I still think this might be worth adding to the stdlib (presumably
> in functools) because
>
> 1) Readability counts (and explicit is better than implicit): it's easy to
> determine that, other than "well there's no underscore so this is probably
> a public function", that "yes, the library author meant for this function
> to be used".
>
> 2) Proper maintenance becomes easier. Attaching a small decorator next to
> each public function is easier to remember than remembering to add an
> arbitrary string to an arbitrary global constant. There is also the added
> benefit that renaming/refactoring also doesn't require modifying the magic
> global when you're done.
>
> 3) It helps encourage best practices, especially among either lazy
> programmers or those new to Python.
>
>
> One possible counter argument is that it's not very important/isn't a core
> feature for library inclusion:
> Well, things like an lru_cache or total_ordering aren't core features, but
> they are nice to have, which is why they were added; export would fall
> into the same category.
>
> What are everyone's thoughts?

I rarely use star imports, and the decorator is mostly noise in my eyes, so
that's a clear -1 from me.

I'm mostly posting to suggest an alternative implementation for your
personal use ;)

def export(f):
sys._getframe(1).f_globals.setdefault("__all__", []).append(f.__name__)
return f

From rosuav at gmail.com Fri May 9 10:53:23 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 9 May 2014 18:53:23 +1000
Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools
In-Reply-To:
References:

Message-ID:

On Fri, May 9, 2014 at 6:45 PM, Peter Otten <__peter__ at web.de> wrote:
> I'm mostly posting to suggest an alternative implementation for your
> personal use ;)
>
> def export(f):
> sys._getframe(1).f_globals.setdefault("__all__", []).append(f.__name__)
> return f

sys._getframe, in my opinion, isn't so much code "smell" as
"Mythbusters' 1987 Chevrolet"... :)

ChrisA

From niki.spahiev at gmail.com Fri May 9 12:12:02 2014
From: niki.spahiev at gmail.com (Niki Spahiev)
Date: Fri, 09 May 2014 13:12:02 +0300
Subject: [Python-ideas] OrderedDict literals
In-Reply-To:
References:
Message-ID:

Hello,

Currently expression (a=1, b=2) is a syntax error. If it's defined to
mean (('a',1), ('b',2)) it can be used when making OrderedDict or
anything that requires named ordered args e.g.

OrderedDict((a=1, b=2))

another variant with more changes in VM is

OrderedDict(**(a=1, b=2))

Niki

From jsbueno at python.org.br Fri May 9 13:07:49 2014
From: jsbueno at python.org.br (Joao S. O. Bueno)
Date: Fri, 9 May 2014 08:07:49 -0300
Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools
In-Reply-To:
References:

Message-ID:

On 9 May 2014 05:53, Chris Angelico wrote:
> On Fri, May 9, 2014 at 6:45 PM, Peter Otten <__peter__ at web.de> wrote:
>> I'm mostly posting to suggest an alternative implementation for your
>> personal use ;)
>>
>> def export(f):
>> sys._getframe(1).f_globals.setdefault("__all__", []).append(f.__name__)
>> return f

in this case it could be
....
f.__globals__.setdefault ...

(instead of the sys._getframe)

anyway, I also dislike the idea on the basis that __all__ is not
that usefull in itself, and people coming from static languages
(and worse, people building "pylinters") might come to find this
a "good practice" to the point of being mandatory
(else, fail the linter): and voil?: a lot of noise
to the language.

js
-><-
>
> sys._getframe, in my opinion, isn't so much code "smell" as
> "Mythbusters' 1987 Chevrolet"... :)
>
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From ncoghlan at gmail.com Fri May 9 13:58:51 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 9 May 2014 21:58:51 +1000
Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools
In-Reply-To:
References:

Message-ID:

On 9 May 2014 18:53, Chris Angelico wrote:
> On Fri, May 9, 2014 at 6:45 PM, Peter Otten <__peter__ at web.de> wrote:
>> I'm mostly posting to suggest an alternative implementation for your
>> personal use ;)
>>
>> def export(f):
>> sys._getframe(1).f_globals.setdefault("__all__", []).append(f.__name__)
>> return f
>
> sys._getframe, in my opinion, isn't so much code "smell" as
> "Mythbusters' 1987 Chevrolet"... :)

As this subthread suggests, the main problem with the concept of
decorator based public/private markings is "If the implementation is
hard to explain, it's a bad idea." You essentially have to use some
form of dynamic scoping in order to modify __all__ in the right
module, and then that limits your ability to wrap the export decorator
inside other helper functions.

In many cases, the fact that underscore prefixed names are excluded
from the implicit all is sufficient to avoid the need to worry too
much about defining an explicit __all__ attribute. Problems typically
only arise due to imported modules being implicitly re-exported.

If folks really want to avoid defining an explicit __all__, then it
isn't that hard to define a helper function that allows a module to be
finished with a line like:

__all__ = all_without_modules(globals())

It generally isn't worth the hassle, though, especially when star
imports are strongly discouraged in the first place.

Cheers,
Nick.

--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia

From barry at python.org Fri May 9 19:25:03 2014
From: barry at python.org (Barry Warsaw)
Date: Fri, 9 May 2014 13:25:03 -0400
Subject: [Python-ideas] Things I wish Pip learned from Npm
References:
<536A9C1D.6000703@egenix.com>

<20140508152231.3387e6d4@anarchist.wooz.org>
Message-ID: <20140509132503.0fd64423@anarchist.wooz.org>

On May 08, 2014, at 03:22 PM, Barry Warsaw wrote:

>Debian and Ubuntu won't be providing pip automatically outside of a
>virtualenv, but we'll provide some hints as to how to install it using the
>OS package manager.

I should clarify this. I think python3-pip should be a Recommends (for
python3 I guess) and apt-get installs Recommends by default, so to most people
it'll seem like you get it automatically. But you disable this with apt-get's
--no-install-recommends flag.

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL:

From greg.ewing at canterbury.ac.nz Sat May 10 01:15:43 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 10 May 2014 11:15:43 +1200
Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools
In-Reply-To:
References:

Message-ID: <536D619F.2090007@canterbury.ac.nz>

Nick Coghlan wrote:
> You essentially have to use some
> form of dynamic scoping in order to modify __all__ in the right
> module, and then that limits your ability to wrap the export decorator
> inside other helper functions.

The version that uses the function's f_globals directly
doesn't have that problem.

--
Greg

From ncoghlan at gmail.com Sat May 10 14:02:52 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 10 May 2014 22:02:52 +1000
Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools
In-Reply-To: <536D619F.2090007@canterbury.ac.nz>
References:

<536D619F.2090007@canterbury.ac.nz>
Message-ID:

On 10 May 2014 09:16, "Greg Ewing" wrote:
>
> Nick Coghlan wrote:
>>
>> You essentially have to use some
>> form of dynamic scoping in order to modify __all__ in the right
>> module, and then that limits your ability to wrap the export decorator
>> inside other helper functions.
>
>
> The version that uses the function's f_globals directly
> doesn't have that problem.

It has a different problem: f_globals may come from a wrapper function
applied by a decorator that lives in a different module (or replace it with
a callable that has no "f_globals" at all).

An export decorator like this is a neat idea that might work within the
confines of a single project or organisation, but it's inherently too
fragile to make it a generally available part of the standard library.

Cheers,
Nick.

>
> --
> Greg
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From kn0m0n3 at gmail.com Sat May 10 16:24:57 2014
From: kn0m0n3 at gmail.com (www.leap.cc)
Date: Sat, 10 May 2014 09:24:57 -0500
Subject: [Python-ideas] cell phone gnu os
Message-ID: <83tjvg6338ic7t48j3gi051s.1399731897514@email.android.com>

Niki Spahiev wrote:

>Hello,
>
>Currently expression (a=1, b=2) is a syntax error. If it's defined to
>mean (('a',1), ('b',2)) it can be used when making OrderedDict or
>anything that requires named ordered args e.g.
>
>OrderedDict((a=1, b=2))
>
>another variant with more changes in VM is
>
>OrderedDict(**(a=1, b=2))
>
>Niki
>
>_______________________________________________
>Python-ideas mailing list
>Python-ideas at python.org
>https://mail.python.org/mailman/listinfo/python-ideas
>Code of Conduct: http://python.org/psf/codeofconduct/

From dw+python-ideas at hmmz.org Sat May 10 20:04:36 2014
From: dw+python-ideas at hmmz.org (dw+python-ideas at hmmz.org)
Date: Sat, 10 May 2014 18:04:36 +0000
Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools
In-Reply-To:
References:

Message-ID: <20140510180436.GA16032@k2>

On Fri, May 09, 2014 at 10:45:33AM +0200, Peter Otten wrote:

> I rarely use star imports, and the decorator is mostly noise in my eyes, so
> that's a clear -1 from me.

__all__ also prettifies e.g. pydoc output, which I'd consider a +1.

I'm overall not fond of hiding simple mechanisms using unecessary magic,
especially when the result doesn't generalize to all possible uses of
taht mechanism (e.g. this approach doesn't work for exporting simple
variables).

> def export(f):
> sys._getframe(1).f_globals.setdefault("__all__", []).append(f.__name__)
> return f

How about:

def export(fn):
mod = sys.modules[fn.__module__]
lst = vars(mod).setdefault('__all__', [])
lst.append(fn)
return fn

From markus at unterwaditzer.net Sat May 10 21:30:55 2014
From: markus at unterwaditzer.net (Markus Unterwaditzer)
Date: Sat, 10 May 2014 21:30:55 +0200
Subject: [Python-ideas] Adding an "export" decorator in (e.g.) functools
In-Reply-To:
References:
Message-ID: <20140510193055.GA956@chromebot.unti>

On Fri, May 09, 2014 at 12:38:34AM -0500, Bill Winslow wrote:
> Hey guys.
>
> This is simple pattern I've added to my own "myutils.py" file, which I
> think could see wide use if added to the standard library.
>
> Simply, it's the following function meant to be used primarily as a
> decorator.
>
> def export(func):
> global __all__
> if callable(func) and hasattr(func, '__name__'):
> try:
> __all__.append(func.__name__)
> except NameError:
> __all__ = [func.__name__]
> return func

Another version which doesn't have the bug mentioned by Chris, and works
without getframe:

def export(f):
module = __import__(f.__module__)
if not hasattr(module, '__all__'):
module.__all__ = []
module.__all__.append(f.__name__)
return f

-- Markus

From ram.rachum at gmail.com Thu May 15 22:02:56 2014
From: ram.rachum at gmail.com (Ram Rachum)
Date: Thu, 15 May 2014 13:02:56 -0700 (PDT)
Subject: [Python-ideas] Expose `itertools.count.start` and implement
`itertools.count.__eq__` based on it, like `range`.
Message-ID: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com>

I suggest exposing `itertools.count.start` and implementing
`itertools.count.__eq__` based on it. This'll provide the same benefits
that `range` got by exposing `range.start` and allowing `range.__eq__`.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From tjreedy at udel.edu Fri May 16 00:51:57 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 15 May 2014 18:51:57 -0400
Subject: [Python-ideas] Expose `itertools.count.start` and implement
`itertools.count.__eq__` based on it, like `range`.
In-Reply-To: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com>
References: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com>
Message-ID:

On 5/15/2014 4:02 PM, Ram Rachum wrote:
> I suggest exposing `itertools.count.start` and implementing
> `itertools.count.__eq__` based on it. This'll provide the same benefits
> that `range` got by exposing `range.start` and allowing `range.__eq__`.

The benefits cannot be the same because range and count are in different
categories.

A range object is an immutable, constant attribute, reiterable sequence
object. It makes sense to expose the read-only constants and compare on
the basis of them. This is as sensible as comparing other sequences.

A count is an iterator. We do not try to compare iterators (except by
identity). The start value is only the initial value yielded. As soon as
values are pulled from the iterator, the starting value is history. The
generator equivalent in the doc can be condensed a bit to how I would
actually write it.

def count(start=0, step=1):
while True:
yield start
start += step

For an iterator class, I would save the start parameter as self.n,
.count, or .current. In other words, something equivlaent to

def __init__(self, start=0, step=1):
self.count = start
self.step = step

If you want an augmented iterator class, you should write one yourself
for your specific needs.

--
Terry Jan Reedy

From steve at pearwood.info Fri May 16 01:35:36 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 16 May 2014 09:35:36 +1000
Subject: [Python-ideas] Expose `itertools.count.start` and implement
`itertools.count.__eq__` based on it, like `range`.
In-Reply-To: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com>
References: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com>
Message-ID: <20140515233536.GO4273@ando>

On Thu, May 15, 2014 at 01:02:56PM -0700, Ram Rachum wrote:

> I suggest exposing `itertools.count.start` and implementing
> `itertools.count.__eq__` based on it. This'll provide the same benefits
> that `range` got by exposing `range.start` and allowing `range.__eq__`.

What benefits are those? Under what circumstances have you compared two
range objects or checked their start? That's a serious question -- I
don't recall ever wanting to compare range objects for equality.

The iterator protocol is intentionally very simple, and I think that is
a good thing. Adding complexity to one specific standard iterator
without a good, solid use-case does not strike me as a good idea. But
even if you have a good use-case, I don't think the concept of equality
for count objects is very well defined. Consider:

from itertools import count
a = count(1)
b = count(1)
_ = next(b); _ = next(b)
c = count(3)

a.start and b.start are the same, so one might argue that a and b should
compare equal. But next(a) and next(b) are different, so one might
equally argue that a and b should compare unequal.

Likewise b.start and c.start are different, but next(b) and next(c)
return the same value, so one might expect b and c to be both equal and
unequal.

I think, whichever definition of equality you pick, people will be
surprised by it fifty percent of the time.

--
Steven

From ram.rachum at gmail.com Thu May 15 22:04:02 2014
From: ram.rachum at gmail.com (Ram Rachum)
Date: Thu, 15 May 2014 13:04:02 -0700 (PDT)
Subject: [Python-ideas] Expose `itertools.count.start` and implement
`itertools.count.__eq__` based on it, like `range`.
In-Reply-To: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com>
References: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com>
Message-ID:

Now that I think about it, I would ideally want `itertools.count` to be
deprecated in favor of `range(float('inf'))`, but I know that would never
happen.

On Thursday, May 15, 2014 11:02:56 PM UTC+3, Ram Rachum wrote:
>
> I suggest exposing `itertools.count.start` and implementing
> `itertools.count.__eq__` based on it. This'll provide the same benefits
> that `range` got by exposing `range.start` and allowing `range.__eq__`.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From guettli at thomas-guettler.de Fri May 16 09:05:11 2014
From: guettli at thomas-guettler.de (=?ISO-8859-15?Q?Thomas_G=FCttler?=)
Date: Fri, 16 May 2014 09:05:11 +0200
Subject: [Python-ideas] logging.config.defaultConfig()
Message-ID: <5375B8A7.1000204@thomas-guettler.de>

Using logging (as library) is easy in Python.

But setting up logging is done different in nearly every environment. I would like to have
a common way to **load** the configuration. The configuration itself should be done
in the environment the scripts runs in.

If you write a console script, and you want it to be reusable in different environments, there
is no default way at the moment (or at least I don't see any) to let the environment
set up the logging.

I want a standard hook: The console script should be able to call into the surrounding environment.
This improves reusablity.

I know how to use dictConfig() or fileConfig(), but these methods need parameters. And what
the parameters look like should not be defined in the reusable console script.

I think the following solution is very flexible and solves most needs to set up logging,
since I can implement your needs in for example your_environment_module.set_up()

{{{
def defaultConfig():
'''
Load module to set_up() the logging configuration of your environment.

Reads the module name from: os.environ.get('LOGGINGCONFIG', 'loggingconfig')

Calls set_up() on the imported module.

Would be nice to have this as logging.config.defaultConfig

Related: https://docs.python.org/2/library/logging.config.html

'''
module_name = os.environ.get('LOGGINGCONFIG', 'loggingconfig')
module = importlib.import_module(module_name)
module.set_up()
}}}

Do you understand what I propose?

What do you think?

Thomas G?ttler

--
Thomas G?ttler
http://thomas-guettler.de/

From solipsis at pitrou.net Fri May 16 11:27:16 2014
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 16 May 2014 11:27:16 +0200
Subject: [Python-ideas] logging.config.defaultConfig()
References: <5375B8A7.1000204@thomas-guettler.de>
Message-ID: <20140516112716.6002e7e8@fsol>

On Fri, 16 May 2014 09:05:11 +0200
Thomas G?ttler
wrote:
>
> I think the following solution is very flexible and solves most needs to set up logging,
> since I can implement your needs in for example your_environment_module.set_up()

This looks dubious to me. There is no reason to have a shared Python
logging configuration, IMO. Also, I don't understand why this is
importing a module.

If all your scripts are part of an application, then it's reasonable
for them to share a mechanism for logging configuration. But it should
be done in your application, not in Python itself.

Regards

Antoine.

From j.wielicki at sotecware.net Fri May 16 11:32:55 2014
From: j.wielicki at sotecware.net (Jonas Wielicki)
Date: Fri, 16 May 2014 11:32:55 +0200
Subject: [Python-ideas] logging.config.defaultConfig()
In-Reply-To: <20140516112716.6002e7e8@fsol>
References: <5375B8A7.1000204@thomas-guettler.de>
<20140516112716.6002e7e8@fsol>
Message-ID: <5375DB47.1020708@sotecware.net>

On 16.05.2014 11:27, Antoine Pitrou wrote:
> On Fri, 16 May 2014 09:05:11 +0200
> Thomas G?ttler
> wrote:
>>
>> I think the following solution is very flexible and solves most needs to set up logging,
>> since I can implement your needs in for example your_environment_module.set_up()
>
> This looks dubious to me. There is no reason to have a shared Python
> logging configuration, IMO. Also, I don't understand why this is
> importing a module.

While I agree that importing a module might not be the right way, having
a standard way to configure logging via environment variables might be
helpful.

Configuring logging is a difficult thing if done fully, like, allowing
different loglevels for different loggers. Having this implemented in
the standard library might be actually useful (and it?s also done that
way in other languages).

regards,
Jonas

>
> If all your scripts are part of an application, then it's reasonable
> for them to share a mechanism for logging configuration. But it should
> be done in your application, not in Python itself.
>
> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>

From solipsis at pitrou.net Fri May 16 11:54:17 2014
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 16 May 2014 11:54:17 +0200
Subject: [Python-ideas] logging.config.defaultConfig()
References: <5375B8A7.1000204@thomas-guettler.de>
<20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net>
Message-ID: <20140516115417.0d7e3a54@fsol>

On Fri, 16 May 2014 11:32:55 +0200
Jonas Wielicki
wrote:
> On 16.05.2014 11:27, Antoine Pitrou wrote:
> > On Fri, 16 May 2014 09:05:11 +0200
> > Thomas G?ttler
> > wrote:
> >>
> >> I think the following solution is very flexible and solves most needs to set up logging,
> >> since I can implement your needs in for example your_environment_module.set_up()
> >
> > This looks dubious to me. There is no reason to have a shared Python
> > logging configuration, IMO. Also, I don't understand why this is
> > importing a module.
>
> While I agree that importing a module might not be the right way, having
> a standard way to configure logging via environment variables might be
> helpful.

I entirely disagree. An environment variable is a very lousy way to
specify a configuration file's location; and there is no reason to have
a common logging configuration for all Python applications.

> Configuring logging is a difficult thing if done fully, like, allowing
> different loglevels for different loggers. Having this implemented in
> the standard library might be actually useful (and it?s also done that
> way in other languages).

What does this have to do with environment variables?
logging.dictConfig() already does this.

Regards

Antoine.

From guettli at thomas-guettler.de Fri May 16 13:08:17 2014
From: guettli at thomas-guettler.de (Thomas Guettler)
Date: Fri, 16 May 2014 13:08:17 +0200
Subject: [Python-ideas] logging.config.defaultConfig()
In-Reply-To: <20140516115417.0d7e3a54@fsol>
References: <5375B8A7.1000204@thomas-guettler.de>
<20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net>
<20140516115417.0d7e3a54@fsol>
Message-ID: <5375F1A1.7070602@thomas-guettler.de>

Am 16.05.2014 11:54, schrieb Antoine Pitrou:
> On Fri, 16 May 2014 11:32:55 +0200
> Jonas Wielicki
> wrote:
>> On 16.05.2014 11:27, Antoine Pitrou wrote:
>>> On Fri, 16 May 2014 09:05:11 +0200
>>> Thomas G?ttler
>>> wrote:
>>>>
>>>> I think the following solution is very flexible and solves most needs to set up logging,
>>>> since I can implement your needs in for example your_environment_module.set_up()
>>>
>>> This looks dubious to me. There is no reason to have a shared Python
>>> logging configuration, IMO. Also, I don't understand why this is
>>> importing a module.
>>
>> While I agree that importing a module might not be the right way, having
>> a standard way to configure logging via environment variables might be
>> helpful.
>
> I entirely disagree. An environment variable is a very lousy way to
> specify a configuration file's location; and there is no reason to have
> a common logging configuration for all Python applications.

** I don't want a common logging configuration **

I want a standard hook to find the logging configuration.

And I want it to be a Python method. If you prefer a file config,
create a method which loads your config file. This would make
the spec "simple and stupid".

The configuration should be empty by default. Only if the environment
wants to have a common config, it should provide one.

Thomas

--
Thomas Guettler http://www.thomas-guettler.de/

From solipsis at pitrou.net Fri May 16 13:14:47 2014
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 16 May 2014 13:14:47 +0200
Subject: [Python-ideas] logging.config.defaultConfig()
References: <5375B8A7.1000204@thomas-guettler.de>
<20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net>
<20140516115417.0d7e3a54@fsol>
<5375F1A1.7070602@thomas-guettler.de>
Message-ID: <20140516131447.6d4ba9ef@fsol>

On Fri, 16 May 2014 13:08:17 +0200
Thomas Guettler
wrote:
> Am 16.05.2014 11:54, schrieb Antoine Pitrou:
> > On Fri, 16 May 2014 11:32:55 +0200
> > Jonas Wielicki
> > wrote:
> >> On 16.05.2014 11:27, Antoine Pitrou wrote:
> >>> On Fri, 16 May 2014 09:05:11 +0200
> >>> Thomas G?ttler
> >>> wrote:
> >>>>
> >>>> I think the following solution is very flexible and solves most needs to set up logging,
> >>>> since I can implement your needs in for example your_environment_module.set_up()
> >>>
> >>> This looks dubious to me. There is no reason to have a shared Python
> >>> logging configuration, IMO. Also, I don't understand why this is
> >>> importing a module.
> >>
> >> While I agree that importing a module might not be the right way, having
> >> a standard way to configure logging via environment variables might be
> >> helpful.
> >
> > I entirely disagree. An environment variable is a very lousy way to
> > specify a configuration file's location; and there is no reason to have
> > a common logging configuration for all Python applications.
>
> ** I don't want a common logging configuration **
>
> I want a standard hook to find the logging configuration.

Why would that be Python's business?
If the hook is meant to be truly "standard", then it should be
something like a LSB standard.

End users don't really care whether some application is written in
Python or another language. Why a Python-specific hook? What do users
gain?

> And I want it to be a Python method.

Basically you are telling us what /you/ want, but not why it would be
useful for the broader community.

Regards

Antoine.

From mal at egenix.com Fri May 16 13:34:54 2014
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 16 May 2014 13:34:54 +0200
Subject: [Python-ideas] logging.config.defaultConfig()
In-Reply-To: <20140516115417.0d7e3a54@fsol>
References: <5375B8A7.1000204@thomas-guettler.de> <20140516112716.6002e7e8@fsol>
<5375DB47.1020708@sotecware.net> <20140516115417.0d7e3a54@fsol>
Message-ID: <5375F7DE.9080902@egenix.com>

On 16.05.2014 11:54, Antoine Pitrou wrote:
>> While I agree that importing a module might not be the right way, having
>> a standard way to configure logging via environment variables might be
>> helpful.
>
> I entirely disagree. An environment variable is a very lousy way to
> specify a configuration file's location; and there is no reason to have
> a common logging configuration for all Python applications.

Hmm, it's a fairly standard way to define config file locations esp.
on Unix platforms, so I don't follow you here. Perhaps I'm just
missing some context.

Such env vars are often used in application environments to override
system defaults, e.g. for finding OpenSSL or ODBC config files.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, May 16 2014)
>>> Python Projects, Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/

From accountearnstar at gmail.com Sat May 17 03:40:25 2014
From: accountearnstar at gmail.com (Chris B)
Date: Sat, 17 May 2014 03:40:25 +0200
Subject: [Python-ideas] Extending the usage of the @ decoration operator.
Message-ID:

Right now, one can use the @ symbol only for decorations and only bofore
function or class definition. ("A decorator is just a callable that takes a
function as an argument and returns a replacement function.").

@dec1(arg)
@dec2
def func(): pass

If a function has already been defined, it cannot be decorated using the
decoration operator. But it can still be decorated by explicitly calling
the decorator:

@dec1(arg)
@dec2
def func(): pass

is equivalent to:

def func(): pass
func = dec1(arg)(dec2(func))

Now I propose that the @ symbol should also be usable as an assignment
operator, in which case a succeeding function definition would not be
decorated:

def foo(): pass
foo @ decorator
def bar(): pass

is equivalent to:
def foo(): pass
foo = decorator(func)
def bar(): pass

This doesn't allow us to have stacked decorators so, the use of a tuple is
needed:

def func(): pass
func @ (dec2, dec1(arg))

is equivalent to:
def func(): pass
func = dec1(arg)(dec2(func))

Why not decorate more than one function at once?:

func1, func2, func3 @ dec1(arg), dec2

is equivalent to:

func1 = dec1(arg)(dec2(func1))
func2 = dec1(arg)(dec2(func2))
func3 = dec1(arg)(dec2(func3))

or better:

_temp1 = dec1(arg)(dec2(func1))
_temp2 = dec1(arg)(dec2(func2))
_temp3 = dec1(arg)(dec2(func3))
func1, func2, func3 = _temp1, _temp2, _temp3

The @ operator would still be only used for function decoration. But it
should pass any object preceding it as the (only) argument to the first
callable - let's call them modifiers - in the tuple succeeding it and then
pass the return value to the next modifier in the tuple. The last return
value should then be assigned to the variable again. Consider the following
example:

from os.path import expandvars, abspath, normcase

p1 = input('Insert path here: )
p2 = input('And another path here: )

# Fix the path strings
p1, p2 @ expandvars, abspath, normcase

Functions that take more than one argument can't be used as modifiers. But
simply currying them solves the problem:

from os.path import expandvars, abspath, normcase, relpath

def curry(f, *args, **kwargs):
def curried_f(arg1):
return f(arg1, *args, **kwargs)
return curried_f

# Fix the path strings
p1, p2 @ expandvars, abspath, normcase, curry(relpath, start)

Storing the modifiers in a mutable like a list, one could do rather complex
stuff:

def add(a, b):
return a + b
def sub(...; def mult(...; def div(... # ...the obvious way.

def permutations(L):
for _ in range(possible_permutations)
next_permutation = ...
yield next_permutation

L = [curry(add, 1), curry(sub, 2), curry(mult, 3), curry(div, 4)]

# Prints the result for all possible combinations of the four operations
+1, -2, *3, /4
# applied to 1.
for permutation in permutations(L):
x = 1
x @ permutation
print(x)

I'm not sure where to go from here. Does this idea qualify for a PEP? Is it
even possible to be implemented? Has it already been discussed? What do you
think about it? Please share your opinions, suggestions and improvements!
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From python at mrabarnett.plus.com Sat May 17 04:08:31 2014
From: python at mrabarnett.plus.com (MRAB)
Date: Sat, 17 May 2014 03:08:31 +0100
Subject: [Python-ideas] Extending the usage of the @ decoration operator.
In-Reply-To:
References:
Message-ID: <5376C49F.3040604@mrabarnett.plus.com>

On 2014-05-17 02:40, Chris B wrote:
> Right now, one can use the @ symbol only for decorations and only bofore
> function or class definition. ("A decorator is just a callable that
> takes a function as an argument and returns a replacement function.").
>
> @dec1(arg)
> @dec2
> def func(): pass
>
>
> If a function has already been defined, it cannot be decorated using the
> decoration operator. But it can still be decorated by explicitly calling
> the decorator:
>
> @dec1(arg)
> @dec2
> def func(): pass
>
> is equivalent to:
>
> def func(): pass
> func = dec1(arg)(dec2(func))
>
>
> Now I propose that the @ symbol should also be usable as an assignment
> operator, in which case a succeeding function definition would not be
> decorated:
>
[snip]
There is a proposal to use @ as an operator for matrix multiplication:

http://legacy.python.org/dev/peps/pep-0465/

From steve at pearwood.info Sat May 17 04:53:22 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 17 May 2014 12:53:22 +1000
Subject: [Python-ideas] Extending the usage of the @ decoration operator.
In-Reply-To: <5376C49F.3040604@mrabarnett.plus.com>
References:
<5376C49F.3040604@mrabarnett.plus.com>
Message-ID: <20140517025322.GQ4273@ando>

On Sat, May 17, 2014 at 03:08:31AM +0100, MRAB wrote:
> On 2014-05-17 02:40, Chris B wrote:
> >Right now, one can use the @ symbol only for decorations and only bofore
> >function or class definition. ("A decorator is just a callable that
> >takes a function as an argument and returns a replacement function.").
[...]
> There is a proposal to use @ as an operator for matrix multiplication:
>
> http://legacy.python.org/dev/peps/pep-0465/

It's not just a proposal, it's accepted and implemented in Python 3.5:

http://bugs.python.org/issue21176

So regardless of the merits of this proposal (if any), it isn't going to
happen.

--
Steven

From ethan at stoneleaf.us Sat May 17 04:42:54 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Fri, 16 May 2014 19:42:54 -0700
Subject: [Python-ideas] Extending the usage of the @ decoration operator.
In-Reply-To: <5376C49F.3040604@mrabarnett.plus.com>
References:
<5376C49F.3040604@mrabarnett.plus.com>
Message-ID: <5376CCAE.2090002@stoneleaf.us>

On 05/16/2014 07:08 PM, MRAB wrote:
> On 2014-05-17 02:40, Chris B wrote:
>> Right now, one can use the @ symbol only for decorations and only bofore
>> function or class definition. ("A decorator is just a callable that
>> takes a function as an argument and returns a replacement function.").
>>
>> @dec1(arg)
>> @dec2
>> def func(): pass
>>
>>
>> If a function has already been defined, it cannot be decorated using the
>> decoration operator. But it can still be decorated by explicitly calling
>> the decorator:
>>
>> @dec1(arg)
>> @dec2
>> def func(): pass
>>
>> is equivalent to:
>>
>> def func(): pass
>> func = dec1(arg)(dec2(func))
>>
>>
>> Now I propose that the @ symbol should also be usable as an assignment
>> operator, in which case a succeeding function definition would not be
>> decorated:
>>
> [snip]
> There is a proposal to use @ as an operator for matrix multiplication:
>
> http://legacy.python.org/dev/peps/pep-0465/

Which has been accepted.

--
~Ethan~

From ncoghlan at gmail.com Sat May 17 10:56:07 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 17 May 2014 18:56:07 +1000
Subject: [Python-ideas] logging.config.defaultConfig()
In-Reply-To: <5375F7DE.9080902@egenix.com>
References: <5375B8A7.1000204@thomas-guettler.de>
<20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net>
<20140516115417.0d7e3a54@fsol> <5375F7DE.9080902@egenix.com>
Message-ID:

On 16 May 2014 21:34, M.-A. Lemburg wrote:
> On 16.05.2014 11:54, Antoine Pitrou wrote:
>>> While I agree that importing a module might not be the right way, having
>>> a standard way to configure logging via environment variables might be
>>> helpful.
>>
>> I entirely disagree. An environment variable is a very lousy way to
>> specify a configuration file's location; and there is no reason to have
>> a common logging configuration for all Python applications.
>
> Hmm, it's a fairly standard way to define config file locations esp.
> on Unix platforms, so I don't follow you here. Perhaps I'm just
> missing some context.
>
> Such env vars are often used in application environments to override
> system defaults, e.g. for finding OpenSSL or ODBC config files.

Python is a language runtime, not an application. Having globally
configurable behaviours for a runtime is, in general, questionable,
which is why we have the options to ignore the environment variables,
site-packages, user site-packages and now the "isolated mode" flag
that basically says "ignore *every* explicitly configurable Python
setting in the environment".

For 3.2+, we defined a sensible default logging configuration (warning
and above written to stderr, everything else ignored), so users should
be free to just use the logging module when writing libraries without
worrying about whether or not it has been configured for reporting
properly. That doesn't help Python 2 users, but that's the case for a
lot of things.

Trying to provide a way to actually *configure* logging in a general
way would be fraught with backwards compatibility issues when it came
to interfering with frameworks (whether for writing CLI applications,
web applications, or GUI applications) that already providing their
own way of handling logging configuration.

Cheers,
Nick.

--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia

From ncoghlan at gmail.com Sat May 17 11:30:53 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 17 May 2014 19:30:53 +1000
Subject: [Python-ideas] Extending the usage of the @ decoration operator.
In-Reply-To:
References:
Message-ID:

On 17 May 2014 11:40, Chris B wrote:
> I'm not sure where to go from here. Does this idea qualify for a PEP? Is it
> even possible to be implemented? Has it already been discussed? What do you
> think about it? Please share your opinions, suggestions and improvements!

Others have noted that this specific proposal conflicts with the
already accepted matrix multiplication operator, but there are some
more general questions to ask yourself when making syntax proposals:

* what problem am I trying to solve?
* how common is that problem in general?
* what are the existing solutions to that problem?
* how easy is it to make a mistake when relying on the existing solutions?
* how does the readability of the new syntax compare to existing code?
* how much harder will it be to learn Python after this proposal is added?

For example, the original decorator syntax solved a significant
readability problem:

def method(a, b, c): # Where is self????
# many
# lines
# of
# implementation

method = staticmethod(method) # Oh, it's a static method

vs

@staticmethod
def method(a, b, c): # Obviously no self needed
# many
# lines
# of
# implementation

By contrast, a new way of spelling the "method = staticmethod(method)"
line isn't particularly interesting - it doesn't add much
expressiveness to the language, just a new way of spelling something
that can already be written out explicitly. Adding a complicated way
of avoiding writing multiple assignment statements or a helper
function also isn't compelling:

p1, p2 @ expandvars, abspath, normcase, curry(relpath, start)

vs

def fixpath(p):
return expandvars(abspath(normcase(relpath(start, p))))

p1 = fixpath(input('Insert path here: '))
p2 = fixpath(input('And another path here: '))

Python aspires to be "executable pseudocode". While we often fall
short of that mark, it does at least mean we're willing to sacrifice a
little brevity for the sake of clarity.

For a more recent example of a successful syntax change proposal, the
numeric Python community were able to make their case for a new matrix
multiplication operator because they have been trying to solve it
*without* a new operator for more than a decade, but haven't been able
to come up with a non-syntactic solution that they were all happy
with. The PEP was accepted in short order because they were able to
demonstrate two things:

1. Yes, they really needed new syntax to solve the problem properly
2. No, they weren't likely to be back in a couple of years time asking
for *another* operator in 3.6 - matrix multiplication really was the
only thing they had found they didn't have a good clean spelling for

http://www.curiousefficiency.org/posts/2011/02/justifying-python-language-changes.html
has a few more examples of past changes that were accepted, and some
of the key reasons why.

Cheers,
Nick.

--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia

From guettli at thomas-guettler.de Sat May 17 14:04:12 2014
From: guettli at thomas-guettler.de (=?ISO-8859-1?Q?Thomas_G=FCttler?=)
Date: Sat, 17 May 2014 14:04:12 +0200
Subject: [Python-ideas] logging.config.defaultConfig()
In-Reply-To:
References: <5375B8A7.1000204@thomas-guettler.de>
<20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net>
<20140516115417.0d7e3a54@fsol> <5375F7DE.9080902@egenix.com>

Message-ID: <5377503C.2070000@thomas-guettler.de>

Am 17.05.2014 10:56, schrieb Nick Coghlan:
> On 16 May 2014 21:34, M.-A. Lemburg wrote:
>> On 16.05.2014 11:54, Antoine Pitrou wrote:
>>>> While I agree that importing a module might not be the right way, having
>>>> a standard way to configure logging via environment variables might be
>>>> helpful.
>>>
>>> I entirely disagree. An environment variable is a very lousy way to
>>> specify a configuration file's location; and there is no reason to have
>>> a common logging configuration for all Python applications.
>>
>> Hmm, it's a fairly standard way to define config file locations esp.
>> on Unix platforms, so I don't follow you here. Perhaps I'm just
>> missing some context.
>>
>> Such env vars are often used in application environments to override
>> system defaults, e.g. for finding OpenSSL or ODBC config files.
>
> Python is a language runtime, not an application. Having globally
> configurable behaviours for a runtime is, in general, questionable,
> which is why we have the options to ignore the environment variables,
> site-packages, user site-packages and now the "isolated mode" flag
> that basically says "ignore *every* explicitly configurable Python
> setting in the environment".

Using logging as library works well in Python. But writing console
scripts which use logging force the developer to solve the same
problems again and again: How to set up the logging?

And the developer of the console script does
not know how the user of the console script wants to handle logging.

That's why all Python application have a different way to set up the logging.

> For 3.2+, we defined a sensible default logging configuration (warning
> and above written to stderr, everything else ignored), so users should
> be free to just use the logging module when writing libraries without
> worrying about whether or not it has been configured for reporting
> properly. That doesn't help Python 2 users, but that's the case for a
> lot of things.

> Trying to provide a way to actually *configure* logging in a general
> way would be fraught with backwards compatibility issues when it came
> to interfering with frameworks (whether for writing CLI applications,
> web applications, or GUI applications) that already providing their
> own way of handling logging configuration.

Of course a standard way to get the logging configuration defined by the end user
should be optional. I don't see any backwards compatibility issues.

The author of the console script should just need one line, to
get the defaults which the console script user wants.

{{{
import argparse

def main()
logging.config.defaultConfig()
argparse...

}}}

The end user can set up the logging in the way he wants:

- Log to a file
- Log to a daemon
- Format the messages the way he likes it
- ...

Since I know that some logging environments are complicated, I think
it is best to hook into a method call. There are environments
where fileConfig() does not solve all needs.

Please ask if, you don't understand what I want.

Thomas G?ttler

--
Thomas G?ttler
http://thomas-guettler.de/

From solipsis at pitrou.net Sat May 17 14:08:36 2014
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 17 May 2014 14:08:36 +0200
Subject: [Python-ideas] logging.config.defaultConfig()
References: <5375B8A7.1000204@thomas-guettler.de>
<20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net>
<20140516115417.0d7e3a54@fsol> <5375F7DE.9080902@egenix.com>

<5377503C.2070000@thomas-guettler.de>
Message-ID: <20140517140836.2af88817@fsol>

On Sat, 17 May 2014 14:04:12 +0200
Thomas G?ttler
wrote:
>
> There are environments where fileConfig() does not solve all needs.

Please explain how.

From mal at egenix.com Sat May 17 14:27:32 2014
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat, 17 May 2014 14:27:32 +0200
Subject: [Python-ideas] logging.config.defaultConfig()
In-Reply-To:
References: <5375B8A7.1000204@thomas-guettler.de> <20140516112716.6002e7e8@fsol>
<5375DB47.1020708@sotecware.net> <20140516115417.0d7e3a54@fsol>
<5375F7DE.9080902@egenix.com>

Message-ID: <537755B4.9070003@egenix.com>

On 17.05.2014 10:56, Nick Coghlan wrote:
> On 16 May 2014 21:34, M.-A. Lemburg wrote:
>> On 16.05.2014 11:54, Antoine Pitrou wrote:
>>>> While I agree that importing a module might not be the right way, having
>>>> a standard way to configure logging via environment variables might be
>>>> helpful.
>>>
>>> I entirely disagree. An environment variable is a very lousy way to
>>> specify a configuration file's location; and there is no reason to have
>>> a common logging configuration for all Python applications.
>>
>> Hmm, it's a fairly standard way to define config file locations esp.
>> on Unix platforms, so I don't follow you here. Perhaps I'm just
>> missing some context.
>>
>> Such env vars are often used in application environments to override
>> system defaults, e.g. for finding OpenSSL or ODBC config files.
>
> Python is a language runtime, not an application. Having globally
> configurable behaviours for a runtime is, in general, questionable,
> which is why we have the options to ignore the environment variables,
> site-packages, user site-packages and now the "isolated mode" flag
> that basically says "ignore *every* explicitly configurable Python
> setting in the environment".

Right, but those options address specific use cases (e.g.
for setting up testing environments). Their existence does
not imply that having config variables for all of the above
is a bad thing, as you seem to imply - otherwise, we wouldn't
have them in the first place ;-)

Logging is just another runtime feature, just like writing
PYC files or setting a search path. Now, configuring logging
is too complex to do on the command line, so pointing the
runtime to a logging config file instead seems like a good idea.

Of course, an application could just as well do this, so the
question really is whether we should have it in general or not.

PS: Note that with "application environment" I'm referring to
exactly that: a shell environment with environment options
specifically setup for a specific application. You typically
use those for application specific user accounts, not globally.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, May 17 2014)
>>> Python Projects, Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/

From guettli at thomas-guettler.de Sat May 17 14:41:05 2014
From: guettli at thomas-guettler.de (=?UTF-8?B?VGhvbWFzIEfDvHR0bGVy?=)
Date: Sat, 17 May 2014 14:41:05 +0200
Subject: [Python-ideas] logging.config.defaultConfig()
In-Reply-To: <20140517140836.2af88817@fsol>
References: <5375B8A7.1000204@thomas-guettler.de>
<20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net>
<20140516115417.0d7e3a54@fsol> <5375F7DE.9080902@egenix.com>

<5377503C.2070000@thomas-guettler.de> <20140517140836.2af88817@fsol>
Message-ID: <537758E1.7010403@thomas-guettler.de>

Am 17.05.2014 14:08, schrieb Antoine Pitrou:
> On Sat, 17 May 2014 14:04:12 +0200
> Thomas G?ttler
> wrote:
>>
>> There are environments where fileConfig() does not solve all needs.
>
> Please explain how.

If you want to get the config from a database or LDAP. Is
this supported by fileConfig()?

--
Thomas G?ttler
http://thomas-guettler.de/

From solipsis at pitrou.net Sat May 17 14:55:52 2014
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 17 May 2014 14:55:52 +0200
Subject: [Python-ideas] logging.config.defaultConfig()
References: <5375B8A7.1000204@thomas-guettler.de>
<20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net>
<20140516115417.0d7e3a54@fsol> <5375F7DE.9080902@egenix.com>

<5377503C.2070000@thomas-guettler.de>
<20140517140836.2af88817@fsol>
<537758E1.7010403@thomas-guettler.de>
Message-ID: <20140517145552.1436833b@fsol>

On Sat, 17 May 2014 14:41:05 +0200
Thomas G?ttler
wrote:
> Am 17.05.2014 14:08, schrieb Antoine Pitrou:
> > On Sat, 17 May 2014 14:04:12 +0200
> > Thomas G?ttler
> > wrote:
> >>
> >> There are environments where fileConfig() does not solve all needs.
> >
> > Please explain how.
>
> If you want to get the config from a database or LDAP. Is
> this supported by fileConfig()?

Obviously not, but you should be able to use dictConfig() for that.
Mapping the database contents to the dict representation expected by
dictConfig() is a domain-specific task that cannot be provided by the
standard library, so it's the application's job to provide it.

Regards

Antoine.

From ncoghlan at gmail.com Sat May 17 16:34:55 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 18 May 2014 00:34:55 +1000
Subject: [Python-ideas] logging.config.defaultConfig()
In-Reply-To: <5377503C.2070000@thomas-guettler.de>
References: <5375B8A7.1000204@thomas-guettler.de>
<20140516112716.6002e7e8@fsol> <5375DB47.1020708@sotecware.net>
<20140516115417.0d7e3a54@fsol> <5375F7DE.9080902@egenix.com>

<5377503C.2070000@thomas-guettler.de>
Message-ID:

On 17 May 2014 22:05, "Thomas G?ttler" wrote:

> Using logging as library works well in Python. But writing console
> scripts which use logging force the developer to solve the same
> problems again and again: How to set up the logging?
>
> And the developer of the console script does
> not know how the user of the console script wants to handle logging.
>
> That's why all Python application have a different way to set up the
logging.

But this is also why command line frameworks like Cement exist (disclosure:
Cement was just the first example I found of the kind of full featured CLI
framework I mean. There may be other examples).

I guess my question is, if an application is to the point of worrying about
configuring logging, why should we handle it in the interpreter for command
line applications, when we leave it to frameworks to handle for web and GUI
applications?

Application configuration is a complicated problem - you have to decide
what to do about global defaults, user defaults, environment variables,
command line options, potentially runtime adjustable options for daemons,
SIGHUP handling, etc, etc.

This complexity, along with other questions like ini-format vs JSON vs YAML
is a key part of *why* PEP 391 punted on the question and just defined
logging.dictConfig() instead.

Cheers,
Nick.

>
> > For 3.2+, we defined a sensible default logging configuration (warning
> > and above written to stderr, everything else ignored), so users should
> > be free to just use the logging module when writing libraries without
> > worrying about whether or not it has been configured for reporting
> > properly. That doesn't help Python 2 users, but that's the case for a
> > lot of things.
>
>
> > Trying to provide a way to actually *configure* logging in a general
> > way would be fraught with backwards compatibility issues when it came
> > to interfering with frameworks (whether for writing CLI applications,
> > web applications, or GUI applications) that already providing their
> > own way of handling logging configuration.
>
> Of course a standard way to get the logging configuration defined by the
end user
> should be optional. I don't see any backwards compatibility issues.
>
> The author of the console script should just need one line, to
> get the defaults which the console script user wants.
>
> {{{
> import argparse
>
> def main()
> logging.config.defaultConfig()
> argparse...
>
> }}}
>
> The end user can set up the logging in the way he wants:
>
> - Log to a file
> - Log to a daemon
> - Format the messages the way he likes it
> - ...
>
> Since I know that some logging environments are complicated, I think
> it is best to hook into a method call. There are environments
> where fileConfig() does not solve all needs.
>
> Please ask if, you don't understand what I want.
>
>
> Thomas G?ttler
>
> --
> Thomas G?ttler
> http://thomas-guettler.de/
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From guettli at thomas-guettler.de Sun May 18 09:34:52 2014
From: guettli at thomas-guettler.de (=?ISO-8859-15?Q?Thomas_G=FCttler?=)
Date: Sun, 18 May 2014 09:34:52 +0200
Subject: [Python-ideas] Application start up hooks
Message-ID: <5378629C.4040208@thomas-guettler.de>

This is the successor to the thread "logging.config.defaultConfig()".

Thank you for your replies to the first post. I thought agin about my needs: setting up logging.

The more abstract description is "setting up an application". And again, that's something
that should not be done by Python itself.

But there could be standard hooks to build a bridge between developer and operator:

- developer: develops libraries, command lines apps, web apps, ...

- app user: responsible for configuring the app.

Up to to now every application has its own way to set up the application, and
this diversity is good.

All environments are different, but this pattern is common for most applications:

1. use default config provided by the app. These defaults are from the developer of the app.

2. use default config provided by the app user. This can overwrite previous config

3. use explicit config (for example command line arguments) provided by the app user. This can overwrite previous config

I looked at Cement. It provides the above steps, but it provides a lot of other things, too. This
makes it a too heavy weight dependency for many small applications.

I like argparse for command line tools. But it misses loading defaults before parsing the command line
args.

Many post of my first thread said something like "this is not the job of python, handle this in our application yourself".

Now I think you are right.

If there is a good reusable module for loading configs (setting up logging is one part of this) it can live outside
the standard library. And if it is really good, it can get into the standard library in the future.

I know that it is off topic on this list. But it might be useful for other people, too:

Does anyone know a light weight module for loading configuration settings?

Thomas

--
Thomas G?ttler
http://thomas-guettler.de/

From tjreedy at udel.edu Sun May 18 10:30:27 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 18 May 2014 04:30:27 -0400
Subject: [Python-ideas] Application start up hooks
In-Reply-To: <5378629C.4040208@thomas-guettler.de>
References: <5378629C.4040208@thomas-guettler.de>
Message-ID:

On 5/18/2014 3:34 AM, Thomas G?ttler wrote:
[snip]
> Does anyone know a light weight module for loading configuration settings?

Questions like this are a good subject for a python-list post.

--
Terry Jan Reedy

From machyniak at gmail.com Mon May 19 18:16:31 2014
From: machyniak at gmail.com (Pavel Machyniak)
Date: Mon, 19 May 2014 18:16:31 +0200
Subject: [Python-ideas] python configure --with-ssl
Message-ID: <537A2E5F.4060004@gmail.com>

Hello python developers,

please add option to python configure for setting custom `openssl`
installation on python build, eg `--with-ssl=path` as used commonly.
Otherwise it is difficult to build python with specific `openssl`
installation/compilation, see eg.
http://stackoverflow.com/questions/22409092/coredump-when-compiling-python-with-a-custom-openssl-version.

Thank you.

Sincerely,
Pavel Machyniak
machyniak at gmail.com

From random832 at fastmail.us Mon May 19 22:40:31 2014
From: random832 at fastmail.us (random832 at fastmail.us)
Date: Mon, 19 May 2014 16:40:31 -0400
Subject: [Python-ideas] Delivery Status Notification (Failure)
In-Reply-To:
References:
Message-ID: <1400532031.8059.119201577.158D70D7@webmail.messagingengine.com>

For some reason it tried to send my message to a @googlegroups.com
address, likely because this for some reason shows up in the other
user's headers instead of the correct email address.

On Mon, May 19, 2014, at 14:48, Mail Delivery Subsystem wrote:
> Hello random832 at fastmail.us,
>
> We're writing to let you know that the group you tried to contact
> (python-ideas) may not exist, or you may not have permission to post
> messages to the group. A few more details on why you weren't able to
> post:
>
> * You might have spelled or formatted the group name incorrectly.
> * The owner of the group may have removed this group.
> * You may need to join the group before receiving permission to post.
> * This group may not be open to posting.
>
> If you have questions related to this or any other Google Group, visit
> the Help Center at http://groups.google.com/support/.
>
> Thanks,
>
> Google Groups
>
>
>
> ----- Original message -----
>
> X-Received: by 10.50.32.4 with SMTP id e4mr126025igi.7.1400525310266;
> Mon, 19 May 2014 11:48:30 -0700 (PDT)
> Return-Path:
> Received: from out3-smtp.messagingengine.com
> (out3-smtp.messagingengine.com. [66.111.4.27])
> by gmr-mx.google.com with ESMTPS id
> bu7si4056012pad.0.2014.05.19.11.48.29
> for
> (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256
> bits=128/128);
> Mon, 19 May 2014 11:48:29 -0700 (PDT)
> Received-SPF: pass (google.com: domain of random832 at fastmail.us
> designates 66.111.4.27 as permitted sender) client-ip=66.111.4.27;
> Authentication-Results: gmr-mx.google.com;
> spf=pass (google.com: domain of random832 at fastmail.us designates
> 66.111.4.27 as permitted sender) smtp.mail=random832 at fastmail.us;
> dkim=pass header.i=@fastmail.us
> Received: from compute6.internal (compute6.nyi.mail.srv.osa
> [10.202.2.46])
> by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id C3DB720ABA;
> Mon, 19 May 2014 14:48:28 -0400 (EDT)
> Received: from web3 ([10.202.2.213])
> by compute6.internal (MEProxy); Mon, 19 May 2014 14:48:28 -0400
> DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=fastmail.us; h=
> message-id:from:to:mime-version:content-transfer-encoding
> :content-type:subject:date:in-reply-to:references; s=mesmtp; bh=
> U6FWb0C2uyTm8cceYVd+D9FI+LU=; b=A73KEJ7yw88fKnQPrc+QgfItUQV+aRd7
> w3+v0bLtjD+hul2EliX/jxu+oWr0r60DYKKYpKq6LaeqK3wZKiohTBraOw4yh+5+
> /AIlvOm4u1otMWb4TX2RLeRKjvaPkBZbr9aaRy9pJ9uYR2pugWxCCaV82nRhZoPS
> 6bXzp7y7zPM=
> DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=
> messagingengine.com; h=message-id:from:to:mime-version
> :content-transfer-encoding:content-type:subject:date:in-reply-to
> :references; s=smtpout; bh=U6FWb0C2uyTm8cceYVd+D9FI+LU=; b=fXFmo
> JvkBD6AI/AoBMbVe1LKbTgCoFtw2ERCmhTLa3/V7NqaltxUvvB3Rj51xJ+2RpQBR
> RBIeqyDDRYLjtwpy6CxK7ZpnbSQlHImhhQUCKSVSUFI5n97n9X5MVKdfIdfXu3Pn
> 1udejeeAK095H3R9knwOCvdTHuFiI63/pA2Fm8=
> Received: by web3.nyi.mail.srv.osa (Postfix, from userid 99)
> id 9CF84114773; Mon, 19 May 2014 14:48:28 -0400 (EDT)
> Message-Id:
> <1400525308.6670.119156649.17F929FF at webmail.messagingengine.com>
> X-Sasl-Enc: Y0SV34WN7zg7S+fK/epnscp+bHNT34v9NGD0AvnSqJX5 1400525308
> From: random832 at fastmail.us
> To: Ram Rachum , python-ideas at googlegroups.com
> MIME-Version: 1.0
> Content-Transfer-Encoding: 7bit
> Content-Type: text/plain
> X-Mailer: MessagingEngine.com Webmail Interface - ajax-988d4021
> Subject: Re: [Python-ideas] Expose `itertools.count.start` and implement
> `itertools.count.__eq__` based on it, like `range`.
> Date: Mon, 19 May 2014 14:48:28 -0400
> In-Reply-To: <082cd87a-aeb5-49bf-9f79-d99a6d18e402 at googlegroups.com>
> References: <082cd87a-aeb5-49bf-9f79-d99a6d18e402 at googlegroups.com>
>
> On Thu, May 15, 2014, at 16:02, Ram Rachum wrote:
> > I suggest exposing `itertools.count.start` and implementing
> > `itertools.count.__eq__` based on it. This'll provide the same benefits
> > that `range` got by exposing `range.start` and allowing `range.__eq__`.
>
> I think this _and_ your other request reveal a misunderstanding of what
> itertools are. They're not "magic sequences", they're generators - the
> fact that you can use either a sequence or a generator in a for loop may
> have confused you. In other words, they're more like Python 2
> dict.iteritems than Python 2 xrange. It might be more reasonable to
> propose that a new module be created for "magic sequence" objects.

--
Random832

From nad at acm.org Tue May 20 00:30:41 2014
From: nad at acm.org (Ned Deily)
Date: Mon, 19 May 2014 15:30:41 -0700
Subject: [Python-ideas] python configure --with-ssl
References: <537A2E5F.4060004@gmail.com>
Message-ID:

In article <537A2E5F.4060004 at gmail.com>,
Pavel Machyniak
wrote:
> please add option to python configure for setting custom `openssl`
> installation on python build, eg `--with-ssl=path` as used commonly.
> Otherwise it is difficult to build python with specific `openssl`
> installation/compilation, see eg.
> http://stackoverflow.com/questions/22409092/coredump-when-compiling-python-wit
> h-a-custom-openssl-version.

This sort of request has come up before on the Python bug tracker. With
a quick search, I didn't find an exact match for what you request,
although there might be a more general issue open to allow more control
over all third-party libraries. However, there is
http://bugs.python.org/issue5575 which provides a patch to allow control
using environment variables rather than configure options. Feel free to
open a new issue or comment on this one.

--
Ned Deily,
nad at acm.org

From darren.rmc at gmail.com Tue May 20 06:27:38 2014
From: darren.rmc at gmail.com (Darren McCleary)
Date: Tue, 20 May 2014 00:27:38 -0400
Subject: [Python-ideas] Break if *condition*
Message-ID:

Hello,

This is my first python idea suggestion. I often find myself writing code
like:

for i in iterable:
# do something
if i == some_condition :
break

I feel that condensing this down to one line would be a novel idea. That
same 2 lines could be written as:

for i in iterable:
# do something
break if i == some_condition

It seems to me this would be logical and in the same vein of conditional
variable assignments ( i.e. x = 0 if y == True else 1 )

Thoughts?

Cheers,
Darren
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From rosuav at gmail.com Tue May 20 07:23:47 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 20 May 2014 15:23:47 +1000
Subject: [Python-ideas] Break if *condition*
In-Reply-To:
References:
Message-ID:

On Tue, May 20, 2014 at 2:27 PM, Darren McCleary wrote:
> if i == some_condition :
> break
>
> I feel that condensing this down to one line would be a novel idea. That
> same 2 lines could be written as:
>
> for i in iterable:
> # do something
> break if i == some_condition

if i == some_condition: break

That's one line too, and works on existing Python interpreters :)

ChrisA

From theller at ctypes.org Tue May 20 10:51:22 2014
From: theller at ctypes.org (Thomas Heller)
Date: Tue, 20 May 2014 10:51:22 +0200
Subject: [Python-ideas] pathlib suggestion
Message-ID:

Python 3.4's pathlib uses str(path) to get the full pathname
as string.

I'd like to suggest adding a property which allows to access
the full pathname. IMO this should make it easier to understand
the code or make if possible to search for it in sources.

I'm unsure about the name this property should get; maybe .fullpath
or something like that. I'm also unsure whether there should be
separate properties to get the full pathname as string or bytes object.

Opinions?

Thomas

From machyniak at gmail.com Tue May 20 11:33:58 2014
From: machyniak at gmail.com (Pavel Machyniak)
Date: Tue, 20 May 2014 11:33:58 +0200
Subject: [Python-ideas] python configure --with-ssl
In-Reply-To:
References: <537A2E5F.4060004@gmail.com>

Message-ID: <537B2186.3080907@gmail.com>

On 20.5.2014 0:30, Ned Deily wrote:
> In article <537A2E5F.4060004 at gmail.com>,
> Pavel Machyniak
> wrote:
>> please add option to python configure for setting custom `openssl`
>> installation on python build, eg `--with-ssl=path` as used commonly.
>> Otherwise it is difficult to build python with specific `openssl`
>> installation/compilation, see eg.
>> http://stackoverflow.com/questions/22409092/coredump-when-compiling-python-wit
>> h-a-custom-openssl-version.
>
> This sort of request has come up before on the Python bug tracker. With
> a quick search, I didn't find an exact match for what you request,
> although there might be a more general issue open to allow more control
> over all third-party libraries. However, there is
> http://bugs.python.org/issue5575 which provides a patch to allow control
> using environment variables rather than configure options. Feel free to
> open a new issue or comment on this one.
>

Thanks,

I am well aware of the patch but it does not work if there is default
openssl installation within the system (because it only adds another
path to the END of the search list), and although the patch is from the
year 2009 it is not released (accepted?) yet.

I will probably find some time and propose the solution/patch using
configure options --with-ssl (and also --with-ssl-includes,
--with-ssl-libs, --with_krb5, and maybe --with-sqlite,
--with-sqlite-includes, --with-sqlite-libs as well).

Pavel Machyniak

From g.rodola at gmail.com Tue May 20 12:22:12 2014
From: g.rodola at gmail.com (Giampaolo Rodola')
Date: Tue, 20 May 2014 12:22:12 +0200
Subject: [Python-ideas] pathlib suggestion
In-Reply-To:
References:
Message-ID:

On Tue, May 20, 2014 at 10:51 AM, Thomas Heller wrote:
>
> I'm unsure about the name this property should get; maybe .fullpath
> or something like that.

Probably "abspath" in order to be consistent with os.path.abspath.

--
Giampaolo - http://grodola.blogspot.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From theller at ctypes.org Tue May 20 17:08:45 2014
From: theller at ctypes.org (Thomas Heller)
Date: Tue, 20 May 2014 17:08:45 +0200
Subject: [Python-ideas] pathlib suggestion
In-Reply-To:
References:

Message-ID:

Am 20.05.2014 12:22, schrieb Giampaolo Rodola':
> On Tue, May 20, 2014 at 10:51 AM, Thomas Heller > wrote:
>
> I'm unsure about the name this property should get; maybe .fullpath
> or something like that.
>
>
> Probably "abspath" in order to be consistent with os.path.abspath.

Well, it would not always be an absolute pathname, so .abspath looks
wrong to me.

From solipsis at pitrou.net Tue May 20 17:25:18 2014
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 20 May 2014 17:25:18 +0200
Subject: [Python-ideas] pathlib suggestion
References:
Message-ID: <20140520172518.4576b754@fsol>

On Tue, 20 May 2014 10:51:22 +0200
Thomas Heller wrote:
> Python 3.4's pathlib uses str(path) to get the full pathname
> as string.
>
> I'd like to suggest adding a property which allows to access
> the full pathname. IMO this should make it easier to understand
> the code or make if possible to search for it in sources.
>
> I'm unsure about the name this property should get; maybe .fullpath
> or something like that. I'm also unsure whether there should be
> separate properties to get the full pathname as string or bytes object.

.strpath perhaps?
(also .bytespath if desired)

It was once proposed as "filesystem path" protocol where classes
purporting to represent filesystem paths could define a e.g.
__strpath__ method returning the string representation of the path. I
can only find the following allusions on python-ideas:
https://mail.python.org/pipermail/python-ideas/2012-October/016912.html
https://mail.python.org/pipermail/python-ideas/2012-October/016974.html

Regards

Antoine.

From victor.stinner at gmail.com Tue May 20 18:57:53 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Tue, 20 May 2014 18:57:53 +0200
Subject: [Python-ideas] Make Python code read-only
Message-ID:

Hi,

I'm trying to find the best option to make CPython faster. I would
like to discuss here a first idea of making the Python code read-only
to allow new optimizations.

Make Python code read-only
==========================

I propose to add an option to Python to make the code read-only. In
this mode, module namespace, class namespace and function attributes
become read-only. It is still be possible to add a "__readonly__ =
False" marker to keep a module, a class and/or a function modifiable.

I chose to make the code read-only by default instead of the opposite.
In my test, almost all code can be made read-only without major issue,
few code requires the "__readonly__ = False" marker.

A module is only made read-only by importlib after the module is
loaded. The module is stil modifiable when code is executed until
importlib has set all its attributes (ex: __loader__).

I have a proof of concept: a fork of Python 3.5 making code read-only
if the PYTHONREADONLY environment variable is set to 1. Commands to
try it:

hg clone http://hg.python.org/sandbox/readonly
cd readonly && ./configure && make
PYTHONREADONLY=1 ./python -c 'import os; os.x = 1'
# ValueError: read-only dictionary

Status of the standard library (Lib/*.py): 139 modules are read-only,
25 are modifiable. Except of the sys module, all modules writen in C
are read-only.

I'm surprised that so few code rely on the ability to modify
everything. Most of the code can be read-only.

Optimizations possible when the code is read-only
=================================================

* Inline calls to functions.

* Replace calls to pure functions (without side effect) with the
result. For example, len("abc") can be replaced with 3.

* Constants can be replaced with their values (at least for simple
types like bytes, int and str).

It is for example possible to implement these optimizations by
manipulating the Abstract Syntax Tree (AST) during the compilation
from the source code to bytecode. See my astoptimizer project which
already implements similar optimizations:

https://bitbucket.org/haypo/astoptimizer

More optimizations
==================

My main motivation to make code read-only is to specialize a function:
optimize a function for a specific environment (type of parameters,
external symbols like other functions, etc). Checking the type of
parameters can be fast (especially when implemented in C), but it
would be expensive to check that all global variables used in the
function were not modified since the function has been "specialized".
For example, if os.path.isabs(path) is called: you have to check that
"os.path" and "os.path.isabs" attributes were not modified and that
the isabs() was not modified. If we know that globals are read-only,
these checks are no more needed and so it becomes cheap to decide if
the specialized function can be used or not.

It becomes possible to "learn" types (trace the execution of the
application, and then compile for the recorded types). Knowing the
type of function parameters, result and local variables opens an
interesting class of new optimizations, but I prefer to discuss this
later, after discussing the idea of making the code read-only.

One point remains unclear to me. There is a short time window between
a module is loaded and the module is made read-only. During this
window, we cannot rely on the read-only property of the code.
Specialized code cannot be used safetly before the module is known to
be read-only. I don't know yet how the switch from "slow" code to
optimized code should be implemented.

Issues with read-only code
==========================

* Currently, it's not possible to allow again to modify a module,
class or function to keep my implementation simple. With a registry of
callbacks, it may be possible to enable again modification and call
code to disable optimizations.

* PyPy implements this but thanks to its JIT, it can optimize again
the modified code during the execution. Writing a JIT is very complex,
I'm trying to find a compromise between the fast PyPy and the slow
CPython. Add a JIT to CPython is out of my scope, it requires too much
modifications of the code.

* With read-only code, monkey-patching cannot be used anymore. It's
annoying to run tests. An obvious solution is to disable read-only
mode to run tests, which can be seen as unsafe since tests are usually
used to trust the code.

* The sys module cannot be made read-only because modifying sys.stdout
and sys.ps1 is a common use case.

* The warnings module tries to add a __warningregistry__ global
variable in the module where the warning was emited to not repeat
warnings that should only be emited once. The problem is that the
module namespace is made read-only before this variable is added. A
workaround would be to maintain these dictionaries in the warnings
module directly, but it becomes harder to clear the dictionary when a
module is unloaded or reloaded. Another workaround is to add
__warningregistry__ before making a module read-only.

* Lazy initialization of module variables does not work anymore. A
workaround is to use a mutable type. It can be a dict used as a
namespace for module modifiable variables.

* The interactive interpreter sets a "_" variable in the builtins
namespace. I have no workaround for this. The "_" variable is no more
created in read-only mode. Don't run the interactive interpreter in
read-only mode.

* It is not possible yet to make the namespace of packages read-only.
For example, "import encodings.utf_8" adds the symbol "utf_8" to the
encodings namespace. A workaround is to load all submodules before
making the namespace read-only. This cannot be done for some large
modules. For example, the encodings has a lot of submodules, only a
few are needed.

Read the documentation for more information:

http://hg.python.org/sandbox/readonly/file/tip/READONLY.txt

More optimizations
==================

See my notes for all ideas to optimize CPython:

http://haypo-notes.readthedocs.org/faster_cpython.html

I explain there why I prefer to optimize CPython instead of working on
PyPy or another Python implementation like Pyston, Numba or similar
projects.

Victor

From dw+python-ideas at hmmz.org Tue May 20 19:22:42 2014
From: dw+python-ideas at hmmz.org (dw+python-ideas at hmmz.org)
Date: Tue, 20 May 2014 17:22:42 +0000
Subject: [Python-ideas] Make Python code read-only
In-Reply-To:
References:
Message-ID: <20140520172242.GA26262@k2>

On Tue, May 20, 2014 at 06:57:53PM +0200, Victor Stinner wrote:

> * With read-only code, monkey-patching cannot be used anymore. It's
> annoying to run tests. An obvious solution is to disable read-only
> mode to run tests, which can be seen as unsafe since tests are usually
> used to trust the code.

At least for me, this represents a material change to the philosophy of
the language. While frowned upon, monkey patching is extremely useful
while debugging, and occasionally in emergencies. :)

Definitely not worth it for a few extra % IMHO

David

From pmawhorter at gmail.com Tue May 20 19:36:51 2014
From: pmawhorter at gmail.com (Peter Mawhorter)
Date: Tue, 20 May 2014 10:36:51 -0700
Subject: [Python-ideas] Make Python code read-only
In-Reply-To: <20140520172242.GA26262@k2>
References:
<20140520172242.GA26262@k2>
Message-ID:

On Tue, May 20, 2014 at 10:22 AM, wrote:
> On Tue, May 20, 2014 at 06:57:53PM +0200, Victor Stinner wrote:
>
>> * With read-only code, monkey-patching cannot be used anymore. It's
>> annoying to run tests. An obvious solution is to disable read-only
>> mode to run tests, which can be seen as unsafe since tests are usually
>> used to trust the code.
>
> At least for me, this represents a material change to the philosophy of
> the language. While frowned upon, monkey patching is extremely useful
> while debugging, and occasionally in emergencies. :)
>
> Definitely not worth it for a few extra % IMHO
>
> David
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

I think part of the point was that this read-only mode would be
entirely optional. One of the main reasons that I don't use Python for
all of my projects is the speed issue, so anything that's a "free"
speedup seems like a great thing. The main cost I can see here is in
maintaining the readonly mode and the perhaps subtle bugs that would
arise in many people's code when run in readonly mode. As an official
feature, there would be a documentation and maintenance cost to the
community, but I do think that there's substantial benefit, and
especially as an opt-in feature, if the optimizations really speed
things up, this seems quite useful. I guess the question is: How does
this compare to other "drop-in" speedup solutions, like PyPy. Is it
applicable to more existing code? Is it easier to apply? Does it
provide a better speed increase?

If there's a niche for it in one of those three areas and it's an
opt-in system, I see the issue being a cost-benefit analysis of what
is gained (whatever that niche is) vs. the maintenance cost in terms
of bug reports etc.

-Peter Mawhorter

From rosuav at gmail.com Tue May 20 19:37:42 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 21 May 2014 03:37:42 +1000
Subject: [Python-ideas] Make Python code read-only
In-Reply-To:
References:
Message-ID:

On Wed, May 21, 2014 at 2:57 AM, Victor Stinner
wrote:
> * The sys module cannot be made read-only because modifying sys.stdout
> and sys.ps1 is a common use case.

I think this highlights the biggest concern with defaulting to
read-only. Currently, most Python code won't reach into another module
and change anything, but any program could monkey-patch any module at
any time. You've noted that modifying sys's attributes is common
enough to prevent its being made read-only; how do you know what else
will be broken if this change goes through?

For that reason, even though the read-only state would be the more
common one, I would strongly recommend flagging those modules which
_are_ read-only, rather than those which aren't. Then it becomes a
documentable part of the module's interface: "This module will be
frozen when Python is run in read-only mode". Setting that flag and
then modifying your own state would be a mistake on par with using
assert for crucial checks; monkey-patching someone else's read-only
module makes your app incompatible with read-only mode. Any problems
would come from use of *both* read-only mode *and* the __readonly__
flag, rather than unexpectedly cropping up when someone loads up a
module from PyPI and it turns out to depend on mutability.

Also, flagging the ones that have the changed behaviour means it's
quite easy to get partial benefit from this, with no risk. In fact,
you could probably turn this on for arbitrary Python programs, as long
as only the standard library uses __readonly__; going the other way,
having a single module that doesn't have the flag and requires
mutability would prevent the whole app from being run in read-only
mode.

With that (rather big, and yet quite trivial) caveat, though: Looks
interesting. Optimizing for the >99% of code that doesn't do weird
things makes very good sense, just as long as the <1% can be catered
for.

ChrisA

From rosuav at gmail.com Tue May 20 19:44:43 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 21 May 2014 03:44:43 +1000
Subject: [Python-ideas] Make Python code read-only
In-Reply-To:
References:
<20140520172242.GA26262@k2>

Message-ID:

On Wed, May 21, 2014 at 3:36 AM, Peter Mawhorter wrote:
> The main cost I can see here is in
> maintaining the readonly mode and the perhaps subtle bugs that would
> arise in many people's code when run in readonly mode.

Here's a stupid-crazy idea to chew on. (Fortunately this is not
python-good-ideas at python.org - I wouldn't have much to contribute
there!) Make the per-module flag opt-in-only, but the overall
per-application flag active by default. Then, read-only mode applies
to a small number of standard library modules (plus any user modules
that specifically request it), and will thus be less surprising; and a
small rewording of the error message (eg "... - run Python with the
-O0 parameter to disable this check") would mean the monkey-patchers
could still do their stuff, at the cost of this optimization. It's
less likely to be surprising, because development would be done with
read-only mode active, rather than "Okay, let's try this in optimized
mode now - no asserts and read-only dicts... oh dear, it's not
working".

Big downside: Time machine policy prevents us from going back to 2.0
and implementing it there. There's going to be an even worse boundary
when people upgrade to a Python with this active by default. So it's
probably better to NOT make either half active by default, but to
recommend that new projects be developed with read-only mode active.

ChrisA

From ethan at stoneleaf.us Tue May 20 19:34:58 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 20 May 2014 10:34:58 -0700
Subject: [Python-ideas] Make Python code read-only
In-Reply-To:
References:
Message-ID: <537B9242.3080601@stoneleaf.us>

On 05/20/2014 09:57 AM, Victor Stinner wrote:
>
> I'm trying to find the best option to make CPython faster. I would
> like to discuss here a first idea of making the Python code read-only
> to allow new optimizations.

-1 to a forced read-only by default

+0 to a command-line switch to enable read-only by default

--
~Ethan~

From victor.stinner at gmail.com Tue May 20 22:32:25 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Tue, 20 May 2014 22:32:25 +0200
Subject: [Python-ideas] Make Python code read-only
In-Reply-To:
References:

Message-ID:

2014-05-20 19:37 GMT+02:00 Chris Angelico :
>> * The sys module cannot be made read-only because modifying sys.stdout
>> and sys.ps1 is a common use case.
>
> I think this highlights the biggest concern with defaulting to
> read-only.

Hum, maybe my email was unclear: the read-only mode is disabled by default.

When you enable the read-only mode, all modules are read-only except
the modules explicitly configured to be modifiable. I don't have a
strong opinion on this choice. We may only make modules read-only when
the read-only mode is activated and the module is explicitly
configured to be read-only.

Another option is to have a list of modules which should be made
read-only, configurable by the application.

> With that (rather big, and yet quite trivial) caveat, though: Looks
> interesting. Optimizing for the >99% of code that doesn't do weird
> things makes very good sense, just as long as the <1% can be catered
> for.

Yeah, the whole stdlib doesn't need to be read-only to make an
application faster.

Victor

From ethan at stoneleaf.us Tue May 20 23:21:35 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 20 May 2014 14:21:35 -0700
Subject: [Python-ideas] Make Python code read-only
In-Reply-To:
References:

Message-ID: <537BC75F.2060605@stoneleaf.us>

On 05/20/2014 01:32 PM, Victor Stinner wrote:
> 2014-05-20 19:37 GMT+02:00 Chris Angelico :
>>> * The sys module cannot be made read-only because modifying sys.stdout
>>> and sys.ps1 is a common use case.
>>
>> I think this highlights the biggest concern with defaulting to
>> read-only.
>
> Hum, maybe my email was unclear: the read-only mode is disabled by default.

Ah, that's good.

> Another option is to have a list of modules which should be made
> read-only, configurable by the application.

Or a list of modules that should remain read/write. As an application dev I should know which modules I am going to be
modifying after initial load, so as long as I can easily add them to a read/write list I would be happy (especially when
it came time to debug something).

--
~Ethan~

From ericsnowcurrently at gmail.com Wed May 21 00:04:22 2014
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 20 May 2014 16:04:22 -0600
Subject: [Python-ideas] Make Python code read-only
In-Reply-To:
References:
Message-ID:

An interesting idea. Comments below.

On May 20, 2014 10:58 AM, "Victor Stinner" wrote:
> Make Python code read-only
> ==========================
>
> I propose to add an option to Python to make the code read-only. In
> this mode, module namespace, class namespace and function attributes
> become read-only. It is still be possible to add a "__readonly__ =
> False" marker to keep a module, a class and/or a function modifiable.

Make __readonly__ a data descriptor (getset in the C-API) on
ModuleType, type, and FunctionType and people could toggle it as
needed. The descriptor could look something like this (in pure
Python):

class ReadonlyDescriptor:
DEFAULT = os.environ.get(b'PYTHONREADONLY', False) # i.e. ignore
changes to PYTHONREADONLY
def __init__(self, *, default=None):
if default is None:
default = cls.DEFAULT
self.default = default
def __get__(self, obj, cls):
if obj is None:
return self
try:
return obj.__dict__['__readonly__']
except KeyError:
readonly = bool(self.default)
obj.__dict__['__readonly__'] = readonly
return readonly
def __set__(self, obj, value):
obj.__dict__['__readonly__'] = value

Alternately, the object structs for the 3 types (e.g. PyModuleObject)
could each grow a "readonly" field (or an extra flag option if there
is an appropriate flag). The descriptor (in C) would use that instead
of obj.__dict__['__readonly__']. However, I'd prefer going through
__dict__.

Either way, the 3 types would share a tp_setattro implementation that
checked the read-only flag. That way there's no need to make sweeping
changes to the 3 types, nor to the dict type.

def __setattr__(self, name, value):
if self.__readonly__:
raise AttributeError('readonly')
super().__setattr__(name, value)

FWIW, the idea of a flag for read-only could be applied to objects in
general, particularly in a future language addition. "__readonly__"
is a good name for the flag so the precedent set by the three types in
this proposal would be a good one.

>
> I chose to make the code read-only by default instead of the opposite.
> In my test, almost all code can be made read-only without major issue,
> few code requires the "__readonly__ = False" marker.

Read-only by default would be backwards-incompatible, but having a
commandline flag (and/or env var) to enable it would be useful.

For classes a decorator could be nice, though it should wait until it
was more obviously worth doing. I'm not sure it would matter for
functions, though the same decorator would probably work.

>
> A module is only made read-only by importlib after the module is
> loaded. The module is stil modifiable when code is executed until
> importlib has set all its attributes (ex: __loader__).

With a data descriptor and __setattr__ like I described above, there
is no need to make any changes to importlib.

> Optimizations possible when the code is read-only
> =================================================
...
> More optimizations
> ==================

+1

> One point remains unclear to me. There is a short time window between
> a module is loaded and the module is made read-only. During this
> window, we cannot rely on the read-only property of the code.
> Specialized code cannot be used safetly before the module is known to
> be read-only.

How big a problem would this be in practice?

> Issues with read-only code
> ==========================
>
> * Currently, it's not possible to allow again to modify a module,
> class or function to keep my implementation simple. With a registry of
> callbacks, it may be possible to enable again modification and call
> code to disable optimizations.

With the data descriptor approach toggling read-only would work.
Enabling/disabling optimizations at that point would depend on how
they were implemented.

> * Lazy initialization of module variables does not work anymore. A
> workaround is to use a mutable type. It can be a dict used as a
> namespace for module modifiable variables.

What do you mean by "lazy initialization of module variables"?

> * It is not possible yet to make the namespace of packages read-only.
> For example, "import encodings.utf_8" adds the symbol "utf_8" to the
> encodings namespace. A workaround is to load all submodules before
> making the namespace read-only. This cannot be done for some large
> modules. For example, the encodings has a lot of submodules, only a
> few are needed.

If read-only is only enforced via __setattr__ then the workaround is
to bind the submodule directly via pkg.__dict__.

-eric

From greg.ewing at canterbury.ac.nz Wed May 21 00:44:25 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 21 May 2014 10:44:25 +1200
Subject: [Python-ideas] Make Python code read-only
In-Reply-To:
References:
Message-ID: <537BDAC9.2010100@canterbury.ac.nz>

Victor Stinner wrote:

> Optimizations possible when the code is read-only
> =================================================
>
> * Inline calls to functions.
>
> * Replace calls to pure functions (without side effect) with the
> result. For example, len("abc") can be replaced with 3.

I'm skeptical about how much difference this would make.
In most of the Python code I've seen, calls to module-level
functions are relatively rare -- most calls are method
calls.

--
Greg

From rosuav at gmail.com Wed May 21 00:46:04 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 21 May 2014 08:46:04 +1000
Subject: [Python-ideas] Make Python code read-only
In-Reply-To:
References:

Message-ID:

On Wed, May 21, 2014 at 6:32 AM, Victor Stinner
wrote:
> 2014-05-20 19:37 GMT+02:00 Chris Angelico :
>>> * The sys module cannot be made read-only because modifying sys.stdout
>>> and sys.ps1 is a common use case.
>>
>> I think this highlights the biggest concern with defaulting to
>> read-only.
>
> Hum, maybe my email was unclear: the read-only mode is disabled by default.
>
> When you enable the read-only mode, all modules are read-only except
> the modules explicitly configured to be modifiable.

There are two read-only states:

1) Is this application running in read-only mode? (You give an example
of setting this by an env var.)

2) Is this module read-only? (You give an example of setting this to False.)

It's the second one that I'm talking about. If, once you turn on
read-only mode (the first state), every module is read-only except
those marked __readonly__=False, you're going to have major backward
incompatibility problems. All it takes is one single module that ought
to be marked __readonly__=False and isn't, and read-only mode is
broken. Yes, it may be that most of the standard library can be made
read-only; but I believe it would still be better to explicitly say
__readonly__=True on each of those modules, than __readonly__=False on
the others - because of all the *non* stdlib modules.

ChrisA

From victor.stinner at gmail.com Wed May 21 01:46:34 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Wed, 21 May 2014 01:46:34 +0200
Subject: [Python-ideas] Make Python code read-only
In-Reply-To:
References:

Message-ID:

2014-05-21 0:04 GMT+02:00 Eric Snow :
> Make __readonly__ a data descriptor (getset in the C-API) on
> ModuleType, type, and FunctionType and people could toggle it as
> needed.

In my PoC, I chose to modify directly the builtin type "dict". I don't
think that I will keep this solution because I would prefer to not
touch such critical Python type. I may use a subclass instead. I added
a dict.setreadonly() method which can be used to make a dict
read-only, but a read-only dict cannot be made modifiable again.

I added a type.setreadonly() method which calls
type.__dict__.setreadonly(). I did this to access the underlying dict,
type.setreadonly() also works on builtin types like str. For example,
str.__dict__ is a mappingproxy, not the real dictionary.

> Alternately, the object structs for the 3 types (e.g. PyModuleObject)
> could each grow a "readonly" field (or an extra flag option if there
> is an appropriate flag). The descriptor (in C) would use that instead
> of obj.__dict__['__readonly__']. However, I'd prefer going through
> __dict__.

There is already a function.__readonly__ property (I just modified its
name, it was called __modifiable__ before, the opposite). It is used
to make a function read-only by importlib.

> Either way, the 3 types would share a tp_setattro implementation that
> checked the read-only flag. That way there's no need to make sweeping
> changes to the 3 types, nor to the dict type.
>
> def __setattr__(self, name, value):
> if self.__readonly__:
> raise AttributeError('readonly')
> super().__setattr__(name, value)

Are you sure that it's not possible to retrieve the underlying
dictionary somehow? For example, functions have a func.__dict__
attribute.

> Read-only by default would be backwards-incompatible, but having a
> commandline flag (and/or env var) to enable it would be useful.

My PoC had a PYTHONREADONLY env var to enable the read-only mode. I
just added a -r command line option for the same purpose.

It's disabled by default for backward compatibility. Only enable it if
you want to try my optimizations :-)

> For classes a decorator could be nice, though it should wait until it
> was more obviously worth doing. I'm not sure it would matter for
> functions, though the same decorator would probably work.

I just pushed a change to make the classes read-only by default to
make also nested classes read-only. I modified the builtin
__build_class__ function for that.

The decorator is called after the class is defined, it's too late.
That's why I chose a class attribute.

>> One point remains unclear to me. There is a short time window between
>> a module is loaded and the module is made read-only. During this
>> window, we cannot rely on the read-only property of the code.
>> Specialized code cannot be used safetly before the module is known to
>> be read-only.
>
> How big a problem would this be in practice?

I have no idea right now :)

>> Issues with read-only code
>> ==========================
>>
>> * Currently, it's not possible to allow again to modify a module,
>> class or function to keep my implementation simple. With a registry of
>> callbacks, it may be possible to enable again modification and call
>> code to disable optimizations.
>
> With the data descriptor approach toggling read-only would work.
> Enabling/disabling optimizations at that point would depend on how
> they were implemented.

Hum, I should try to use your descriptor. I'm not sure that it works
for modules and classes. (Functions already have a __readonly__
property.)

>> * Lazy initialization of module variables does not work anymore. A
>> workaround is to use a mutable type. It can be a dict used as a
>> namespace for module modifiable variables.
>
> What do you mean by "lazy initialization of module variables"?

To reduce the memory footprint, "large" precomputed tables of the
base64 module are only filled at the first call of the function
needing the tables.

I also saw in other modules that a module is only imported the first
time that is it loaded. Example: "def _lazy_import_sys(): global sys;
import sys" and then "if sys is None: _lazy_import_sys(); # use sys".

>> * It is not possible yet to make the namespace of packages read-only.
>> For example, "import encodings.utf_8" adds the symbol "utf_8" to the
>> encodings namespace. A workaround is to load all submodules before
>> making the namespace read-only. This cannot be done for some large
>> modules. For example, the encodings has a lot of submodules, only a
>> few are needed.
>
> If read-only is only enforced via __setattr__ then the workaround is
> to bind the submodule directly via pkg.__dict__.

I don't like the idea of an "almost" read-only module object.

In one of my project, I would like to emit machine code. If a module
is modified whereas the machine code relies on the module read-only
property, Python may crash.

Victor

From ncoghlan at gmail.com Wed May 21 02:00:31 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 21 May 2014 10:00:31 +1000
Subject: [Python-ideas] Make Python code read-only
In-Reply-To:
References:
Message-ID:

> More optimizations
> ==================
>
> See my notes for all ideas to optimize CPython:
>
> http://haypo-notes.readthedocs.org/faster_cpython.html
>
> I explain there why I prefer to optimize CPython instead of working on
> PyPy or another Python implementation like Pyston, Numba or similar
> projects.

You don't explain why you don't want to go with the selective optimisation
approach of Numba.

That isn't its own implementation - it's a way of marking particular
functions to be accelerated.

Cheers,
Nick.

>
> Victor
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From victor.stinner at gmail.com Wed May 21 02:09:03 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Wed, 21 May 2014 02:09:03 +0200
Subject: [Python-ideas] Make Python code read-only
In-Reply-To: <537BDAC9.2010100@canterbury.ac.nz>
References:
<537BDAC9.2010100@canterbury.ac.nz>
Message-ID:

2014-05-21 0:44 GMT+02:00 Greg Ewing :
>> Optimizations possible when the code is read-only
>> =================================================
>>
>> * Inline calls to functions.
>>
>> * Replace calls to pure functions (without side effect) with the
>> result. For example, len("abc") can be replaced with 3.
>
> I'm skeptical about how much difference this would make.
> In most of the Python code I've seen, calls to module-level
> functions are relatively rare -- most calls are method
> calls.

If the class is read-only and has a __slots__ class attribute, methods
cannot be modified anymore. If you are able to get (compute) the type
of an object, you can optimize the call to the method.

Dummy example:
---
chars=[]
for ch in range(32, 126):
chars.append(chr(ch))
print(''.join(chars))
---

Here you can guess that the type of chars in "chars.append" is list.
The list.append() method is well known (and it is read-only, even if
my global read-only mode is disabled, because list.append is a builtin
type).

You may inline the call in the body of the loop. Or you can at least
move the lookup of the append method out of the loop.

Victor

From victor.stinner at gmail.com Wed May 21 02:16:41 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Wed, 21 May 2014 02:16:41 +0200
Subject: [Python-ideas] Make Python code read-only
In-Reply-To:
References:

Message-ID:

2014-05-21 2:00 GMT+02:00 Nick Coghlan :
>> More optimizations
>> ==================
>>
>> See my notes for all ideas to optimize CPython:
>>
>> http://haypo-notes.readthedocs.org/faster_cpython.html
>>
>> I explain there why I prefer to optimize CPython instead of working on
>> PyPy or another Python implementation like Pyston, Numba or similar
>> projects.
>
> You don't explain why you don't want to go with the selective optimisation
> approach of Numba.
>
> That isn't its own implementation - it's a way of marking particular
> functions to be accelerated.

I don't want to optimize a single function, I want to optimize a whole
application.

If possible, I would prefer to not have to modify the application to
run it faster.

Numba plays very well with numbers and arrays, but I'm not sure that
it is able to inline arbitrary Python function for example.

Victor

From steve at pearwood.info Wed May 21 03:42:04 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 21 May 2014 11:42:04 +1000
Subject: [Python-ideas] Make Python code read-only
In-Reply-To:
References:

Message-ID: <20140521014203.GE10355@ando>

On Wed, May 21, 2014 at 03:37:42AM +1000, Chris Angelico wrote:

> With that (rather big, and yet quite trivial) caveat, though: Looks
> interesting. Optimizing for the >99% of code that doesn't do weird
> things makes very good sense, just as long as the <1% can be catered
> for.

"99% of Python code doesn't do weird things..."

It seems to me that this is a myth, or at least unjustifiable by the
facts as we have seen it. Victor's experiment shows 25 modules from the
standard library are modifiable, with 139 read-only. That's more like
15% than 1% "weird".

I don't consider setting sys.ps1 and sys.stdout to be "weird", which is
why Victor has to leave sys unlocked.

Read-only by default would play havok with such simple idioms as global
variables. (Globals may be considered harmful, but they're not
considered "weird", and they're also more intuitive to many beginners
than function arguments and return values. Strange but true.) As much as
I wish to discourage people from using the global statement to rebind
globals, I consider it completely unacceptable to have to teach
beginners how to disable read-only mode before they've even mastered
writing simple functions.

I applaud Victor for his experiment, and would like to suggest a couple
of approaches he might like to think about. I assume that read-only mode
can be set on a per-module basis.

* For simplicity, read-only mode is all-or-nothing on a per module
basis. If the module is locked, so are the functions and classes defined
by that module. If the module is not locked, neither are the functions
and classes.

(By locked, I mean Victor's read-only mode where globals and class
attributes cannot be re-bound, etc.)

* For backwards compatibility, any (eventual) production use of this
would have to default to off. Perhaps in Python 4 or 5 we can consider
defaulting to on.

* Define an (optional) module global, say, __readonly__, which defaults
to False. The module author must explicitly set it to True if they wish
to lock the module in read-only mode. There's no way to enable the
read-only optimizations by accident, you have to explicitly turn them
on.

* However there are ways to auto-detect when *not* to enable them. E.g.
if a module uses the global statement in any function or method,
read-only mode is disabled for that module.

* Similarly, a Python switch to enable/disable read-only mode. I don't
mind if the switch --enable-readonly is true by defalt, so long as
individual modules default to unlocked.

How about testing? It's very common, useful, and very much non-weird to
reach into a module and monkey-patch it for the purposes of testing. I
don't have a good solution to that, but a couple of stream of
consciousness suggestions:

- Would it help if there was a statement "import unlocked mymodule" that
forces mymodule to remain unlocked rather than read-only?

- Would it help if you could make a copy of a readonly module in an
unlocked state?

- Obviously the "best" (most obvious) solution would be if there was a
way to unlock modules on the fly, but Victor suggests that's hard.

--
Steven

From rosuav at gmail.com Wed May 21 04:20:14 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 21 May 2014 12:20:14 +1000
Subject: [Python-ideas] Make Python code read-only
In-Reply-To: <20140521014203.GE10355@ando>
References:

<20140521014203.GE10355@ando>
Message-ID:

On Wed, May 21, 2014 at 11:42 AM, Steven D'Aprano wrote:
> On Wed, May 21, 2014 at 03:37:42AM +1000, Chris Angelico wrote:
>
>> With that (rather big, and yet quite trivial) caveat, though: Looks
>> interesting. Optimizing for the >99% of code that doesn't do weird
>> things makes very good sense, just as long as the <1% can be catered
>> for.
>
> "99% of Python code doesn't do weird things..."
>
> It seems to me that this is a myth, or at least unjustifiable by the
> facts as we have seen it. Victor's experiment shows 25 modules from the
> standard library are modifiable, with 139 read-only. That's more like
> 15% than 1% "weird".
>
> I don't consider setting sys.ps1 and sys.stdout to be "weird", which is
> why Victor has to leave sys unlocked.

Allow me to clarify. A module mutating its own globals is not at all
weird; the only thing I'm calling weird is reaching into another
module's globals and changing things. In a few rare cases (like
sys.ps1 and sys.stdout), this is part of the documented interface of
the module; but if (in a parallel universe) Python were designed such
that this sort of thing is impossible, it wouldn't be illogical to
have a "sys.set_ps1()" function, because the author(s) of the sys
module *expect* ps1 to be changed. In contrast, the random module
makes use of a bunch of stuff from math (importing them all with
underscores, presumably to keep them out of "from random import *",
although __all__ is also set), and it is NOT normal to reach in and
change them. And before you say "Well, that has an underscore, of
course you don't fiddle with it", other modules like imaplib will
happily import without underscores - is it expected that you should be
able to change imaplib.random to have it use a different random number
generator? Or, for that matter, to replace some of its helper
functions like imaplib.Int2AP? That, I think, would be considered
weird.

So there are 15% that change their own globals, which is fine. In this
particular instance, we can't optimize for the whole of the 99%, but I
maintain that the 15% is not all "weird" just because it's not
optimizable. How many modules actually expect that their globals will
be externally changed?

ChrisA

From greg at krypto.org Wed May 21 06:45:26 2014
From: greg at krypto.org (Gregory P. Smith)
Date: Tue, 20 May 2014 21:45:26 -0700
Subject: [Python-ideas] python configure --with-ssl
In-Reply-To: <537B2186.3080907@gmail.com>
References: <537A2E5F.4060004@gmail.com>

<537B2186.3080907@gmail.com>
Message-ID:

On Tue, May 20, 2014 at 2:33 AM, Pavel Machyniak wrote:

> On 20.5.2014 0:30, Ned Deily wrote:
> > In article <537A2E5F.4060004 at gmail.com>,
> > Pavel Machyniak
> > wrote:
> >> please add option to python configure for setting custom `openssl`
> >> installation on python build, eg `--with-ssl=path` as used commonly.
> >> Otherwise it is difficult to build python with specific `openssl`
> >> installation/compilation, see eg.
> >>
> http://stackoverflow.com/questions/22409092/coredump-when-compiling-python-wit
> >> h-a-custom-openssl-version.
> >
> > This sort of request has come up before on the Python bug tracker. With
> > a quick search, I didn't find an exact match for what you request,
> > although there might be a more general issue open to allow more control
> > over all third-party libraries. However, there is
> > http://bugs.python.org/issue5575 which provides a patch to allow control
> > using environment variables rather than configure options. Feel free to
> > open a new issue or comment on this one.
> >
>
> Thanks,
>
> I am well aware of the patch but it does not work if there is default
> openssl installation within the system (because it only adds another
> path to the END of the search list), and although the patch is from the
> year 2009 it is not released (accepted?) yet.
>
> I will probably find some time and propose the solution/patch using
> configure options --with-ssl (and also --with-ssl-includes,
> --with-ssl-libs, --with_krb5, and maybe --with-sqlite,
> --with-sqlite-includes, --with-sqlite-libs as well).
>

If you ever go so far as to include options for everything, please include
the ability to point to a specific path for readline, ncurses and zlib as
well. :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ericsnowcurrently at gmail.com Wed May 21 08:26:03 2014
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 21 May 2014 00:26:03 -0600
Subject: [Python-ideas] Make Python code read-only
In-Reply-To:
References:
Message-ID:

On Tue, May 20, 2014 at 10:57 AM, Victor Stinner
wrote:
> Issues with read-only code
> ==========================

Other things to consider:

* reload() will no longer work (it loads into the existing module ns)
* the module-replaces-self-in-sys-modules hack will be weird
* class decorators that modify the class will no longer work
* caching class attrs that are lazily set by instances will no longer
work (similar to modules)
* singletons stored on the class will break

-eric

From greg.ewing at canterbury.ac.nz Wed May 21 08:29:11 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 21 May 2014 18:29:11 +1200
Subject: [Python-ideas] Make Python code read-only
In-Reply-To: <20140521014203.GE10355@ando>
References:

<20140521014203.GE10355@ando>
Message-ID: <537C47B7.2080404@canterbury.ac.nz>

Steven D'Aprano wrote:
> Read-only by default would play havok with such simple idioms as global
> variables.

I don't see why there couldn't be a way to exempt selected
names in a module from read-only status.

An exemption could be inferred whenever a name is
referenced by a 'global' statement. There should also be
a way to explicitly mark a name as exempt, to take care
of sys.stdout etc., and cases where the only mutations are
done from a different module, so there is no global
statement.

For modules implemented in Python, the explicit marker
could consist of a global statement at the top level,
which is currently allowed but redundant.

--
Greg

From theller at ctypes.org Wed May 21 08:58:48 2014
From: theller at ctypes.org (Thomas Heller)
Date: Wed, 21 May 2014 08:58:48 +0200
Subject: [Python-ideas] pathlib suggestion
In-Reply-To: <20140520172518.4576b754@fsol>
References: <20140520172518.4576b754@fsol>
Message-ID:

Am 20.05.2014 17:25, schrieb Antoine Pitrou:
> On Tue, 20 May 2014 10:51:22 +0200
> Thomas Heller wrote:
>> Python 3.4's pathlib uses str(path) to get the full pathname
>> as string.
>>
>> I'd like to suggest adding a property which allows to access
>> the full pathname. IMO this should make it easier to understand
>> the code or make if possible to search for it in sources.
>>
>> I'm unsure about the name this property should get; maybe .fullpath
>> or something like that. I'm also unsure whether there should be
>> separate properties to get the full pathname as string or bytes object.
>
> .strpath perhaps?
> (also .bytespath if desired)

The names .strpath and .bytespath look good to me.

>
> It was once proposed as "filesystem path" protocol where classes
> purporting to represent filesystem paths could define a e.g.
> __strpath__ method returning the string representation of the path. I
> can only find the following allusions on python-ideas:
> https://mail.python.org/pipermail/python-ideas/2012-October/016912.html
> https://mail.python.org/pipermail/python-ideas/2012-October/016974.html

This is not directly related of my proposal, but it may be a good
idea. So __strpath__() would return the .strpath property?

Thomas

From machyniak at gmail.com Wed May 21 10:46:32 2014
From: machyniak at gmail.com (Pavel Machyniak)
Date: Wed, 21 May 2014 10:46:32 +0200
Subject: [Python-ideas] python configure --with-ssl
In-Reply-To:
References: <537A2E5F.4060004@gmail.com>
<537B2186.3080907@gmail.com>

Message-ID: <537C67E8.4010505@gmail.com>

On 21.5.2014 6:45, Gregory P. Smith wrote:
>
>
>
> On Tue, May 20, 2014 at 2:33 AM, Pavel Machyniak > wrote:
>
> On 20.5.2014 0:30, Ned Deily wrote:
> > In article <537A2E5F.4060004 at gmail.com
> >,
> > Pavel Machyniak >
> > wrote:
> >> please add option to python configure for setting custom `openssl`
> >> installation on python build, eg `--with-ssl=path` as used commonly.
> >> Otherwise it is difficult to build python with specific `openssl`
> >> installation/compilation, see eg.
> >>
> http://stackoverflow.com/questions/22409092/coredump-when-compiling-python-wit
> >> h-a-custom-openssl-version.
> >
> > This sort of request has come up before on the Python bug tracker.
> With
> > a quick search, I didn't find an exact match for what you request,
> > although there might be a more general issue open to allow more
> control
> > over all third-party libraries. However, there is
> > http://bugs.python.org/issue5575 which provides a patch to allow
> control
> > using environment variables rather than configure options. Feel
> free to
> > open a new issue or comment on this one.
> >
>
> Thanks,
>
> I am well aware of the patch but it does not work if there is default
> openssl installation within the system (because it only adds another
> path to the END of the search list), and although the patch is from the
> year 2009 it is not released (accepted?) yet.
>
> I will probably find some time and propose the solution/patch using
> configure options --with-ssl (and also --with-ssl-includes,
> --with-ssl-libs, --with_krb5, and maybe --with-sqlite,
> --with-sqlite-includes, --with-sqlite-libs as well).
>
>
> If you ever go so far as to include options for everything, please
> include the ability to point to a specific path for readline, ncurses
> and zlib as well. :)
>

Created an issue, please see and comment there:
http://bugs.python.org/issue21541

From ncoghlan at gmail.com Wed May 21 11:43:05 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 21 May 2014 19:43:05 +1000
Subject: [Python-ideas] Make Python code read-only
In-Reply-To: <20140521014203.GE10355@ando>
References:

<20140521014203.GE10355@ando>
Message-ID:

On 21 May 2014 11:48, "Steven D'Aprano" wrote:
>
> On Wed, May 21, 2014 at 03:37:42AM +1000, Chris Angelico wrote:
>
> > With that (rather big, and yet quite trivial) caveat, though: Looks
> > interesting. Optimizing for the >99% of code that doesn't do weird
> > things makes very good sense, just as long as the <1% can be catered
> > for.
>
> "99% of Python code doesn't do weird things..."
>
> It seems to me that this is a myth, or at least unjustifiable by the
> facts as we have seen it. Victor's experiment shows 25 modules from the
> standard library are modifiable, with 139 read-only. That's more like
> 15% than 1% "weird".

It also misses the big reason I am a Python programmer rather than a Java
programmer.

For me, Python is primarily an orchestration language. It is the language
for the code that is telling everything else what to do. If my Python code
is an overall performance bottleneck, then "Huzzah!", as it means I have
finally engineered all the other structural bottlenecks out of the system.

For this use case, monkey patching is not an incidental feature to be
tolerated merely for backwards compatibility reasons: it is a key
capability that makes Python an ideal language for me, as it takes ultimate
control of what dependencies do away from the original author and places it
in my hands as the system integrator. This is a dangerous power, not to be
used lightly, but it also grants me the ability to work around critical
bugs in dependencies at run time, rather than having to fork and patch the
source the way Java developers tend to do.

Victor's proposal is to make Python more complicated and a worse
orchestration language, for the sake of making it a better applications
programming language. In isolation, it might be possible to make that case,
but in the presence of PyPy for a full dynamically optimised runtime and
tools like Cython and Numba for selective optimisation within CPython, no.

Regards,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ned at nedbatchelder.com Wed May 21 13:05:49 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Wed, 21 May 2014 07:05:49 -0400
Subject: [Python-ideas] Disable all peephole optimizations
Message-ID: <537C888D.7060903@nedbatchelder.com>

** The problem

A long-standing problem with CPython is that the peephole optimizer
cannot be completely disabled. Normally, peephole optimization is a
good thing, it improves execution speed. But in some situations, like
coverage testing, it's more important to be able to reason about the
code's execution. I propose that we add a way to completely disable the
optimizer.

To demonstrate the problem, here is continue.py:

a = b = c = 0
for n in range(100):
if n % 2:
if n % 4:
a += 1
continue
else:
b += 1
c += 1
assert a == 50 and b == 50 and c == 50

If you execute "python3.4 -m trace -c -m continue.py", it produces this
continue.cover file:

1: a = b = c = 0
101: for n in range(100):
100: if n % 2:
50: if n % 4:
50: a += 1
>>>>>> continue
else:
50: b += 1
50: c += 1
1: assert a == 50 and b == 50 and c == 50

This indicates that the continue line is not executed. It's true: the
byte code for that statement is not executed, because the peephole
optimizer has removed the jump to the jump. But in reasoning about the
code, the continue statement is clearly part of the semantics of this
program. If you remove the statement, the program will run
differently. If you had to explain this code to a learner, you would of
course describe the continue statement as part of the execution. So the
trace output does not match our (correct) understanding of the program.

The reason we are running trace (or coverage.py) in the first place is
to learn something about our code, but it is misleading us. The peephole
optimizer is interfering with our ability to reason about the code. We
need a way to disable the optimizer so that this won't happen. This
type of control is well-known in C compilers, for the same reasons: when
running code, optimization is good for speed; when reasoning about code,
optimization gets in the way.

More details are in http://bugs.python.org/issue2506, which also
includes previous discussion of the idea.

This has come up on Python-Dev, and Guido seemed supportive:
https://mail.python.org/pipermail/python-dev/2012-December/123099.html .

** Implementation

Although it may seem like a big change to be able to disable the
optimizer, the heart of it is quite simple. In compile.c is the only
call to PyCode_Optimize. That function takes a string of bytecode and
returns another. If we skip that call, the peephole optimizer is disabled.

** User Interface

Unfortunately, the -O command-line switch does not lend itself to a new
value that means, "less optimization than the default." I propose a new
switch -P, to control the peephole optimizer, with a value of -P0
meaning no optimization at all. The PYTHONPEEPHOLE environment variable
would also control the option.

There are about a dozen places internal to CPython where optimization
level is indicated with an integer, for example, in
Py_CompileStringObject. Those uses also don't allow for new values
indicating less optimization than the default: 0 and -1 already have
meanings. Unless we want to start using -2 for less that the default.
I'm not sure we need to provide for those values, or if the
PYTHONPEEPHOLE environment variable provides enough control.

** Ramifications

This switch makes no changes to the semantics of Python programs,
although clearly, if you are tracing a program, the exact sequence of
lines and bytecodes will be different (this is the whole point).

In the ticket, one objection raised is that providing this option will
complicate testing, and that optimization is a difficult enough thing to
get right as it is. I disagree, I think providing this option will help
test the optimizer, because it will give us a way to test that code runs
the same with and without the optimizer. This gives us a tool to use to
demonstrate that the optimizer isn't changing the behavior of programs.

-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ncoghlan at gmail.com Wed May 21 13:41:45 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 21 May 2014 21:41:45 +1000
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To: <537C888D.7060903@nedbatchelder.com>
References: <537C888D.7060903@nedbatchelder.com>
Message-ID:

On 21 May 2014 21:06, "Ned Batchelder" wrote:
> ** Implementation
>
> Although it may seem like a big change to be able to disable the
optimizer, the heart of it is quite simple. In compile.c is the only call
to PyCode_Optimize. That function takes a string of bytecode and returns
another. If we skip that call, the peephole optimizer is disabled.
>
> ** User Interface
>
> Unfortunately, the -O command-line switch does not lend itself to a new
value that means, "less optimization than the default." I propose a new
switch -P, to control the peephole optimizer, with a value of -P0 meaning
no optimization at all. The PYTHONPEEPHOLE environment variable would also
control the option.

Since this is a CPython specific thing, a -X named command line option
would be more appropriate.

>
> There are about a dozen places internal to CPython where optimization
level is indicated with an integer, for example, in
Py_CompileStringObject. Those uses also don't allow for new values
indicating less optimization than the default: 0 and -1 already have
meanings. Unless we want to start using -2 for less that the default. I'm
not sure we need to provide for those values, or if the PYTHONPEEPHOLE
environment variable provides enough control.

I assume you want the environment variable so the setting can be inherited
by subprocesses?

Cheers,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From steve at pearwood.info Wed May 21 14:13:19 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 21 May 2014 22:13:19 +1000
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To: <537C888D.7060903@nedbatchelder.com>
References: <537C888D.7060903@nedbatchelder.com>
Message-ID: <20140521121319.GG10355@ando>

On Wed, May 21, 2014 at 07:05:49AM -0400, Ned Batchelder wrote:
> ** The problem
>
> A long-standing problem with CPython is that the peephole optimizer
> cannot be completely disabled. Normally, peephole optimization is a
> good thing, it improves execution speed. But in some situations, like
> coverage testing, it's more important to be able to reason about the
> code's execution. I propose that we add a way to completely disable the
> optimizer.

I'm not sure whether this is an argument for or against your proposal,
but the continue statement shown below is *not* dead code and should not
be optimized out. The assert fails if you remove the continue statement.

I don't have 3.4 on this machine to test with, but using 3.3, I can see
no evidence that `continue` is optimized away. Later in your post, you
say:

> It's true: the
> byte code for that statement [the continue] is not executed, because
> the peephole optimizer has removed the jump to the jump.

But that cannot be true, because if it were, the assertion would
fail. Here's your code again:

> To demonstrate the problem, here is continue.py:
>
> a = b = c = 0
> for n in range(100):
> if n % 2:
> if n % 4:
> a += 1
> continue
> else:
> b += 1
> c += 1
> assert a == 50 and b == 50 and c == 50

If the continue were not executed, c would equal 100 and the assertion
would fail. Have I misunderstood something?

(By the way, as given, your indents are inconsistent: some are 4 spaces
and some are 5.)

--
Steven

From j.wielicki at sotecware.net Wed May 21 14:21:50 2014
From: j.wielicki at sotecware.net (Jonas Wielicki)
Date: Wed, 21 May 2014 14:21:50 +0200
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To: <20140521121319.GG10355@ando>
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
Message-ID: <537C9A5E.1060502@sotecware.net>

On 21.05.2014 14:13, Steven D'Aprano wrote:
> On Wed, May 21, 2014 at 07:05:49AM -0400, Ned Batchelder wrote:
>> ** The problem
>>
>> A long-standing problem with CPython is that the peephole optimizer
>> cannot be completely disabled. Normally, peephole optimization is a
>> good thing, it improves execution speed. But in some situations, like
>> coverage testing, it's more important to be able to reason about the
>> code's execution. I propose that we add a way to completely disable the
>> optimizer.
>
> I'm not sure whether this is an argument for or against your proposal,
> but the continue statement shown below is *not* dead code and should not
> be optimized out. The assert fails if you remove the continue statement.
>
> I don't have 3.4 on this machine to test with, but using 3.3, I can see
> no evidence that `continue` is optimized away.

The logical continue is still there -- what happens is that the
optimizer rewrites the `else` jump at the preceding `if` condition,
which would normally point at the `continue` statement, to the beginning
of the loop, because it would be a jump (to the continue) to a jump (to
the for loop header).

Thus, the actual continue statement is not reached, but logically the
code does the same, because the only way continue would have been
reached was transformed to a continue itself.

> Later in your post, you
> say:
>
>> It's true: the
>> byte code for that statement [the continue] is not executed, because
>> the peephole optimizer has removed the jump to the jump.
>
> But that cannot be true, because if it were, the assertion would
> fail. Here's your code again:
>
>
>> To demonstrate the problem, here is continue.py:
>>
>> a = b = c = 0
>> for n in range(100):
>> if n % 2:
>> if n % 4:
>> a += 1
>> continue
>> else:
>> b += 1
>> c += 1
>> assert a == 50 and b == 50 and c == 50
>
> If the continue were not executed, c would equal 100 and the assertion
> would fail. Have I misunderstood something?
>
> (By the way, as given, your indents are inconsistent: some are 4 spaces
> and some are 5.)
>
>

From p.f.moore at gmail.com Wed May 21 15:05:07 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 21 May 2014 14:05:07 +0100
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To: <537C888D.7060903@nedbatchelder.com>
References: <537C888D.7060903@nedbatchelder.com>
Message-ID:

On 21 May 2014 12:05, Ned Batchelder wrote:
> Unfortunately, the -O command-line switch does not lend itself to a new
> value that means, "less optimization than the default." I propose a new
> switch -P, to control the peephole optimizer, with a value of -P0 meaning no
> optimization at all. The PYTHONPEEPHOLE environment variable would also
> control the option.

The idea sounds reasonable (pretty specialised, but that's OK). But
one pitfall is that unless you encode the PYTHONPEEPHOLE setting in
the bytecode filename then people will have to remember to delete all
bytecode files before using the flag, or the interpreter will pick up
an optimised pyc file. Or maybe pyc/pyo files should be ignored if
PYTHONPEEPHOLE is set? That's probably simpler.

Paul

From bcannon at gmail.com Wed May 21 15:51:48 2014
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 21 May 2014 13:51:48 +0000
Subject: [Python-ideas] Disable all peephole optimizations
References: <537C888D.7060903@nedbatchelder.com>

Message-ID:

On Wed May 21 2014 at 9:05:48 AM, Paul Moore wrote:

> On 21 May 2014 12:05, Ned Batchelder wrote:
> > Unfortunately, the -O command-line switch does not lend itself to a new
> > value that means, "less optimization than the default." I propose a new
> > switch -P, to control the peephole optimizer, with a value of -P0
> meaning no
> > optimization at all. The PYTHONPEEPHOLE environment variable would also
> > control the option.
>
> The idea sounds reasonable (pretty specialised, but that's OK). But
> one pitfall is that unless you encode the PYTHONPEEPHOLE setting in
> the bytecode filename then people will have to remember to delete all
> bytecode files before using the flag, or the interpreter will pick up
> an optimised pyc file. Or maybe pyc/pyo files should be ignored if
> PYTHONPEEPHOLE is set? That's probably simpler.
>

There are constant rumblings about trying to make .pyc/.pyo aware of what
optimizations were applied so that this kind of thing wouldn't occur. It
would require tweaking how optimizations are expressed/added so that they
are more easily controlled and can somehow contribute to the labeling of
what optimizations were applied. All totally doable but will require
thinking about the proper API and such (reading .pyc/.pyo files would also
break but that's happened before when we added file size to the header and
.pyc/.pyo files are viewed as internal optimizations anyway).
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ned at nedbatchelder.com Wed May 21 16:12:58 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Wed, 21 May 2014 10:12:58 -0400
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com>

Message-ID: <537CB46A.3040002@nedbatchelder.com>

On 5/21/14 7:41 AM, Nick Coghlan wrote:
>
>
> On 21 May 2014 21:06, "Ned Batchelder" > wrote:
> > ** Implementation
> >
> > Although it may seem like a big change to be able to disable the
> optimizer, the heart of it is quite simple. In compile.c is the only
> call to PyCode_Optimize. That function takes a string of bytecode and
> returns another. If we skip that call, the peephole optimizer is
> disabled.
> >
> > ** User Interface
> >
> > Unfortunately, the -O command-line switch does not lend itself to a
> new value that means, "less optimization than the default." I propose
> a new switch -P, to control the peephole optimizer, with a value of
> -P0 meaning no optimization at all. The PYTHONPEEPHOLE environment
> variable would also control the option.
>
> Since this is a CPython specific thing, a -X named command line option
> would be more appropriate.
>
I had overlooked the introduction of -X. Yes, that seems like the right
way: -Xpeephole=0
>
> >
> > There are about a dozen places internal to CPython where
> optimization level is indicated with an integer, for example, in
> Py_CompileStringObject. Those uses also don't allow for new values
> indicating less optimization than the default: 0 and -1 already have
> meanings. Unless we want to start using -2 for less that the
> default. I'm not sure we need to provide for those values, or if the
> PYTHONPEEPHOLE environment variable provides enough control.
>
> I assume you want the environment variable so the setting can be
> inherited by subprocesses?
>
It allows it to be inherited by subprocesses, yes. I was hoping it
would mean the setting would be available deeper in the interpreter, but
now that I think about it, environment variables are interpreted at the
top of the interpreter, and then the settings passed along internally.
I'll do a survey to figure out where the setting has to be plumbed
through the layers to get to compile.c properly.

--Ned.
>
> Cheers,
> Nick.
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ned at nedbatchelder.com Wed May 21 16:13:57 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Wed, 21 May 2014 10:13:57 -0400
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com>

Message-ID: <537CB4A5.1030003@nedbatchelder.com>

On 5/21/14 9:05 AM, Paul Moore wrote:
> On 21 May 2014 12:05, Ned Batchelder wrote:
>> Unfortunately, the -O command-line switch does not lend itself to a new
>> value that means, "less optimization than the default." I propose a new
>> switch -P, to control the peephole optimizer, with a value of -P0 meaning no
>> optimization at all. The PYTHONPEEPHOLE environment variable would also
>> control the option.
> The idea sounds reasonable (pretty specialised, but that's OK). But
> one pitfall is that unless you encode the PYTHONPEEPHOLE setting in
> the bytecode filename then people will have to remember to delete all
> bytecode files before using the flag, or the interpreter will pick up
> an optimised pyc file. Or maybe pyc/pyo files should be ignored if
> PYTHONPEEPHOLE is set? That's probably simpler.
For my use case, it would be enough to use whatever .pyc files the
interpreter finds. For a testing scenario, it is fine to delete all the
.pyc files, set PYTHONPEEPHOLE, and then run the test suite to be sure
to avoid optimized pyc files.
>
> Paul

From ned at nedbatchelder.com Wed May 21 16:23:41 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Wed, 21 May 2014 10:23:41 -0400
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To: <537C9A5E.1060502@sotecware.net>
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net>
Message-ID: <537CB6ED.2050906@nedbatchelder.com>

On 5/21/14 8:21 AM, Jonas Wielicki wrote:
> On 21.05.2014 14:13, Steven D'Aprano wrote:
>> >On Wed, May 21, 2014 at 07:05:49AM -0400, Ned Batchelder wrote:
>>> >>** The problem
>>> >>
>>> >>A long-standing problem with CPython is that the peephole optimizer
>>> >>cannot be completely disabled. Normally, peephole optimization is a
>>> >>good thing, it improves execution speed. But in some situations, like
>>> >>coverage testing, it's more important to be able to reason about the
>>> >>code's execution. I propose that we add a way to completely disable the
>>> >>optimizer.
>> >
>> >I'm not sure whether this is an argument for or against your proposal,
>> >but the continue statement shown below is*not* dead code and should not
>> >be optimized out. The assert fails if you remove the continue statement.
>> >
>> >I don't have 3.4 on this machine to test with, but using 3.3, I can see
>> >no evidence that `continue` is optimized away.
> The logical continue is still there -- what happens is that the
> optimizer rewrites the `else` jump at the preceding `if` condition,
> which would normally point at the `continue` statement, to the beginning
> of the loop, because it would be a jump (to the continue) to a jump (to
> the for loop header).
>
> Thus, the actual continue statement is not reached, but logically the
> code does the same, because the only way continue would have been
> reached was transformed to a continue itself.
>
To make the details more explicit, here is the source again, and the
disassembled code, with the original source interspersed:

a = b = c = 0
for n in range(100):
if n % 2:
if n % 4:
a += 1
continue
else:
b += 1
c += 1
assert a == 50 and b == 50 and c == 50

Disassembled (Python 3.4, but the same effect is visible in 2.7, 3.3, etc):

a = b = c = 0
1 0 LOAD_CONST 0 (0)
3 DUP_TOP
4 STORE_NAME 0 (a)
7 DUP_TOP
8 STORE_NAME 1 (b)
11 STORE_NAME 2 (c)

for n in range(100):
2 14 SETUP_LOOP 79 (to 96)
17 LOAD_NAME 3 (range)
20 LOAD_CONST 1 (100)
23 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
26 GET_ITER
>> 27 FOR_ITER 65 (to 95)
30 STORE_NAME 4 (n)

if n % 2:
3 33 LOAD_NAME 4 (n)
36 LOAD_CONST 2 (2)
39 BINARY_MODULO
40 POP_JUMP_IF_FALSE 72

if n % 4:
4 43 LOAD_NAME 4 (n)
46 LOAD_CONST 3 (4)
49 BINARY_MODULO
50 POP_JUMP_IF_FALSE 27

a += 1
5 53 LOAD_NAME 0 (a)
56 LOAD_CONST 4 (1)
59 INPLACE_ADD
60 STORE_NAME 0 (a)
63 JUMP_ABSOLUTE 27

continue
6 66 JUMP_ABSOLUTE 27
69 JUMP_FORWARD 10 (to 82)

b += 1
8 >> 72 LOAD_NAME 1 (b)
75 LOAD_CONST 4 (1)
78 INPLACE_ADD
79 STORE_NAME 1 (b)

c += 1
9 >> 82 LOAD_NAME 2 (c)
85 LOAD_CONST 4 (1)
88 INPLACE_ADD
89 STORE_NAME 2 (c)
92 JUMP_ABSOLUTE 27
>> 95 POP_BLOCK

assert a == 50 and b == 50 and c == 50
10 >> 96 LOAD_NAME 0 (a)
99 LOAD_CONST 5 (50)
102 COMPARE_OP 2 (==)
105 POP_JUMP_IF_FALSE 132
108 LOAD_NAME 1 (b)
111 LOAD_CONST 5 (50)
114 COMPARE_OP 2 (==)
117 POP_JUMP_IF_FALSE 132
120 LOAD_NAME 2 (c)
123 LOAD_CONST 5 (50)
126 COMPARE_OP 2 (==)
129 POP_JUMP_IF_TRUE 138
>> 132 LOAD_GLOBAL 5 (AssertionError)
135 RAISE_VARARGS 1
>> 138 LOAD_CONST 6 (None)
141 RETURN_VALUE

Notice that line 6 (the continue) is unreachable, because the else-jump
from line 4 has been turned into a jump to bytecode offset 27 (the for
loop), and the end of line 5 has also been turned into a jump to 27,
rather than letting it flow to line 6. So line 6 still exists in the
bytecode, but is never executed, leading tracing tools to indicate that
line 6 is never executed.

--Ned.

-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ethan at stoneleaf.us Wed May 21 16:11:21 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 21 May 2014 07:11:21 -0700
Subject: [Python-ideas] Make Python code read-only
In-Reply-To:
References:

<20140521014203.GE10355@ando>

Message-ID: <537CB409.5000104@stoneleaf.us>

On 05/21/2014 02:43 AM, Nick Coghlan wrote:
>
> For this use case, monkey patching is not an incidental feature to be tolerated merely for backwards compatibility
> reasons: it is a key capability that makes Python an ideal language for me, as it takes ultimate control of what
> dependencies do away from the original author and places it in my hands as the system integrator. This is a dangerous
> power, not to be used lightly, but it also grants me the ability to work around critical bugs in dependencies at run
> time, rather than having to fork and patch the source the way Java developers tend to do.

+inf

--
~Ethan~

From mulhern at gmail.com Wed May 21 16:49:36 2014
From: mulhern at gmail.com (mulhern)
Date: Wed, 21 May 2014 10:49:36 -0400
Subject: [Python-ideas] Maybe/Option builtin
Message-ID:

I feel that a Maybe/Option type, analogous to the types found in Haskell or
OCaml would actually be useful in Python. The value analogous to the None
constructor should be Python's None.

Obviously, it wouldn't give the type-checking benefits that it gives in
statically checked languages, but every use of a Maybe object as if it were
the contained object would give an error, alerting the user to the fact
that None is a possibility and allowing them to address the problem sooner
rather than later.

I feel that it would be kind of tricky to implement it as a class.
Something like:

class Maybe(object):

def __init__(self, value=None):
self.value = value

def value(self):
return self.value

is a start but I'm not able to see how to make

if Maybe():
print("nothing") # never prints

but

if Maybe({}):
print("yes a value") #always prints

which is definitely the desired behaviour.

I also think that it would be the first Python type introduced solely
because of its typey properties, not because it provided any actual
functionality, which might be considered unpythonic.

Any comments?

Thanks!

- mulhern
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From Steve.Dower at microsoft.com Wed May 21 16:56:23 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Wed, 21 May 2014 14:56:23 +0000
Subject: [Python-ideas] Make Python code read-only
In-Reply-To: <537CB409.5000104@stoneleaf.us>
References:

<20140521014203.GE10355@ando>
,
<537CB409.5000104@stoneleaf.us>
Message-ID: <1d4b9976723c4013addf2c8ec3c77f0f@BLUPR03MB389.namprd03.prod.outlook.com>

Another +inf from me.

Mind if I quote you on this next time I'm trying to convince C# developers to take Python seriously? :)

Top-posted from my Windows Phone
________________________________
From: Ethan Furman
Sent: ?5/?21/?2014 7:37
To: python-ideas at python.org
Subject: Re: [Python-ideas] Make Python code read-only

On 05/21/2014 02:43 AM, Nick Coghlan wrote:
>
> For this use case, monkey patching is not an incidental feature to be tolerated merely for backwards compatibility
> reasons: it is a key capability that makes Python an ideal language for me, as it takes ultimate control of what
> dependencies do away from the original author and places it in my hands as the system integrator. This is a dangerous
> power, not to be used lightly, but it also grants me the ability to work around critical bugs in dependencies at run
> time, rather than having to fork and patch the source the way Java developers tend to do.

+inf

--
~Ethan~
_______________________________________________
Python-ideas mailing list
Python-ideas at python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From j.wielicki at sotecware.net Wed May 21 17:16:17 2014
From: j.wielicki at sotecware.net (Jonas Wielicki)
Date: Wed, 21 May 2014 17:16:17 +0200
Subject: [Python-ideas] Maybe/Option builtin
In-Reply-To:
References:
Message-ID: <537CC341.7030901@sotecware.net>

On 21.05.2014 16:49, mulhern wrote:
> Something like:
>
> class Maybe(object):
>
> def __init__(self, value=None):
> self.value = value
>
> def value(self):
> return self.value
>
> is a start but I'm not able to see how to make
>
> if Maybe():
> print("nothing") # never prints
>
> but
>
> if Maybe({}):
> print("yes a value") #always prints

Implement the __bool__ method:
.

I don?t have any opinion on the proposal itself.

regards,
Jonas

From ahammel87 at gmail.com Wed May 21 17:38:16 2014
From: ahammel87 at gmail.com (Alex Hammel)
Date: Wed, 21 May 2014 08:38:16 -0700
Subject: [Python-ideas] Maybe/Option builtin
In-Reply-To:
References:
Message-ID:

What's a specific use case? Usually a Maybe is used to model a chain of
computations which might fail. You can use exceptions for that in Python:

try:
foo1 = bar()
foo2 = baz(foo1)
foo3 = quux(foo2)
except FooException:
# recover

On occasion I've wanted to do the opposite: call a number of functions and
keep the value of the first one that doesn't throw an exception. I
implemented it like this.

That certainly doesn't need to be a built-in, though, and I'm not convinced
it belongs in that standard library. It's a relatively rare use-case, and
it's easy to roll-your-own if you need it.

On Wed, May 21, 2014 at 7:49 AM, mulhern wrote:

> I feel that a Maybe/Option type, analogous to the types found in Haskell
> or OCaml would actually be useful in Python. The value analogous to the
> None constructor should be Python's None.
>
> Obviously, it wouldn't give the type-checking benefits that it gives in
> statically checked languages, but every use of a Maybe object as if it were
> the contained object would give an error, alerting the user to the fact
> that None is a possibility and allowing them to address the problem sooner
> rather than later.
>
> I feel that it would be kind of tricky to implement it as a class.
> Something like:
>
> class Maybe(object):
>
> def __init__(self, value=None):
> self.value = value
>
> def value(self):
> return self.value
>
> is a start but I'm not able to see how to make
>
> if Maybe():
> print("nothing") # never prints
>
> but
>
> if Maybe({}):
> print("yes a value") #always prints
>
> which is definitely the desired behaviour.
>
> I also think that it would be the first Python type introduced solely
> because of its typey properties, not because it provided any actual
> functionality, which might be considered unpythonic.
>
> Any comments?
>
> Thanks!
>
> - mulhern
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From raymond.hettinger at gmail.com Wed May 21 21:44:24 2014
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Wed, 21 May 2014 20:44:24 +0100
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References:
Message-ID: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com>

On May 21, 2014, at 1:21 PM, python-ideas-request at python.org wrote:

> I propose that we add a way to completely disable the
> optimizer.

I think this opens a can of worms that is better left closed.

* We will have to start running tests both with and without the switch
turned on for example (because you're exposing yet another way to
run Python with different code).

* Over time, I expect that some of the functionality of the peepholer
is going to be moved upstream into AST transformations you will
have even less ability switch something on-and-off.

* The code in-place has been in the code for over a decade and
the tracker item has languished for years. That provides some
evidence that the "need" here is very small.

* I sympathize with "there is an irritating dimple in coverage.py"
but that hasn't actually impaired its usability beyond creating a
curiosity. Using that a reason to add a new CPython-only
command-line switch seems like having the tail wag the dog.

* As the other implementations of Python continue to develop,
I don't think we should tie their hands with respect to code
generation.

* Ideally, the peepholer should be thought of as part of the code
generation. As compilation improves over time, it should start
to generate the same code as we're getting now. It probably
isn't wise to expose the implementation detail that the constant
folding and jump tweaks are done in a separate second pass.

* Mostly, I don't want to open a new crack in the Python veneer
where people are switching on and off two different streams of
code generation (currently, there is one way to do it). I can't
fully articulate my instincts here, but I think we'll regret opening
this door when we didn't have to.

That being said, I know how the politics of python-ideas works
and I expect that my thoughts on the subject will quickly get
buried by a discussion of which lettercode should be used for the
command-line switch.

Hopefully, some readers will focus on the question of whether
it is worth it. Others might look at ways to improve the existing
code (without an off-switch) so that the continue-statement
jump-to-jump shows-up in your coverage tool.

IMO, adding a new command-line switch is a big deal (we should
do it very infrequently, limit it to things with a big payoff, and
think about whether there are any downsides). Personally, I don't
see any big wins here and have a sense that there are downsides
that would make us regret exposing alternate code generation.

Raymond

-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From antony.lee at berkeley.edu Wed May 21 22:38:01 2014
From: antony.lee at berkeley.edu (Antony Lee)
Date: Wed, 21 May 2014 13:38:01 -0700
Subject: [Python-ideas] Another pathlib suggestion
Message-ID:

Handling of Paths with multiple extensions is currently not so easy with
pathlib. Specifically, I don't think there is an easy way to go from
"foo.tar.gz" to "foo.ext", because Path.with_suffix only replaces the last
suffix.

I would therefore like to suggest either

1/ add Path.replace_suffix, such that
Path("foo.tar.gz").replace_suffix(".tar.gz", ".ext") == Path("foo.ext")
(this would also provide extension-checking capabilities, raising
ValueError if the first argument is not a valid suffix of the initial
path); or

2/ add a second argument to Path.with_suffix, "n_to_strip" (although
perhaps with a better name), defaulting to 1, such that
Path("foo.tar.gz").with_suffix(".ext", 0) == Path("foo.tar.gz.ext")
Path("foo.tar.gz").with_suffix(".ext", 1) == Path("foo.tar.ext")
Path("foo.tar.gz").with_suffix(".ext", 2) == Path("foo.ext") # set
n_to_strip to len(path.suffixes) for stripping all of them.
Path("foo.tar.gz").with_suffix(".ext", 3) raises a ValueError.

Best,
Antony
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From njs at pobox.com Wed May 21 23:14:41 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 21 May 2014 22:14:41 +0100
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
Message-ID:

On Wed, May 21, 2014 at 8:44 PM, Raymond Hettinger
wrote:
> * I sympathize with "there is an irritating dimple in coverage.py"
> but that hasn't actually impaired its usability beyond creating a
> curiosity. Using that a reason to add a new CPython-only
> command-line switch seems like having the tail wag the dog.

I've certainly been frustrated by this wart in coverage.py's output --
if one uses a dev cycle where you constantly review every uncovered
line to make sure that tests are doing what you want, then even a
small number of spurious uncovered lines that appear and disappear
based on the optimizer's whim can result in a lot of wasted time. (Not
to mention the hours wasted the first time I ran into this, trying to
figure out why my tests weren't working and writing new ones
specifically to target the optimized-out line...)

That said, I'm also sympathetic to your point.

Isn't the real problem here that the peephole optimizer violates the
first rule of optimization ("don't change semantics") by breaking
sys.settrace? Couldn't we fix this directly?

One approach might be to enhance co_lnotab (if anyone dares touch it)
so that it can record that a peepholed jump instruction logically
belongs to multiple *different* lines, and when we encounter such an
instruction we call the trace function multiple times. Then the
peephole optimizer just has to propagate line number information
whenever it short-circuits a jump.

Or perhaps it would be enough to add a dead-code optimization pass
after the peephole optimizer, so that coverage.py can at least see
that things like Ned's "continue" didn't actually generate any code.
(This is suboptimal as well, since it will still cause coverage.py to
produce somewhat confusing output, as if the "continue" line had a
comment instead of real code -- but it'd still be better than the
status quo.)

-n

--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

From solipsis at pitrou.net Wed May 21 23:24:14 2014
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 21 May 2014 23:24:14 +0200
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
Message-ID: <20140521232414.2720b5ed@fsol>

Hi,

On Wed, 21 May 2014 20:44:24 +0100
Raymond Hettinger
wrote:
>
> I think this opens a can of worms that is better left closed.

FWIW, I agree with Raymond's arguments here.

Regards

Antoine.

From p.f.moore at gmail.com Thu May 22 00:17:59 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 21 May 2014 23:17:59 +0100
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <20140521232414.2720b5ed@fsol>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol>
Message-ID:

On 21 May 2014 22:24, Antoine Pitrou wrote:
> On Wed, 21 May 2014 20:44:24 +0100
> Raymond Hettinger
> wrote:
>>
>> I think this opens a can of worms that is better left closed.
>
> FWIW, I agree with Raymond's arguments here.

I tend to agree as well. It's a pretty specialised case, and
presumably tools similar to coverage for languages like C manage to
deal with the issue.

Like Raymond, I can't quite explain my reservations, but it feels like
this proposal leans towards overspecifying implementation details, in
a way that will limit future development of the optimiser.
Paul

From trip at flowroute.com Thu May 22 00:30:38 2014
From: trip at flowroute.com (Trip Volpe)
Date: Wed, 21 May 2014 15:30:38 -0700
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>

Message-ID:

(First, shouldn't this be in the "disable all peephole optimizations"
thread? Raymond seems to have replied to the digest..!)

On Wed, May 21, 2014 at 2:14 PM, Nathaniel Smith wrote:
> Isn't the real problem here that the peephole optimizer violates the
> first rule of optimization ("don't change semantics") by breaking
> sys.settrace? Couldn't we fix this directly?

I agree with this. Adding a command line flag to tinker with code
generation may well be opening a can of worms, but "the peephole optimizer
shouldn't change semantics" is a more compelling argument, although fixing
it from that angle is obviously more involved. One problem is that
functions like settrace() expose low-level details to the higher-level
semantics. It's a fair question as to whether it should be considered
kosher to expose implementation details like the peephole optimizer through
such interfaces.

I could get behind an implementation that hides the erasure of lines that
are still (semantically) being executed, without disabling the peephole
optimizer.

- Trip

On Wed, May 21, 2014 at 2:14 PM, Nathaniel Smith wrote:

> On Wed, May 21, 2014 at 8:44 PM, Raymond Hettinger
> wrote:
> > * I sympathize with "there is an irritating dimple in coverage.py"
> > but that hasn't actually impaired its usability beyond creating a
> > curiosity. Using that a reason to add a new CPython-only
> > command-line switch seems like having the tail wag the dog.
>
> I've certainly been frustrated by this wart in coverage.py's output --
> if one uses a dev cycle where you constantly review every uncovered
> line to make sure that tests are doing what you want, then even a
> small number of spurious uncovered lines that appear and disappear
> based on the optimizer's whim can result in a lot of wasted time. (Not
> to mention the hours wasted the first time I ran into this, trying to
> figure out why my tests weren't working and writing new ones
> specifically to target the optimized-out line...)
>
> That said, I'm also sympathetic to your point.
>
> Isn't the real problem here that the peephole optimizer violates the
> first rule of optimization ("don't change semantics") by breaking
> sys.settrace? Couldn't we fix this directly?
>
> One approach might be to enhance co_lnotab (if anyone dares touch it)
> so that it can record that a peepholed jump instruction logically
> belongs to multiple *different* lines, and when we encounter such an
> instruction we call the trace function multiple times. Then the
> peephole optimizer just has to propagate line number information
> whenever it short-circuits a jump.
>
> Or perhaps it would be enough to add a dead-code optimization pass
> after the peephole optimizer, so that coverage.py can at least see
> that things like Ned's "continue" didn't actually generate any code.
> (This is suboptimal as well, since it will still cause coverage.py to
> produce somewhat confusing output, as if the "continue" line had a
> comment instead of real code -- but it'd still be better than the
> status quo.)
>
> -n
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ethan at stoneleaf.us Wed May 21 23:37:36 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 21 May 2014 14:37:36 -0700
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <20140521232414.2720b5ed@fsol>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol>
Message-ID: <537D1CA0.7070106@stoneleaf.us>

On 05/21/2014 02:24 PM, Antoine Pitrou wrote:
> On Wed, 21 May 2014 20:44:24 +0100 Raymond Hettinger wrote:
>>
>> I think this opens a can of worms that is better left closed.
>
> FWIW, I agree with Raymond's arguments here.

Wow, did a new star fire up somewhere? I also agree. :)

--
~Ethan~

From p.f.moore at gmail.com Thu May 22 00:47:24 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 21 May 2014 23:47:24 +0100
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>

Message-ID:

On 21 May 2014 23:30, Trip Volpe wrote:
> On Wed, May 21, 2014 at 2:14 PM, Nathaniel Smith wrote:
>> Isn't the real problem here that the peephole optimizer violates the
>> first rule of optimization ("don't change semantics") by breaking
>> sys.settrace? Couldn't we fix this directly?
>
> I agree with this. Adding a command line flag to tinker with code generation
> may well be opening a can of worms, but "the peephole optimizer shouldn't
> change semantics" is a more compelling argument, although fixing it from
> that angle is obviously more involved. One problem is that functions like
> settrace() expose low-level details to the higher-level semantics. It's a
> fair question as to whether it should be considered kosher to expose
> implementation details like the peephole optimizer through such interfaces.

While I'm happy to be proved wrong with code, my instinct is that
"making sys.settrace work" would likely be too complex to be
practical.

In any case, as you say, it exposes low-level details, and I would
personally consider "glitches" like this as implementation details. To
put it another way, I don't consider the exact lines traced by
sys.settrace to be part of the semantics of a program, any more than I
consider the output of dis.dis to be. So in my view it is acceptable
for the optimiser to change the lines that get traced in the way that
coverage experienced.

Paul.

From ned at nedbatchelder.com Thu May 22 00:51:41 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Wed, 21 May 2014 18:51:41 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
Message-ID: <537D2DFD.6000504@nedbatchelder.com>

On 5/21/14 3:44 PM, Raymond Hettinger wrote:
>
> On May 21, 2014, at 1:21 PM, python-ideas-request at python.org
> wrote:
>
>> I propose that we add a way to completely disable the
>> optimizer.
>
> I think this opens a can of worms that is better left closed.
>
> * We will have to start running tests both with and without the switch
> turned on for example (because you're exposing yet another way to
> run Python with different code).
Yes, this could mean an increased testing burden. But that scale
horizontally, and will not require a large amount of engineering work.
Besides, what better way to test the optimizer?
>
> * Over time, I expect that some of the functionality of the peepholer
> is going to be moved upstream into AST transformations you will
> have even less ability switch something on-and-off.
I'm perfectly happy to remove the word "peephole" from the feature. If
we expect the set of optimizations to grow in the future, then we can
expect that more cases of code analysis will be misled by
optimizations. All the more reason to establish a way now that will
disable all optimizations.
>
> * The code in-place has been in the code for over a decade and
> the tracker item has languished for years. That provides some
> evidence that the "need" here is very small.
>
> * I sympathize with "there is an irritating dimple in coverage.py"
> but that hasn't actually impaired its usability beyond creating a
> curiosity. Using that a reason to add a new CPython-only
> command-line switch seems like having the tail wag the dog.
I don't think you should dismiss real users' concerns as a curiosity.
We already have -X as a way to provide implementation-specific switches,
I'm not sure why the CPython-only nature of this is an issue?
>
> * As the other implementations of Python continue to develop,
> I don't think we should tie their hands with respect to code
> generation.
This proposal only applies to CPython.
>
> * Ideally, the peepholer should be thought of as part of the code
> generation. As compilation improves over time, it should start
> to generate the same code as we're getting now. It probably
> isn't wise to expose the implementation detail that the constant
> folding and jump tweaks are done in a separate second pass.
I'm happy to remove the word "peephole". I think a way to disable
optimization is useful. I've heard the concern from a number of
coverage.py users. If as we all think, optimizations will expand in
CPython, then the number of mis-diagnosed code problems will grow.

--Ned.
>
> * Mostly, I don't want to open a new crack in the Python veneer
> where people are switching on and off two different streams of
> code generation (currently, there is one way to do it). I can't
> fully articulate my instincts here, but I think we'll regret opening
> this door when we didn't have to.
>
> That being said, I know how the politics of python-ideas works
> and I expect that my thoughts on the subject will quickly get
> buried by a discussion of which lettercode should be used for the
> command-line switch.
>
> Hopefully, some readers will focus on the question of whether
> it is worth it. Others might look at ways to improve the existing
> code (without an off-switch) so that the continue-statement
> jump-to-jump shows-up in your coverage tool.
>
> IMO, adding a new command-line switch is a big deal (we should
> do it very infrequently, limit it to things with a big payoff, and
> think about whether there are any downsides). Personally, I don't
> see any big wins here and have a sense that there are downsides
> that would make us regret exposing alternate code generation.
>
>
> Raymond
>
>
>
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From greg.ewing at canterbury.ac.nz Thu May 22 00:50:47 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 22 May 2014 10:50:47 +1200
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
Message-ID: <537D2DC7.5030504@canterbury.ac.nz>

Raymond Hettinger wrote:
> * I sympathize with "there is an irritating dimple in coverage.py"
> but that hasn't actually impaired its usability beyond creating a
> curiosity.

Another way to address this would be to make coverage.py
smart enough to understand when a source line has been
optimised away and always report it as executed.

--
Greg

From ned at nedbatchelder.com Thu May 22 00:54:13 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Wed, 21 May 2014 18:54:13 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>

Message-ID: <537D2E95.8060806@nedbatchelder.com>

On 5/21/14 6:47 PM, Paul Moore wrote:
> On 21 May 2014 23:30, Trip Volpe wrote:
>> On Wed, May 21, 2014 at 2:14 PM, Nathaniel Smith wrote:
>>> Isn't the real problem here that the peephole optimizer violates the
>>> first rule of optimization ("don't change semantics") by breaking
>>> sys.settrace? Couldn't we fix this directly?
>> I agree with this. Adding a command line flag to tinker with code generation
>> may well be opening a can of worms, but "the peephole optimizer shouldn't
>> change semantics" is a more compelling argument, although fixing it from
>> that angle is obviously more involved. One problem is that functions like
>> settrace() expose low-level details to the higher-level semantics. It's a
>> fair question as to whether it should be considered kosher to expose
>> implementation details like the peephole optimizer through such interfaces.
> While I'm happy to be proved wrong with code, my instinct is that
> "making sys.settrace work" would likely be too complex to be
> practical.
I absolutely agree that "fixing settrace" is likely to be 100x more
complex than disabling the optimizer.
>
> In any case, as you say, it exposes low-level details, and I would
> personally consider "glitches" like this as implementation details.
I also agree that the exact lines reported by settrace are to some
extent an implementation detail. All I'm asking for is a way to make
the implementation match the expectations.
> To
> put it another way, I don't consider the exact lines traced by
> sys.settrace to be part of the semantics of a program, any more than I
> consider the output of dis.dis to be. So in my view it is acceptable
> for the optimiser to change the lines that get traced in the way that
> coverage experienced.
>
> Paul.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From ned at nedbatchelder.com Thu May 22 00:59:38 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Wed, 21 May 2014 18:59:38 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol>

Message-ID: <537D2FDA.6030907@nedbatchelder.com>

On 5/21/14 6:17 PM, Paul Moore wrote:
> On 21 May 2014 22:24, Antoine Pitrou wrote:
>> On Wed, 21 May 2014 20:44:24 +0100
>> Raymond Hettinger
>> wrote:
>>> I think this opens a can of worms that is better left closed.
>> FWIW, I agree with Raymond's arguments here.
> I tend to agree as well. It's a pretty specialised case, and
> presumably tools similar to coverage for languages like C manage to
> deal with the issue.
Yes, C and its tools have a way to deal with this. Are you familiar
with the -O0 switch? It disables optimization.

BTW: As C programmers know, if you want to debug your program, you use
the -O0 switch. Debugging is about reasoning about the code rather than
executing it. Trying to debug optimized C code is very difficult,
because nothing matches your expectations.

If, as others in this thread have said, we expect the set of
optimizations to grow, the need for an off switch will become greater,
even to debug the code.
>
> Like Raymond, I can't quite explain my reservations, but it feels like
> this proposal leans towards overspecifying implementation details, in
> a way that will limit future development of the optimiser.
If by implementation details, you mean the word "peephole", then let's
remove it, and simply have a switch that disables all optimization.
Rather than limiting the future of the optimizer, it will provide an
escape hatch for people who would rather not have the optimizer's effects.

--Ned.
> Paul
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From ned at nedbatchelder.com Thu May 22 01:04:58 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Wed, 21 May 2014 19:04:58 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537D1CA0.7070106@stoneleaf.us>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us>
Message-ID: <537D311A.2010101@nedbatchelder.com>

On 5/21/14 5:37 PM, Ethan Furman wrote:
> On 05/21/2014 02:24 PM, Antoine Pitrou wrote:
>> On Wed, 21 May 2014 20:44:24 +0100 Raymond Hettinger wrote:
>>>
>>> I think this opens a can of worms that is better left closed.
>>
>> FWIW, I agree with Raymond's arguments here.
>
> Wow, did a new star fire up somewhere? I also agree. :)

I'm not sure what can of worms you are imagining. Let's look to our
experience with C compilers. They have a switch to disable
optimization. What trouble has that brought? When I think of problems
with optimizers in C compilers, I think of incorrect or buggy
optimizations. I can't think of something that has gone wrong because
there was a switch to turn it off.

People in this thread have contrasted this proposal with an apparent
desire to expand the set of optimizations performed. It seems to me
that the complexity and danger lie in expanded optimizations, not
disabled ones.

--Ned.
>
> --
> ~Ethan~
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From ned at nedbatchelder.com Thu May 22 01:09:33 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Wed, 21 May 2014 19:09:33 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537D2DC7.5030504@canterbury.ac.nz>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<537D2DC7.5030504@canterbury.ac.nz>
Message-ID: <537D322D.7030909@nedbatchelder.com>

On 5/21/14 6:50 PM, Greg Ewing wrote:
> Raymond Hettinger wrote:
>> * I sympathize with "there is an irritating dimple in coverage.py"
>> but that hasn't actually impaired its usability beyond creating a
>> curiosity.
>
> Another way to address this would be to make coverage.py
> smart enough to understand when a source line has been
> optimised away and always report it as executed.
>
Do you have any ideas about that could possibly work?
Reverse-engineering optimized code is difficult if not impossible. I'm
open to concrete ideas though.

--Ned.

From donald at stufft.io Thu May 22 01:18:31 2014
From: donald at stufft.io (Donald Stufft)
Date: Wed, 21 May 2014 19:18:31 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537D2DFD.6000504@nedbatchelder.com>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<537D2DFD.6000504@nedbatchelder.com>
Message-ID:

On May 21, 2014, at 6:51 PM, Ned Batchelder wrote:

>> * I sympathize with "there is an irritating dimple in coverage.py"
>> but that hasn't actually impaired its usability beyond creating a
>> curiosity. Using that a reason to add a new CPython-only
>> command-line switch seems like having the tail wag the dog.
> I don't think you should dismiss real users' concerns as a curiosity. We already have -X as a way to provide implementation-specific switches, I'm not sure why the CPython-only nature of this is an issue?

I think it has impacted it?s usability. I?ve certainly burned some amount of time trying to figure out why an optimized line was showing up as uncovered.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL:

From victor.stinner at gmail.com Thu May 22 01:22:49 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Thu, 22 May 2014 01:22:49 +0200
Subject: [Python-ideas] Make Python code read-only
In-Reply-To: <20140521014203.GE10355@ando>
References:

<20140521014203.GE10355@ando>
Message-ID:

2014-05-21 3:42 GMT+02:00 Steven D'Aprano :
> - Obviously the "best" (most obvious) solution would be if there was a
> way to unlock modules on the fly, but Victor suggests that's hard.

The problem is to react to such event. In a function has a specialized
version for a set of read-only objects, the specialized version should
not be used anymore.

Ok, I create a new branch "readonly_cb" branch where it is possible to
make again modules, types and functions modifiable. *But* when the
readonly state is modified, a callback is called. It can be used to
disable optimizations relying on it. So all issues listed in this
thread are away. It's possible again to use monkey-patching, lazy
initialization of module variables and class variables, etc.

I hope that such callback is enough to make optimizations efficient.

Victor

From njs at pobox.com Thu May 22 01:09:51 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 22 May 2014 00:09:51 +0100
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537D2DC7.5030504@canterbury.ac.nz>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<537D2DC7.5030504@canterbury.ac.nz>
Message-ID:

On Wed, May 21, 2014 at 11:50 PM, Greg Ewing
wrote:
> Raymond Hettinger wrote:
>>
>> * I sympathize with "there is an irritating dimple in coverage.py"
>> but that hasn't actually impaired its usability beyond creating a
>> curiosity.
>
>
> Another way to address this would be to make coverage.py
> smart enough to understand when a source line has been
> optimised away and always report it as executed.

AFAICT the only ways to make coverage.py "smart enough" would be:

1) Teach coverage.py to perform a full (sound) reachability analysis
on bytecode.
2) Teach coverage.py to notice when a jump instruction doesn't go
where you might expect it to based on a naive reading of the source
code, and then reverse-engineer from this what sequence of jump
instructions that must have been merged to produce the one we observe.
I guess in practice this probably would require carrying around a
patched copy of the full compiler code from every Python release.

The problem here is that the Python compiler is throwing away
information that only it has. Asking coverage.py to reconstruct that
without help from the compiler isn't reasonable IMO.

-n

--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

From steve at pearwood.info Thu May 22 01:50:31 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 22 May 2014 09:50:31 +1000
Subject: [Python-ideas] Maybe/Option builtin
In-Reply-To:
References:
Message-ID: <20140521235031.GH10355@ando>

On Wed, May 21, 2014 at 10:49:36AM -0400, mulhern wrote:
> I feel that a Maybe/Option type, analogous to the types found in Haskell or
> OCaml would actually be useful in Python. The value analogous to the None
> constructor should be Python's None.
>
> Obviously, it wouldn't give the type-checking benefits that it gives in
> statically checked languages, but every use of a Maybe object as if it were
> the contained object would give an error, alerting the user to the fact
> that None is a possibility and allowing them to address the problem sooner
> rather than later.

Since this is a Python list, you shouldn't take it for granted that we
will be familiar with Haskell or Ocaml, nor expect us to go off and
study those languages well enough to understand what a Maybe object is
used for or how it will fit into Python's execution model.

I'm no expect on either of those two languages, but it seems to me that
that the only point of Maybe is to allow the static type checker to
distinguish between (for example) "this function returns a string" and
"this function returns either a string or None". Since Python doesn't do
static, compile-time type checks, I'm having difficulty in seeing what
would be the point of Maybe in Python.

As I said, I'm not an expert in Haskell, but I don't think this proposal
is helpful, and certainly not helpful enough to make it a built-in or
standard part of the language. If I have missed some benefit of Maybe,
please explain how it would apply in Python terms.

> I feel that it would be kind of tricky to implement it as a class.
> Something like:
>
> class Maybe(object):
> def __init__(self, value=None):
> self.value = value
> def value(self):
> return self.value
>
> is a start but I'm not able to see how to make
>
> if Maybe():
> print("nothing") # never prints

In Python 2, define __nonzero__. In Python 3, define __bool__.

https://docs.python.org/2/reference/datamodel.html#object.__nonzero__
https://docs.python.org/3/reference/datamodel.html#object.__bool__

> but
>
> if Maybe({}):
> print("yes a value") #always prints
>
> which is definitely the desired behaviour.

Not to me it isn't. This goes against the standard Python convention
that empty containers are falsey. Since Maybe({}) wraps an empty dict,
it should be considered a false value.

--
Steven

From haoyi.sg at gmail.com Thu May 22 01:53:01 2014
From: haoyi.sg at gmail.com (Haoyi Li)
Date: Wed, 21 May 2014 16:53:01 -0700
Subject: [Python-ideas] Maybe/Option builtin
In-Reply-To: <20140521235031.GH10355@ando>
References:
<20140521235031.GH10355@ando>
Message-ID:

I've been using [x] and [] for my Option type. It works great, even can be
chained monadically using for comprehensions =)

On Wed, May 21, 2014 at 4:50 PM, Steven D'Aprano wrote:

> On Wed, May 21, 2014 at 10:49:36AM -0400, mulhern wrote:
> > I feel that a Maybe/Option type, analogous to the types found in Haskell
> or
> > OCaml would actually be useful in Python. The value analogous to the None
> > constructor should be Python's None.
> >
> > Obviously, it wouldn't give the type-checking benefits that it gives in
> > statically checked languages, but every use of a Maybe object as if it
> were
> > the contained object would give an error, alerting the user to the fact
> > that None is a possibility and allowing them to address the problem
> sooner
> > rather than later.
>
> Since this is a Python list, you shouldn't take it for granted that we
> will be familiar with Haskell or Ocaml, nor expect us to go off and
> study those languages well enough to understand what a Maybe object is
> used for or how it will fit into Python's execution model.
>
> I'm no expect on either of those two languages, but it seems to me that
> that the only point of Maybe is to allow the static type checker to
> distinguish between (for example) "this function returns a string" and
> "this function returns either a string or None". Since Python doesn't do
> static, compile-time type checks, I'm having difficulty in seeing what
> would be the point of Maybe in Python.
>
> As I said, I'm not an expert in Haskell, but I don't think this proposal
> is helpful, and certainly not helpful enough to make it a built-in or
> standard part of the language. If I have missed some benefit of Maybe,
> please explain how it would apply in Python terms.
>
>
>
> > I feel that it would be kind of tricky to implement it as a class.
> > Something like:
> >
> > class Maybe(object):
> > def __init__(self, value=None):
> > self.value = value
> > def value(self):
> > return self.value
> >
> > is a start but I'm not able to see how to make
> >
> > if Maybe():
> > print("nothing") # never prints
>
> In Python 2, define __nonzero__. In Python 3, define __bool__.
>
> https://docs.python.org/2/reference/datamodel.html#object.__nonzero__
> https://docs.python.org/3/reference/datamodel.html#object.__bool__
>
>
> > but
> >
> > if Maybe({}):
> > print("yes a value") #always prints
> >
> > which is definitely the desired behaviour.
>
> Not to me it isn't. This goes against the standard Python convention
> that empty containers are falsey. Since Maybe({}) wraps an empty dict,
> it should be considered a false value.
>
>
>
>
> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ncoghlan at gmail.com Thu May 22 02:01:40 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 22 May 2014 10:01:40 +1000
Subject: [Python-ideas] Make Python code read-only
In-Reply-To: <1d4b9976723c4013addf2c8ec3c77f0f@BLUPR03MB389.namprd03.prod.outlook.com>
References:

<20140521014203.GE10355@ando>

<537CB409.5000104@stoneleaf.us>
<1d4b9976723c4013addf2c8ec3c77f0f@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID:

On 22 May 2014 00:56, "Steve Dower" wrote:
>
> Another +inf from me.
>
> Mind if I quote you on this next time I'm trying to convince C#
developers to take Python seriously? :)

Sure - I expect your conversations with C# devs resemble some of mine with
Java devs :)

Cheers,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ncoghlan at gmail.com Thu May 22 02:07:30 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 22 May 2014 10:07:30 +1000
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To: <537CB6ED.2050906@nedbatchelder.com>
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>
Message-ID:

On 22 May 2014 00:24, "Ned Batchelder" wrote:
>
> On 5/21/14 8:21 AM, Jonas Wielicki wrote:
>>
>> On 21.05.2014 14:13, Steven D'Aprano wrote:
>>>
>>> > On Wed, May 21, 2014 at 07:05:49AM -0400, Ned Batchelder wrote:
>>>>
>>>> >> ** The problem
>>>> >>
>>>> >> A long-standing problem with CPython is that the peephole optimizer
>>>> >> cannot be completely disabled. Normally, peephole optimization is
a
>>>> >> good thing, it improves execution speed. But in some situations,
like
>>>> >> coverage testing, it's more important to be able to reason about
the
>>>> >> code's execution. I propose that we add a way to completely
disable the
>>>> >> optimizer.
>>>
>>> >
>>> > I'm not sure whether this is an argument for or against your
proposal,
>>> > but the continue statement shown below is *not* dead code and should
not
>>> > be optimized out. The assert fails if you remove the continue
statement.
>>> >
>>> > I don't have 3.4 on this machine to test with, but using 3.3, I can
see
>>> > no evidence that `continue` is optimized away.
>>
>> The logical continue is still there -- what happens is that the
>> optimizer rewrites the `else` jump at the preceding `if` condition,
>> which would normally point at the `continue` statement, to the beginning
>> of the loop, because it would be a jump (to the continue) to a jump (to
>> the for loop header).
>>
>> Thus, the actual continue statement is not reached, but logically the
>> code does the same, because the only way continue would have been
>> reached was transformed to a continue itself.
>>
> To make the details more explicit, here is the source again, and the
disassembled code, with the original source interspersed:
>
>> a = b = c = 0
>> for n in range(100):
>> if n % 2:
>> if n % 4:
>> a += 1
>> continue
>> else:
>> b += 1
>> c += 1
>> assert a == 50 and b == 50 and c == 50
>
> Disassembled (Python 3.4, but the same effect is visible in 2.7, 3.3,
etc):
>
>
> a = b = c = 0
> 1 0 LOAD_CONST 0 (0)
> 3 DUP_TOP
> 4 STORE_NAME 0 (a)
> 7 DUP_TOP
> 8 STORE_NAME 1 (b)
> 11 STORE_NAME 2 (c)
>
> for n in range(100):
> 2 14 SETUP_LOOP 79 (to 96)
> 17 LOAD_NAME 3 (range)
> 20 LOAD_CONST 1 (100)
> 23 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
> 26 GET_ITER
> >> 27 FOR_ITER 65 (to 95)
> 30 STORE_NAME 4 (n)
>
> if n % 2:
> 3 33 LOAD_NAME 4 (n)
> 36 LOAD_CONST 2 (2)
> 39 BINARY_MODULO
> 40 POP_JUMP_IF_FALSE 72
>
> if n % 4:
> 4 43 LOAD_NAME 4 (n)
> 46 LOAD_CONST 3 (4)
> 49 BINARY_MODULO
> 50 POP_JUMP_IF_FALSE 27
>
> a += 1
> 5 53 LOAD_NAME 0 (a)
> 56 LOAD_CONST 4 (1)
> 59 INPLACE_ADD
> 60 STORE_NAME 0 (a)
> 63 JUMP_ABSOLUTE 27
>
> continue
> 6 66 JUMP_ABSOLUTE 27
> 69 JUMP_FORWARD 10 (to 82)
>
> b += 1
> 8 >> 72 LOAD_NAME 1 (b)
> 75 LOAD_CONST 4 (1)
> 78 INPLACE_ADD
> 79 STORE_NAME 1 (b)
>
> c += 1
> 9 >> 82 LOAD_NAME 2 (c)
> 85 LOAD_CONST 4 (1)
> 88 INPLACE_ADD
> 89 STORE_NAME 2 (c)
> 92 JUMP_ABSOLUTE 27
> >> 95 POP_BLOCK
>
>
> assert a == 50 and b == 50 and c == 50
> 10 >> 96 LOAD_NAME 0 (a)
> 99 LOAD_CONST 5 (50)
> 102 COMPARE_OP 2 (==)
> 105 POP_JUMP_IF_FALSE 132
> 108 LOAD_NAME 1 (b)
> 111 LOAD_CONST 5 (50)
> 114 COMPARE_OP 2 (==)
> 117 POP_JUMP_IF_FALSE 132
> 120 LOAD_NAME 2 (c)
> 123 LOAD_CONST 5 (50)
> 126 COMPARE_OP 2 (==)
> 129 POP_JUMP_IF_TRUE 138
> >> 132 LOAD_GLOBAL 5 (AssertionError)
> 135 RAISE_VARARGS 1
> >> 138 LOAD_CONST 6 (None)
> 141 RETURN_VALUE
>
> Notice that line 6 (the continue) is unreachable, because the else-jump
from line 4 has been turned into a jump to bytecode offset 27 (the for
loop), and the end of line 5 has also been turned into a jump to 27, rather
than letting it flow to line 6. So line 6 still exists in the bytecode,
but is never executed, leading tracing tools to indicate that line 6 is
never executed.

So isn't this just a bug in the dead code elimination? Fixing that (so
there's no bytecode behind that line and coverage tools can know it has
been optimised out) sounds better than adding an obscure config option.

Potentially less risky would be to provide a utility in the dis module to
flag such lines after the fact.

Cheers,
Nick.

>
> --Ned.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From dw+python-ideas at hmmz.org Thu May 22 02:10:39 2014
From: dw+python-ideas at hmmz.org (dw+python-ideas at hmmz.org)
Date: Thu, 22 May 2014 00:10:39 +0000
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537D311A.2010101@nedbatchelder.com>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us>
<537D311A.2010101@nedbatchelder.com>
Message-ID: <20140522001038.GB15946@k2>

On Wed, May 21, 2014 at 07:04:58PM -0400, Ned Batchelder wrote:

> I can't think of something that has gone wrong because there was a
> switch to turn it off.

> I'm not sure what can of worms you are imagining. Let's look to our
> experience with C compilers. They have a switch to disable optimization.
> What trouble has that brought?

Are you serious? Somehow I'm reminded of the funroll-loops.info Gentoo
parody site. As others mention, there is a difficult to quantify, but
very real non-zero cost in introducing new major execution modes.

> When I think of problems with optimizers in C compilers, I think of
> incorrect or buggy optimizations.

Sure, it if were still the early 90s. Most optimization bugs come from
inexperienced developers relying on undefined behaviour of one form or
another, and Python doesn't suffer from UB quite the way C does.

> People in this thread have contrasted this proposal with an apparent desire
> to expand the set of optimizations performed. It seems to me that the
> complexity and danger lie in expanded optimizations, not disabled ones.

Agreed, and so I'd suggest a better fix would be removing the peephole
optimizer, for the little benefit that it offers, if it could be shown
that it really truly does hinder peoples' comprehension of Python.

It seems the proposed feature is all about avoiding saying "oh, don't
worry about that for the moment" while teaching, assuming the question
comes up at all.

Adding another special case to disable a minor performance improvement
seems pointless when the implementation is slow regardless, kind of
along the same lines as adding another -O or -OO flag, and we all know
how useful they ended up being.

If there really was a problem here, it seems preferable to just remove
the optimizer entirely and find more general ways to fix performance
without creating a mess.

David

From jeanpierreda at gmail.com Thu May 22 02:11:19 2014
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Wed, 21 May 2014 17:11:19 -0700
Subject: [Python-ideas] Maybe/Option builtin
In-Reply-To: <20140521235031.GH10355@ando>
References:
<20140521235031.GH10355@ando>
Message-ID:

On Wed, May 21, 2014 at 4:50 PM, Steven D'Aprano wrote:
> On Wed, May 21, 2014 at 10:49:36AM -0400, mulhern wrote:
>> I feel that a Maybe/Option type, analogous to the types found in Haskell or
>> OCaml would actually be useful in Python. The value analogous to the None
>> constructor should be Python's None.
>>
>> Obviously, it wouldn't give the type-checking benefits that it gives in
>> statically checked languages, but every use of a Maybe object as if it were
>> the contained object would give an error, alerting the user to the fact
>> that None is a possibility and allowing them to address the problem sooner
>> rather than later.
>
> Since this is a Python list, you shouldn't take it for granted that we
> will be familiar with Haskell or Ocaml, nor expect us to go off and
> study those languages well enough to understand what a Maybe object is
> used for or how it will fit into Python's execution model.
>
> I'm no expect on either of those two languages, but it seems to me that
> that the only point of Maybe is to allow the static type checker to
> distinguish between (for example) "this function returns a string" and
> "this function returns either a string or None". Since Python doesn't do
> static, compile-time type checks, I'm having difficulty in seeing what
> would be the point of Maybe in Python.

Python has no way to represent "I have a value, and that value is None".

e.g. what is the difference between x.get('a') and x.get('b') for x ==
{'a': None} ?

Otherwise, agree 100% . And anyway, retrofitting this into Python
can't really work.

-- Devin

> As I said, I'm not an expert in Haskell, but I don't think this proposal
> is helpful, and certainly not helpful enough to make it a built-in or
> standard part of the language. If I have missed some benefit of Maybe,
> please explain how it would apply in Python terms.
>
>
>
>> I feel that it would be kind of tricky to implement it as a class.
>> Something like:
>>
>> class Maybe(object):
>> def __init__(self, value=None):
>> self.value = value
>> def value(self):
>> return self.value
>>
>> is a start but I'm not able to see how to make
>>
>> if Maybe():
>> print("nothing") # never prints
>
> In Python 2, define __nonzero__. In Python 3, define __bool__.
>
> https://docs.python.org/2/reference/datamodel.html#object.__nonzero__
> https://docs.python.org/3/reference/datamodel.html#object.__bool__
>
>
>> but
>>
>> if Maybe({}):
>> print("yes a value") #always prints
>>
>> which is definitely the desired behaviour.
>
> Not to me it isn't. This goes against the standard Python convention
> that empty containers are falsey. Since Maybe({}) wraps an empty dict,
> it should be considered a false value.
>
>
>
>
> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From amber.yust at gmail.com Thu May 22 02:15:15 2014
From: amber.yust at gmail.com (Amber Yust)
Date: Wed, 21 May 2014 17:15:15 -0700
Subject: [Python-ideas] Maybe/Option builtin
In-Reply-To:
References:
<20140521235031.GH10355@ando>

Message-ID:

If you care about none as a value, you specify a different default.

NO_VALUE = object()
foo = bar.get("baz", NO_VALUE)
if foo is NO_VALUE:
#....
On May 21, 2014 5:12 PM, "Devin Jeanpierre" wrote:

> On Wed, May 21, 2014 at 4:50 PM, Steven D'Aprano
> wrote:
> > On Wed, May 21, 2014 at 10:49:36AM -0400, mulhern wrote:
> >> I feel that a Maybe/Option type, analogous to the types found in
> Haskell or
> >> OCaml would actually be useful in Python. The value analogous to the
> None
> >> constructor should be Python's None.
> >>
> >> Obviously, it wouldn't give the type-checking benefits that it gives in
> >> statically checked languages, but every use of a Maybe object as if it
> were
> >> the contained object would give an error, alerting the user to the fact
> >> that None is a possibility and allowing them to address the problem
> sooner
> >> rather than later.
> >
> > Since this is a Python list, you shouldn't take it for granted that we
> > will be familiar with Haskell or Ocaml, nor expect us to go off and
> > study those languages well enough to understand what a Maybe object is
> > used for or how it will fit into Python's execution model.
> >
> > I'm no expect on either of those two languages, but it seems to me that
> > that the only point of Maybe is to allow the static type checker to
> > distinguish between (for example) "this function returns a string" and
> > "this function returns either a string or None". Since Python doesn't do
> > static, compile-time type checks, I'm having difficulty in seeing what
> > would be the point of Maybe in Python.
>
> Python has no way to represent "I have a value, and that value is None".
>
> e.g. what is the difference between x.get('a') and x.get('b') for x ==
> {'a': None} ?
>
> Otherwise, agree 100% . And anyway, retrofitting this into Python
> can't really work.
>
> -- Devin
>
> > As I said, I'm not an expert in Haskell, but I don't think this proposal
> > is helpful, and certainly not helpful enough to make it a built-in or
> > standard part of the language. If I have missed some benefit of Maybe,
> > please explain how it would apply in Python terms.
> >
> >
> >
> >> I feel that it would be kind of tricky to implement it as a class.
> >> Something like:
> >>
> >> class Maybe(object):
> >> def __init__(self, value=None):
> >> self.value = value
> >> def value(self):
> >> return self.value
> >>
> >> is a start but I'm not able to see how to make
> >>
> >> if Maybe():
> >> print("nothing") # never prints
> >
> > In Python 2, define __nonzero__. In Python 3, define __bool__.
> >
> > https://docs.python.org/2/reference/datamodel.html#object.__nonzero__
> > https://docs.python.org/3/reference/datamodel.html#object.__bool__
> >
> >
> >> but
> >>
> >> if Maybe({}):
> >> print("yes a value") #always prints
> >>
> >> which is definitely the desired behaviour.
> >
> > Not to me it isn't. This goes against the standard Python convention
> > that empty containers are falsey. Since Maybe({}) wraps an empty dict,
> > it should be considered a false value.
> >
> >
> >
> >
> > --
> > Steven
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at python.org
> > https://mail.python.org/mailman/listinfo/python-ideas
> > Code of Conduct: http://python.org/psf/codeofconduct/
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ericsnowcurrently at gmail.com Thu May 22 02:17:18 2014
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 21 May 2014 18:17:18 -0600
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537D2DFD.6000504@nedbatchelder.com>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<537D2DFD.6000504@nedbatchelder.com>
Message-ID:

On Wed, May 21, 2014 at 4:51 PM, Ned Batchelder wrote:
> On 5/21/14 3:44 PM, Raymond Hettinger wrote:
>> I think this opens a can of worms that is better left closed.
>>
>> * We will have to start running tests both with and without the switch
>> turned on for example (because you're exposing yet another way to
>> run Python with different code).
>
> Yes, this could mean an increased testing burden. But that scale
> horizontally, and will not require a large amount of engineering work.
> Besides, what better way to test the optimizer?

I buy that to an extent. It would definitely be helpful when adding
or changing optimizations, particularly to identify the impact of
changes both in sematics and performance. However, work on
optimizations isn't too common. Aside from direct work on
optimizations, optimization-free testing could be useful for
identifying optimizer-related bugs (which I expect are quite rare).
However, that doesn't add a lot of benefit over a normal buildbot run
considering that each run has few changes it is testing.

Having said all that, I think it would still be worth testing with and
without optimizations. Unless the optimizations are
platform-specific, would we need more than one buildbot running with
optimizations turned off?

>> * Over time, I expect that some of the functionality of the peepholer
>> is going to be moved upstream into AST transformations you will
>> have even less ability switch something on-and-off.
>
> I'm perfectly happy to remove the word "peephole" from the feature. If we
> expect the set of optimizations to grow in the future, then we can expect
> that more cases of code analysis will be misled by optimizations. All the
> more reason to establish a way now that will disable all optimizations.

While the use-case is very specific, I think it's a valid motivator
for a means of disabling all optimizations, particularly if disabling
optimizations is isolated to a very focused location as you've
indicated.

The big question then is the impact on implementing optimizations (in
general) in the future. There has been talk of AST-based
optimizations. Raymond indicates that this makes it harder to
conditionally optimize. So how much harder would it make this future
optimization work? Is that a premature optimization?

>> * I sympathize with "there is an irritating dimple in coverage.py"
>> but that hasn't actually impaired its usability beyond creating a
>> curiosity. Using that a reason to add a new CPython-only
>> command-line switch seems like having the tail wag the dog.
>
> I don't think you should dismiss real users' concerns as a curiosity. We
> already have -X as a way to provide implementation-specific switches, I'm
> not sure why the CPython-only nature of this is an issue?

If optimizations can break coverage tools when run on other Python
implementations, does that make a case for a more general command-line
option? Or is it just a matter of CPython's optimizations behave
badly by breaking some perceived invariants that coverage tools rely
on, and other implementations behave correctly?

If it's the latter, then perhaps Python needs a few tests added to the
test suite that verify that optimizer doesn't break the invariants.
Such tests would benefit all implementations. However, even if it's
the right approach, if the burden of fixing things is so much more
than the burden of adding a no-optimizations option, it may make more
sense to just add the option and move on. It's all about who has the
time to do something about it. (And of course "Now is better than
never. Although never is often better than *right* now.")

Of course, if the coverage tools rely on CPython implementation
details then an implementation-specific -X option makes even more
sense.

FWIW, regardless of the scenario a -X option makes practical sense in
that it would relatively immediately relieve the (infrequent? but
onerous) pain point encountered in coverage tools. However, keep in
mind that such an option would not be backported and would not be
released until 3.5 (in late 2015). So I suppose it would be more
about relieving future pain and not helping current coverage tool
users.

>
>> * Ideally, the peepholer should be thought of as part of the code
>> generation. As compilation improves over time, it should start
>> to generate the same code as we're getting now. It probably
>> isn't wise to expose the implementation detail that the constant
>> folding and jump tweaks are done in a separate second pass.
>
> I'm happy to remove the word "peephole". I think a way to disable
> optimization is useful. I've heard the concern from a number of coverage.py
> users. If as we all think, optimizations will expand in CPython, then the
> number of mis-diagnosed code problems will grow.

The comparison made elsewhere with -O0 option in other compilers is
also appropriate here.

-eric

From steve at pearwood.info Thu May 22 02:24:34 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 22 May 2014 10:24:34 +1000
Subject: [Python-ideas] Maybe/Option builtin
In-Reply-To:
References:
<20140521235031.GH10355@ando>

Message-ID: <20140522002434.GJ10355@ando>

On Wed, May 21, 2014 at 05:11:19PM -0700, Devin Jeanpierre wrote:

> Python has no way to represent "I have a value, and that value is None".

Sure it has. 'a' in x and x['a'] is None

> e.g. what is the difference between x.get('a') and x.get('b') for x ==
> {'a': None} ?

dict.get is explicitly designed to blur the distinction between "I have
a key and here is its value" and "I may or may not have a key, it
doesn't matter which, return its value or this default regardless". (The
default value defaults to None.) If you care about the difference, as
most people do most of the time, you should avoid the get method and
call x['a'] directly.

--
Steven

From ned at nedbatchelder.com Thu May 22 03:29:40 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Wed, 21 May 2014 21:29:40 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <537D2DFD.6000504@nedbatchelder.com>

Message-ID: <537D5304.6030109@nedbatchelder.com>

On 5/21/14 8:17 PM, Eric Snow wrote:
>>> * Over time, I expect that some of the functionality of the peepholer
>>> >>is going to be moved upstream into AST transformations you will
>>> >>have even less ability switch something on-and-off.
>> >
>> >I'm perfectly happy to remove the word "peephole" from the feature. If we
>> >expect the set of optimizations to grow in the future, then we can expect
>> >that more cases of code analysis will be misled by optimizations. All the
>> >more reason to establish a way now that will disable all optimizations.
> While the use-case is very specific, I think it's a valid motivator
> for a means of disabling all optimizations, particularly if disabling
> optimizations is isolated to a very focused location as you've
> indicated.
>
> The big question then is the impact on implementing optimizations (in
> general) in the future. There has been talk of AST-based
> optimizations. Raymond indicates that this makes it harder to
> conditionally optimize. So how much harder would it make this future
> optimization work? Is that a premature optimization?
>
I don't understand the claim that AST transformations will have less
ability to switch something on-and-off. The very term "AST
transformations" outlines the implementation: step 1, construct an AST;
step 2, transform the AST; step 3, generate code from the AST. My
proposal is that a switch would let you skip step 2.

This is analogous to the current optimizer, which generates bytecode,
then as a separate (and skippable!) step, performs peephole
optimizations on that bytecode.

--Ned.

From ned at nedbatchelder.com Thu May 22 04:00:38 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Wed, 21 May 2014 22:00:38 -0400
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com>

Message-ID: <537D5A46.4040200@nedbatchelder.com>

On 5/21/14 8:07 PM, Nick Coghlan wrote:
>
> > Notice that line 6 (the continue) is unreachable, because the
> else-jump from line 4 has been turned into a jump to bytecode offset
> 27 (the for loop), and the end of line 5 has also been turned into a
> jump to 27, rather than letting it flow to line 6. So line 6 still
> exists in the bytecode, but is never executed, leading tracing tools
> to indicate that line 6 is never executed.
>
> So isn't this just a bug in the dead code elimination? Fixing that (so
> there's no bytecode behind that line and coverage tools can know it
> has been optimised out) sounds better than adding an obscure config
> option.
>
Perhaps I don't know how much dead code elimination was intended.
Assuming we can get to the point that the statement has been completely
removed, you'll still have the confusing state that a perfectly good
statement is marked as not executable (because it has no corresponding
bytecode). And getting to that point means adding more complexity to
the bytecode optimizer.
>
> Potentially less risky would be to provide a utility in the dis module
> to flag such lines after the fact.
>
I don't see how the dis module would know which lines these are?

I'm surprised at the amount of invention and mystery code people will
propose to avoid having an off-switch for the code we already have.
>
> Cheers,
> Nick.
>

From ned at nedbatchelder.com Thu May 22 04:03:22 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Wed, 21 May 2014 22:03:22 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <20140522001038.GB15946@k2>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us>
<537D311A.2010101@nedbatchelder.com> <20140522001038.GB15946@k2>
Message-ID: <537D5AEA.901@nedbatchelder.com>

On 5/21/14 8:10 PM, dw+python-ideas at hmmz.org wrote:
> Agreed, and so I'd suggest a better fix would be removing the peephole
> optimizer, for the little benefit that it offers, if it could be shown
> that it really truly does hinder peoples' comprehension of Python.
>
> It seems the proposed feature is all about avoiding saying "oh, don't
> worry about that for the moment" while teaching, assuming the question
> comes up at all.

The point is not about teaching Python. It's about getting useful
information from code analysis tools. When you run coverage tools or
debuggers, you are hoping to learn something about your code. It is bad
when those tools give you incorrect or misleading information. Being
able to disable the optimizer will prevent certain kinds of incorrect
information.

--Ned.

From ethan at stoneleaf.us Thu May 22 04:00:49 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 21 May 2014 19:00:49 -0700
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537D2DFD.6000504@nedbatchelder.com>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<537D2DFD.6000504@nedbatchelder.com>
Message-ID: <537D5A51.3010809@stoneleaf.us>

On 05/21/2014 03:51 PM, Ned Batchelder wrote:
>
> I'm perfectly happy to remove the word "peephole" from the feature. If we expect the set of optimizations to grow in the
> future, then we can expect that more cases of code analysis will be misled by optimizations. All the more reason to
> establish a way now that will disable all optimizations.

I think the big part of the problem is that there are more than just peephole optimizations. For example, what about
all the fast-path optimizations? Do we want to be able to turn those off? How about the heapq optimizations that
Raymond put in a few months ago?

As Nick suggested, I think it would be better to fix whichever part is broken and allowing dead code to stay in the
bytecode.

--
~Ethan~

From jeanpierreda at gmail.com Thu May 22 08:25:14 2014
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Wed, 21 May 2014 23:25:14 -0700
Subject: [Python-ideas] Maybe/Option builtin
In-Reply-To: <20140522002434.GJ10355@ando>
References:
<20140521235031.GH10355@ando>

<20140522002434.GJ10355@ando>
Message-ID:

On Wed, May 21, 2014 at 5:24 PM, Steven D'Aprano wrote:
> On Wed, May 21, 2014 at 05:11:19PM -0700, Devin Jeanpierre wrote:
>
>> Python has no way to represent "I have a value, and that value is None".
>
> Sure it has. 'a' in x and x['a'] is None

Hm, why do you bring this up? Do you think I didn't know about this?

While we're at it, you could just implement the Option type, and then
the problem is also solved! I am sorry I did not communicate what I
meant to say.

>> e.g. what is the difference between x.get('a') and x.get('b') for x ==
>> {'a': None} ?
>
> dict.get is explicitly designed to blur the distinction between "I have
> a key and here is its value" and "I may or may not have a key, it
> doesn't matter which, return its value or this default regardless". (The
> default value defaults to None.) If you care about the difference, as
> most people do most of the time, you should avoid the get method and
> call x['a'] directly.

I was not trying to criticize dict.get, I was noting the same
property you have agreed with: it blurs the distinction.

That distinction does not have to be blurred, and for some other APIs,
must not be blurred. I appreciate that dict.get is not the former
case, so, for the sake of making the discussion about something
important, let's choose such an API: imagine that we steal Guido's
time machine to go into the past and redesign next(). With next(), it
is absolutely vital that we are able to know when the iterator has
stopped vs when the iterator has produced a value.

API ideas:

- Use next's current primary API: raise an exception if the iterator
has ended. This is a very good API, except that if the caller forgets
to catch the exception, this can result in generators or iterators up
the call stack spontaneously and silently stopping, which can be hard
to debug - and, unfortunately, is almost never actually your fault.

- Use next's current alternate API: give next a default parameter
which is returned when the iterator stops, and follow Amber's
solution. However, you have to be careful that this sentinel value
won't ever be wanted as a return value, and that you don't
accidentally match things against the sentinel by mistake: i.e. it has
to be defined inline with a unique object and checked by identity.

- make next() return None when the generator is exhausted. How do you
differentiate None as a value vs None as a no-result sentinel? ideas:

- give iterators a .is_exhausted attribute which is True if the None
it returned indicates end-of-iterator, False otherwise. This is
acceptable, but now, if the caller forgets to check for exhaustion,
they start silently giving None as values. This might also be hard to
debug.

- make all values be wrapped in Some(v), so that next(it) returns
either None or Some(v). This can cause problems if you forget to
actually unwrap the Some to get at the value. Unlike the above
problems, however, such things are very easy to debug - you will get a
TypeError when you add Some(3) + 5. And, similarly, if you forget to
check for None, this will be an error when you treat it like a Some
value (e.g. 'NoneType' object has no attribute 'unwrap'). So this
option is pretty resilient against mistakes, even in a dynamically
typed language.

Stylistically, I feel like exceptions and option types give you a
clean separation of cases, where the other two options just munge data
together and let you sort it out afterwards - this feels ugly to me,
and would seem to push programmers towards not caring about the value.
(Probably not a good idea when the cases are important.) Similarly,
it's easy to forget to catch an exception, especially since quite
often people use next in a situation where they think it will never
raise an exception. And in the particular case of StopIteration,
unlike most other exceptions, it's frequently silently caught by
various functions that might sit between you and where the exception
was raised, meaning that you might not easily identify the cause of
the problem -- especially in a badly tested codebase. (And what other
kind of codebase would miss this error? ;). So for me personally, if
Python had all these options and they were all equally easy and
natural, I would probably choose option types. This is of course
subjective, but I hope it sort of ties together everything and
explains the utility.

If you value certain things and have a certain sense of elegance, you
might want to use option types instead of some of the alternatives, in
some circumstances, even in a dynamically typed language. Though, I am
not trying to suggest that you personally would ever want to use it,
or that it's even particularly Pythonic. I personally feel like
probably Some/None would be awkward/unpythonic to retrofit (although I
think sum types in general belong in the language, which is why I am
talking about this at all - awareness is important!).

Does that help explain how Some/None can sometimes be useful or
desirable (to some people) even without static typing?

-- Devin

From tjreedy at udel.edu Thu May 22 08:44:52 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 22 May 2014 02:44:52 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537D2FDA.6030907@nedbatchelder.com>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol>

<537D2FDA.6030907@nedbatchelder.com>
Message-ID:

On 5/21/2014 6:59 PM, Ned Batchelder wrote:

> If by implementation details, you mean the word "peephole", then let's
> remove it, and simply have a switch that disables all optimization.
> Rather than limiting the future of the optimizer, it will provide an
> escape hatch for people who would rather not have the optimizer's effects.

The presumption of this idea is that there is a proper, canonical
unoptimized version of 'compiled Python'. For Python there obviously is
not. For CPython, there is not either. What Raymond has been saying is
that the output of the CPython compiler is the output of the CPython
compiler.

Sys.settrace is not intended to mandate. It reports on the operations of
a particular version of CPython as well as it can with the line number
table it gets. The existence of the table is not mandated by the
language definition, but is provided on a best effort basis. Another
issue on the tracker points out that if an ast is constructed directly,
and then compiled, then 'source line numbers' has no meaning.

When I used coverage (last summer) with tested Idle modules, I could not
get a reported 100% coverage because coverage counts the body of a final
"if __name__ == '__main__':" statement. So I had to visually checked
that those were the only 'uncovered' lines. I do not see doing the same
for 'uncovered' continue as much different. In either case, coverage
could leave such lines out of the denominator.

--
Terry Jan Reedy

From ncoghlan at gmail.com Thu May 22 10:25:26 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 22 May 2014 18:25:26 +1000
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To: <537D5A46.4040200@nedbatchelder.com>
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>
Message-ID:

On 22 May 2014 12:00, "Ned Batchelder" wrote:
>
> I'm surprised at the amount of invention and mystery code people will
propose to avoid having an off-switch for the code we already have.

It's not the off switch per se, it's the documentation and testing
consequences. Better to figure out a way to let the code generator and
analysis tools collaborate more effectively than to complicate the
execution model further.

Cheers,
Nick.

>>
>>
>> Cheers,
>> Nick.
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From solipsis at pitrou.net Thu May 22 10:43:34 2014
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 22 May 2014 10:43:34 +0200
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol>

<537D2FDA.6030907@nedbatchelder.com>
Message-ID: <20140522104334.36b2b07f@fsol>

On Thu, 22 May 2014 02:44:52 -0400
Terry Reedy wrote:
>
> When I used coverage (last summer) with tested Idle modules, I could not
> get a reported 100% coverage because coverage counts the body of a final
> "if __name__ == '__main__':" statement.

There are flags to modify this behaviour.

Regards

Antoine.

From solipsis at pitrou.net Thu May 22 10:52:20 2014
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 22 May 2014 10:52:20 +0200
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us>
<537D311A.2010101@nedbatchelder.com>
Message-ID: <20140522105220.70b360fe@fsol>

On Wed, 21 May 2014 19:04:58 -0400
Ned Batchelder
wrote:
>
> I'm not sure what can of worms you are imagining. Let's look to our
> experience with C compilers. They have a switch to disable
> optimization. What trouble has that brought? When I think of problems
> with optimizers in C compilers, I think of incorrect or buggy
> optimizations. I can't think of something that has gone wrong because
> there was a switch to turn it off.

Python's usage model does not contain the notion of compiler
optimizations. Hardly anybody uses the misnamed -O flags. There
is a single compilation mode, which everyone is content with. It is
part of the simplicity of the language (or, at least, of CPython); by
adding some flags than can affect the level of "optimization" you make
the model more complicated to understand for users, and to support for
us.

(having used coverage several times, I haven't found those missed lines
really annoying, by the way; not to the point that I would have wanted
a specific command-line flag to disable optimizations)

The use case for disabling optimizations in C is to make programs
actually debuggable. Python doesn't have that problem.

Regards

Antoine.

From p.f.moore at gmail.com Thu May 22 11:02:06 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 22 May 2014 10:02:06 +0100
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <20140522105220.70b360fe@fsol>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us>
<537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol>
Message-ID:

On 22 May 2014 09:52, Antoine Pitrou wrote:
> adding some flags than can affect the level of "optimization" you make
> the model more complicated to understand for users, and to support for
> us.

As a concrete example, note my earlier comment about pyc files.
Switching off optimisation results in unoptimised bytecode being
written to pyc files, which could then be read in a subsequent
(supposedly) optimised run. And vice versa. This may not be a huge
problem for the coverage use case, but it does add an extra level of
complexity into the model of caching bytecode. Handwaving it away as
"not a big deal - just delete the bytecode files before and after the
coverage run" doesn't alter the fact that the bytecode caching model
isn't handling the new mode properly.

Paul

From theller at ctypes.org Thu May 22 14:50:08 2014
From: theller at ctypes.org (Thomas Heller)
Date: Thu, 22 May 2014 14:50:08 +0200
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <20140522105220.70b360fe@fsol>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us>
<537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol>
Message-ID:

Am 22.05.2014 10:52, schrieb Antoine Pitrou:
> The use case for disabling optimizations in C is to make programs
> actually debuggable. Python doesn't have that problem.

Well, setting a breakpoint to the 'continue' line in Ned's test code
and running it with pdb does NOT trigger the breakpoint.
So 'Python doesn't have this problem' is not really true.

Thomas

From ned at nedbatchelder.com Thu May 22 15:24:37 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Thu, 22 May 2014 09:24:37 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol>

<537D2FDA.6030907@nedbatchelder.com>
Message-ID: <537DFA95.4040000@nedbatchelder.com>

On 5/22/14 2:44 AM, Terry Reedy wrote:
> On 5/21/2014 6:59 PM, Ned Batchelder wrote:
>
>> If by implementation details, you mean the word "peephole", then let's
>> remove it, and simply have a switch that disables all optimization.
>> Rather than limiting the future of the optimizer, it will provide an
>> escape hatch for people who would rather not have the optimizer's
>> effects.
>
> The presumption of this idea is that there is a proper, canonical
> unoptimized version of 'compiled Python'. For Python there obviously
> is not. For CPython, there is not either. What Raymond has been saying
> is that the output of the CPython compiler is the output of the
> CPython compiler.
>

I'd like to understand why we think the Python compiler is different in
this regard than a C compiler. We all use C compilers that have a -O0
switch. It's there to disable optimizations so that programs can be
debugged. The C compiler also has no "canonical unoptimized compiled
output". But the switch is there to make it possible to debug (reason
about) the compiled code.

I don't care if we have a command line switch or some other mechanism to
disable optimizations. I just think it's useful to be able to do it somehow.

When this came up 18 months ago on Python-Dev, it was part of a thread
about adding more optimizations to CPython. Guido said "+1" to the idea
of being able to disable the optimizers
(https://mail.python.org/pipermail/python-dev/2012-December/123099.html). Our
need is not as great as C's, the unrecognizability of the compiled code
is much less, but current optimizations are already interfering with the
ability to debug and analyze code, and new optimizations will only
broaden the possibility of interference.

--Ned.

From ned at nedbatchelder.com Thu May 22 15:29:46 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Thu, 22 May 2014 09:29:46 -0400
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com> <537D5A46.4040200@nedbatchelder.com>

Message-ID: <537DFBCA.2070006@nedbatchelder.com>

On 5/22/14 4:25 AM, Nick Coghlan wrote:
>
>
> On 22 May 2014 12:00, "Ned Batchelder" > wrote:
> >
> > I'm surprised at the amount of invention and mystery code people
> will propose to avoid having an off-switch for the code we already have.
>
> It's not the off switch per se, it's the documentation and testing
> consequences. Better to figure out a way to let the code generator and
> analysis tools collaborate more effectively than to complicate the
> execution model further.
>

The problem with "letting them collaborate more effectively" is that we
don't know how to do that. If we can come up with a way to do it, it
will involve much more complex code than I am proposing.

As far as documentation, we have three possibilities for optimization
level now. This will add a fourth. I don't see that as a burden.

On the testing front, if I were the developer of an optimizer, I would
welcome a switch to disable it, as a way to test that optimizations
don't change semantics. I understand that this is a different mode of
execution. I guess we have different opinions about the tradeoff of
risk and benefit of that new mode.

> Cheers,
> Nick.
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From rosuav at gmail.com Thu May 22 15:05:51 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 22 May 2014 23:05:51 +1000
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us>
<537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol>

Message-ID:

On Thu, May 22, 2014 at 10:50 PM, Thomas Heller wrote:
> Am 22.05.2014 10:52, schrieb Antoine Pitrou:
>
>> The use case for disabling optimizations in C is to make programs
>> actually debuggable. Python doesn't have that problem.
>
>
> Well, setting a breakpoint to the 'continue' line in Ned's test code
> and running it with pdb does NOT trigger the breakpoint.
> So 'Python doesn't have this problem' is not really true.

Correct me if I'm wrong, but as I understand it, the problem is that
the peephole optimizer eliminated an entire line of code. Would it be
possible to have it notice when it merges two pieces from different
lines, and somehow mark that the resulting bytecode comes from both
lines? That would solve the breakpoint and coverage problems
simultaneously.

ChrisA

From skip at pobox.com Thu May 22 15:49:49 2014
From: skip at pobox.com (Skip Montanaro)
Date: Thu, 22 May 2014 08:49:49 -0500
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us>
<537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol>

Message-ID:

On Thu, May 22, 2014 at 8:05 AM, Chris Angelico wrote:
> Correct me if I'm wrong, but as I understand it, the problem is that
> the peephole optimizer eliminated an entire line of code. Would it be
> possible to have it notice when it merges two pieces from different
> lines, and somehow mark that the resulting bytecode comes from both
> lines? That would solve the breakpoint and coverage problems
> simultaneously.

It seems to me that Ned has revealed a bug in the peephole optimizer.
It zapped an entire source line's worth of bytecode, but failed to
delete the relevant entry in the line number table of the resulting
code object. If I had my druthers, that would be the change I'd
prefer.

That said, I think Ned's proposal is fairly simple. As for the
increased testing load, I think the extra cost would be the
duplication of the buildbots (or the adjustment of their setup to test
with -O and -O0 flags). Is it still the case that -O effectively does
nothing (maybe only eliding __debug__ checks)?

Skip

From ethan at stoneleaf.us Thu May 22 16:02:36 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 22 May 2014 07:02:36 -0700
Subject: [Python-ideas] Maybe/Option builtin
In-Reply-To:
References:
<20140521235031.GH10355@ando>

<20140522002434.GJ10355@ando>

Message-ID: <537E037C.1020202@stoneleaf.us>

On 05/21/2014 11:25 PM, Devin Jeanpierre wrote:
>
> Does that help explain how Some/None can sometimes be useful or
> desirable (to some people) even without static typing?

Nice essay, thanks!

--
~Ethan~

From p.f.moore at gmail.com Thu May 22 16:29:13 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 22 May 2014 15:29:13 +0100
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537DFA95.4040000@nedbatchelder.com>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol>

<537D2FDA.6030907@nedbatchelder.com>
<537DFA95.4040000@nedbatchelder.com>
Message-ID:

On 22 May 2014 14:24, Ned Batchelder wrote:
> When this came up 18 months ago on Python-Dev, it was part of a thread about
> adding more optimizations to CPython. Guido said "+1" to the idea of being
> able to disable the optimizers
> (https://mail.python.org/pipermail/python-dev/2012-December/123099.html).
> Our need is not as great as C's, the unrecognizability of the compiled code
> is much less, but current optimizations are already interfering with the
> ability to debug and analyze code, and new optimizations will only broaden
> the possibility of interference.

So I'm a bit confused. This was debated on python-dev and (presumably)
agreement reached, so why does it need a whole new thread here?

Paul

From wolfgang.maier at biologie.uni-freiburg.de Thu May 22 16:44:52 2014
From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier)
Date: Thu, 22 May 2014 16:44:52 +0200
Subject: [Python-ideas] Make Python code read-only
In-Reply-To:
References:

<20140521014203.GE10355@ando>

Message-ID:

On 21.05.2014 11:43, Nick Coghlan wrote:
>
> It also misses the big reason I am a Python programmer rather than a
> Java programmer.
>
> For me, Python is primarily an orchestration language. It is the
> language for the code that is telling everything else what to do. If my
> Python code is an overall performance bottleneck, then "Huzzah!", as it
> means I have finally engineered all the other structural bottlenecks out
> of the system.
>
> For this use case, monkey patching is not an incidental feature to be
> tolerated merely for backwards compatibility reasons: it is a key
> capability that makes Python an ideal language for me, as it takes
> ultimate control of what dependencies do away from the original author
> and places it in my hands as the system integrator. This is a dangerous
> power, not to be used lightly, but it also grants me the ability to work
> around critical bugs in dependencies at run time, rather than having to
> fork and patch the source the way Java developers tend to do.
>

Very intriguing perspective! I am not experienced enough with Java to
judge your comparison, but if you ever find the time to elaborate on
this in a longer post or article somewhere I'd be eager to read it.
Thanks,
Wolfgang

From ned at nedbatchelder.com Thu May 22 17:29:07 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Thu, 22 May 2014 11:29:07 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com> <537DFA95.4040000@nedbatchelder.com>

Message-ID: <537E17C3.8000307@nedbatchelder.com>

On 5/22/14 10:29 AM, Paul Moore wrote:
> On 22 May 2014 14:24, Ned Batchelder wrote:
>> When this came up 18 months ago on Python-Dev, it was part of a thread about
>> adding more optimizations to CPython. Guido said "+1" to the idea of being
>> able to disable the optimizers
>> (https://mail.python.org/pipermail/python-dev/2012-December/123099.html).
>> Our need is not as great as C's, the unrecognizability of the compiled code
>> is much less, but current optimizations are already interfering with the
>> ability to debug and analyze code, and new optimizations will only broaden
>> the possibility of interference.
> So I'm a bit confused. This was debated on python-dev and (presumably)
> agreement reached, so why does it need a whole new thread here?
>
> Paul
I would not say the idea was debated. You can read the (short) thread
here:
https://mail.python.org/pipermail/python-dev/2012-December/123022.html
. Mark Shannon proposed emitting different bytecode for while loops and
some other constructs. Guido said no PEP was needed. Nick Coghlan said
"main challenge is to keep stepping through the code with pdb sane" (I
agree with that!). I said it would be good to have a way to disable
optimizations, Guido said "+1".

I put this idea here because the discussion on issue2506 got involved
enough that someone suggested this was the right place for it. I linked
to Guido's sentiment in my initial post here, and had hoped that he
would chime in.

--Ned.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From guido at python.org Thu May 22 17:32:10 2014
From: guido at python.org (Guido van Rossum)
Date: Thu, 22 May 2014 08:32:10 -0700
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To: <537DFBCA.2070006@nedbatchelder.com>
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>
Message-ID:

FWIW, I am strictly with Ned here.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ned at nedbatchelder.com Thu May 22 17:32:33 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Thu, 22 May 2014 11:32:33 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us>
<537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol>

Message-ID: <537E1891.5050808@nedbatchelder.com>

On 5/22/14 9:49 AM, Skip Montanaro wrote:
> On Thu, May 22, 2014 at 8:05 AM, Chris Angelico wrote:
>> Correct me if I'm wrong, but as I understand it, the problem is that
>> the peephole optimizer eliminated an entire line of code. Would it be
>> possible to have it notice when it merges two pieces from different
>> lines, and somehow mark that the resulting bytecode comes from both
>> lines? That would solve the breakpoint and coverage problems
>> simultaneously.
> It seems to me that Ned has revealed a bug in the peephole optimizer.
> It zapped an entire source line's worth of bytecode, but failed to
> delete the relevant entry in the line number table of the resulting
> code object. If I had my druthers, that would be the change I'd
> prefer.

I think it is the nature of optimization that it will destroy useful
information. I don't think it will always be possible to retain enough
back-mapping that the optimized code can be understood as if it had not
been optimized. For example, the debug issue would still be present:
if you run pdb and set a breakpoint on the "continue" line, it will
never be hit. Even if the optimizer cleaned up after itself perfectly
(in fact, especially so), that breakpoint will still not be hit. You
simply cannot reason about optimized code without having to mentally
understand the transformations that have been applied.

The whole point of this proposal is to recognize that there are times
(debugging, coverage measurement) when optimizations are harmful, and to
avoid them.
>
> That said, I think Ned's proposal is fairly simple. As for the
> increased testing load, I think the extra cost would be the
> duplication of the buildbots (or the adjustment of their setup to test
> with -O and -O0 flags). Is it still the case that -O effectively does
> nothing (maybe only eliding __debug__ checks)?
>
> Skip
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From p.f.moore at gmail.com Thu May 22 17:39:57 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 22 May 2014 16:39:57 +0100
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537E17C3.8000307@nedbatchelder.com>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol>

<537D2FDA.6030907@nedbatchelder.com>
<537DFA95.4040000@nedbatchelder.com>

<537E17C3.8000307@nedbatchelder.com>
Message-ID:

On 22 May 2014 16:29, Ned Batchelder wrote:
> I would not say the idea was debated. You can read the (short) thread here:
> https://mail.python.org/pipermail/python-dev/2012-December/123022.html .
> Mark Shannon proposed emitting different bytecode for while loops and some
> other constructs. Guido said no PEP was needed. Nick Coghlan said "main
> challenge is to keep stepping through the code with pdb sane" (I agree with
> that!). I said it would be good to have a way to disable optimizations,
> Guido said "+1".
>
> I put this idea here because the discussion on issue2506 got involved enough
> that someone suggested this was the right place for it. I linked to Guido's
> sentiment in my initial post here, and had hoped that he would chime in.

OK, thanks for the summary.

Personally, I still think the biggest issue is around pyc files. I
think any proposal needs an answer to that (even if it's just that
no-optimisation mode never reads or writes bytecode files). Expecting
users to manually manage pyc files is a bad idea. Well, that and any
implementation complexity, which I'll leave to others to consider.

Paul

From mal at egenix.com Thu May 22 17:40:56 2014
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 22 May 2014 17:40:56 +0200
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537E1891.5050808@nedbatchelder.com>
References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol>
<537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com>
<20140522105220.70b360fe@fsol>
<537E1891.5050808@nedbatchelder.com>
Message-ID: <537E1A88.8080501@egenix.com>

On 22.05.2014 17:32, Ned Batchelder wrote:
>
> The whole point of this proposal is to recognize that there are times (debugging, coverage
> measurement) when optimizations are harmful, and to avoid them.

+1

It's regular practice in other languages to disable optimizations
when debugging code. I don't see why Python should be different in this
respect.

Debuggers, testing, coverage and other such tools should be able to
invoke a Python runtime mode that let's the compiler work strictly
by the book, without applying any kind of optimization.

This used to be the default in Python, but over the years, we
gradually moved away from this as default, with no options to get
the old non-optimizing behavior back.

I think it's fine to make safe optimizations default in Python,
but there's definitely a need for being able to run Python in
a debugger without having it perfectly valid skip code lines
(even if they are no ops).

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, May 22 2014)
>>> Python Projects, Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/

From mal at egenix.com Thu May 22 17:41:52 2014
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 22 May 2014 17:41:52 +0200
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com>
<537DFA95.4040000@nedbatchelder.com> <537E17C3.8000307@nedbatchelder.com>

Message-ID: <537E1AC0.1000708@egenix.com>

On 22.05.2014 17:39, Paul Moore wrote:
> On 22 May 2014 16:29, Ned Batchelder wrote:
>> I would not say the idea was debated. You can read the (short) thread here:
>> https://mail.python.org/pipermail/python-dev/2012-December/123022.html .
>> Mark Shannon proposed emitting different bytecode for while loops and some
>> other constructs. Guido said no PEP was needed. Nick Coghlan said "main
>> challenge is to keep stepping through the code with pdb sane" (I agree with
>> that!). I said it would be good to have a way to disable optimizations,
>> Guido said "+1".
>>
>> I put this idea here because the discussion on issue2506 got involved enough
>> that someone suggested this was the right place for it. I linked to Guido's
>> sentiment in my initial post here, and had hoped that he would chime in.
>
> OK, thanks for the summary.
>
> Personally, I still think the biggest issue is around pyc files. I
> think any proposal needs an answer to that (even if it's just that
> no-optimisation mode never reads or writes bytecode files). Expecting
> users to manually manage pyc files is a bad idea. Well, that and any
> implementation complexity, which I'll leave to others to consider.

Why not simply have the new option disable writing PYC files ?

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, May 22 2014)
>>> Python Projects, Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/

From p.f.moore at gmail.com Thu May 22 17:46:13 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 22 May 2014 16:46:13 +0100
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537E1AC0.1000708@egenix.com>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol>

<537D2FDA.6030907@nedbatchelder.com>
<537DFA95.4040000@nedbatchelder.com>

<537E17C3.8000307@nedbatchelder.com>

<537E1AC0.1000708@egenix.com>
Message-ID:

On 22 May 2014 16:41, M.-A. Lemburg wrote:
> Why not simply have the new option disable writing PYC files ?

That's what I said. But you also need to not read them as well,
because otherwise you could read an optimised file if the source
hasn't changed.
Paul

From mal at egenix.com Thu May 22 17:49:27 2014
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 22 May 2014 17:49:27 +0200
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com>
<537DFA95.4040000@nedbatchelder.com> <537E17C3.8000307@nedbatchelder.com> <537E1AC0.1000708@egenix.com>

Message-ID: <537E1C87.7020407@egenix.com>

On 22.05.2014 17:46, Paul Moore wrote:
> On 22 May 2014 16:41, M.-A. Lemburg wrote:
>> Why not simply have the new option disable writing PYC files ?
>
> That's what I said. But you also need to not read them as well,
> because otherwise you could read an optimised file if the source
> hasn't changed.

Good point :-)

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, May 22 2014)
>>> Python Projects, Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/

From victor.stinner at gmail.com Thu May 22 17:55:33 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Thu, 22 May 2014 17:55:33 +0200
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To: <537C888D.7060903@nedbatchelder.com>
References: <537C888D.7060903@nedbatchelder.com>
Message-ID:

Hi,

2014-05-21 13:05 GMT+02:00 Ned Batchelder :
> A long-standing problem with CPython is that the peephole optimizer cannot
> be completely disabled.

I had a similar concern why I worked on my astoptimizer project. I
wanted to reimplement the peephole optimizer using the AST instead of
the bytecode. Since the peephole optimizer is always on, I was not
able to compare the bytecode generated by my AST optimizer without
peepholer optimizer to the bytecode generated with the peephole
optimizer.

I would also be curious to see the code before the peepholer optimizer
modifies it.

> If you execute "python3.4 -m trace -c -m continue.py", it produces this
> continue.cover file:
>
> 1: a = b = c = 0
> 101: for n in range(100):
> 100: if n % 2:
> 50: if n % 4:
> 50: a += 1
>>>>>>> continue
> else:
> 50: b += 1
> 50: c += 1
> 1: assert a == 50 and b == 50 and c == 50
>
> This indicates that the continue line is not executed.

I played long hours in gdb and this is a common issue of compiler
optimizations. In gdb, sometimes the program looks to go backward or
reexecute the same instruction twice. I hate loosing my time with
that, I prefer to recompile the whole (C) application with gcc -O0
-ggdb.

> ** User Interface
>
> Unfortunately, the -O command-line switch does not lend itself to a new
> value that means, "less optimization than the default." I propose a new
> switch -P, to control the peephole optimizer, with a value of -P0 meaning no
> optimization at all. The PYTHONPEEPHOLE environment variable would also
> control the option.

I propose "python -X nopeephole" , "python -X peephole=0" or "python -X optim=0"

I don't like "python -X peephole=0" because "python -X peephole"
should active the optimizer, which is already the default.

For "optim" proposition, should we keep it synchronous with -O and -OO?

(no -O alternative) <=> -X optim=0
(default) => -X optim=1
-O <=> -X optim=2
-OO <=> -X optim=3

I never understand -O and -OO. What is optimized exactly. To me,
striping docstrings is not really an "optimization", it should be a
different option.

Because of this confusion, the peephole optimizer option should maybe
be disconnected to -O and -OO. So take "python -X nopeephole".

IMO you should not write .pyc or .pyo files if the peephole optimizer
is actived. It avoids the question of "was this .pyc generated with or
without peephole optimizer?". Usually, when you disable optimizations,
you don't care of performances (.pyc are created to speedup Python
startup time).

I also suggest to add a new flag to the builtin compile() function:
PyCF_NO_PEEPHOLE.

> There are about a dozen places internal to CPython where optimization level
> is indicated with an integer, for example, in Py_CompileStringObject. Those
> uses also don't allow for new values indicating less optimization than the
> default: 0 and -1 already have meanings. Unless we want to start using -2
> for less that the default. I'm not sure we need to provide for those
> values, or if the PYTHONPEEPHOLE environment variable provides enough
> control.

Add a new flag to sys.flags: "peephole" or "peephole_optimizer"
(boolean, True by default).

Victor

From ethan at stoneleaf.us Thu May 22 17:43:35 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 22 May 2014 08:43:35 -0700
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537E1891.5050808@nedbatchelder.com>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us>
<537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol>

<537E1891.5050808@nedbatchelder.com>
Message-ID: <537E1B27.7050201@stoneleaf.us>

On 05/22/2014 08:32 AM, Ned Batchelder wrote:
> On 5/22/14 9:49 AM, Skip Montanaro wrote:
>> On Thu, May 22, 2014 at 8:05 AM, Chris Angelico wrote:
>>>
>>> Correct me if I'm wrong, but as I understand it, the problem is that
>>> the peephole optimizer eliminated an entire line of code. Would it be
>>> possible to have it notice when it merges two pieces from different
>>> lines, and somehow mark that the resulting bytecode comes from both
>>> lines? That would solve the breakpoint and coverage problems
>>> simultaneously.
>>
>> It seems to me that Ned has revealed a bug in the peephole optimizer.
>> It zapped an entire source line's worth of bytecode, but failed to
>> delete the relevant entry in the line number table of the resulting
>> code object. If I had my druthers, that would be the change I'd
>> prefer.
>
> I think it is the nature of optimization that it will destroy useful information. I don't think it will always be
> possible to retain enough back-mapping that the optimized code can be understood as if it had not been optimized. For
> example, the debug issue would still be present: if you run pdb and set a breakpoint on the "continue" line, it will
> never be hit. Even if the optimizer cleaned up after itself perfectly (in fact, especially so), that breakpoint will
> still not be hit. You simply cannot reason about optimized code without having to mentally understand the
> transformations that have been applied.
>
> The whole point of this proposal is to recognize that there are times (debugging, coverage measurement) when
> optimizations are harmful, and to avoid them.

Having read through the issue on the tracker, I find myself swayed towards Neds point of view. However, I do still
agree with Raymond that a full-fledged command-line switch is overkill, especially since the unoptimized runs are very
special-cased (meaning useful for debugging, coverage, curiosity, learning about optimizing, etc).

If we had a sys flag that could be set before a module was loaded, then coverage, pdb, etc., could use that to recompile
the source, not save a .pyc file, and move forward. For debugging purposes perhaps a `__no_optimize__ = True` or `from
__future__ import no_optimize` would help in those cases where you're dropping into the debugger.

The dead-code elimination still has a bug to be fixed, though, because if a line has been optimized away trying to set a
break-point at it should fail.

--
~Ethan~

From barry at python.org Thu May 22 18:41:32 2014
From: barry at python.org (Barry Warsaw)
Date: Thu, 22 May 2014 12:41:32 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us>
<537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol>

Message-ID: <20140522124132.14bffcb3@anarchist.wooz.org>

On May 22, 2014, at 10:02 AM, Paul Moore wrote:

>As a concrete example, note my earlier comment about pyc files.
>Switching off optimisation results in unoptimised bytecode being
>written to pyc files, which could then be read in a subsequent
>(supposedly) optimised run.

Seems to me that PEP 3147 tagging could be extended to describe various
optimization levels. It might even be nice to get rid of the overloaded .pyo
files. The use of .pyo for both -O and -OO optimization levels causes some
issues.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL:

From ericsnowcurrently at gmail.com Thu May 22 18:47:40 2014
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 22 May 2014 10:47:40 -0600
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol>

<537D2FDA.6030907@nedbatchelder.com>
<537DFA95.4040000@nedbatchelder.com>

<537E17C3.8000307@nedbatchelder.com>

Message-ID:

On May 22, 2014 9:40 AM, "Paul Moore" wrote:
> Personally, I still think the biggest issue is around pyc files. I
> think any proposal needs an answer to that (even if it's just that
> no-optimisation mode never reads or writes bytecode files).

So the flag for that would be set implicitly? That sounds reasonable (and
easy).

> Expecting
> users to manually manage pyc files is a bad idea. Well, that and any
> implementation complexity, which I'll leave to others to consider.

As a fallback, Victor already pointed out that changing
sys.implementation.cache_tag would be easy too.

-eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From Steve.Dower at microsoft.com Thu May 22 17:41:31 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Thu, 22 May 2014 15:41:31 +0000
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com>
<20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

Message-ID:

Guido van Rossum wrote:
> FWIW, I am strictly with Ned here.

As someone who maintains/develops a debugger for Python, I?m with Ned as well (and also Raymond, since I really don?t want to have to worry about one-more-mode that Python might be running in).

Why not move the existing optimisation into -O mode and put future optimisations in there too? It may just start having enough value that people switch to using it.

From stefan at bytereef.org Thu May 22 19:23:02 2014
From: stefan at bytereef.org (Stefan Krah)
Date: Thu, 22 May 2014 19:23:02 +0200
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com>

Message-ID: <20140522172302.GA25613@sleipnir.bytereef.org>

Victor Stinner wrote:
> I never understand -O and -OO. What is optimized exactly. To me,
> striping docstrings is not really an "optimization", it should be a
> different option.

Indeed, it should be -Os (optimize for space).

Stefan Krah

From ethan at stoneleaf.us Thu May 22 19:29:48 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 22 May 2014 10:29:48 -0700
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

Message-ID: <537E340C.9090001@stoneleaf.us>

On 05/22/2014 08:41 AM, Steve Dower wrote:
> Guido van Rossum wrote:
>>
>> FWIW, I am strictly with Ned here.
>
> As someone who maintains/develops a debugger for Python, I?m with
> Ned as well (and also Raymond, since I really don?t want to have
> to worry about one-more-mode that Python might be running in).
>
> Why not move the existing optimisation into -O mode and put future
> optimisations in there too? It may just start having enough value
> that people switch to using it.

I will admit to being very surprised the day I realized that the normal run mode for python is debugging mode!

For anyone who hasn't yet realized this, without -O, __debug__ is True, but with any -O __debug__ is False. Given that,
it does seem kind of odd to have source altering optimizations active when __debug__ is True.

Of course, we can't change that mid-3.x stream.

However, we could turn off optimizations by default, and then have -O remove assertions /and/ turn on optimizations.

Which would still work nicely with .pyc and .pyo files as ... wait, let me make a table:

flag | optimizations | saved files
--------+--------------------+--------------
none | none | none
--------+--------------------+--------------
-O | asserts removed | .pyc
| peephole, etc. |
--------+--------------------+--------------
-OO | -O plus |
| docstrings removed | .pyo

That would certainly make the -O flags make more sense than they do now. It would also emphasize the fact that assert
is not for user data verification. ;)

--
~Ethan~

From steve at pearwood.info Thu May 22 19:59:10 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 23 May 2014 03:59:10 +1000
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

Message-ID: <20140522175910.GM10355@ando>

On Thu, May 22, 2014 at 03:41:31PM +0000, Steve Dower wrote:

> Why not move the existing optimisation into -O mode and put future
> optimisations in there too? It may just start having enough value that
> people switch to using it.

I just had the same idea, you beat me to it.

There's a steady but small stream of people asking "why do we have -O,
it does so little we might as well get rid of it". If I remember
correctly (and apologies if I do not), Guido has even suggested getting
rid of simple constant folding. So let's make -O more attractive, while
simplifying the default behaviour:

* By default, no optimizations operate at all.

* With -O, you get assert disabling, the tricky string concatenation
optimization, constant folding, and whatever else the peepholer does.

* The double -OO switch should be deprecated, for eventual removal
in the very distant future. (4.0? 5.0?)

* Instead, a separate switch for removing docstrings can be added,
to support implementations in low-memory devices or other
constrained situations.

This will make Python's compilation model a little more familiar to
people coming from other languages. It will make -O more attractive,
instead of being viewed by some as a waste of effort, and ensure that by
default there are no tricks played with byte-code.

A big advantage: we already have separate .pyo and .pyc files, so no
risk of confusion.

Downside of this suggestion:

- To the extent that constant folding and other optimizations actually
lead to a speed-up, turning them off by default will be a performance
regression.

- Experienced programmers ought to know not to rely on the string
concatenation optimization, as it is non-portable and prone to
surprising failures even in CPython. The optimization really only
exists for naive programmers, but they are unlikely to know about, or
bother using, -O to get that optimization.

- Surely I don't expect PyPy to perform no optimizations at all unless
the -O switch is given? I'd have to be mad to suggest that.

--
Steven

From njs at pobox.com Thu May 22 19:16:33 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 22 May 2014 18:16:33 +0100
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537E1891.5050808@nedbatchelder.com>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us>
<537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol>

<537E1891.5050808@nedbatchelder.com>
Message-ID:

On Thu, May 22, 2014 at 4:32 PM, Ned Batchelder wrote:
> On 5/22/14 9:49 AM, Skip Montanaro wrote:
>> It seems to me that Ned has revealed a bug in the peephole optimizer.
>> It zapped an entire source line's worth of bytecode, but failed to
>> delete the relevant entry in the line number table of the resulting
>> code object. If I had my druthers, that would be the change I'd
>> prefer.
>
> I think it is the nature of optimization that it will destroy useful
> information. I don't think it will always be possible to retain enough
> back-mapping that the optimized code can be understood as if it had not been
> optimized. For example, the debug issue would still be present: if you run
> pdb and set a breakpoint on the "continue" line, it will never be hit. Even
> if the optimizer cleaned up after itself perfectly (in fact, especially so),
> that breakpoint will still not be hit. You simply cannot reason about
> optimized code without having to mentally understand the transformations
> that have been applied.

In this particular case, the back-mapping problem is pretty minor.
IIUC the optimization is that if we have (abusing BASIC notation)

10 GOTO 20
20 GOTO 30
30 ...

then in fact the operations at lines 10 and 20 are, from the point of
view of the rest of the program, indivisible -- every time you execute
10 you also execute 20, there is no way to tell from outside whether
we paused in betwen executing 10 and 20, etc. Effectively we just have
a single uber-instruction that does both:

(10, 20) GOTO 30
30 ...

So from the coverage point of view, just marking line 20 as covered
every time line 10 is executed is the Right Thing To Do. From the
debugging point of view, a breakpoint set at line 20 should just trip
whenever line 10 is executed -- it's not like there's any way to tell
whether we're "half way through" the jump sequence or not. It's a
pretty solid abstraction.

-n

--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

From steve at pearwood.info Thu May 22 20:14:11 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 23 May 2014 04:14:11 +1000
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To: <537E340C.9090001@stoneleaf.us>
References: <20140521121319.GG10355@ando> <537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<537E340C.9090001@stoneleaf.us>
Message-ID: <20140522181411.GN10355@ando>

On Thu, May 22, 2014 at 10:29:48AM -0700, Ethan Furman wrote:

> However, we could turn off optimizations by default, and then have -O
> remove assertions /and/ turn on optimizations.
>
> Which would still work nicely with .pyc and .pyo files as ... wait, let me
> make a table:
>
> flag | optimizations | saved files
> --------+--------------------+--------------
> none | none | none
> --------+--------------------+--------------
> -O | asserts removed | .pyc
> | peephole, etc. |
> --------+--------------------+--------------
> -OO | -O plus |
> | docstrings removed | .pyo

I think we still want to cache byte code in .pyc files by default.
Technically, yes, it's an optimization, but it's not the sort of
optimization that makes a difference to debugging[1]. As I understand
it, generating the parse tree is *extremely* expensive. Run python -v to
see just how many modules would have to be parsed and compiled every
single time without the cached .pyc files.

> That would certainly make the -O flags make more sense than they do now.
> It would also emphasize the fact that assert is not for user data
> verification. ;)

:-)

[1] Except perhaps under very rare and unusual circumstances, but there
are already mechanisms in place to disable the generation of .pyc files.

--
Steven

From ericsnowcurrently at gmail.com Thu May 22 20:49:32 2014
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 22 May 2014 12:49:32 -0600
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To: <20140522175910.GM10355@ando>
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>
Message-ID:

On Thu, May 22, 2014 at 11:59 AM, Steven D'Aprano wrote:
> On Thu, May 22, 2014 at 03:41:31PM +0000, Steve Dower wrote:
>
>> Why not move the existing optimisation into -O mode and put future
>> optimisations in there too? It may just start having enough value that
>> people switch to using it.
>
> I just had the same idea, you beat me to it.

Same here. More concretely:

-O0 -- no optmizations at all
-O1 -- level 1 optimizations (current peephole optmizations), asserts disabled
-O2 -- level 2 optimizations (currently nothing extra)
-O3 -- ...
-ONs or -X nodocstrings or -X compact or --compact or --nodocstrings
-- remove docstrings (for space savings)
--debug or -X debug -- sets __debug__ to True (also implies -O0)

Compatibility (keeping the current behavior):
Default: -O + __debug__ = True (deprecate setting __debug__ to True?)
-O -- same as -O
-OO -- same as -Os (deprecate)

Having the current optimizations correspond to -O1 makes sense in that
we don't have anything more granular. However, if more optimizations
were added I'd expect them to fall under a higher optimization level.

Adding a new option just for docstrings/compact seems like a I waste
so I like Stefan's idea of optionally appending "s" (for space) onto
the -O option.

As Barry noted, we would also build on PEPs 3147/3149 to add a tag for
the optmization level, etc. The default mode would keep the current
cache tag and -O/-OO would likewise stay the same (with the .pyo
suffix).

> * The double -OO switch should be deprecated, for eventual removal
> in the very distant future. (4.0? 5.0?)

Good idea.

> * Instead, a separate switch for removing docstrings can be added,
> to support implementations in low-memory devices or other
> constrained situations.

Also a good idea.

> This will make Python's compilation model a little more familiar to
> people coming from other languages. It will make -O more attractive,
> instead of being viewed by some as a waste of effort, and ensure that by
> default there are no tricks played with byte-code.

+1

-eric

From ericsnowcurrently at gmail.com Thu May 22 20:57:41 2014
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 22 May 2014 12:57:41 -0600
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

Message-ID:

On Thu, May 22, 2014 at 12:49 PM, Eric Snow wrote:
> Same here. More concretely:
...

Having said that, revamping those options and our current optimization
mechanism is a far cry from just adding -X nopeephole as Ned has
implied. While the former may make sense on its own, those broader
changes may languish as nice-to-haves. It may be better to go with
the latter in the short-term while the broader changes swirl in the
maelstrom of discussion indefinitely.

-eric

From stefan_ml at behnel.de Thu May 22 21:11:46 2014
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 22 May 2014 21:11:46 +0200
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com>

Message-ID:

Brett Cannon, 21.05.2014 15:51:
> There are constant rumblings about trying to make .pyc/.pyo aware of what
> optimizations were applied so that this kind of thing wouldn't occur. It
> would require tweaking how optimizations are expressed/added so that they
> are more easily controlled and can somehow contribute to the labeling of
> what optimizations were applied. All totally doable but will require
> thinking about the proper API and such (reading .pyc/.pyo files would also
> break but that's happened before when we added file size to the header and
> .pyc/.pyo files are viewed as internal optimizations anyway).

It might be possible to move the peephole optimiser run into the code
loader, i.e. the .pyc files could be written out *before* it runs, as plain
unoptimised byte code. There might be a tiny performance impact on load,
but I doubt that it would be serious.

Stefan

From ned at nedbatchelder.com Thu May 22 22:10:11 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Thu, 22 May 2014 16:10:11 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537E1C87.7020407@egenix.com>
References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D2FDA.6030907@nedbatchelder.com>
<537DFA95.4040000@nedbatchelder.com> <537E17C3.8000307@nedbatchelder.com> <537E1AC0.1000708@egenix.com>

<537E1C87.7020407@egenix.com>
Message-ID: <537E59A3.8090100@nedbatchelder.com>

On 5/22/14 11:49 AM, M.-A. Lemburg wrote:
> On 22.05.2014 17:46, Paul Moore wrote:
>> On 22 May 2014 16:41, M.-A. Lemburg wrote:
>>> Why not simply have the new option disable writing PYC files ?
>> That's what I said. But you also need to not read them as well,
>> because otherwise you could read an optimised file if the source
>> hasn't changed.
> Good point :-)
>

For the use-case I am considering, it would be best to write .pyc files
as usual. These are large test suites that already have detailed
choreography, usually involving new working trees for each run, or
explicitly deleted pyc files. Avoiding pyc's altogether will slow
things down, and test suites are universally considered to take too long
as it is.

--Ned.

From ned at nedbatchelder.com Thu May 22 22:13:48 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Thu, 22 May 2014 16:13:48 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537E1B27.7050201@stoneleaf.us>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us>
<537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol>

<537E1891.5050808@nedbatchelder.com> <537E1B27.7050201@stoneleaf.us>
Message-ID: <537E5A7C.3080708@nedbatchelder.com>

On 5/22/14 11:43 AM, Ethan Furman wrote:
> On 05/22/2014 08:32 AM, Ned Batchelder wrote:
>> On 5/22/14 9:49 AM, Skip Montanaro wrote:
>>> On Thu, May 22, 2014 at 8:05 AM, Chris Angelico wrote:
>>>>
>>>> Correct me if I'm wrong, but as I understand it, the problem is that
>>>> the peephole optimizer eliminated an entire line of code. Would it be
>>>> possible to have it notice when it merges two pieces from different
>>>> lines, and somehow mark that the resulting bytecode comes from both
>>>> lines? That would solve the breakpoint and coverage problems
>>>> simultaneously.
>>>
>>> It seems to me that Ned has revealed a bug in the peephole optimizer.
>>> It zapped an entire source line's worth of bytecode, but failed to
>>> delete the relevant entry in the line number table of the resulting
>>> code object. If I had my druthers, that would be the change I'd
>>> prefer.
>>
>> I think it is the nature of optimization that it will destroy useful
>> information. I don't think it will always be
>> possible to retain enough back-mapping that the optimized code can be
>> understood as if it had not been optimized. For
>> example, the debug issue would still be present: if you run pdb and
>> set a breakpoint on the "continue" line, it will
>> never be hit. Even if the optimizer cleaned up after itself
>> perfectly (in fact, especially so), that breakpoint will
>> still not be hit. You simply cannot reason about optimized code
>> without having to mentally understand the
>> transformations that have been applied.
>>
>> The whole point of this proposal is to recognize that there are times
>> (debugging, coverage measurement) when
>> optimizations are harmful, and to avoid them.
>
> Having read through the issue on the tracker, I find myself swayed
> towards Neds point of view. However, I do still agree with Raymond
> that a full-fledged command-line switch is overkill, especially since
> the unoptimized runs are very special-cased (meaning useful for
> debugging, coverage, curiosity, learning about optimizing, etc).
>

I'm perfectly happy to drop the idea of the command-line switch. An
environment variable would be a fine way to control this behavior.

> If we had a sys flag that could be set before a module was loaded,
> then coverage, pdb, etc., could use that to recompile the source, not
> save a .pyc file, and move forward. For debugging purposes perhaps a
> `__no_optimize__ = True` or `from __future__ import no_optimize` would
> help in those cases where you're dropping into the debugger.

I don't understand these ideas, but having to add an import to the top
of the file seems like a non-starter to me.

>
> The dead-code elimination still has a bug to be fixed, though, because
> if a line has been optimized away trying to set a break-point at it
> should fail.

If we get a way to disable optimization, we don't need to fix that bug.
Everyone knows that optimized code acts oddly in debuggers. :)

>
> --
> ~Ethan~

From ned at nedbatchelder.com Thu May 22 22:17:18 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Thu, 22 May 2014 16:17:18 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> <537E1891.5050808@nedbatchelder.com>

Message-ID: <537E5B4E.8050905@nedbatchelder.com>

On 5/22/14 1:16 PM, Nathaniel Smith wrote:
> On Thu, May 22, 2014 at 4:32 PM, Ned Batchelder wrote:
>> On 5/22/14 9:49 AM, Skip Montanaro wrote:
>>> It seems to me that Ned has revealed a bug in the peephole optimizer.
>>> It zapped an entire source line's worth of bytecode, but failed to
>>> delete the relevant entry in the line number table of the resulting
>>> code object. If I had my druthers, that would be the change I'd
>>> prefer.
>> I think it is the nature of optimization that it will destroy useful
>> information. I don't think it will always be possible to retain enough
>> back-mapping that the optimized code can be understood as if it had not been
>> optimized. For example, the debug issue would still be present: if you run
>> pdb and set a breakpoint on the "continue" line, it will never be hit. Even
>> if the optimizer cleaned up after itself perfectly (in fact, especially so),
>> that breakpoint will still not be hit. You simply cannot reason about
>> optimized code without having to mentally understand the transformations
>> that have been applied.
> In this particular case, the back-mapping problem is pretty minor.
> IIUC the optimization is that if we have (abusing BASIC notation)
>
> 10 GOTO 20
> 20 GOTO 30
> 30 ...
>
> then in fact the operations at lines 10 and 20 are, from the point of
> view of the rest of the program, indivisible -- every time you execute
> 10 you also execute 20, there is no way to tell from outside whether
> we paused in betwen executing 10 and 20, etc. Effectively we just have
> a single uber-instruction that does both:
>
> (10, 20) GOTO 30
> 30 ...
>
> So from the coverage point of view, just marking line 20 as covered
> every time line 10 is executed is the Right Thing To Do. From the
> debugging point of view, a breakpoint set at line 20 should just trip
> whenever line 10 is executed -- it's not like there's any way to tell
> whether we're "half way through" the jump sequence or not. It's a
> pretty solid abstraction.
>
> -n
>

You've used the word "just" three times, glossing over the fact that we
have no facility for marking statements as an uber instruction, and
you've made no proposal for how it might work. Even if we build (and
test!) a way to do that, it only covers this particular kind of oddity
with optimized code.

--Ned.

From ned at nedbatchelder.com Thu May 22 22:26:15 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Thu, 22 May 2014 16:26:15 -0400
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

Message-ID: <537E5D67.90101@nedbatchelder.com>

On 5/22/14 2:57 PM, Eric Snow wrote:
> On Thu, May 22, 2014 at 12:49 PM, Eric Snow wrote:
>> Same here. More concretely:
> ...
>
> Having said that, revamping those options and our current optimization
> mechanism is a far cry from just adding -X nopeephole as Ned has
> implied. While the former may make sense on its own, those broader
> changes may languish as nice-to-haves. It may be better to go with
> the latter in the short-term while the broader changes swirl in the
> maelstrom of discussion indefinitely.

I get distracted (by work...) for the afternoon, and things take an
unexpected turn!

I definitely did not mean to throw open the floodgates to reconsider the
entire -O switch. I agree that the -O switch seems like too much UI for
too little change in results, and I think a different set of settings
and defaults makes more sense. But I do not suppose that we have much
appetite to take on that large a change.

For my purposes, an environment variable and no change or addition to
the switches would be fine.

--Ned
> -eric
>

From tjreedy at udel.edu Fri May 23 02:45:33 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 22 May 2014 20:45:33 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <20140522104334.36b2b07f@fsol>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol>

<537D2FDA.6030907@nedbatchelder.com>
<20140522104334.36b2b07f@fsol>
Message-ID:

On 5/22/2014 4:43 AM, Antoine Pitrou wrote:
> On Thu, 22 May 2014 02:44:52 -0400
> Terry Reedy wrote:
>>
>> When I used coverage (last summer) with tested Idle modules, I could not
>> get a reported 100% coverage because coverage counts the body of a final
>> "if __name__ == '__main__':" statement.
>
> There are flags to modify this behaviour.

Not directly, but yes, indirectly via --rcfile=FILE where FILE defaults
to .coveragerc and the configuration file has
[report]
exclude_lines =
if __name__ == .__main__.:

I believe Ned pointed that out to me when I reported the 'problem' to him.

If 'continue' were added under 'exclude_lines', the 'can't get 100%
coverage' continue issue should go away also. (Yes, I know it is not
quite that simple, as there will be times when continue is skipped that
should be reported. But I suspect that there will nearly always be some
other line skipped and reported, so that a false 100% will be rare.)

--
Terry Jan Reedy

From ethan at stoneleaf.us Fri May 23 02:40:51 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 22 May 2014 17:40:51 -0700
Subject: [Python-ideas] Disabling optimizations
Message-ID: <537E9913.7070501@stoneleaf.us>

So, to hopefully summarize where we seem to have come to something of a consensus:

- disabling optimizations can be a good thing

- creating a new command-line switch is an overpowered solution

- having a sys flag could work

- redefining the existing -O switch could work

- care must be taken to properly handle what is written to .pyc/.pyo files

Personally, I could live with either a sys flag type solution or the -O solution, but I strongly favor the -O solution.

Why?

Partly because -O is for optimizations, so it naturally lends itself to turning them off; partly because I think the
current state of the -O switches is sub-optimal (almost-pun intended ;); partly because I see assert being used
incorrectly and want to encourage the use of at least -O; partly because running in __debug__ mode by default seems a
bit strange; and partly because running in __debug__ mode but having optimizations turned on also seems a bit strange.

I think the big question if we go this route is what gets written to pyc files, and what to pyo files?

--
~Ethan~

From ned at nedbatchelder.com Fri May 23 03:44:27 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Thu, 22 May 2014 21:44:27 -0400
Subject: [Python-ideas] Disabling optimizations
In-Reply-To: <537E9913.7070501@stoneleaf.us>
References: <537E9913.7070501@stoneleaf.us>
Message-ID: <537EA7FB.8070806@nedbatchelder.com>

On 5/22/14 8:40 PM, Ethan Furman wrote:
> So, to hopefully summarize where we seem to have come to something of
> a consensus:
>
> - disabling optimizations can be a good thing
>
> - creating a new command-line switch is an overpowered solution
>
> - having a sys flag could work
>
> - redefining the existing -O switch could work
>
> - care must be taken to properly handle what is written to .pyc/.pyo
> files
>
> Personally, I could live with either a sys flag type solution or the
> -O solution, but I strongly favor the -O solution.
>
> Why?
>
> Partly because -O is for optimizations, so it naturally lends itself
> to turning them off; partly because I think the current state of the
> -O switches is sub-optimal (almost-pun intended ;); partly because I
> see assert being used incorrectly and want to encourage the use of at
> least -O; partly because running in __debug__ mode by default seems a
> bit strange; and partly because running in __debug__ mode but having
> optimizations turned on also seems a bit strange.
>
> I think the big question if we go this route is what gets written to
> pyc files, and what to pyo files?

I'm of the opinion that we don't need to segregate bytecode into
different files depending on the options used to create the bytecode.
How often is the same program run in the same place with different
options at different times? I'm happy to have optimized and
non-optimized code both written to .pyc files, and if you are fiddling
with the options like that, you should delete your pyc files when you
change the options. If we come up with a way to have the bytecode
file-segregated, I'm OK with that too.

I definitely don't like the alternative that says unoptimized code isn't
written to disk at all. If people want to solve the problem that way,
there is already a mechanism to avoid writing bytecode, you can use it
with the optimizer controls to achieve the effect you want.

--Ned.

From cs at zip.com.au Fri May 23 03:56:57 2014
From: cs at zip.com.au (Cameron Simpson)
Date: Fri, 23 May 2014 11:56:57 +1000
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537E1B27.7050201@stoneleaf.us>
References: <537E1B27.7050201@stoneleaf.us>
Message-ID: <20140523015657.GA35202@cskk.homeip.net>

On 22May2014 08:43, Ethan Furman wrote:
>On 05/22/2014 08:32 AM, Ned Batchelder wrote:
>>The whole point of this proposal is to recognize that there are times (debugging, coverage measurement) when
>>optimizations are harmful, and to avoid them.
>
>Having read through the issue on the tracker, I find myself swayed
>towards Neds point of view.

I've been with Ned from the first post, but have been playing (slow) catchup on
the discussion.

I'd personally be fine with a -O0 command line switch in keeping with a
somewhat common C-compiler convention, or with an environment variable.

If all the optimizations in the compiler/interpreter are a distinct step, then
having a switch that just says "skip this step, we do not want the naive code
transformed at all" seems both desirable and easy.

And finally, the sig quote below really did come up at random for this message.

Cheers,
Cameron Simpson

We should forget about small efficiencies, say about 97% of the time:
premature optimization is the root of all evil. - Donald Knuth

From tjreedy at udel.edu Fri May 23 04:07:58 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 22 May 2014 22:07:58 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537E1A88.8080501@egenix.com>
References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol>
<537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com>
<20140522105220.70b360fe@fsol>
<537E1891.5050808@nedbatchelder.com> <537E1A88.8080501@egenix.com>
Message-ID:

On 5/22/2014 11:40 AM, M.-A. Lemburg wrote:
> On 22.05.2014 17:32, Ned Batchelder wrote:
>>
>> The whole point of this proposal is to recognize that there are times (debugging, coverage
>> measurement) when optimizations are harmful, and to avoid them.
>
> +1
>
> It's regular practice in other languages to disable optimizations
> when debugging code. I don't see why Python should be different in this
> respect.
>
> Debuggers, testing, coverage and other such tools should be able to
> invoke a Python runtime mode that let's the compiler work strictly
> by the book, without applying any kind of optimization.
>
> This used to be the default in Python,

I believe that Python has always had an 'as if' rule that allows more or
less 'hidden' optimizations, as long as the net effect of a statement is
as defined.

1. By the book, "a,b = b,a" means create a tuple from b,a, unpack the
contents to a and b, and delete the reference to the tuple. An obvious
optimization is to not create the tuple. As I remember, this was once
tried out before tuple unpacking was generalized to iterable unpacking.
I don't know if CPython was ever released with that optimization, or if
other implementations have or do use it. By the 'as if' rule, it does
not matter, even though an allocation tracer (such as the one added to
3.4?) might detect the non-allocation.

2. The manual says
'''
@f1(arg)
@f2
def func(): pass

is equivalent to

def func(): pass
func = f1(arg)(f2(func))
'''
The equivalent is 'as if', in net effect, not in the detailed process.
CPython actually executes (or at least did at one time)

def (): pass
func = f1(arg)(f2())

Ignore f1. The difference can be detected when f2 is called by examining
the approriate namespace within f2. When someone filed an issue about
the 'bug' of 'func' never being bound to the unwrapped function object,
Guido said that he neither wanted to change the doc or the
implementation. (Sorry, I cannot find the issue.)

3. "a + b" is *usually* equivalent to "a.__class__.__add__(b)" or
possibly "b.__class__.__radd__(a)". However, my understanding is that if
a and b are ints, a 'fast path' optimization is applied that bypasses
the int.__add slot wrapper. Is so, a call tracer could notice the
difference and if unaware of such optimizations, falsely report a problem.

4. Some Python implementations delay object destruction. I suspect that
some (many?) do not really destroy objects (zero out the memory block).

> but there's definitely a need for being able to run Python in
> a debugger without having it perfectly valid skip code lines
> (even if they are no ops).

This is a different issue from 'disable the peephole optimizer'.

--
Terry Jan Reedy

From ethan at stoneleaf.us Fri May 23 04:03:53 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 22 May 2014 19:03:53 -0700
Subject: [Python-ideas] Disabling optimizations
In-Reply-To: <537E9913.7070501@stoneleaf.us>
References: <537E9913.7070501@stoneleaf.us>
Message-ID: <537EAC89.6090802@stoneleaf.us>

Oh, and just to be clear, we are only talking about optimizations that modify the byte-code, correct?

--
~Ethan~

From tjreedy at udel.edu Fri May 23 04:53:28 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 22 May 2014 22:53:28 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537DFA95.4040000@nedbatchelder.com>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol>

<537D2FDA.6030907@nedbatchelder.com>
<537DFA95.4040000@nedbatchelder.com>
Message-ID:

On 5/22/2014 9:24 AM, Ned Batchelder wrote:
> On 5/22/14 2:44 AM, Terry Reedy wrote:
>> On 5/21/2014 6:59 PM, Ned Batchelder wrote:
>>
>>> If by implementation details, you mean the word "peephole", then let's
>>> remove it, and simply have a switch that disables all optimization.
>>> Rather than limiting the future of the optimizer, it will provide an
>>> escape hatch for people who would rather not have the optimizer's
>>> effects.
>>
>> The presumption of this idea is that there is a proper, canonical
>> unoptimized version of 'compiled Python'. For Python there obviously
>> is not. For CPython, there is not either. What Raymond has been saying
>> is that the output of the CPython compiler is the output of the
>> CPython compiler.

> I'd like to understand why we think the Python compiler is different in
> this regard than a C compiler.

Python is a different language. But let us not get sidetracked on that.

> When this came up 18 months ago on Python-Dev, it was part of a thread
> about adding more optimizations to CPython. Guido said "+1" to the idea
> of being able to disable the optimizers
> (https://mail.python.org/pipermail/python-dev/2012-December/123099.html).

I read that and it is not to me exactly what his quick, top-posted '+1'
really means. I claimed in response to Marc-Andre that CPython has
always had an as-if rule and numerous optimizations, some of which
cannot, realistically, be disabled. Nor would we really want to disable
'all optimization' (as you requested in your post).

My objection to 'disable the peephole optimizer' is that it likely
disables too much, and perhaps too little (as more is done with asts).
Also, it seems it may add a continuing burden to a relatively small core
developer team, which also has an stdlib to maintain.

I think we should initially focus on the ghosting of 'continue'. While
the coverage problem can be partly solved by adding 'continue' to
'exclude lines', that will not solve the problem of a debugger
checkpoint not working. I think you could argue (very Pythonically ;-)
that the total machine-time saving of ghosting 'continue' is not worth
the extra time waste of humans. I would be happier removing that
particular optimization than with adding machinery to make it optional.

If, as has been proposed, some or all of the peephole (code)
optimizations were moved to the ast stage, where continue jumps are
still distinguished by Continue nodes, it might be easier to selectively
avoid undesirable ghosting of continue statements.

--
Terry Jan Reedy

From ethan at stoneleaf.us Fri May 23 05:10:01 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 22 May 2014 20:10:01 -0700
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol>

<537D2FDA.6030907@nedbatchelder.com>
<537DFA95.4040000@nedbatchelder.com>
Message-ID: <537EBC09.6000004@stoneleaf.us>

On 05/22/2014 07:53 PM, Terry Reedy wrote:
> On 5/22/2014 9:24 AM, Ned Batchelder wrote:
>>
>> When this came up 18 months ago on Python-Dev, it was part of a thread
>> about adding more optimizations to CPython. Guido said "+1" to the idea
>> of being able to disable the optimizers
>
> I read that and it is not to me exactly what his quick, top-posted '+1' really means.

In the interest of not debating what Guido meant way back when, he has posted (today?) that "I am strictly with Ned here."

I think we can count that as a +1 for Ned's request.

--
~Ethan~

From stefan_ml at behnel.de Fri May 23 07:02:36 2014
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 23 May 2014 07:02:36 +0200
Subject: [Python-ideas] Disabling optimizations
In-Reply-To: <537EA7FB.8070806@nedbatchelder.com>
References: <537E9913.7070501@stoneleaf.us>
<537EA7FB.8070806@nedbatchelder.com>
Message-ID:

Ned Batchelder, 23.05.2014 03:44:
> I'm of the opinion that we don't need to segregate bytecode into different
> files depending on the options used to create the bytecode. How often is
> the same program run in the same place with different options at different
> times? I'm happy to have optimized and non-optimized code both written to
> .pyc files, and if you are fiddling with the options like that, you should
> delete your pyc files when you change the options. If we come up with a
> way to have the bytecode file-segregated, I'm OK with that too.
>
> I definitely don't like the alternative that says unoptimized code isn't
> written to disk at all. If people want to solve the problem that way,
> there is already a mechanism to avoid writing bytecode, you can use it with
> the optimizer controls to achieve the effect you want.

As I already proposed, we could get rid of .pyo files all together and only
write unoptimised .pyc files, and then apply the optimisations at load time
based on the current interpreter config. I think that would give us a good
tradeoff between fast (precompiled) code loading and differing requirements
on byte code optimisations.

Stefan

From stefan_ml at behnel.de Fri May 23 07:28:32 2014
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 23 May 2014 07:28:32 +0200
Subject: [Python-ideas] Disabling optimizations
In-Reply-To:
References: <537E9913.7070501@stoneleaf.us>
<537EA7FB.8070806@nedbatchelder.com>
Message-ID:

Stefan Behnel, 23.05.2014 07:02:
> Ned Batchelder, 23.05.2014 03:44:
>> I'm of the opinion that we don't need to segregate bytecode into different
>> files depending on the options used to create the bytecode. How often is
>> the same program run in the same place with different options at different
>> times? I'm happy to have optimized and non-optimized code both written to
>> .pyc files, and if you are fiddling with the options like that, you should
>> delete your pyc files when you change the options. If we come up with a
>> way to have the bytecode file-segregated, I'm OK with that too.
>>
>> I definitely don't like the alternative that says unoptimized code isn't
>> written to disk at all. If people want to solve the problem that way,
>> there is already a mechanism to avoid writing bytecode, you can use it with
>> the optimizer controls to achieve the effect you want.
>
> As I already proposed, we could get rid of .pyo files all together and only
> write unoptimised .pyc files, and then apply the optimisations at load time
> based on the current interpreter config. I think that would give us a good
> tradeoff between fast (precompiled) code loading and differing requirements
> on byte code optimisations.

Stefan Krah already proposed -Os (optimise for space) for the cases where
you want to reduce the size of the byte code file, e.g. by removing doc
strings. That could become the next .pyo file. Although it's unclear to me
why you would do that, instead of just compressing them.

Stefan

From ethan at stoneleaf.us Fri May 23 07:42:07 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 22 May 2014 22:42:07 -0700
Subject: [Python-ideas] Disabling optimizations
In-Reply-To:
References: <537E9913.7070501@stoneleaf.us>
<537EA7FB.8070806@nedbatchelder.com>
Message-ID: <537EDFAF.4060802@stoneleaf.us>

On 05/22/2014 10:02 PM, Stefan Behnel wrote:
> Ned Batchelder, 23.05.2014 03:44:
>> I'm of the opinion that we don't need to segregate bytecode into different
>> files depending on the options used to create the bytecode. How often is
>> the same program run in the same place with different options at different
>> times? I'm happy to have optimized and non-optimized code both written to
>> .pyc files, and if you are fiddling with the options like that, you should
>> delete your pyc files when you change the options. If we come up with a
>> way to have the bytecode file-segregated, I'm OK with that too.
>>
>> I definitely don't like the alternative that says unoptimized code isn't
>> written to disk at all. If people want to solve the problem that way,
>> there is already a mechanism to avoid writing bytecode, you can use it with
>> the optimizer controls to achieve the effect you want.
>
> As I already proposed, we could get rid of .pyo files all together and only
> write unoptimised .pyc files, and then apply the optimisations at load time
> based on the current interpreter config. I think that would give us a good
> tradeoff between fast (precompiled) code loading and differing requirements
> on byte code optimisations.

-1

The whole point of saving the compiled version to disk is to load-and-go.

I have no problem with having the pyc contain the info on which optimizations it was compiled with, and if the current
options are different then it gets recompiled. As Ned said, "How often is the same program run in the same place with
different options at different times?"

--
~Ethan~

From p.f.moore at gmail.com Fri May 23 09:20:00 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 23 May 2014 08:20:00 +0100
Subject: [Python-ideas] Disabling optimizations
In-Reply-To: <537EA7FB.8070806@nedbatchelder.com>
References: <537E9913.7070501@stoneleaf.us>
<537EA7FB.8070806@nedbatchelder.com>
Message-ID:

On 23 May 2014 02:44, Ned Batchelder wrote:
> I'm happy to have optimized and non-optimized code both written to .pyc
> files, and if you are fiddling with the options like that, you should delete
> your pyc files when you change the options.

Surely the net effect of this on your original issue would be that
instead of people wondering why continue is not shown as covered,
doing a lot of debugging, realising it was an eliminated line and
moving on, you would have people wondering why continue is shown as
not covered, doing a lot of debugging, realising they forgot to delete
the pyc file, removing it, rerunning the coverage report and moving
on? I doubt that diagnosing "I forgot to remove the pyc file, and it
matters" would be much easier than the current situation. Both could
pretty easily be documented in the coverage docs.

-1 on having Python fail to distinguish pyc files that have different
user-visible behaviour that we care about.

Paul

From mal at egenix.com Fri May 23 10:25:29 2014
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 23 May 2014 10:25:29 +0200
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> <537E1891.5050808@nedbatchelder.com>
<537E1A88.8080501@egenix.com>
Message-ID: <537F05F9.7070406@egenix.com>

On 23.05.2014 04:07, Terry Reedy wrote:
> On 5/22/2014 11:40 AM, M.-A. Lemburg wrote:
>> On 22.05.2014 17:32, Ned Batchelder wrote:
>>>
>>> The whole point of this proposal is to recognize that there are times (debugging, coverage
>>> measurement) when optimizations are harmful, and to avoid them.
>>
>> +1
>>
>> It's regular practice in other languages to disable optimizations
>> when debugging code. I don't see why Python should be different in this
>> respect.
>>
>> Debuggers, testing, coverage and other such tools should be able to
>> invoke a Python runtime mode that let's the compiler work strictly
>> by the book, without applying any kind of optimization.
>>
>> This used to be the default in Python,
>
> I believe that Python has always had an 'as if' rule that allows more or less 'hidden'
> optimizations, as long as the net effect of a statement is as defined.

I was referring to the times before the peephole optimizer was
introduced (Python 2.3 and earlier).

What's important here is to look at the difference between what
the compiler generates by simply following its rule book and the
version of the byte code which is the result of running an
optimizer on the byte code or even on the AST before running the
transform to byte code.

Note that I'm not talking about optimizations applied at the VM
level implementations of bytecodes and I think neither was Ned.

> 1. By the book, "a,b = b,a" means create a tuple from b,a, unpack the contents to a and b, and
> delete the reference to the tuple. An obvious optimization is to not create the tuple. As I
> remember, this was once tried out before tuple unpacking was generalized to iterable unpacking. I
> don't know if CPython was ever released with that optimization, or if other implementations have or
> do use it. By the 'as if' rule, it does not matter, even though an allocation tracer (such as the
> one added to 3.4?) might detect the non-allocation.

This is an implementation detail of the VM. The code generated
by the compiler is byte code saying rotate the top two arguments
on the stack (ROT_TWO).

> 2. The manual says
> '''
> @f1(arg)
> @f2
> def func(): pass
>
> is equivalent to
>
> def func(): pass
> func = f1(arg)(f2(func))
> '''
> The equivalent is 'as if', in net effect, not in the detailed process. CPython actually executes (or
> at least did at one time)
>
> def (): pass
> func = f1(arg)(f2())
>
> Ignore f1. The difference can be detected when f2 is called by examining the approriate namespace
> within f2. When someone filed an issue about the 'bug' of 'func' never being bound to the unwrapped
> function object, Guido said that he neither wanted to change the doc or the implementation. (Sorry,
> I cannot find the issue.)

I'd put that under documentation bug, if at all :-)

Note that the function func does get the name "func". It's just
not bound to the name in the intermediate step, since the function
object serves as parameter to the function f2.

> 3. "a + b" is *usually* equivalent to "a.__class__.__add__(b)" or possibly
> "b.__class__.__radd__(a)". However, my understanding is that if a and b are ints, a 'fast path'
> optimization is applied that bypasses the int.__add slot wrapper. Is so, a call tracer could notice
> the difference and if unaware of such optimizations, falsely report a problem.

Again, this is an optimization in the implementation of the
byte code, not one applied by the compiler. There are quite
a few more such optimizations going in the VM.

> 4. Some Python implementations delay object destruction. I suspect that some (many?) do not really
> destroy objects (zero out the memory block).

I don't see what this has to do with the compiler. Isn't that
just a implementation detail of how GC works on a particular
Python platform ?

>> but there's definitely a need for being able to run Python in
>> a debugger without having it perfectly valid skip code lines
>> (even if they are no ops).
>
> This is a different issue from 'disable the peephole optimizer'.

For me, a key argument for having a runtime mode without
compiler optimizations is that the compiler gains
more freedom in applying more aggressive optimizations.

Tools will no longer have to adapt to whatever optimizations
are added with each new Python release, since there will be
a defined non-optimized runtime mode they can use as basis for
their work.

The net result would be faster Pythons and better working debugging
tools (well, at least that's the hope ;-).

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, May 23 2014)
>>> Python Projects, Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/

From me+python at ixokai.io Fri May 23 10:02:29 2014
From: me+python at ixokai.io (Stephen Hansen)
Date: Fri, 23 May 2014 01:02:29 -0700
Subject: [Python-ideas] Disabling optimizations
In-Reply-To: <537EA7FB.8070806@nedbatchelder.com>
References: <537E9913.7070501@stoneleaf.us>
<537EA7FB.8070806@nedbatchelder.com>
Message-ID:

On Thu, May 22, 2014 at 6:44 PM, Ned Batchelder wrote:

> I'm of the opinion that we don't need to segregate bytecode into different
> files depending on the options used to create the bytecode. How often is
> the same program run in the same place with different options at different
> times? I'm happy to have optimized and non-optimized code both written to
> .pyc files, and if you are fiddling with the options like that, you should
> delete your pyc files when you change the options. If we come up with a
> way to have the bytecode file-segregated, I'm OK with that too.
>

What madness is this?

Any suggestion that "you should delete your pyc files" strikes me as
remarkably wrongheaded. You shouldn't even have to think about pyc (or pyo)
files -- they're a convenience, not something there is any expectation on
anyone to *manage*. When I edit my .py file, I don't have to go delete the
pyc; I don't need to be sure to do a 'make clean' like on some of my C
projects. Python sees my source is modified, and discards the compiled bit
-- expecting anything more from people using python is a serious thing.

Things have gotten a bit more complex in modern Python with the __pycache__
directory, yet still there is no expectation that users *manage* these
files. That's a bit shocking to me.

I definitely don't like the alternative that says unoptimized code isn't
> written to disk at all. If people want to solve the problem that way,
> there is already a mechanism to avoid writing bytecode, you can use it with
> the optimizer controls to achieve the effect you want.
>

I don't understand this point. It seems natural to me that if you have an
option to run code with optimizations disabled, its not written to disk...:
after all the entire assumption of the point is the code isn't doing
everything it can to be as efficient as it can. At that point, what does
speed matter? You've decided you want precise traceable semantics even when
its known that certain branches aren't needed -- you want to trace the
precise logic. Do you really then care about the cost it takes to compile
the source to bytecode?

I get that there are reasons to not want optimizations, but I don't get the
desire to complicate the compilation and running step. Optimizations on/off
makes some sense: in testing environments and the like. Its something else
entirely to demand people manually delete files, or where the burden is
upon those who run the app/test suites/etc to deal with files created as a
side-effect of what they're doing.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ncoghlan at gmail.com Fri May 23 11:11:02 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 23 May 2014 19:11:02 +1000
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To: <537E5D67.90101@nedbatchelder.com>
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

<537E5D67.90101@nedbatchelder.com>
Message-ID:

On 23 May 2014 06:27, "Ned Batchelder" wrote:
>
> On 5/22/14 2:57 PM, Eric Snow wrote:
>>
>> On Thu, May 22, 2014 at 12:49 PM, Eric Snow
wrote:
>>>
>>> Same here. More concretely:
>>
>> ...
>>
>> Having said that, revamping those options and our current optimization
>> mechanism is a far cry from just adding -X nopeephole as Ned has
>> implied. While the former may make sense on its own, those broader
>> changes may languish as nice-to-haves. It may be better to go with
>> the latter in the short-term while the broader changes swirl in the
>> maelstrom of discussion indefinitely.
>
>
> I get distracted (by work...) for the afternoon, and things take an
unexpected turn!
>
> I definitely did not mean to throw open the floodgates to reconsider the
entire -O switch. I agree that the -O switch seems like too much UI for
too little change in results, and I think a different set of settings and
defaults makes more sense. But I do not suppose that we have much appetite
to take on that large a change.
>
> For my purposes, an environment variable and no change or addition to the
switches would be fine.

Given how far away 3.5 is, I'd actually be interested in seeing a full
write-up of Eric's proposal, comparing it to the "let's just add some more
technical debt to the pile" -X option based approach.

I don't think *anyone* really likes the current state of the optimisation
flags, so if this proposal tips us over the edge into finally fixing them
properly, huzzah!

Cheers,
Nick.

>
> --Ned
>
>> -eric
>>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From victor.stinner at gmail.com Fri May 23 11:30:26 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Fri, 23 May 2014 11:30:26 +0200
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

<537E5D67.90101@nedbatchelder.com>

Message-ID:

2014-05-23 11:11 GMT+02:00 Nick Coghlan :
> Given how far away 3.5 is, I'd actually be interested in seeing a full
> write-up of Eric's proposal, comparing it to the "let's just add some more
> technical debt to the pile" -X option based approach.

The discussion in now splitted in 4 places: 3 threads on this mailing
list, 1 issue in the bug tracker. And there are some old discussions
on python-dev.

It's maybe time to use the power of the PEP process to summarize this
in a clear document? (Write a PEP.)

Victor

From solipsis at pitrou.net Fri May 23 11:53:57 2014
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 23 May 2014 11:53:57 +0200
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol>

<537D2FDA.6030907@nedbatchelder.com>
<537DFA95.4040000@nedbatchelder.com>
Message-ID: <20140523115357.435531a4@fsol>

On Thu, 22 May 2014 22:53:28 -0400
Terry Reedy wrote:
>
> > I'd like to understand why we think the Python compiler is different in
> > this regard than a C compiler.
>
> Python is a different language. But let us not get sidetracked on that.

The number one difference is that people don't compile code explicitly
when writing Python code (well, except packagers who call compileall(),
and a few advanced uses). So "choosing compilation options" is really
not part of the standard workflow for developing in Python.

Regards

Antoine.

From solipsis at pitrou.net Fri May 23 11:57:09 2014
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 23 May 2014 11:57:09 +0200
Subject: [Python-ideas] Disabling optimizations
References: <537E9913.7070501@stoneleaf.us>
<537EA7FB.8070806@nedbatchelder.com>

Message-ID: <20140523115709.58b6d016@fsol>

On Fri, 23 May 2014 07:28:32 +0200
Stefan Behnel wrote:
> >
> > As I already proposed, we could get rid of .pyo files all together and only
> > write unoptimised .pyc files, and then apply the optimisations at load time
> > based on the current interpreter config. I think that would give us a good
> > tradeoff between fast (precompiled) code loading and differing requirements
> > on byte code optimisations.
>
> Stefan Krah already proposed -Os (optimise for space) for the cases where
> you want to reduce the size of the byte code file, e.g. by removing doc
> strings. That could become the next .pyo file. Although it's unclear to me
> why you would do that, instead of just compressing them.

People who are really short on disk space (embedded devs?) probably do
both: first strip docstrings and friends, then compress.

For the same reason, optimizing in-memory would be detrimental:
optimizations can usually reduce the size of pyc files.

(besides, optimizing at compile-time allows us to do more costly
optimizations without caring *too much* about their overhead)

Regards

Antoine.

From njs at pobox.com Fri May 23 12:30:27 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 23 May 2014 11:30:27 +0100
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537E5B4E.8050905@nedbatchelder.com>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us>
<537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol>

<537E1891.5050808@nedbatchelder.com>

<537E5B4E.8050905@nedbatchelder.com>
Message-ID:

On Thu, May 22, 2014 at 9:17 PM, Ned Batchelder wrote:
> On 5/22/14 1:16 PM, Nathaniel Smith wrote:
>>
>> On Thu, May 22, 2014 at 4:32 PM, Ned Batchelder
>> wrote:
>>>
>>> On 5/22/14 9:49 AM, Skip Montanaro wrote:
>>>>
>>>> It seems to me that Ned has revealed a bug in the peephole optimizer.
>>>> It zapped an entire source line's worth of bytecode, but failed to
>>>> delete the relevant entry in the line number table of the resulting
>>>> code object. If I had my druthers, that would be the change I'd
>>>> prefer.
>>>
>>> I think it is the nature of optimization that it will destroy useful
>>> information. I don't think it will always be possible to retain enough
>>> back-mapping that the optimized code can be understood as if it had not
>>> been
>>> optimized. For example, the debug issue would still be present: if you
>>> run
>>> pdb and set a breakpoint on the "continue" line, it will never be hit.
>>> Even
>>> if the optimizer cleaned up after itself perfectly (in fact, especially
>>> so),
>>> that breakpoint will still not be hit. You simply cannot reason about
>>> optimized code without having to mentally understand the transformations
>>> that have been applied.
>>
>> In this particular case, the back-mapping problem is pretty minor.
>> IIUC the optimization is that if we have (abusing BASIC notation)
>>
>> 10 GOTO 20
>> 20 GOTO 30
>> 30 ...
>>
>> then in fact the operations at lines 10 and 20 are, from the point of
>> view of the rest of the program, indivisible -- every time you execute
>> 10 you also execute 20, there is no way to tell from outside whether
>> we paused in betwen executing 10 and 20, etc. Effectively we just have
>> a single uber-instruction that does both:
>>
>> (10, 20) GOTO 30
>> 30 ...
>>
>> So from the coverage point of view, just marking line 20 as covered
>> every time line 10 is executed is the Right Thing To Do. From the
>> debugging point of view, a breakpoint set at line 20 should just trip
>> whenever line 10 is executed -- it's not like there's any way to tell
>> whether we're "half way through" the jump sequence or not. It's a
>> pretty solid abstraction.
>
> You've used the word "just" three times, glossing over the fact that we have
> no facility for marking statements as an uber instruction, and you've made
> no proposal for how it might work.

What we have right now is co_lnotab. It encodes a many-to-one mapping
from bytecode locations to line number:

# bytecode offset -> line no
lnotab = {
0: 10,
1: 10,
2: 10,
3: 11,
4: 12,
...
}

AFAIK, the main operations it supports are (a) given a bytecode
location, return the relevant line (for backtraces etc.), (b) when
executing bytecode, detect transitions from an instruction associated
with one line to an instruction associated with another line (for
sys.settrace, used by coverage and pdb).

def backtrace_lineno(offset):
return lnotab[offset]

def do_trace(offset1, offset2):
if lnotab[offset1] != lnotab[offset2]:
call_trace_fn(lnotab[offset2])

My proposal is to make this a many-to-many mapping:

lnotab = {
0: {10},
1: {10},
2: {10, 11}, # optimized jump
3: {12},
...
}

def backtrace_lineno(offset):
# if there are multiple linenos, then it's indistinguishable which one the
# exception occurred on, so just pick one to display
return min(lnotab[offset])

def do_trace(offset1, offset2):
for lineno in sorted(lnotab[offset2].difference(lnotab[offset1])):
call_trace_fn(lineno)

Yes, there is some complexity in practice because currently co_lnotab
is a ridiculously optimized data structure for encoding the
many-to-one mapping, and so some work needs to be done to come up with
a similarly optimized way of encoding a many-to-many mapping. But this
is all fundamentally trivial. "Compactly encoding a dict of sets of
ints" is not the sort of challenge that we should find daunting and
impossible.

> Even if we build (and test!) a way to do
> that, it only covers this particular kind of oddity with optimized code.

Well, this is the only oddity that is causing problems. And future
optimizations might well be covered by my proposed mechanism. Any
optimization that works by taking in a set of line-number-tagged
objects (ast nodes, bytecode instructions, whatever) and spits out a
set of new objects could potentially make use of this -- just set the
lineno annotation on the output objects to be the union of the lineno
annotations on the input objects. Will that actually be enough in
practice? Who knows, we'll have to wait until we get there. Trying to
handle hypothetical future optimizations now is just borrowing
trouble.

And even if we do add a minimal-optimization mode, that shouldn't be
taken as a blank check to stop worrying about the debuggability of the
default-optimization mode, so we'll still need something like this
sooner or later. gdb actually works extremely well on optimized C/C++
code -- sure, sometimes it's a bit confusing and you have to recompile
with -O0 to wrap your head around what's happening, but gdb keeps
working regardless and I almost never bother. And this is because the
C/C++ crowd has spent a lot of time on coming up with solid systems
for describing really really complicated relationships between
compiler output and the original source code -- much worse than the
ones we have to deal with. Just throwing up our hands and giving up
seems like a rather cowardly solution.

-n

--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

From ned at nedbatchelder.com Fri May 23 12:39:54 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Fri, 23 May 2014 06:39:54 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <20140523115357.435531a4@fsol>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol>

<537D2FDA.6030907@nedbatchelder.com>
<537DFA95.4040000@nedbatchelder.com>
<20140523115357.435531a4@fsol>
Message-ID: <537F257A.5000509@nedbatchelder.com>

On 5/23/14 5:53 AM, Antoine Pitrou wrote:
> On Thu, 22 May 2014 22:53:28 -0400
> Terry Reedy wrote:
>>> I'd like to understand why we think the Python compiler is different in
>>> this regard than a C compiler.
>> Python is a different language. But let us not get sidetracked on that.
> The number one difference is that people don't compile code explicitly
> when writing Python code (well, except packagers who call compileall(),
> and a few advanced uses). So "choosing compilation options" is really
> not part of the standard workflow for developing in Python.

That seems an odd distinction to make, given that we already do have
ways to control how the compilation step happens, and we are having no
trouble imagining other ways to control it. Whether you like those
options or not, you have to admit that we do have ways to tell Python
how we want compilation to happen.
>
> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From solipsis at pitrou.net Fri May 23 12:44:31 2014
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 23 May 2014 12:44:31 +0200
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol>

<537D2FDA.6030907@nedbatchelder.com>
<537DFA95.4040000@nedbatchelder.com>
<20140523115357.435531a4@fsol> <537F257A.5000509@nedbatchelder.com>
Message-ID: <20140523124431.7e79aba3@fsol>

On Fri, 23 May 2014 06:39:54 -0400
Ned Batchelder
wrote:
> On 5/23/14 5:53 AM, Antoine Pitrou wrote:
> > On Thu, 22 May 2014 22:53:28 -0400
> > Terry Reedy wrote:
> >>> I'd like to understand why we think the Python compiler is different in
> >>> this regard than a C compiler.
> >> Python is a different language. But let us not get sidetracked on that.
> > The number one difference is that people don't compile code explicitly
> > when writing Python code (well, except packagers who call compileall(),
> > and a few advanced uses). So "choosing compilation options" is really
> > not part of the standard workflow for developing in Python.
>
> That seems an odd distinction to make, given that we already do have
> ways to control how the compilation step happens, and we are having no
> trouble imagining other ways to control it. Whether you like those
> options or not, you have to admit that we do have ways to tell Python
> how we want compilation to happen.

My point is that almost nobody ever cares about them. The standard
model for execution Python code is "python mycode.py" or
"python -m mymodule". Compilation is invisible for the average user.

Regards

Antoine.

From ned at nedbatchelder.com Fri May 23 14:04:23 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Fri, 23 May 2014 08:04:23 -0400
Subject: [Python-ideas] Disabling optimizations
In-Reply-To:
References: <537E9913.7070501@stoneleaf.us>
<537EA7FB.8070806@nedbatchelder.com>

Message-ID: <537F3947.6000401@nedbatchelder.com>

On 5/23/14 4:02 AM, Stephen Hansen wrote:
> On Thu, May 22, 2014 at 6:44 PM, Ned Batchelder > wrote:
>
> I'm of the opinion that we don't need to segregate bytecode into
> different files depending on the options used to create the
> bytecode. How often is the same program run in the same place
> with different options at different times? I'm happy to have
> optimized and non-optimized code both written to .pyc files, and
> if you are fiddling with the options like that, you should delete
> your pyc files when you change the options. If we come up with a
> way to have the bytecode file-segregated, I'm OK with that too.
>
>
> What madness is this?
>
> Any suggestion that "you should delete your pyc files" strikes me as
> remarkably wrongheaded. You shouldn't even have to think about pyc (or
> pyo) files -- they're a convenience, not something there is any
> expectation on anyone to *manage*. When I edit my .py file, I don't
> have to go delete the pyc; I don't need to be sure to do a 'make
> clean' like on some of my C projects. Python sees my source is
> modified, and discards the compiled bit -- expecting anything more
> from people using python is a serious thing.
>
> Things have gotten a bit more complex in modern Python with the
> __pycache__ directory, yet still there is no expectation that users
> *manage* these files. That's a bit shocking to me.
>
> I definitely don't like the alternative that says unoptimized code
> isn't written to disk at all. If people want to solve the problem
> that way, there is already a mechanism to avoid writing bytecode,
> you can use it with the optimizer controls to achieve the effect
> you want.
>
>
> I don't understand this point. It seems natural to me that if you have
> an option to run code with optimizations disabled, its not written to
> disk...: after all the entire assumption of the point is the code
> isn't doing everything it can to be as efficient as it can. At that
> point, what does speed matter? You've decided you want precise
> traceable semantics even when its known that certain branches aren't
> needed -- you want to trace the precise logic. Do you really then care
> about the cost it takes to compile the source to bytecode?
>
> I get that there are reasons to not want optimizations, but I don't
> get the desire to complicate the compilation and running step.
> Optimizations on/off makes some sense: in testing environments and the
> like. Its something else entirely to demand people manually delete
> files, or where the burden is upon those who run the app/test
> suites/etc to deal with files created as a side-effect of what they're
> doing.
>

I may not have been clear, sorry: I would love to find a way to make
this transparent to the user, and not have to have the user delete .pyc
files. I was merely trying to make my requirements precise. In my
particular use case, having to delete .pyc files is not a problem. If
we can engineer it so that is not necessary, all the better.

The .pyc file already has a metadata that indicates the source timestamp
and the version of the Python interpreter. If those numbers don't mesh
well with the Python source and interpreter that finds the pyc file,
then the file is discarded transparently. We could put the compilation
options into the pyc file as well, and automatically discard the file if
it had been made with different options than the running interpreter.

--Ned.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From bcannon at gmail.com Fri May 23 14:27:26 2014
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 23 May 2014 12:27:26 +0000
Subject: [Python-ideas] Disabling optimizations
References: <537E9913.7070501@stoneleaf.us>
<537EA7FB.8070806@nedbatchelder.com>

<20140523115709.58b6d016@fsol>
Message-ID:

On Fri May 23 2014 at 6:03:59 AM, Antoine Pitrou
wrote:

> On Fri, 23 May 2014 07:28:32 +0200
> Stefan Behnel wrote:
> > >
> > > As I already proposed, we could get rid of .pyo files all together and
> only
> > > write unoptimised .pyc files, and then apply the optimisations at load
> time
> > > based on the current interpreter config. I think that would give us a
> good
> > > tradeoff between fast (precompiled) code loading and differing
> requirements
> > > on byte code optimisations.
> >
> > Stefan Krah already proposed -Os (optimise for space) for the cases where
> > you want to reduce the size of the byte code file, e.g. by removing doc
> > strings. That could become the next .pyo file. Although it's unclear to
> me
> > why you would do that, instead of just compressing them.
>
> People who are really short on disk space (embedded devs?) probably do
> both: first strip docstrings and friends, then compress.
>

.pyo files also use less memory once loaded as well. -OO is definitely not
going away as at least an available option under some name.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ned at nedbatchelder.com Fri May 23 13:59:48 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Fri, 23 May 2014 07:59:48 -0400
Subject: [Python-ideas] Disabling optimizations
In-Reply-To: <20140523115709.58b6d016@fsol>
References: <537E9913.7070501@stoneleaf.us>
<537EA7FB.8070806@nedbatchelder.com>
<20140523115709.58b6d016@fsol>
Message-ID: <537F3834.7050904@nedbatchelder.com>

On 5/23/14 5:57 AM, Antoine Pitrou wrote:
> On Fri, 23 May 2014 07:28:32 +0200
> Stefan Behnel wrote:
>>> As I already proposed, we could get rid of .pyo files all together and only
>>> write unoptimised .pyc files, and then apply the optimisations at load time
>>> based on the current interpreter config. I think that would give us a good
>>> tradeoff between fast (precompiled) code loading and differing requirements
>>> on byte code optimisations.
>> Stefan Krah already proposed -Os (optimise for space) for the cases where
>> you want to reduce the size of the byte code file, e.g. by removing doc
>> strings. That could become the next .pyo file. Although it's unclear to me
>> why you would do that, instead of just compressing them.
> People who are really short on disk space (embedded devs?) probably do
> both: first strip docstrings and friends, then compress.
>
> For the same reason, optimizing in-memory would be detrimental:
> optimizations can usually reduce the size of pyc files.
>
> (besides, optimizing at compile-time allows us to do more costly
> optimizations without caring *too much* about their overhead)

Optimizing at compile time also lets you do optimizations that are not
bytecode->bytecode transformations. Most of the recent discussion about
new optimizations is focused on AST manipulations. Although I started
this discussion with the word "peephole", those types of optimizations
also affect the source->bytecode mapping, and should be controlled by
the levers we're discussing.

--Ned.
>
> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

From ncoghlan at gmail.com Fri May 23 18:33:28 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 24 May 2014 02:33:28 +1000
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

<537E5D67.90101@nedbatchelder.com>

Message-ID:

On 23 May 2014 19:30, Victor Stinner wrote:
> 2014-05-23 11:11 GMT+02:00 Nick Coghlan :
>> Given how far away 3.5 is, I'd actually be interested in seeing a full
>> write-up of Eric's proposal, comparing it to the "let's just add some more
>> technical debt to the pile" -X option based approach.
>
> The discussion in now splitted in 4 places: 3 threads on this mailing
> list, 1 issue in the bug tracker. And there are some old discussions
> on python-dev.
>
> It's maybe time to use the power of the PEP process to summarize this
> in a clear document? (Write a PEP.)

Yes, I think so. One key thing this discussion made me realise is that
we haven't taken a serious look at the compilation behaviour since PEP
3147 was implemented. The introduction of the cache tag and the
source<->cache conversion functions provides an opportunity to
actually clean up the handling of the different optimisation levels,
and potentially make docstring stripping an independent setting.

It may be that the end result of that process is to declare "-X
nopeephole" a good enough solution and proceed with implementing that.
I just think it's worth exploring what would be involved in fixing
things properly before making a decision.

Cheers,
Nick.

--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia

From guido at python.org Fri May 23 18:49:30 2014
From: guido at python.org (Guido van Rossum)
Date: Fri, 23 May 2014 09:49:30 -0700
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

<537E5D67.90101@nedbatchelder.com>

Message-ID:

I'm not happy with the direction this is taking. I would prefer an approach
that *first* implements the minimal thing (an internal flag, set by an
environment variable, to disable the peephole optimizer) and *then* perhaps
revisits the greater UI for specifying optimization levels and the
consequences this has for pyc/pyo files.

I would also like to remind people the reason why there are separate pyc
and pyo files: they are separate to support precompilation of the standard
library and installed 3rd party packages for different optimization levels.
While it may be okay for a developer that their pyc files all get
invalidated when they change the optimization level, the stdlib and
site-packages may require root access to write, so if your optimization
level means you have to ignore the precompiled stdlib or site packages,
that would be a major drag on your startup time (and memory usage will also
spike at import time, since the AST is rather large).

Looking at my own (frequent) use of coverage.py, I would be totally fine if
disabling peephole optimization only affected my app's code, and kept using
the precompiled stdlib. (How exactly this would work is left as an exercise
for the reader.)

On Fri, May 23, 2014 at 9:33 AM, Nick Coghlan wrote:

> On 23 May 2014 19:30, Victor Stinner wrote:
> > 2014-05-23 11:11 GMT+02:00 Nick Coghlan :
> >> Given how far away 3.5 is, I'd actually be interested in seeing a full
> >> write-up of Eric's proposal, comparing it to the "let's just add some
> more
> >> technical debt to the pile" -X option based approach.
> >
> > The discussion in now splitted in 4 places: 3 threads on this mailing
> > list, 1 issue in the bug tracker. And there are some old discussions
> > on python-dev.
> >
> > It's maybe time to use the power of the PEP process to summarize this
> > in a clear document? (Write a PEP.)
>
> Yes, I think so. One key thing this discussion made me realise is that
> we haven't taken a serious look at the compilation behaviour since PEP
> 3147 was implemented. The introduction of the cache tag and the
> source<->cache conversion functions provides an opportunity to
> actually clean up the handling of the different optimisation levels,
> and potentially make docstring stripping an independent setting.
>
> It may be that the end result of that process is to declare "-X
> nopeephole" a good enough solution and proceed with implementing that.
> I just think it's worth exploring what would be involved in fixing
> things properly before making a decision.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>

--
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From donald at stufft.io Fri May 23 19:08:30 2014
From: donald at stufft.io (Donald Stufft)
Date: Fri, 23 May 2014 13:08:30 -0400
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net> <537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

<537E5D67.90101@nedbatchelder.com>

Message-ID: <7477C87E-FB8F-42A8-B825-9F5F04B4A884@stufft.io>

On May 23, 2014, at 12:49 PM, Guido van Rossum wrote:

> I'm not happy with the direction this is taking. I would prefer an approach that *first* implements the minimal thing (an internal flag, set by an environment variable, to disable the peephole optimizer) and *then* perhaps revisits the greater UI for specifying optimization levels and the consequences this has for pyc/pyo files.

I agree with this I think.

>
> I would also like to remind people the reason why there are separate pyc and pyo files: they are separate to support precompilation of the standard library and installed 3rd party packages for different optimization levels.

Sadly enough it doesn?t go far enough since you can?t have (as far as I know) a .pyo for both -O and -OO. Perhaps the PEP isn?t the worst idea in order to make all of that work with the __pycache__ directories and the pyc tagging.

> While it may be okay for a developer that their pyc files all get invalidated when they change the optimization level, the stdlib and site-packages may require root access to write, so if your optimization level means you have to ignore the precompiled stdlib or site packages, that would be a major drag on your startup time (and memory usage will also spike at import time, since the AST is rather large).
>
> Looking at my own (frequent) use of coverage.py, I would be totally fine if disabling peephole optimization only affected my app's code, and kept using the precompiled stdlib. (How exactly this would work is left as an exercise for the reader.)
>
>
> On Fri, May 23, 2014 at 9:33 AM, Nick Coghlan wrote:
> On 23 May 2014 19:30, Victor Stinner wrote:
> > 2014-05-23 11:11 GMT+02:00 Nick Coghlan :
> >> Given how far away 3.5 is, I'd actually be interested in seeing a full
> >> write-up of Eric's proposal, comparing it to the "let's just add some more
> >> technical debt to the pile" -X option based approach.
> >
> > The discussion in now splitted in 4 places: 3 threads on this mailing
> > list, 1 issue in the bug tracker. And there are some old discussions
> > on python-dev.
> >
> > It's maybe time to use the power of the PEP process to summarize this
> > in a clear document? (Write a PEP.)
>
> Yes, I think so. One key thing this discussion made me realise is that
> we haven't taken a serious look at the compilation behaviour since PEP
> 3147 was implemented. The introduction of the cache tag and the
> source<->cache conversion functions provides an opportunity to
> actually clean up the handling of the different optimisation levels,
> and potentially make docstring stripping an independent setting.
>
> It may be that the end result of that process is to declare "-X
> nopeephole" a good enough solution and proceed with implementing that.
> I just think it's worth exploring what would be involved in fixing
> things properly before making a decision.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL:

From ericsnowcurrently at gmail.com Fri May 23 19:11:59 2014
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Fri, 23 May 2014 11:11:59 -0600
Subject: [Python-ideas] Disabling optimizations
In-Reply-To: <537F3947.6000401@nedbatchelder.com>
References: <537E9913.7070501@stoneleaf.us>
<537EA7FB.8070806@nedbatchelder.com>

<537F3947.6000401@nedbatchelder.com>
Message-ID:

On Fri, May 23, 2014 at 6:04 AM, Ned Batchelder wrote:
> The .pyc file already has a metadata that indicates the source timestamp and
> the version of the Python interpreter. If those numbers don't mesh well
> with the Python source and interpreter that finds the pyc file, then the
> file is discarded transparently. We could put the compilation options into
> the pyc file as well, and automatically discard the file if it had been made
> with different options than the running interpreter.

Adjusting the cache tag (sys.implementation.cache_tag) to reflect the
optimization level would be pretty straight-forward and relatively
easy. I'd like that better than putting that information into the
.pyc header.

-eric

From guido at python.org Fri May 23 19:12:37 2014
From: guido at python.org (Guido van Rossum)
Date: Fri, 23 May 2014 10:12:37 -0700
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To: <7477C87E-FB8F-42A8-B825-9F5F04B4A884@stufft.io>
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

<537E5D67.90101@nedbatchelder.com>

<7477C87E-FB8F-42A8-B825-9F5F04B4A884@stufft.io>
Message-ID:

On Fri, May 23, 2014 at 10:08 AM, Donald Stufft wrote:

>
> On May 23, 2014, at 12:49 PM, Guido van Rossum wrote:
>
> I'm not happy with the direction this is taking. I would prefer an
> approach that *first* implements the minimal thing (an internal flag, set
> by an environment variable, to disable the peephole optimizer) and *then*
> perhaps revisits the greater UI for specifying optimization levels and the
> consequences this has for pyc/pyo files.
>
>
> I agree with this I think.
>
>
> I would also like to remind people the reason why there are separate pyc
> and pyo files: they are separate to support precompilation of the standard
> library and installed 3rd party packages for different optimization levels.
>
>
> Sadly enough it doesn?t go far enough since you can?t have (as far as I
> know) a .pyo for both -O and -OO. Perhaps the PEP isn?t the worst idea in
> order to make all of that work with the __pycache__ directories and the pyc
> tagging.
>

Agreed (though I think that -OO is a very niche feature) and I think
deciding on what to do about this (if anything) should not hold the
peephole disabling feature hostage. (The latter of course has to decide
what to do about pyc files, but the should be a suitable answer that
doesn't require solving the general problem nor prevents the general
problem being solved.)

--
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ericsnowcurrently at gmail.com Fri May 23 19:17:03 2014
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Fri, 23 May 2014 11:17:03 -0600
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

<537E5D67.90101@nedbatchelder.com>

Message-ID:

On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum wrote:
> I'm not happy with the direction this is taking. I would prefer an approach
> that *first* implements the minimal thing (an internal flag, set by an
> environment variable, to disable the peephole optimizer) and *then* perhaps
> revisits the greater UI for specifying optimization levels and the
> consequences this has for pyc/pyo files.

Yeah, that's exactly what I was trying to convey in the followup to my
longer message about revamping the optimization levels.

> Looking at my own (frequent) use of coverage.py, I would be totally fine if
> disabling peephole optimization only affected my app's code, and kept using
> the precompiled stdlib. (How exactly this would work is left as an exercise
> for the reader.)

Would it be a problem if .pyc files weren't generated or used (a la -B
or PYTHONDONTWRITEBYTECODE) when you ran coverage?

-eric

From ncoghlan at gmail.com Fri May 23 19:22:25 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 24 May 2014 03:22:25 +1000
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

<537E5D67.90101@nedbatchelder.com>

Message-ID:

On 24 May 2014 02:49, "Guido van Rossum" wrote:
>
> I'm not happy with the direction this is taking. I would prefer an
approach that *first* implements the minimal thing (an internal flag, set
by an environment variable, to disable the peephole optimizer) and *then*
perhaps revisits the greater UI for specifying optimization levels and the
consequences this has for pyc/pyo files.

Sure, that sounds like a reasonable approach, too. My perspective is mainly
coloured by the fact that we're still in the "eh, feature freeze is still
more than a year away" low urgency period for 3.5 :)

Cheers,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From guido at python.org Fri May 23 19:22:43 2014
From: guido at python.org (Guido van Rossum)
Date: Fri, 23 May 2014 10:22:43 -0700
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

<537E5D67.90101@nedbatchelder.com>

Message-ID:

On Fri, May 23, 2014 at 10:17 AM, Eric Snow wrote:

> On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum
> wrote:
> > Looking at my own (frequent) use of coverage.py, I would be totally fine
> if
> > disabling peephole optimization only affected my app's code, and kept
> using
> > the precompiled stdlib. (How exactly this would work is left as an
> exercise
> > for the reader.)
>
> Would it be a problem if .pyc files weren't generated or used (a la -B
> or PYTHONDONTWRITEBYTECODE) when you ran coverage?
>

In first approximation that would probably be okay, although it would make
coverage even slower. I was envisioning something where it would still use,
but not write, pyc files for the stdlib or site-packages, because the code
in whose coverage I am interested is puny compared to the stdlib code it
imports.

--
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From guido at python.org Fri May 23 19:23:41 2014
From: guido at python.org (Guido van Rossum)
Date: Fri, 23 May 2014 10:23:41 -0700
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

<537E5D67.90101@nedbatchelder.com>

Message-ID:

On Fri, May 23, 2014 at 10:22 AM, Nick Coghlan wrote:

>
> On 24 May 2014 02:49, "Guido van Rossum" wrote:
> >
> > I'm not happy with the direction this is taking. I would prefer an
> approach that *first* implements the minimal thing (an internal flag, set
> by an environment variable, to disable the peephole optimizer) and *then*
> perhaps revisits the greater UI for specifying optimization levels and the
> consequences this has for pyc/pyo files.
>
> Sure, that sounds like a reasonable approach, too. My perspective is
> mainly coloured by the fact that we're still in the "eh, feature freeze is
> still more than a year away" low urgency period for 3.5 :)
>
Yeah, and I'm countering that not every project needs to land a week before
the feature freeze. :-)

--
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ncoghlan at gmail.com Fri May 23 19:36:32 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 24 May 2014 03:36:32 +1000
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com> <20140521121319.GG10355@ando>
<537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

<537E5D67.90101@nedbatchelder.com>

Message-ID:

On 24 May 2014 03:24, "Guido van Rossum" wrote:
>
> On Fri, May 23, 2014 at 10:22 AM, Nick Coghlan wrote:
>>
>>
>> On 24 May 2014 02:49, "Guido van Rossum" wrote:
>> >
>> > I'm not happy with the direction this is taking. I would prefer an
approach that *first* implements the minimal thing (an internal flag, set
by an environment variable, to disable the peephole optimizer) and *then*
perhaps revisits the greater UI for specifying optimization levels and the
consequences this has for pyc/pyo files.
>>
>> Sure, that sounds like a reasonable approach, too. My perspective is
mainly coloured by the fact that we're still in the "eh, feature freeze is
still more than a year away" low urgency period for 3.5 :)
>
> Yeah, and I'm countering that not every project needs to land a week
before the feature freeze. :-)

But that approach makes Larry's life far more exciting! :)

Happily-on-the-other-side-of-the-Pacific-from-Larry-while-saying-that'ly
yours,
Nick.

>
> --
> --Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From dw+python-ideas at hmmz.org Fri May 23 21:22:20 2014
From: dw+python-ideas at hmmz.org (dw+python-ideas at hmmz.org)
Date: Fri, 23 May 2014 19:22:20 +0000
Subject: [Python-ideas] Faster PyArg_ParseTupleAndKeywords kwargs
Message-ID: <20140523192220.GA20596@k2>

Early while working on py-lmdb I noticed that a huge proportion of
runtime was being lost to PyArg_ParseTupleAndKeywords, and so I
subsequently wrote a specialization for this extension module.

In the current code[0], parse_args() is much faster than
ParseTupleAndKeywords, responsible for a doubling of performance in
several of the library's faster code paths (e.g.
Cursor.put(append=True)). Ever since adding the rewrite I've wanted to
go back and either remove it or at least reduce the amount of custom
code, but it seems there really isn't a better approach to fast argument
parsing using the bare Python C API at the moment.

[0] https://github.com/dw/py-lmdb/blob/master/lmdb/cpython.c#L833

In the append=True path, parse_args() yields a method that can complete
1.1m insertions/sec on my crappy Core 2 laptop, compared to 592k/sec
using the same method rewritten with PyArg_ParseTupleAndKeywords.

Looking to other 'fast' projects for precedent, and studying Cython's
output in particular, it seems that Cython completely ignores the
standard APIs and expends a huge amount of .text on using almost every
imagineable C performance trick to speed up parsing (actually Cython's
output is a sheer marvel of trickery, it's worth study). So it's clear
the standard APIs are somewhat non-ideal, and those concerned with
performance are taking other approaches.

ParseTupleAndKeywords is competitive for positional arguments (1.2m/sec
vs 1.5m/sec for "Cursor.put(k, v)"), but things go south when a kwarg
dict is provided.

The primary goal of parse_args() was to avoid the continous temporary
allocations and hashing done by PyArg_ParseTupleAndKeywords, by way of
PyDict_GetItemString(), which invokes PyString_FromString() internally,
which subsequently causes alloc / strlen() and memcpy(), one for each
possible kwarg, on every function call.

The rewrite has been hacked over time, and honestly I'm not sure which
bits are responsible for the speed improvement, and which are totally
redundant. The tricks are:

* Intern keyword arg strings once at startup, avoiding the temporary
PyString creation and also causing their hash() to be cached across
calls. This uses an incredibly ugly pair of enum/const char *[]
static globals.[3]

[3] https://github.com/dw/py-lmdb/blob/master/lmdb/cpython.c#L79

* Use a per-function 'static const' array of structures to describe the
expected set of arguments. Since these arrays are built at compile
time, they cannot directly reference the runtime-generated interned
PyStrings, thus the use of an enum.

A nice side effect of the array's contents being purely small integer
is that each array element is small and thus quite cache-efficient.
In the current code array elements are 4 bytes each.

* Avoid use of variable-length argument lists. I'm not sure if this
helps at all, but certainly it simplifies the parsing code and makes
the call sites much more compact.

Instead of a va_arg list of destination pointers, parsed output is
represented as a per-function structure[1][2] definition, whose
offsets are encoded into the above argspec array, and at build time.

[1] https://github.com/dw/py-lmdb/blob/master/lmdb/cpython.c#L1265
[2] https://github.com/dw/py-lmdb/blob/master/lmdb/cpython.c#L704

This might hurt the compiler's ability to optimize the placement of
what were previouly small stack variables (e.g. I'm not sure if it
prevents the compiler making more use of registers). In any case the
overall result is much faster than before.

And most recently, giving a further 20% boost to append=True:

* Cache a dict that maps interned kwarg -> argspec array offset,
allowing the per-call kwarg dict to be iterated, and causing only one
hash lookup per supplied kwarg. Prior to the cache, presence of
kwargs would cause one hash lookup per argspec entry (e.g.
potentially 15 lookups instead of 1 or 2).

It's obvious this approach isn't generally useful, and looking at the
CPython source we can see the interning trick is already known, and
presumably not exposed in the CPython API because the method is quite
ugly. Still it seems there is room to improve the public API to include
something like this interning trick, and that's what this mail is about.

My initial thought is for a horribly macro-heavy API like:

PyObject *my_func(PyObject *self, PyObject *args, PyObject *kwargs)
{
Py_ssize_t foo;
const char *some_buf;
PyObject *list;

Py_BEGIN_ARGS
PY_ARG("foo", PY_ARG_SSIZE_T, NULL, PY_ARG_REQUIRED),
PY_ARG("some_buf", PY_ARG_BUFFER, NULL, PY_ARG_REQUIRED),
PY_ARG("list", PY_ARG_OBJECT, &PyList_Type, NULL, 0)
Py_END_ARGS

if(Py_PARSE_ARGS(args, kwds, &foo, &some_buf, &list)) {
return NULL;
}

/* do stuff */
}

Where:

struct py_arg_info; /* Opaque */

struct py_arg_spec {
const char *name;
enum { ... } type;
PyTypeObject *type;
int options;
};

#define PY_BEGIN_ARGS \
static struct py_arg_info *_py_arg_info; \
if(! _py_arg_info) { \
static const struct py_arg_spec _py_args[] = {

#define PY_END_ARGS \
}; \
_Py_InitArgInfo(&_py_arg_info, _py_args, \
sizeof _py_args / sizeof _py_args[0]); \
}

#define PY_ARG(name, type, type2, opts) {name, type, type2, opts}

#define Py_PARSE_ARGS(a, k, ...) \
_Py_ParseArgsFromInfo(&_py_arg_info, a, k, _VA_ARG_);

Here some implementation-internal py_arg_info structure is built up on
first function invocation, producing the cached mapping of argument
keywords to array index, and storing a reference to the py_arg_spec
array, or some version of it that has been internally transformed to a
more useful format.

You may notice this depends on va_arg macros, which breaks at least
Visual Studio, so at the very least that part is broken.

The above also doesn't deal with all the cases supported by the existing
PyArg_ routines, such as setting the function name and custom error
message, or unpacking tuples (is this still even supported in Python 3?)

Another approach might be to use a PyArg_ParseTupleAndKeywords-alike
API, so that something like this was possible:

static PyObject *
my_method(PyObject *self, PyObject *args, *PyObject *kwds)
{
Py_ssize_t foo;
const char *some_buf;
Py_ssize_t some_buf_size;
PyObject *list;

static PyArgInfo arg_info;
static char *keywords[] = {
"foo", "some_buf", "list", NULL
};

if(! PyArg_FastParse(&arg_info, args, kwds, "ns#|O!", keywords,
&foo, &some_buf, &some_buf_size,
&PyList_Type, &list)) {
return NULL;
}

/* do stuff */
}

In that case that API is very familiar, and PyArg_FastParse() builds the
cache on first invocation itself, but the supplied va_list is full of
noise that needs to be carefully skipped somehow. The work involved in
doing the skipping might introduce complexity that slows things down all
over again.

Any thoughts on a better API? Is there a need here? I'm obviously not
the first to notice PyArg_ParseTupleAndKeywords is slow, and so I wonder
how many people have sighed and brushed off the fact their module is
slower than it could be.

David

From tjreedy at udel.edu Fri May 23 21:55:49 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 23 May 2014 15:55:49 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537F05F9.7070406@egenix.com>
References: <0B902BFA-920E-4383-A09C-08A558168509@gmail.com> <20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us> <537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol> <537E1891.5050808@nedbatchelder.com>
<537E1A88.8080501@egenix.com>
<537F05F9.7070406@egenix.com>
Message-ID:

On 5/23/2014 4:25 AM, M.-A. Lemburg wrote:

>> I believe that Python has always had an 'as if' rule that allows more or less 'hidden'
>> optimizations, as long as the net effect of a statement is as defined.
>
> I was referring to the times before the peephole optimizer was
> introduced (Python 2.3 and earlier).
>
> What's important here is to look at the difference between what
> the compiler generates by simply following its rule book and the
> version of the byte code which is the result of running an
> optimizer on the byte code or even on the AST before running the
> transform to byte code.

I have tried to say that the 'rule book' at a particular stage is not a
fixed thing. There are several tranformations from source to CPython
bytecode. The order and grouping is somewhat a matter of convenience.

However, leave that aside. What Ned wants and what Guido has supported
is that there be an option to get bytecode that is friendly to execution
analysis. They can decide what constraints that places on the end
product and therefore on the multiple transformation processes.

> For me, a key argument for having a runtime mode without
> compiler optimizations is that the compiler gains
> more freedom in applying more aggressive optimizations.
>
> Tools will no longer have to adapt to whatever optimizations
> are added with each new Python release, since there will be
> a defined non-optimized runtime mode they can use as basis for
> their work.

Stability is certainly a useful constraint.

> The net result would be faster Pythons and better working debugging
> tools (well, at least that's the hope ;-).

Good point. It appears that rethinking the current -O, -OO will help.

--
Terry Jan Reedy

From tjreedy at udel.edu Fri May 23 22:05:11 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 23 May 2014 16:05:11 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To: <537F257A.5000509@nedbatchelder.com>
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol>

<537D2FDA.6030907@nedbatchelder.com>
<537DFA95.4040000@nedbatchelder.com>
<20140523115357.435531a4@fsol> <537F257A.5000509@nedbatchelder.com>
Message-ID:

On 5/23/2014 6:39 AM, Ned Batchelder wrote:
> On 5/23/14 5:53 AM, Antoine Pitrou wrote:
>> On Thu, 22 May 2014 22:53:28 -0400
>> Terry Reedy wrote:
>>>> I'd like to understand why we think the Python compiler is different in
>>>> this regard than a C compiler.
>>> Python is a different language. But let us not get sidetracked on that.
>> The number one difference is that people don't compile code explicitly
>> when writing Python code (well, except packagers who call compileall(),
>> and a few advanced uses). So "choosing compilation options" is really
>> not part of the standard workflow for developing in Python.
>
> That seems an odd distinction to make, given that we already do have
> ways to control how the compilation step happens,

There are not used much, and I doubt that anyone is joyous at the status
quo. Which is why your proposal looks more inviting (to me, and I think
to some others) as part of a reworking of the clumbsy status quo than as
a clumbsy add-on.

--
Terry Jan Reedy

From dw+python-ideas at hmmz.org Fri May 23 22:07:17 2014
From: dw+python-ideas at hmmz.org (dw+python-ideas at hmmz.org)
Date: Fri, 23 May 2014 20:07:17 +0000
Subject: [Python-ideas] Faster PyArg_ParseTupleAndKeywords kwargs
In-Reply-To: <20140523192220.GA20596@k2>
References: <20140523192220.GA20596@k2>
Message-ID: <20140523200717.GA22671@k2>

On Fri, May 23, 2014 at 07:22:20PM +0000, dw+python-ideas at hmmz.org wrote:

> if(! PyArg_FastParse(&arg_info, args, kwds, "ns#|O!", keywords,
> &foo, &some_buf, &some_buf_size,
> &PyList_Type, &list)) {
> return NULL;
> }

Perhaps the most off-the-wall approach would be to completely preserve
the existing interface, by using a dollop of assembly to fetch the
return address, and use that to maintain some internal hash table.

That's incredibly nasty, but as a systemic speedup it might be worth it?

David

From njs at pobox.com Fri May 23 22:08:28 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 23 May 2014 21:08:28 +0100
Subject: [Python-ideas] Faster PyArg_ParseTupleAndKeywords kwargs
In-Reply-To: <20140523192220.GA20596@k2>
References: <20140523192220.GA20596@k2>
Message-ID:

On Fri, May 23, 2014 at 8:22 PM, wrote:
> Early while working on py-lmdb I noticed that a huge proportion of
> runtime was being lost to PyArg_ParseTupleAndKeywords, and so I
> subsequently wrote a specialization for this extension module.
>
> In the current code[0], parse_args() is much faster than
> ParseTupleAndKeywords, responsible for a doubling of performance in
> several of the library's faster code paths (e.g.
> Cursor.put(append=True)). Ever since adding the rewrite I've wanted to
> go back and either remove it or at least reduce the amount of custom
> code, but it seems there really isn't a better approach to fast argument
> parsing using the bare Python C API at the moment.
>
> [0] https://github.com/dw/py-lmdb/blob/master/lmdb/cpython.c#L833
>
> In the append=True path, parse_args() yields a method that can complete
> 1.1m insertions/sec on my crappy Core 2 laptop, compared to 592k/sec
> using the same method rewritten with PyArg_ParseTupleAndKeywords.
>
> Looking to other 'fast' projects for precedent, and studying Cython's
> output in particular, it seems that Cython completely ignores the
> standard APIs and expends a huge amount of .text on using almost every
> imagineable C performance trick to speed up parsing (actually Cython's
> output is a sheer marvel of trickery, it's worth study). So it's clear
> the standard APIs are somewhat non-ideal, and those concerned with
> performance are taking other approaches.

As another data point about PyArg_ParseTupleAndKeywords slowness,
Numpy has tons of barely-maintainable hand-written argument parsing
code. I haven't read the proposal below in detail, but anything that
helps us clean that up is ok with me...

You should check out Argument Clinic (PEP 436) if you haven't seen it.

-n

--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

From dw+python-ideas at hmmz.org Fri May 23 22:22:48 2014
From: dw+python-ideas at hmmz.org (dw+python-ideas at hmmz.org)
Date: Fri, 23 May 2014 20:22:48 +0000
Subject: [Python-ideas] Faster PyArg_ParseTupleAndKeywords kwargs
In-Reply-To:
References: <20140523192220.GA20596@k2>

Message-ID: <20140523202248.GB22671@k2>

On Fri, May 23, 2014 at 09:08:28PM +0100, Nathaniel Smith wrote:

> You should check out Argument Clinic (PEP 436) if you haven't seen it.

Thanks! I'd seen this but forgotten about it. The use of a preprocessor
seems excessive, and a potential PITA when combined with other
preprocessors - e.g. Qt's moc, but the language is a very cool idea.

If the DSL definition was expressed as a string constant, that pointer
could key an internal hash table. Still not as fast as specialized code,
but perhaps an interesting middleground.

David

From njs at pobox.com Fri May 23 22:38:40 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 23 May 2014 21:38:40 +0100
Subject: [Python-ideas] Faster PyArg_ParseTupleAndKeywords kwargs
In-Reply-To: <20140523202248.GB22671@k2>
References: <20140523192220.GA20596@k2>

<20140523202248.GB22671@k2>
Message-ID:

On Fri, May 23, 2014 at 9:22 PM, wrote:
> On Fri, May 23, 2014 at 09:08:28PM +0100, Nathaniel Smith wrote:
>
>> You should check out Argument Clinic (PEP 436) if you haven't seen it.
>
> Thanks! I'd seen this but forgotten about it. The use of a preprocessor
> seems excessive, and a potential PITA when combined with other
> preprocessors - e.g. Qt's moc, but the language is a very cool idea.

Yes, but OTOH it's working and shipping code with a substantial user
base (lots of the CPython implementation), so making it fast and
usable in third-party libraries might still be the most efficient
approach. And IIRC it's not (necessarily) a build-time thing, the
usual mode is for it to update your checked-in source directly, so
integration with other preprocessors might be a non-issue.

A preprocessor approach might make it easier to support older Python's
in the generated code, compared to a library approach. (It's easier to
say "developers/the person making the source release must have Python
3 installed, but the generated code works everywhere" than to say
"this library only works on Python 3.5+ because that's the first
version that ships the new argument parsing API".)

-n

--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

From dw+python-ideas at hmmz.org Fri May 23 23:41:35 2014
From: dw+python-ideas at hmmz.org (dw+python-ideas at hmmz.org)
Date: Fri, 23 May 2014 21:41:35 +0000
Subject: [Python-ideas] Faster PyArg_ParseTupleAndKeywords kwargs
In-Reply-To: <20140523200717.GA22671@k2>
References: <20140523192220.GA20596@k2>
<20140523200717.GA22671@k2>
Message-ID: <20140523214135.GA24056@k2>

On Fri, May 23, 2014 at 08:07:17PM +0000, dw+python-ideas at hmmz.org wrote:

> Perhaps the most off-the-wall approach would be to completely preserve
> the existing interface, by using a dollop of assembly to fetch the
> return address, and use that to maintain some internal hash table.
>
> That's incredibly nasty, but as a systemic speedup it might be worth it?

Final (obvious in hindsight) suggestion: mix 'fmt' and 'keywords'
argument pointers together for use as a hash key into a table within
getargs.c, no nasty asm or interface changes necessary.

David

From greg.ewing at canterbury.ac.nz Sat May 24 01:15:10 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 24 May 2014 11:15:10 +1200
Subject: [Python-ideas] Python-ideas Digest, Vol 90, Issue 30
In-Reply-To:
References:
<0B902BFA-920E-4383-A09C-08A558168509@gmail.com>
<20140521232414.2720b5ed@fsol> <537D1CA0.7070106@stoneleaf.us>
<537D311A.2010101@nedbatchelder.com> <20140522105220.70b360fe@fsol>

<537E1891.5050808@nedbatchelder.com>

<537E5B4E.8050905@nedbatchelder.com>

Message-ID: <537FD67E.2000000@canterbury.ac.nz>

Nathaniel Smith wrote:
> "Compactly encoding a dict of sets of
> ints" is not the sort of challenge that we should find daunting and
> impossible.

I'd question whether it's even worth going to heroic lengths
to compress the lnotab these days, especially if it could be
lazily loaded from the pyc when needed.

--
Greg

From ncoghlan at gmail.com Sat May 24 03:36:29 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 24 May 2014 11:36:29 +1000
Subject: [Python-ideas] Faster PyArg_ParseTupleAndKeywords kwargs
In-Reply-To:
References: <20140523192220.GA20596@k2>

<20140523202248.GB22671@k2>

Message-ID:

On 24 May 2014 06:39, "Nathaniel Smith" wrote:
>
> On Fri, May 23, 2014 at 9:22 PM, wrote:
> > On Fri, May 23, 2014 at 09:08:28PM +0100, Nathaniel Smith wrote:
> >
> >> You should check out Argument Clinic (PEP 436) if you haven't seen it.
> >
> > Thanks! I'd seen this but forgotten about it. The use of a preprocessor
> > seems excessive, and a potential PITA when combined with other
> > preprocessors - e.g. Qt's moc, but the language is a very cool idea.
>
> Yes, but OTOH it's working and shipping code with a substantial user
> base (lots of the CPython implementation), so making it fast and
> usable in third-party libraries might still be the most efficient
> approach. And IIRC it's not (necessarily) a build-time thing, the
> usual mode is for it to update your checked-in source directly, so
> integration with other preprocessors might be a non-issue.

Note there are two key goals behind Argument Clinic:

1. Add introspection metadata to functions implemented in C without further
reducing maintainability (adding an arg to a C function already touched 6
places, signature metadata would have been a 7th)

2. Eventually switch the generated code to something faster than
PyArg_ParseTupleAndKeywords.

What phase 2 actually looks like hasn't been defined yet (enabling phase 1
ended up being a big enough challenge for 3.4), but the ideas in this
thread would definitely be worth exploring further in that context.

As Nathaniel noted, once checked in, Argument Clinic code is just ordinary
C code with some funny comments, so it introduces no additional build time
dependencies.

Cheers,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From dw+python-ideas at hmmz.org Sat May 24 03:52:11 2014
From: dw+python-ideas at hmmz.org (dw+python-ideas at hmmz.org)
Date: Sat, 24 May 2014 01:52:11 +0000
Subject: [Python-ideas] Faster PyArg_ParseTupleAndKeywords kwargs
In-Reply-To:
References: <20140523192220.GA20596@k2>

<20140523202248.GB22671@k2>

Message-ID: <20140524015211.GA28050@k2>

On Sat, May 24, 2014 at 11:36:29AM +1000, Nick Coghlan wrote:

> 1. Add introspection metadata to functions implemented in C without further
> reducing maintainability (adding an arg to a C function already touched 6
> places, signature metadata would have been a 7th)

> As Nathaniel noted, once checked in, Argument Clinic code is just ordinary C
> code with some funny comments, so it introduces no additional build time
> dependencies.

Hadn't realized it was already in use! It isn't nearly as intrusive as I
might have expected, it seems 'preprocessor' is just a scary word. :)

> 2. Eventually switch the generated code to something faster than
> PyArg_ParseTupleAndKeywords.

> What phase 2 actually looks like hasn't been defined yet (enabling phase 1
> ended up being a big enough challenge for 3.4), but the ideas in this thread
> would definitely be worth exploring further in that context.

The previous mail's hint led to thinking about how to actually implement
a no-API-changes internal cache for PyArg_ParseTupleAndKeywords. While
not a perfect solution, that approach has the tremendous benefit of
backwards compatibility with every existing extension.

It seems after 20 years' evolution, getargs.c is quite resistant to
change (read: it afflicts headaches and angst on the unweary), so
instead I've spent my Friday evening exploring a rewrite.

David

From z at etiol.net Sat May 24 05:57:18 2014
From: z at etiol.net (Zero Piraeus)
Date: Fri, 23 May 2014 23:57:18 -0400
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

Message-ID: <20140524035718.GA13305@piedra>

:

On Thu, May 22, 2014 at 12:49:32PM -0600, Eric Snow wrote:
>
> -O0 -- no optmizations at all
> [...]
> -OO -- same as -Os (deprecate)

Making no optimization so easily visually confused with maximum
optimization isn't terribly good UI ...

-[]z.

--
Zero Piraeus: inter caetera
http://etiol.net/pubkey.asc

From ned at nedbatchelder.com Tue May 27 02:27:20 2014
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Mon, 26 May 2014 20:27:20 -0400
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com>
<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

<537E5D67.90101@nedbatchelder.com>

Message-ID: <5383DBE8.6020309@nedbatchelder.com>

On 5/23/14 1:22 PM, Guido van Rossum wrote:
> On Fri, May 23, 2014 at 10:17 AM, Eric Snow
> > wrote:
>
> On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum
> > wrote:
> > Looking at my own (frequent) use of coverage.py, I would be
> totally fine if
> > disabling peephole optimization only affected my app's code, and
> kept using
> > the precompiled stdlib. (How exactly this would work is left as
> an exercise
> > for the reader.)
>
> Would it be a problem if .pyc files weren't generated or used (a la -B
> or PYTHONDONTWRITEBYTECODE) when you ran coverage?
>
>
> In first approximation that would probably be okay, although it would
> make coverage even slower. I was envisioning something where it would
> still use, but not write, pyc files for the stdlib or site-packages,
> because the code in whose coverage I am interested is puny compared to
> the stdlib code it imports.

I was concerned about losing any time in test suites that are already
considered too slow. But I tried to do some controlled measurements of
these scenarios, and found the worst case (no .pyc available, and none
written) was only 2.8% slower than full .pyc files available. When I
tried to measure stdlib .pyc's available, and no .pyc's for my code, the
results were actually very slightly faster than the typical case. I
think this points to the difficult in controlling all the variables!

In any case, it seems that the penalty for avoiding the .pyc files is
not burdensome.
>
> --
> --Guido van Rossum (python.org/~guido )
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From ncoghlan at gmail.com Tue May 27 04:40:37 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 27 May 2014 12:40:37 +1000
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To: <5383DBE8.6020309@nedbatchelder.com>
References: <537C888D.7060903@nedbatchelder.com>
<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

<537E5D67.90101@nedbatchelder.com>

<5383DBE8.6020309@nedbatchelder.com>
Message-ID:

On 27 May 2014 10:28, "Ned Batchelder" wrote:
>
> On 5/23/14 1:22 PM, Guido van Rossum wrote:
>>
>> On Fri, May 23, 2014 at 10:17 AM, Eric Snow
wrote:
>>>
>>> On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum
wrote:

>>>
>>> Would it be a problem if .pyc files weren't generated or used (a la -B
>>> or PYTHONDONTWRITEBYTECODE) when you ran coverage?
>>
>>
>> In first approximation that would probably be okay, although it would
make coverage even slower. I was envisioning something where it would still
use, but not write, pyc files for the stdlib or site-packages, because the
code in whose coverage I am interested is puny compared to the stdlib code
it imports.
>
>
> I was concerned about losing any time in test suites that are already
considered too slow. But I tried to do some controlled measurements of
these scenarios, and found the worst case (no .pyc available, and none
written) was only 2.8% slower than full .pyc files available. When I tried
to measure stdlib .pyc's available, and no .pyc's for my code, the results
were actually very slightly faster than the typical case. I think this
points to the difficult in controlling all the variables!
>
> In any case, it seems that the penalty for avoiding the .pyc files is not
burdensome.

Along these lines, how about making the environment variable something like
"PYTHONANALYSINGSOURCE" with the effects:

- bytecode files are neither read nor written
- all bytecode and AST optimisations are disabled

A use case oriented flag like that lets us tweak the definition as needed
in the future, unlike an option that is specific to turning off the CPython
peephole optimiser (e.g. we don't have an AST optimiser yet, but turning it
off would still be covered by an "analysing source" flag).

Cheers,
Nick.

>>
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>>
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From haoyi.sg at gmail.com Tue May 27 04:45:21 2014
From: haoyi.sg at gmail.com (Haoyi Li)
Date: Mon, 26 May 2014 19:45:21 -0700
Subject: [Python-ideas] Disable all peephole optimizations
In-Reply-To:
References: <537C888D.7060903@nedbatchelder.com>
<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

<537E5D67.90101@nedbatchelder.com>

<5383DBE8.6020309@nedbatchelder.com>

Message-ID:

> - bytecode files are neither read nor written

Yay! That would be amazing...

On Mon, May 26, 2014 at 7:40 PM, Nick Coghlan wrote:

>
> On 27 May 2014 10:28, "Ned Batchelder" wrote:
> >
> > On 5/23/14 1:22 PM, Guido van Rossum wrote:
> >>
> >> On Fri, May 23, 2014 at 10:17 AM, Eric Snow <
> ericsnowcurrently at gmail.com> wrote:
> >>>
> >>> On Fri, May 23, 2014 at 10:49 AM, Guido van Rossum
> wrote:
>
> >>>
> >>> Would it be a problem if .pyc files weren't generated or used (a la -B
> >>> or PYTHONDONTWRITEBYTECODE) when you ran coverage?
> >>
> >>
> >> In first approximation that would probably be okay, although it would
> make coverage even slower. I was envisioning something where it would still
> use, but not write, pyc files for the stdlib or site-packages, because the
> code in whose coverage I am interested is puny compared to the stdlib code
> it imports.
> >
> >
> > I was concerned about losing any time in test suites that are already
> considered too slow. But I tried to do some controlled measurements of
> these scenarios, and found the worst case (no .pyc available, and none
> written) was only 2.8% slower than full .pyc files available. When I tried
> to measure stdlib .pyc's available, and no .pyc's for my code, the results
> were actually very slightly faster than the typical case. I think this
> points to the difficult in controlling all the variables!
> >
> > In any case, it seems that the penalty for avoiding the .pyc files is
> not burdensome.
>
> Along these lines, how about making the environment variable something
> like "PYTHONANALYSINGSOURCE" with the effects:
>
> - bytecode files are neither read nor written
> - all bytecode and AST optimisations are disabled
>
> A use case oriented flag like that lets us tweak the definition as needed
> in the future, unlike an option that is specific to turning off the CPython
> peephole optimiser (e.g. we don't have an AST optimiser yet, but turning it
> off would still be covered by an "analysing source" flag).
>
> Cheers,
> Nick.
>
> >>
> >>
> >> --
> >> --Guido van Rossum (python.org/~guido)
> >>
> >>
> >> _______________________________________________
> >> Python-ideas mailing list
> >> Python-ideas at python.org
> >> https://mail.python.org/mailman/listinfo/python-ideas
> >> Code of Conduct: http://python.org/psf/codeofconduct/
> >
> >
> >
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at python.org
> > https://mail.python.org/mailman/listinfo/python-ideas
> > Code of Conduct: http://python.org/psf/codeofconduct/
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:

From barry at python.org Tue May 27 21:08:58 2014
From: barry at python.org (Barry Warsaw)
Date: Tue, 27 May 2014 15:08:58 -0400
Subject: [Python-ideas] Disable all peephole optimizations
References: <537C888D.7060903@nedbatchelder.com>
<537C9A5E.1060502@sotecware.net>
<537CB6ED.2050906@nedbatchelder.com>

<537D5A46.4040200@nedbatchelder.com>

<537DFBCA.2070006@nedbatchelder.com>

<20140522175910.GM10355@ando>

<537E5D67.90101@nedbatchelder.com>

Message-ID: <20140527150858.2ba75c1c@anarchist.wooz.org>

On May 23, 2014, at 09:49 AM, Guido van Rossum wrote:

>I would also like to remind people the reason why there are separate pyc
>and pyo files: they are separate to support precompilation of the standard
>library and installed 3rd party packages for different optimization levels.

In fact, Debian (and I'm sure other OSes with package managers) precompiles
source files at installation time. We have a couple of bugs languishing to
provide -OO optimization levels as an option when doing this precompilation.

I haven't pushed this forward because I got side-tracked by the overloading of
.pyo files for -O and -OO optimization levels.

I agree that the flags, mechanisms, and semantics should be worked out first,
but I also think that PEP 3147 tagging will provide a nice ui for the file
system representation of the optimization levels.

death-to-pyo-files-ly y'rs,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL:}}