From tjreedy at udel.edu  Sun Jun  1 00:21:00 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 31 May 2014 18:21:00 -0400
Subject: [Python-Dev] Updating  turtle.py
In-Reply-To: <538A1A03.4020405@v.loewis.de>
References: <lmbifr$drl$1@ger.gmane.org> <538A1A03.4020405@v.loewis.de>
Message-ID: <lmdkkk$vts$1@ger.gmane.org>

On 5/31/2014 2:05 PM, "Martin v. L?wis" wrote:
> Am 31.05.14 05:32, schrieb Terry Reedy:
>> I have two areas of questions about updating turtle.py. First the module
>> itself, then a turtle tracker issue versus code cleanup policies.
>>
>> A. Unlike most stdlib modules, turtle is copyrighted and licensed by an
>> individual.
>> '''
>> # turtle.py: a Tkinter based turtle graphics module for Python
>> # Version 1.1b - 4. 5. 2009
>> # Copyright (C) 2006 - 2010  Gregor Lingl
>> # email: glingl at aon.at
>> '''
>> I am not sure what the copyright covers other than the exact text
>> contributed, with updates, by Gregor. It certainly does not cover the
>> API and whatever code he copied from the previous version (unless that
>> was also by him, and I have no idea how much he copied when
>> reimplementing). I don't think it should cover additions made by others
>> either. Should there be another line to cover these?
>
> He should provide a contributor form, covering his past contributions.
> Would you like to contact him about this?

Thank you for the advice. I emailed him about contributor form, change 
notice in the file, and maintenance.

> Adding a license up-front (as you propose) is counter-productive; the
> author may not agree to your specific licensing terms. If he was
> unwilling to agree to the contributor form (which I doubt, knowing
> him personally), the only option would be to remove the code from the
> distribution.
>

>> Responding today, I cautioned that clean-up only patches, such as she
>> apparently would like to start with, are not in favor.
>
> I would not say that. I recall that I asked Gregor to make a number of
> style changes before he submitted the code, and eventually agreed to the
> code when I thought it was "good enough". However, continuing on that
> path sounds reasonable to me.

I am not sure what you mean by 'that path', to be continued on.

> It is the mixing of clean-up patches with functional changes that is not
> in favor.

What I have understood from Guido is that 'blind' format changes, not 
part of working on the file, are not good as they could cause harm 
without direct benefit. On the otherhand, you are saying that if the 
code is reviewed, then the format changes should be separate, possibly 
with a commit note that they are not 'blind'.

>> Since she only marked the issue for 3.5, I also cautioned that 3.5-only
>> cleanups would make fixing bugs in other issues harder. Is the code
>> clean-up policy the same for all branches?
>
> I don't think that we should be taken hostage by merging restrictions
> of the DVCS - we switched to the DVCS precisely with the promise that
> merging would be easier. Given the number of bug fixes that the turtle
> module has seen,

which is miniscule in the last few years... I ran differ on the 3.4 and 
3.5 versions of turtle.py and did not see any differences. So at the 
moment, forward porting is trivial.

> I'd suggest that it is less work to restrict cleanup
> to 3.5, and then deal with any forward-porting of bug fixing when it
> actually happens.

This would make it non-trivial for any patch hitting a difference.

-- 
Terry Jan Reedy


From steve at pearwood.info  Sun Jun  1 10:11:39 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 1 Jun 2014 18:11:39 +1000
Subject: [Python-Dev] Should standard library modules optimize for CPython?
Message-ID: <20140601081139.GO10355@ando>

I think I know the answer to this, but I'm going to ask it anyway...

I know that there is a general policy of trying to write code in the 
standard library that does not disadvantage other implementations. How 
far does that go the other way? Should the standard library accept 
slower code because it will be much faster in other implementations?

Briefly, I have a choice of algorithm for the median function in the 
statistics module. If I target CPython, I will use a naive but simple 
O(N log N) implementation based on sorting the list and returning the 
middle item. (That's what the module currently does.) But if I target 
PyPy, I will use an O(N) algorithm which knocks the socks off the naive 
version even for smaller lists. In CPython that's typically 2-5 times 
slower; in PyPy it's typically 3-8 times faster, and the bigger the data 
set the more the advantage.

For the specific details, see http://bugs.python.org/issue21592

My feeling is that the CPython standard library should be written for 
CPython, that is, it should stick to the current naive implementation of 
median, and if PyPy wants to speed the function up, they can provide 
their own version of the module. I should *not* complicate the 
implementation by trying to detect which Python the code is running 
under and changing algorithms accordingly. However, I should put a 
comment in the module pointing at the tracker issue. Does this sound 
right to others?


Thanks,


-- 
Steve

From stefan_ml at behnel.de  Sun Jun  1 11:02:56 2014
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 01 Jun 2014 11:02:56 +0200
Subject: [Python-Dev] Should standard library modules optimize for
	CPython?
In-Reply-To: <20140601081139.GO10355@ando>
References: <20140601081139.GO10355@ando>
Message-ID: <lmeq82$ba8$1@ger.gmane.org>

Steven D'Aprano, 01.06.2014 10:11:
> Briefly, I have a choice of algorithm for the median function in the 
> statistics module. If I target CPython, I will use a naive but simple 
> O(N log N) implementation based on sorting the list and returning the 
> middle item. (That's what the module currently does.) But if I target 
> PyPy, I will use an O(N) algorithm which knocks the socks off the naive 
> version even for smaller lists. In CPython that's typically 2-5 times 
> slower; in PyPy it's typically 3-8 times faster, and the bigger the data 
> set the more the advantage.
> 
> For the specific details, see http://bugs.python.org/issue21592
> 
> My feeling is that the CPython standard library should be written for 
> CPython, that is, it should stick to the current naive implementation of 
> median, and if PyPy wants to speed the function up, they can provide 
> their own version of the module.

Note that if you compile the module with Cython, CPython heavily benefits
from the new implementation, too, by a factor of 2-5x. So there isn't
really a reason to choose between two implementations because of the two
runtimes, just use the new one for both and compile it for CPython. I added
the necessary bits to the ticket.

Stefan


From ncoghlan at gmail.com  Sun Jun  1 14:31:17 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 1 Jun 2014 22:31:17 +1000
Subject: [Python-Dev] Should standard library modules optimize for
	CPython?
In-Reply-To: <20140601081139.GO10355@ando>
References: <20140601081139.GO10355@ando>
Message-ID: <CADiSq7d+KnnUB6ENseQLvXyU-PzZkCnYPg3C2iMXDKzZkBo_7A@mail.gmail.com>

On 1 Jun 2014 18:13, "Steven D'Aprano" <steve at pearwood.info> wrote:
>
> My feeling is that the CPython standard library should be written for
> CPython, that is, it should stick to the current naive implementation of
> median, and if PyPy wants to speed the function up, they can provide
> their own version of the module. I should *not* complicate the
> implementation by trying to detect which Python the code is running
> under and changing algorithms accordingly. However, I should put a
> comment in the module pointing at the tracker issue. Does this sound
> right to others?

One option is to set the pure Python module up to be paired with an
accelerator module (and update the test suite accordingly), even if we
*don't provide* an accelerator in CPython. That just inverts the more
common case (where we have an accelerator written in C, but another
implementation either doesn't need one, or just doesn't have one yet).

Cheers,
Nick.

>
>
> Thanks,
>
>
>
> --
> Steve
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140601/5afec8fe/attachment.html>

From solipsis at pitrou.net  Sun Jun  1 18:17:22 2014
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 01 Jun 2014 18:17:22 +0200
Subject: [Python-Dev] Should standard library modules optimize for
	CPython?
In-Reply-To: <20140601081139.GO10355@ando>
References: <20140601081139.GO10355@ando>
Message-ID: <lmfjmi$66j$1@ger.gmane.org>

Le 01/06/2014 10:11, Steven D'Aprano a ?crit :
>
> My feeling is that the CPython standard library should be written for
> CPython, that is, it should stick to the current naive implementation of
> median, and if PyPy wants to speed the function up, they can provide
> their own version of the module. I should *not* complicate the
> implementation by trying to detect which Python the code is running
> under and changing algorithms accordingly. However, I should put a
> comment in the module pointing at the tracker issue. Does this sound
> right to others?

It sounds ok to me.

Regards

Antoine.


From benjamin at python.org  Mon Jun  2 01:02:03 2014
From: benjamin at python.org (Benjamin Peterson)
Date: Sun, 01 Jun 2014 16:02:03 -0700
Subject: [Python-Dev] [RELEASE] Python 2.7.7
Message-ID: <1401663723.32188.123962509.15B80988@webmail.messagingengine.com>

I'm happy to announce the immediate availability of Python 2.7.7. Python
2.7.7 is a regularly scheduled bugfix release for the Python 2.7 series.
This release includes months of accumulated bugfixes. All the changes in
Python 2.7.7 are described in detail in the Misc/NEWS file of the source
tarball. You can view it online at

    http://hg.python.org/cpython/raw-file/f89216059edf/Misc/NEWS

The 2.7.7 release also contains fixes for two severe, if arcane,
potential security vulnerabilities. The first was the possibility of
reading arbitrary process memory using JSONDecoder.raw_decode. [1] (No
other json APIs are affected.) The second security issue is an integer
overflow in the strop module. [2] (You actually have no reason
whatsoever to use the strop module.) Another security note for 2.7.7 is
that the release includes a backport from Python 3 of
hmac.compare_digest. This begins the implementation of PEP 466, Network
Security Enhancements for Python 2.7.x.

Downloads are at

    https://python.org/download/releases/2.7.7/

This is a production release. As always, please report bugs to

    http://bugs.python.org/

Build great things,
Benjamin Peterson
2.7 Release Manager
(on behalf of all of Python's contributors)

[1] http://bugs.python.org/issue21529
[2] http://bugs.python.org/issue21530

From raymond.hettinger at gmail.com  Mon Jun  2 02:13:54 2014
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sun, 1 Jun 2014 17:13:54 -0700
Subject: [Python-Dev] Updating  turtle.py
In-Reply-To: <lmbifr$drl$1@ger.gmane.org>
References: <lmbifr$drl$1@ger.gmane.org>
Message-ID: <455F20E5-429E-49A2-A652-CED483334BA1@gmail.com>


On May 30, 2014, at 8:32 PM, Terry Reedy <tjreedy at udel.edu> wrote:

> B. Lets assuming that turtle.py is, at least to some degree, fair game for fixes and enhancements. PSF Python PyLadies (Jessica Keller, Lynn Root) are participating in the 2014 GNOME Outreach Program for Women (OPW) https://wiki.python.org/moin/OPW/2014 . One of the projects (bottem of that page) is Graphical Python, in particular Turtle.
> 
> A few days ago, Jessica posted
> http://bugs.python.org/issue21573 Clean up turtle.py code formatting
> "Lib/turtle.py has some code formatting issues. Let's clean them up to make the module easier to read as interns start working on it this summer." She want to follow cleanup with code examination, fixes, and enhancements.

If these modules are going to change (and Gregor gives us the go-ahead), I suggest we do real clean-ups, not shallow pep8/pylint micro-changes.   

I use these modules as part of a program to teach adults how to teach programming to children.  I've have good success but think the code for several of the modules needs to be simplified.  At some point, kids wrote some of this code but along the way it got "adultified", making it less useful for teaching younger kids.

I would like to be involved in helping to improve these modules in a substantive way and would be happy to coach anyone who wants to undertake the effort and bring a useful patch to fruition.

One thing I would not like to see happen is telling interns that their time is being well spent by pep-8 checking code in the standard library.  It sends that wrong message about what constitutes an actual contribution to the core.  There are plenty of useful things to do instead (we have an "easy" tag on tracker to highlight a few of them).

Another thought is that there are tons of python projects that could use real help and those would likely be a better place to start than trying to patch mature standard library code (where the chance of regression, code churn, or rejection is much higher).

Over the past few years, I've taught Python to over three thousand programmers and have gotten a number of them started in open source (a number of them are now active contributors to OpenStack for example), but I almost never direct them to take their baby steps in the Python core (unless they've found an actual defect or room for improvement).  It's a bummer, but in mature code, almost every idea that occurs to a beginner is something that makes the code worse in some way -- that isn't always true but it happens often enough to be discouraging.


Raymond
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140601/fed625f8/attachment.html>

From raymond.hettinger at gmail.com  Mon Jun  2 02:19:06 2014
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sun, 1 Jun 2014 17:19:06 -0700
Subject: [Python-Dev] Should standard library modules optimize for
	CPython?
In-Reply-To: <lmfjmi$66j$1@ger.gmane.org>
References: <20140601081139.GO10355@ando> <lmfjmi$66j$1@ger.gmane.org>
Message-ID: <8556310D-E314-466E-9BED-E66FCD4C04F1@gmail.com>


On Jun 1, 2014, at 9:17 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:

> Le 01/06/2014 10:11, Steven D'Aprano a ?crit :
>> 
>> My feeling is that the CPython standard library should be written for
>> CPython, that is, it should stick to the current naive implementation of
>> median, and if PyPy wants to speed the function up, they can provide
>> their own version of the module. I should *not* complicate the
>> implementation by trying to detect which Python the code is running
>> under and changing algorithms accordingly. However, I should put a
>> comment in the module pointing at the tracker issue. Does this sound
>> right to others?
> 
> It sounds ok to me.

That makes sense.


Raymond
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140601/bb02d8b1/attachment.html>

From stephen at xemacs.org  Mon Jun  2 06:03:09 2014
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 02 Jun 2014 13:03:09 +0900
Subject: [Python-Dev] Updating  turtle.py
In-Reply-To: <455F20E5-429E-49A2-A652-CED483334BA1@gmail.com>
References: <lmbifr$drl$1@ger.gmane.org>
 <455F20E5-429E-49A2-A652-CED483334BA1@gmail.com>
Message-ID: <878upg556q.fsf@uwakimon.sk.tsukuba.ac.jp>

Raymond Hettinger writes:

 > One thing I would not like to see happen is telling interns that
 > their time is being well spent by pep-8 checking code in the
 > standard library.  It sends that wrong message about what
 > constitutes an actual contribution to the core.  There are plenty
 > of useful things to do instead (we have an "easy" tag on tracker to
 > highlight a few of them).

I have to ask for a qualification here, at least in the case of GSoC
interns.  Of course the intern should contribute to the code, but they
are also supposed to become developers in the community.  Spending a
few hours checking code for PEP-8-correctness is useful training in
writing good core code going forward.  I agree with you that if they
don't move on after a day or so, they should be told to do so.  OTOH,
I haven't yet met an intern who was willing and able to write in good
PEP 8 style to start with, let alone one who was willing to "waste his
time" doing style-checking on existing code -- is this really a problem?

I agree that they should be told that this is an investment in *their*
skills, and at best of marginal value to *Python*, of course.  As you
point out, directing them away from core code to other projects
requiring PEP 8 in their style guides is usually a good idea, too.

 > It's a bummer, but in mature code, almost every idea that occurs to
 > a beginner is something that makes the code worse in some way --
 > that isn't always true but it happens often enough to be
 > discouraging.

This is precisely why style-checking in the core may be a good idea
for interns: assume the code is *good* code (it probably is), don't
mess with the algorithms, but make the code "look right" according to
project standards.  The risk you cite is still there, but much less.
It shows them what Pythonicity looks like at a deeper level than the
relatively superficial[1] guidelines in PEP 8.


Footnotes: 
[1]  Not deprecatory.  Consistent good looks are important.


From ncoghlan at gmail.com  Mon Jun  2 09:12:47 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 2 Jun 2014 17:12:47 +1000
Subject: [Python-Dev] Updating turtle.py
In-Reply-To: <878upg556q.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <lmbifr$drl$1@ger.gmane.org>
 <455F20E5-429E-49A2-A652-CED483334BA1@gmail.com>
 <878upg556q.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CADiSq7dHZ4O09gAcpD-npr1HeSE4i1pqWBrGqQYwJCxRSFPh=A@mail.gmail.com>

On 2 June 2014 14:03, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Raymond Hettinger writes:
>  > It's a bummer, but in mature code, almost every idea that occurs to
>  > a beginner is something that makes the code worse in some way --
>  > that isn't always true but it happens often enough to be
>  > discouraging.
>
> This is precisely why style-checking in the core may be a good idea
> for interns: assume the code is *good* code (it probably is), don't
> mess with the algorithms, but make the code "look right" according to
> project standards.  The risk you cite is still there, but much less.
> It shows them what Pythonicity looks like at a deeper level than the
> relatively superficial[1] guidelines in PEP 8.

The problem from my perspective is that the standard library contains
code where it's either old enough to predate the evolution of the
conventions now documented in PEP 8, or else we declared some code
(especially test code) "good enough" for inclusion because we *really*
wanted the functionality it provided (the original ipaddr tests come
to mind - I suspect that tracker issue is one of the specific cases
Raymond is thinking of as well).

Even if we had unlimited reviewer resources (which we don't),
mechanical code cleanups tend to fall under the "if it ain't broke,
don't fix it" guideline. That then sets us up for a conflict between
folks just getting started and trying to be helpful, and those of us
that are of the school of thought that sees a difference between
"cleaning code up to make it easier to work on a subsequent bug fix or
feature request" and "cleaning code up for the sake of cleaning it
up". The latter is generally a bad idea, while the former may be a
good idea, but it can be hard to explain the difference to folks that
are more familiar with code bases started in the modern era where the
ability to easily run automated tests and code analysis on every
commit is almost assumed, rather than being seen as an exceptional
situation.

There's a reason the desire to "throw it out and start again with a
clean slate" is a common trait amongst developers: green field
programming is genuinely *more fun* than maintenance programming in
most cases. I believe Raymond's concern (and mine) is that if the
challenges of maintenance programming aren't made clear to potential
contributors up front, they're going to be disappointed when their
patches that might be fine for a green field project, or as part of
the development of a particular feature or fix, are instead rejected
as imposing too much risk for not enough gain when considered in
isolation.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From martin at v.loewis.de  Mon Jun  2 09:14:11 2014
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 02 Jun 2014 09:14:11 +0200
Subject: [Python-Dev] Updating  turtle.py
In-Reply-To: <lmdkkk$vts$1@ger.gmane.org>
References: <lmbifr$drl$1@ger.gmane.org> <538A1A03.4020405@v.loewis.de>
 <lmdkkk$vts$1@ger.gmane.org>
Message-ID: <538C2443.3070702@v.loewis.de>

Am 01.06.14 00:21, schrieb Terry Reedy:
>>> Responding today, I cautioned that clean-up only patches, such as she
>>> apparently would like to start with, are not in favor.
>>
>> I would not say that. I recall that I asked Gregor to make a number of
>> style changes before he submitted the code, and eventually agreed to the
>> code when I thought it was "good enough". However, continuing on that
>> path sounds reasonable to me.
> 
> I am not sure what you mean by 'that path', to be continued on.

The path of improving the coding style of the turtle module.

>> I'd suggest that it is less work to restrict cleanup
>> to 3.5, and then deal with any forward-porting of bug fixing when it
>> actually happens.
> 
> This would make it non-trivial for any patch hitting a difference.

Indeed. OTOH, it's simpler for anybody doing the code cleanup to do it
only on one branch.

Regards,
Martin


From victor.stinner at gmail.com  Mon Jun  2 10:43:47 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Mon, 2 Jun 2014 10:43:47 +0200
Subject: [Python-Dev] Should standard library modules optimize for
	CPython?
In-Reply-To: <20140601081139.GO10355@ando>
References: <20140601081139.GO10355@ando>
Message-ID: <CAMpsgwZf6E-emO-B0fsJqV5+1w_Kd-d2DUBj2rwNizG8hHoKiA@mail.gmail.com>

2014-06-01 10:11 GMT+02:00 Steven D'Aprano <steve at pearwood.info>:
> My feeling is that the CPython standard library should be written for
> CPython,

Right. PyPy, Jython and IronPython already have their "own" standard
library when they need a different implement.

PyPy: "lib_pypy" directory (lib-python is the CPython stdlib):
https://bitbucket.org/pypy/pypy/src/ac52eb7bbbb059d0b8d001a2103774917cf7396f/lib_pypy/?at=default

Jython: "Lib" directory (lib-python is the CPython stdlib):
https://bitbucket.org/jython/jython/src/9cd9ab75eadea898e2e74af82ae414925d6a1135/Lib/?at=default

IronPython: "IronPython.Modules" directory:
http://ironpython.codeplex.com/SourceControl/latest#IronPython_Main/Languages/IronPython/IronPython.Modules/

See for example the _fsum.py module of Jython:
https://bitbucket.org/jython/jython/src/9cd9ab75eadea898e2e74af82ae414925d6a1135/Lib/_fsum.py?at=default

Victor

From stephen at xemacs.org  Mon Jun  2 10:46:40 2014
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 02 Jun 2014 17:46:40 +0900
Subject: [Python-Dev] Updating turtle.py
In-Reply-To: <CADiSq7dHZ4O09gAcpD-npr1HeSE4i1pqWBrGqQYwJCxRSFPh=A@mail.gmail.com>
References: <lmbifr$drl$1@ger.gmane.org>
 <455F20E5-429E-49A2-A652-CED483334BA1@gmail.com>
 <878upg556q.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dHZ4O09gAcpD-npr1HeSE4i1pqWBrGqQYwJCxRSFPh=A@mail.gmail.com>
Message-ID: <87zjhv4s27.fsf@uwakimon.sk.tsukuba.ac.jp>

Nick Coghlan writes:

 > Even if we had unlimited reviewer resources (which we don't),

Raymond said "interns".  We at least have a mentor.

 > There's a reason the desire to "throw it out and start again with a
 > clean slate" is a common trait amongst developers:

You mean the Cascade of Attention-Deficit Teenagers development model?

 > I believe Raymond's concern (and mine) is that if the challenges of
 > maintenance programming aren't made clear to potential contributors
 > up front,

So make it clear when the assignment is given.  Remember, the point
I'm making is that it's an investment for the intern, not for Python.
If their code eventually gets relegated to a branch the may never ever
get merged, that's a learning experience too -- they may have been
told, and *thought* they signed up for that up front, but it's
different when you actually get told, "it could be useful, but on
balance let's not touch this code" or even "the 'owner' of the code
doesn't have time to look at changes".

It's not something I suggest as a "rite of initiation" for *all*
interns.  I just think it would be overkill to prohibit it in
principle -- I have a couple of (non-Python) interns who would benefit
from the exercise (their projects are greenfield code, so they have no
"model code" to start from).  It wasn't clear to me whether Raymond
meant to go that far as a general prohibition.


Regards,


From fijall at gmail.com  Mon Jun  2 10:48:20 2014
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Mon, 2 Jun 2014 10:48:20 +0200
Subject: [Python-Dev] Should standard library modules optimize for
	CPython?
In-Reply-To: <CAMpsgwZf6E-emO-B0fsJqV5+1w_Kd-d2DUBj2rwNizG8hHoKiA@mail.gmail.com>
References: <20140601081139.GO10355@ando>
 <CAMpsgwZf6E-emO-B0fsJqV5+1w_Kd-d2DUBj2rwNizG8hHoKiA@mail.gmail.com>
Message-ID: <CAK5idxQ-foHo1_mtz0iLYeot-E1coaGwjFU67F7XKwHExPzXXQ@mail.gmail.com>

On Mon, Jun 2, 2014 at 10:43 AM, Victor Stinner
<victor.stinner at gmail.com> wrote:
> 2014-06-01 10:11 GMT+02:00 Steven D'Aprano <steve at pearwood.info>:
>> My feeling is that the CPython standard library should be written for
>> CPython,
>
> Right. PyPy, Jython and IronPython already have their "own" standard
> library when they need a different implement.
>
> PyPy: "lib_pypy" directory (lib-python is the CPython stdlib):
> https://bitbucket.org/pypy/pypy/src/ac52eb7bbbb059d0b8d001a2103774917cf7396f/lib_pypy/?at=default

it's for stuff that's in CPython implemented in C, not a
reimplementation of python stuff. we patched the most obvious
CPython-specific hacks, but it's a loosing battle, you guys will go
way out of your way to squeeze extra 2% by doing very obscure hacks.

>
> Jython: "Lib" directory (lib-python is the CPython stdlib):
> https://bitbucket.org/jython/jython/src/9cd9ab75eadea898e2e74af82ae414925d6a1135/Lib/?at=default
>
> IronPython: "IronPython.Modules" directory:
> http://ironpython.codeplex.com/SourceControl/latest#IronPython_Main/Languages/IronPython/IronPython.Modules/
>
> See for example the _fsum.py module of Jython:
> https://bitbucket.org/jython/jython/src/9cd9ab75eadea898e2e74af82ae414925d6a1135/Lib/_fsum.py?at=default
>
> Victor
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

From tjreedy at udel.edu  Mon Jun  2 11:17:37 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 02 Jun 2014 05:17:37 -0400
Subject: [Python-Dev] Updating turtle.py
In-Reply-To: <CADiSq7dHZ4O09gAcpD-npr1HeSE4i1pqWBrGqQYwJCxRSFPh=A@mail.gmail.com>
References: <lmbifr$drl$1@ger.gmane.org>
 <455F20E5-429E-49A2-A652-CED483334BA1@gmail.com>
 <878upg556q.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CADiSq7dHZ4O09gAcpD-npr1HeSE4i1pqWBrGqQYwJCxRSFPh=A@mail.gmail.com>
Message-ID: <lmhffi$17r$1@ger.gmane.org>

On 6/2/2014 3:12 AM, Nick Coghlan wrote:

> Even if we had unlimited reviewer resources (which we don't),
> mechanical code cleanups tend to fall under the "if it ain't broke,
> don't fix it" guideline. That then sets us up for a conflict between
> folks just getting started and trying to be helpful, and those of us
> that are of the school of thought that sees a difference between
> "cleaning code up to make it easier to work on a subsequent bug fix or

In the case of turtle, Jessica said from the beginning that code cleanup 
would be for the purpose of understanding the code and making it easier 
to do bug fixes and enhancements.

> feature request" and "cleaning code up for the sake of cleaning it
> up".

As you know, many outsiders think that we take PEP 8 more seriously than 
we do.

  The latter is generally a bad idea, while the former may be a
> good idea,

Lita seemed to quickly understand that being able to test a bug fix is 
more important than making it look pretty. In any case, I believe she is 
doing something else until we hear from Gregor or otherwise decide how 
to proceed with turtle.

 > but it can be hard to explain the difference to folks that
> are more familiar with code bases started in the modern era where the
> ability to easily run automated tests and code analysis on every
> commit is almost assumed, rather than being seen as an exceptional
> situation.

-- 
Terry Jan Reedy


From stefan_ml at behnel.de  Mon Jun  2 12:32:36 2014
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Mon, 02 Jun 2014 12:32:36 +0200
Subject: [Python-Dev] Should standard library modules optimize for
	CPython?
In-Reply-To: <CAK5idxQ-foHo1_mtz0iLYeot-E1coaGwjFU67F7XKwHExPzXXQ@mail.gmail.com>
References: <20140601081139.GO10355@ando>
 <CAMpsgwZf6E-emO-B0fsJqV5+1w_Kd-d2DUBj2rwNizG8hHoKiA@mail.gmail.com>
 <CAK5idxQ-foHo1_mtz0iLYeot-E1coaGwjFU67F7XKwHExPzXXQ@mail.gmail.com>
Message-ID: <lmhjs4$mfs$1@ger.gmane.org>

Maciej Fijalkowski, 02.06.2014 10:48:
> On Mon, Jun 2, 2014 at 10:43 AM, Victor Stinner wrote:
>> 2014-06-01 10:11 GMT+02:00 Steven D'Aprano <steve at pearwood.info>:
>>> My feeling is that the CPython standard library should be written for
>>> CPython,
>>
>> Right. PyPy, Jython and IronPython already have their "own" standard
>> library when they need a different implement.
>>
>> PyPy: "lib_pypy" directory (lib-python is the CPython stdlib):
>> https://bitbucket.org/pypy/pypy/src/ac52eb7bbbb059d0b8d001a2103774917cf7396f/lib_pypy/?at=default
> 
> it's for stuff that's in CPython implemented in C, not a
> reimplementation of python stuff. we patched the most obvious
> CPython-specific hacks, but it's a loosing battle, you guys will go
> way out of your way to squeeze extra 2% by doing very obscure hacks.

Thus my proposal to compile the modules in CPython with Cython, rather than
duplicating their code or making/keeping them CPython specific. I think
reducing the urge to reimplement something in C is a good thing.

Stefan


From michael.haubenwallner at ssi-schaefer.com  Mon Jun  2 20:11:15 2014
From: michael.haubenwallner at ssi-schaefer.com (Michael Haubenwallner)
Date: Mon, 02 Jun 2014 20:11:15 +0200
Subject: [Python-Dev] use cases for "python-config" versus "pkg-config
	python"
In-Reply-To: <5385F7E7.9090408@ssi-schaefer.com>
References: <5385F7E7.9090408@ssi-schaefer.com>
Message-ID: <538CBE43.7070303@ssi-schaefer.com>

Hi,

following up myself with a patch proposal:

On 05/28/2014 04:51 PM, Michael Haubenwallner wrote:
> Stumbling over problems on AIX (Modules/python.exp not found) building libxml2 as python module
> let me wonder about the intended use-cases for 'python-config' and 'pkg-config python'.
> 
> FWIW, I can see these distinct use cases here, and I'm kindly asking if I got them right:
> 
> * Build an application containing a python interpreter (like python$EXE itself):
>   + link against libpython.so
>   + re-export symbols from libpython.so for python-modules (platform-specific)
>   + This is similar to build against any other library, thus
>   = 'python.pc' is installed (for 'pkg-config python').
> 
> * Build a python-module (like build/lib.<platform>-<pyver>/*.so):
>   + no need to link against libpython.so, instead
>   + expect symbols from libpython.so to be available at runtime, platform-specific either as
>   + undefined symbols at build-time (Linux, others), or
>   + a list of symbols to import from "the main executable" (AIX)
>   + This is specific to python-modules, thus
>   = 'python-config' is installed.
> 

Based on these use-cases, I'm on a trip towards a patch improving AIX support here,
where the attached one is a draft against python-tip (next step is to have python-config
not print $LIBS, but $LINKFORMODULE only).

Thoughts?

Thank you!
/haubi/

-------------- next part --------------
diff -r dc3afbee4ad1 Makefile.pre.in
--- a/Makefile.pre.in	Mon Jun 02 01:32:23 2014 -0700
+++ b/Makefile.pre.in	Mon Jun 02 19:57:54 2014 +0200
@@ -87,6 +87,9 @@
 SGI_ABI=	@SGI_ABI@
 CCSHARED=	@CCSHARED@
 LINKFORSHARED=	@LINKFORSHARED@
+BLINKFORSHARED=	@BLINKFORSHARED@
+LINKFORMODULE=	@LINKFORMODULE@
+BLINKFORMODULE=	@BLINKFORMODULE@
 ARFLAGS=	@ARFLAGS@
 # Extra C flags added for building the interpreter object files.
 CFLAGSFORSHARED=@CFLAGSFORSHARED@
@@ -540,7 +543,7 @@
 
 # Build the interpreter
 $(BUILDPYTHON):	Modules/python.o $(LIBRARY) $(LDLIBRARY) $(PY3LIBRARY)
-	$(LINKCC) $(PY_LDFLAGS) $(LINKFORSHARED) -o $@ Modules/python.o $(BLDLIBRARY) $(LIBS) $(MODLIBS) $(SYSLIBS) $(LDLAST)
+	$(LINKCC) $(PY_LDFLAGS) $(BLINKFORSHARED) -o $@ Modules/python.o $(BLDLIBRARY) $(LIBS) $(MODLIBS) $(SYSLIBS) $(LDLAST)
 
 platform: $(BUILDPYTHON) pybuilddir.txt
 	$(RUNSHARED) $(PYTHON_FOR_BUILD) -c 'import sys ; from sysconfig import get_platform ; print(get_platform()+"-"+sys.version[0:3])' >platform
@@ -666,7 +669,7 @@
 	fi
 
 Modules/_testembed: Modules/_testembed.o $(LIBRARY) $(LDLIBRARY) $(PY3LIBRARY)
-	$(LINKCC) $(PY_LDFLAGS) $(LINKFORSHARED) -o $@ Modules/_testembed.o $(BLDLIBRARY) $(LIBS) $(MODLIBS) $(SYSLIBS) $(LDLAST)
+	$(LINKCC) $(PY_LDFLAGS) $(BLINKFORSHARED) -o $@ Modules/_testembed.o $(BLDLIBRARY) $(LIBS) $(MODLIBS) $(SYSLIBS) $(LDLAST)
 
 ############################################################################
 # Importlib
@@ -1310,7 +1313,7 @@
 # pkgconfig directory
 LIBPC=		$(LIBDIR)/pkgconfig
 
-libainstall:	all python-config
+libainstalldirs:
 	@for i in $(LIBDIR) $(LIBPL) $(LIBPC); \
 	do \
 		if test ! -d $(DESTDIR)$$i; then \
@@ -1319,6 +1322,16 @@
 		else	true; \
 		fi; \
 	done
+
+# resolve Makefile variables eventually found in configured python.pc values
+$(DESTDIR)$(LIBPC)/python-$(VERSION).pc: Misc/python.pc Makefile libainstalldirs
+	@echo "Resolving more values for $(LIBPC)/python-$(VERSION).pc"; \
+	if test set = "$${PYTHON_PC_CONTENT:+set}"; \
+	then echo '$(PYTHON_PC_CONTENT)' | tr '@' '\n' > $@; \
+	else PYTHON_PC_CONTENT="`awk -v ORS='@' '{print $0}' < Misc/python.pc`" $(MAKE) $@ `grep = Misc/python.pc`; \
+	fi
+
+libainstall:	all python-config libainstalldirs $(DESTDIR)$(LIBPC)/python-$(VERSION).pc
 	@if test -d $(LIBRARY); then :; else \
 		if test "$(PYTHONFRAMEWORKDIR)" = no-framework; then \
 			if test "$(SHLIB_SUFFIX)" = .dll; then \
@@ -1338,7 +1351,6 @@
 	$(INSTALL_DATA) Modules/Setup $(DESTDIR)$(LIBPL)/Setup
 	$(INSTALL_DATA) Modules/Setup.local $(DESTDIR)$(LIBPL)/Setup.local
 	$(INSTALL_DATA) Modules/Setup.config $(DESTDIR)$(LIBPL)/Setup.config
-	$(INSTALL_DATA) Misc/python.pc $(DESTDIR)$(LIBPC)/python-$(VERSION).pc
 	$(INSTALL_SCRIPT) $(srcdir)/Modules/makesetup $(DESTDIR)$(LIBPL)/makesetup
 	$(INSTALL_SCRIPT) $(srcdir)/install-sh $(DESTDIR)$(LIBPL)/install-sh
 	$(INSTALL_SCRIPT) python-config.py $(DESTDIR)$(LIBPL)/python-config.py
@@ -1540,6 +1552,7 @@
 	-rm -rf build platform
 	-rm -rf $(PYTHONFRAMEWORKDIR)
 	-rm -f python-config.py python-config
+	-rm -f Misc/python.pc
 
 # Make things extra clean, before making a distribution:
 # remove all generated files, even Makefile[.pre]
@@ -1612,7 +1625,7 @@
 .PHONY: frameworkinstallmaclib frameworkinstallapps frameworkinstallunixtools
 .PHONY: frameworkaltinstallunixtools recheck autoconf clean clobber distclean
 .PHONY: smelly funny patchcheck touch altmaninstall commoninstall
-.PHONY: gdbhooks
+.PHONY: gdbhooks libainstalldirs
 
 # IF YOU PUT ANYTHING HERE IT WILL GO AWAY
 # Local Variables:
diff -r dc3afbee4ad1 Misc/python-config.in
--- a/Misc/python-config.in	Mon Jun 02 01:32:23 2014 -0700
+++ b/Misc/python-config.in	Mon Jun 02 19:57:54 2014 +0200
@@ -55,7 +55,7 @@
             if not getvar('Py_ENABLE_SHARED'):
                 libs.insert(0, '-L' + getvar('LIBPL'))
             if not getvar('PYTHONFRAMEWORK'):
-                libs.extend(getvar('LINKFORSHARED').split())
+                libs.extend(getvar('LINKFORMODULE').split())
         print(' '.join(libs))
 
     elif opt == '--extension-suffix':
diff -r dc3afbee4ad1 Misc/python-config.sh.in
--- a/Misc/python-config.sh.in	Mon Jun 02 01:32:23 2014 -0700
+++ b/Misc/python-config.sh.in	Mon Jun 02 19:57:54 2014 +0200
@@ -43,7 +43,6 @@
 LIBS="@LIBS@ $SYSLIBS -lpython${VERSION}${ABIFLAGS}"
 BASECFLAGS="@BASECFLAGS@"
 LDLIBRARY="@LDLIBRARY@"
-LINKFORSHARED="@LINKFORSHARED@"
 OPT="@OPT@"
 PY_ENABLE_SHARED="@PY_ENABLE_SHARED@"
 LDVERSION="@LDVERSION@"
@@ -53,6 +52,7 @@
 PYTHONFRAMEWORK="@PYTHONFRAMEWORK@"
 INCDIR="-I$includedir/python${VERSION}${ABIFLAGS}"
 PLATINCDIR="-I$includedir/python${VERSION}${ABIFLAGS}"
+LINKFORMODULE="@LINKFORMODULE@"
 
 # Scan for --help or unknown argument.
 for ARG in $*
@@ -88,15 +88,15 @@
             echo "$LIBS"
         ;;
         --ldflags)
-            LINKFORSHAREDUSED=
+            LINKFORMODULEUSED=
             if [ -z "$PYTHONFRAMEWORK" ] ; then
-                LINKFORSHAREDUSED=$LINKFORSHARED
+                LINKFORMODULEUSED=$LINKFORMODULE
             fi
             LIBPLUSED=
             if [ "$PY_ENABLE_SHARED" = "0" ] ; then
                 LIBPLUSED="-L$LIBPL"
             fi
-            echo "$LIBPLUSED -L$libdir $LIBS $LINKFORSHAREDUSED"
+            echo "$LIBPLUSED -L$libdir $LIBS $LINKFORMODULEUSED"
         ;;
         --extension-suffix)
             echo "$SO"
diff -r dc3afbee4ad1 Misc/python.pc.in
--- a/Misc/python.pc.in	Mon Jun 02 01:32:23 2014 -0700
+++ b/Misc/python.pc.in	Mon Jun 02 19:57:54 2014 +0200
@@ -9,5 +9,5 @@
 Requires: 
 Version: @VERSION@
 Libs.private: @LIBS@
-Libs: -L${libdir} -lpython at VERSION@@ABIFLAGS@
+Libs: -L${libdir} -lpython at VERSION@@ABIFLAGS@ @LINKFORSHARED@
 Cflags: -I${includedir}/python at VERSION@@ABIFLAGS@
diff -r dc3afbee4ad1 configure.ac
--- a/configure.ac	Mon Jun 02 01:32:23 2014 -0700
+++ b/configure.ac	Mon Jun 02 19:57:54 2014 +0200
@@ -1948,6 +1948,9 @@
 AC_SUBST(BLDSHARED)
 AC_SUBST(CCSHARED)
 AC_SUBST(LINKFORSHARED)
+AC_SUBST(BLINKFORSHARED)
+AC_SUBST(LINKFORMODULE)
+AC_SUBST(BLINKFORMODULE)
 
 # SHLIB_SUFFIX is the extension of shared libraries `(including the dot!)
 # -- usually .so, .sl on HP-UX, .dll on Cygwin
@@ -1975,8 +1978,8 @@
 then
 	case $ac_sys_system/$ac_sys_release in
 	AIX*)
-		BLDSHARED="Modules/ld_so_aix \$(CC) -bI:Modules/python.exp"
-		LDSHARED="\$(BINLIBDEST)/config/ld_so_aix \$(CC) -bI:\$(BINLIBDEST)/config/python.exp"
+		BLDSHARED="Modules/ld_so_aix \$(CC) \$(BLINKFORMODULE)"
+		LDSHARED="\$(LIBPL)/ld_so_aix \$(CC) \$(LINKFORMODULE)"
 		;;
 	IRIX/5*) LDSHARED="ld -shared";;
 	IRIX*/6*) LDSHARED="ld ${SGI_ABI} -shared -all";;
@@ -2136,13 +2139,21 @@
 	esac
 fi
 AC_MSG_RESULT($CCSHARED)
-# LINKFORSHARED are the flags passed to the $(CC) command that links
-# the python executable -- this is only needed for a few systems
+# LINKFORSHARED are the flags passed to the $(CC) command that links an
+# application using a python interpreter -- this is only needed for a few systems
+# BLINKFORSHARED is for the python executable -- defaults to LINKFORSHARED
+# LINKFORMODULE are the flags passed to the $(CC) command that links a
+# modules to be imported by the python interpreter of such an application.
+# BLINKFORMODULE is for modules built in this python's Modules/ directory.
+# Use ${} here if necessary, as these end up in python-config.sh too.
 AC_MSG_CHECKING(LINKFORSHARED)
 if test -z "$LINKFORSHARED"
 then
 	case $ac_sys_system/$ac_sys_release in
-	AIX*)	LINKFORSHARED='-Wl,-bE:Modules/python.exp -lld';;
+	AIX*)	BLINKFORSHARED='-Wl,-bE:Modules/python.exp -lld'
+	    LINKFORSHARED='-Wl,-bE:${LIBPL}/python.exp -lld'
+	    BLINKFORMODULE='-Wl,-bI:Modules/python.exp'
+	    LINKFORMODULE='-Wl,-bI:${LIBPL}/python.exp';;
 	hp*|HP*)
 	    LINKFORSHARED="-Wl,-E -Wl,+s";;
 #	    LINKFORSHARED="-Wl,-E -Wl,+s -Wl,+b\$(BINLIBDEST)/lib-dynload";;
@@ -2193,6 +2204,9 @@
 fi
 AC_MSG_RESULT($LINKFORSHARED)
 
+test -n "${BLINKFORSHARED}" || BLINKFORSHARED="${LINKFORSHARED}"
+test -n "${LINKFORMODULE}"  || LINKFORMODULE="${LINKFORSHARED}"
+test -n "${BLINKFORMODULE}" || BLINKFORMODULE="${LINKFORMODULE}"
 
 AC_SUBST(CFLAGSFORSHARED)
 AC_MSG_CHECKING(CFLAGSFORSHARED)

From bcannon at gmail.com  Mon Jun  2 20:28:40 2014
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 02 Jun 2014 18:28:40 +0000
Subject: [Python-Dev] use cases for "python-config" versus "pkg-config
	python"
References: <5385F7E7.9090408@ssi-schaefer.com>
 <538CBE43.7070303@ssi-schaefer.com>
Message-ID: <CAP1=2W4V+Qv5-1zG4s8Z71OEthQkKeeHh=m7NE98fraaEo_mBg@mail.gmail.com>

Patches sent to python-dev are typically ignored. Could you open an issue
on bugs.python.org and upload it there?

On Mon Jun 02 2014 at 2:20:43 PM, Michael Haubenwallner <
michael.haubenwallner at ssi-schaefer.com> wrote:

> Hi,
>
> following up myself with a patch proposal:
>
> On 05/28/2014 04:51 PM, Michael Haubenwallner wrote:
> > Stumbling over problems on AIX (Modules/python.exp not found) building
> libxml2 as python module
> > let me wonder about the intended use-cases for 'python-config' and
> 'pkg-config python'.
> >
> > FWIW, I can see these distinct use cases here, and I'm kindly asking if
> I got them right:
> >
> > * Build an application containing a python interpreter (like python$EXE
> itself):
> >   + link against libpython.so
> >   + re-export symbols from libpython.so for python-modules
> (platform-specific)
> >   + This is similar to build against any other library, thus
> >   = 'python.pc' is installed (for 'pkg-config python').
> >
> > * Build a python-module (like build/lib.<platform>-<pyver>/*.so):
> >   + no need to link against libpython.so, instead
> >   + expect symbols from libpython.so to be available at runtime,
> platform-specific either as
> >   + undefined symbols at build-time (Linux, others), or
> >   + a list of symbols to import from "the main executable" (AIX)
> >   + This is specific to python-modules, thus
> >   = 'python-config' is installed.
> >
>
> Based on these use-cases, I'm on a trip towards a patch improving AIX
> support here,
> where the attached one is a draft against python-tip (next step is to have
> python-config
> not print $LIBS, but $LINKFORMODULE only).
>
> Thoughts?
>
> Thank you!
> /haubi/
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140602/de36c012/attachment.html>

From doko at ubuntu.com  Mon Jun  2 21:57:31 2014
From: doko at ubuntu.com (Matthias Klose)
Date: Mon, 02 Jun 2014 21:57:31 +0200
Subject: [Python-Dev] use cases for "python-config" versus "pkg-config
 python"
In-Reply-To: <538CBE43.7070303@ssi-schaefer.com>
References: <5385F7E7.9090408@ssi-schaefer.com>
 <538CBE43.7070303@ssi-schaefer.com>
Message-ID: <538CD72B.7030402@ubuntu.com>

Am 02.06.2014 20:11, schrieb Michael Haubenwallner:
> Hi,
> 
> following up myself with a patch proposal:
> 
> On 05/28/2014 04:51 PM, Michael Haubenwallner wrote:
>> Stumbling over problems on AIX (Modules/python.exp not found) building libxml2 as python module
>> let me wonder about the intended use-cases for 'python-config' and 'pkg-config python'.
>>
>> FWIW, I can see these distinct use cases here, and I'm kindly asking if I got them right:
>>
>> * Build an application containing a python interpreter (like python$EXE itself):
>>   + link against libpython.so
>>   + re-export symbols from libpython.so for python-modules (platform-specific)
>>   + This is similar to build against any other library, thus
>>   = 'python.pc' is installed (for 'pkg-config python').
>>
>> * Build a python-module (like build/lib.<platform>-<pyver>/*.so):
>>   + no need to link against libpython.so, instead
>>   + expect symbols from libpython.so to be available at runtime, platform-specific either as
>>   + undefined symbols at build-time (Linux, others), or
>>   + a list of symbols to import from "the main executable" (AIX)
>>   + This is specific to python-modules, thus
>>   = 'python-config' is installed.
>>
> 
> Based on these use-cases, I'm on a trip towards a patch improving AIX support here,
> where the attached one is a draft against python-tip (next step is to have python-config
> not print $LIBS, but $LINKFORMODULE only).
> 
> Thoughts?

there is http://bugs.python.org/issue15590

I think it is worth improving, together with adding documentation, and maybe
distinguishing the two use cases linking for a module or an embedded interpreter.

  Matthias


From sturla.molden at gmail.com  Tue Jun  3 17:13:11 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Tue, 3 Jun 2014 15:13:11 +0000 (UTC)
Subject: [Python-Dev] Should standard library modules optimize for
	CPython?
References: <20140601081139.GO10355@ando>
 <CAMpsgwZf6E-emO-B0fsJqV5+1w_Kd-d2DUBj2rwNizG8hHoKiA@mail.gmail.com>
 <CAK5idxQ-foHo1_mtz0iLYeot-E1coaGwjFU67F7XKwHExPzXXQ@mail.gmail.com>
 <lmhjs4$mfs$1@ger.gmane.org>
Message-ID: <1521177704423500642.020210sturla.molden-gmail.com@news.gmane.org>

Stefan Behnel <stefan_ml at behnel.de> wrote:

> Thus my proposal to compile the modules in CPython with Cython, rather than
> duplicating their code or making/keeping them CPython specific. I think
> reducing the urge to reimplement something in C is a good thing.

For algorithmic and numerical code, Numba has already proven that Python
can be JIT compiled comparable to -O2 in C.  For non-algorthmic code, the
speed determinants are usually outside Python (e.g. the network
connection). Numba is becoming what the "dead swallow" should have been.
The question is rather should the standard library use a JIT compiler like
Numba? Cython is great for writing C extensions while avoiding all the
details of the Python C API. But for speeding up algorithmic code, Numba is
easier to use.

Sturla


From stefan_ml at behnel.de  Tue Jun  3 19:00:16 2014
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 03 Jun 2014 19:00:16 +0200
Subject: [Python-Dev] Should standard library modules optimize for
	CPython?
In-Reply-To: <1521177704423500642.020210sturla.molden-gmail.com@news.gmane.org>
References: <20140601081139.GO10355@ando>
 <CAMpsgwZf6E-emO-B0fsJqV5+1w_Kd-d2DUBj2rwNizG8hHoKiA@mail.gmail.com>
 <CAK5idxQ-foHo1_mtz0iLYeot-E1coaGwjFU67F7XKwHExPzXXQ@mail.gmail.com>
 <lmhjs4$mfs$1@ger.gmane.org>
 <1521177704423500642.020210sturla.molden-gmail.com@news.gmane.org>
Message-ID: <lmkuv0$3ig$1@ger.gmane.org>

Sturla Molden, 03.06.2014 17:13:
> Stefan Behnel wrote:
> 
>> Thus my proposal to compile the modules in CPython with Cython, rather than
>> duplicating their code or making/keeping them CPython specific. I think
>> reducing the urge to reimplement something in C is a good thing.
> 
> For algorithmic and numerical code, Numba has already proven that Python
> can be JIT compiled comparable to -O2 in C.  For non-algorthmic code, the
> speed determinants are usually outside Python (e.g. the network
> connection). Numba is becoming what the "dead swallow" should have been.
> The question is rather should the standard library use a JIT compiler like
> Numba? Cython is great for writing C extensions while avoiding all the
> details of the Python C API. But for speeding up algorithmic code, Numba is
> easier to use.

I certainly agree that a JIT compiler can do much better optimisations on
Python code than a static compiler, especially data driven optimisations.
However, Numba comes with major dependencies, even runtime dependencies.
>From previous discussions on this list, I gathered that there are major
objections against adding such a large dependency to CPython since it can
also just be installed as an external package if users want to have it.

Static compilation, on the other hand, is a build time thing that adds no
dependencies that CPython doesn't have already. Distributions can even
package up the compiled .so files separately from the original .py/.pyc
files, if they feel like it, to make them selectively installable. So the
argument in favour is mostly a pragmatic one. If you can have 2-5x faster
code essentially for free, why not just go for it?

Stefan


From sturla.molden at gmail.com  Tue Jun  3 22:51:30 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Tue, 3 Jun 2014 20:51:30 +0000 (UTC)
Subject: [Python-Dev] Should standard library modules optimize for
	CPython?
References: <20140601081139.GO10355@ando>
 <CAMpsgwZf6E-emO-B0fsJqV5+1w_Kd-d2DUBj2rwNizG8hHoKiA@mail.gmail.com>
 <CAK5idxQ-foHo1_mtz0iLYeot-E1coaGwjFU67F7XKwHExPzXXQ@mail.gmail.com>
 <lmhjs4$mfs$1@ger.gmane.org>
 <1521177704423500642.020210sturla.molden-gmail.com@news.gmane.org>
 <lmkuv0$3ig$1@ger.gmane.org>
Message-ID: <1437293580423521164.103866sturla.molden-gmail.com@news.gmane.org>

Stefan Behnel <stefan_ml at behnel.de> wrote:

> So the
> argument in favour is mostly a pragmatic one. If you can have 2-5x faster
> code essentially for free, why not just go for it?

I would be easier if the GIL or Cython's use of it was redesigned. Cython
just grabs the GIL and holds on to it until it is manually released. The
standard lib cannot have packages that holds the GIL forever, as a Cython
compiled module would do. Cython has to start sharing access the GIL like
the interpreter does.

Sturla


From rosuav at gmail.com  Tue Jun  3 23:38:00 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 4 Jun 2014 07:38:00 +1000
Subject: [Python-Dev] %x formatting of floats - behaviour change since 3.4
Message-ID: <CAPTjJmrCvNKc=iOrcjxkm-5GMdhWpjry2RKiGi+rz34iv76aWg@mail.gmail.com>

I'm helping out with the micropython project and am finding that one
of their tests fails on CPython 3.5 (fresh build from Mercurial this
morning). It comes down to this:

Python 3.4.1rc1 (default, May  5 2014, 14:28:34)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> "%x"%16.0
'10'

Python 3.5.0a0 (default:88814d1f8c32, Jun  4 2014, 07:29:32)
[GCC 4.7.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> "%x"%16.0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: %x format: an integer is required, not float

Is this an intentional change? And if so, is it formally documented
somewhere? I don't recall seeing anything about it, but my
recollection doesn't mean much.

ChrisA

From victor.stinner at gmail.com  Wed Jun  4 00:03:07 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Wed, 4 Jun 2014 00:03:07 +0200
Subject: [Python-Dev] %x formatting of floats - behaviour change since
	3.4
In-Reply-To: <CAPTjJmrCvNKc=iOrcjxkm-5GMdhWpjry2RKiGi+rz34iv76aWg@mail.gmail.com>
References: <CAPTjJmrCvNKc=iOrcjxkm-5GMdhWpjry2RKiGi+rz34iv76aWg@mail.gmail.com>
Message-ID: <CAMpsgwb7Oj19bTvVCGexmqGdEFDYXF2izR3z-1Nn4by68ROQJw@mail.gmail.com>

Hi,

2014-06-03 23:38 GMT+02:00 Chris Angelico <rosuav at gmail.com>:
> Is this an intentional change? And if so, is it formally documented
> somewhere? I don't recall seeing anything about it, but my
> recollection doesn't mean much.

Yes, it's intentional. See the issue for the rationale:
http://bugs.python.org/issue19995

Victor

From eric at trueblade.com  Wed Jun  4 00:02:31 2014
From: eric at trueblade.com (Eric V. Smith)
Date: Tue, 03 Jun 2014 18:02:31 -0400
Subject: [Python-Dev] %x formatting of floats - behaviour change since
 3.4
In-Reply-To: <CAPTjJmrCvNKc=iOrcjxkm-5GMdhWpjry2RKiGi+rz34iv76aWg@mail.gmail.com>
References: <CAPTjJmrCvNKc=iOrcjxkm-5GMdhWpjry2RKiGi+rz34iv76aWg@mail.gmail.com>
Message-ID: <538E45F7.6060209@trueblade.com>

On 6/3/2014 5:38 PM, Chris Angelico wrote:
> I'm helping out with the micropython project and am finding that one
> of their tests fails on CPython 3.5 (fresh build from Mercurial this
> morning). It comes down to this:
> 
> Python 3.4.1rc1 (default, May  5 2014, 14:28:34)
> [GCC 4.8.2] on linux
> Type "help", "copyright", "credits" or "license" for more information.
>>>> "%x"%16.0
> '10'
> 
> Python 3.5.0a0 (default:88814d1f8c32, Jun  4 2014, 07:29:32)
> [GCC 4.7.2] on linux
> Type "help", "copyright", "credits" or "license" for more information.
>>>> "%x"%16.0
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: %x format: an integer is required, not float
> 
> Is this an intentional change? And if so, is it formally documented
> somewhere? I don't recall seeing anything about it, but my
> recollection doesn't mean much.

http://bugs.python.org/issue19995


From rosuav at gmail.com  Wed Jun  4 00:05:58 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 4 Jun 2014 08:05:58 +1000
Subject: [Python-Dev] %x formatting of floats - behaviour change since
	3.4
In-Reply-To: <CAMpsgwb7Oj19bTvVCGexmqGdEFDYXF2izR3z-1Nn4by68ROQJw@mail.gmail.com>
References: <CAPTjJmrCvNKc=iOrcjxkm-5GMdhWpjry2RKiGi+rz34iv76aWg@mail.gmail.com>
 <CAMpsgwb7Oj19bTvVCGexmqGdEFDYXF2izR3z-1Nn4by68ROQJw@mail.gmail.com>
Message-ID: <CAPTjJmoc-Drrt2Tj4_S4UJCAs0Q4a4PZvzem9kC7FDeQMyuftg@mail.gmail.com>

On Wed, Jun 4, 2014 at 8:03 AM, Victor Stinner <victor.stinner at gmail.com> wrote:
> 2014-06-03 23:38 GMT+02:00 Chris Angelico <rosuav at gmail.com>:
>> Is this an intentional change? And if so, is it formally documented
>> somewhere? I don't recall seeing anything about it, but my
>> recollection doesn't mean much.
>
> Yes, it's intentional. See the issue for the rationale:
> http://bugs.python.org/issue19995

Thanks! I'll fix (in this case, simply remove) the test and cite that issue.

ChrisA

From v+python at g.nevcal.com  Wed Jun  4 00:26:00 2014
From: v+python at g.nevcal.com (Glenn Linderman)
Date: Tue, 03 Jun 2014 15:26:00 -0700
Subject: [Python-Dev] %x formatting of floats - behaviour change since
 3.4
In-Reply-To: <CAPTjJmoc-Drrt2Tj4_S4UJCAs0Q4a4PZvzem9kC7FDeQMyuftg@mail.gmail.com>
References: <CAPTjJmrCvNKc=iOrcjxkm-5GMdhWpjry2RKiGi+rz34iv76aWg@mail.gmail.com>
 <CAMpsgwb7Oj19bTvVCGexmqGdEFDYXF2izR3z-1Nn4by68ROQJw@mail.gmail.com>
 <CAPTjJmoc-Drrt2Tj4_S4UJCAs0Q4a4PZvzem9kC7FDeQMyuftg@mail.gmail.com>
Message-ID: <538E4B78.4070503@g.nevcal.com>

On 6/3/2014 3:05 PM, Chris Angelico wrote:
> On Wed, Jun 4, 2014 at 8:03 AM, Victor Stinner <victor.stinner at gmail.com> wrote:
>> 2014-06-03 23:38 GMT+02:00 Chris Angelico <rosuav at gmail.com>:
>>> Is this an intentional change? And if so, is it formally documented
>>> somewhere? I don't recall seeing anything about it, but my
>>> recollection doesn't mean much.
>> Yes, it's intentional. See the issue for the rationale:
>> http://bugs.python.org/issue19995
> Thanks! I'll fix (in this case, simply remove) the test and cite that issue.

Wouldn't it be better to keep the test, but expect the operation to fail?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140603/6c255902/attachment.html>

From rosuav at gmail.com  Wed Jun  4 00:41:30 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 4 Jun 2014 08:41:30 +1000
Subject: [Python-Dev] %x formatting of floats - behaviour change since
	3.4
In-Reply-To: <538E4B78.4070503@g.nevcal.com>
References: <CAPTjJmrCvNKc=iOrcjxkm-5GMdhWpjry2RKiGi+rz34iv76aWg@mail.gmail.com>
 <CAMpsgwb7Oj19bTvVCGexmqGdEFDYXF2izR3z-1Nn4by68ROQJw@mail.gmail.com>
 <CAPTjJmoc-Drrt2Tj4_S4UJCAs0Q4a4PZvzem9kC7FDeQMyuftg@mail.gmail.com>
 <538E4B78.4070503@g.nevcal.com>
Message-ID: <CAPTjJmpjwzHFNKBzBsz2bLonXQQ=3fGXTLftvPUvWHmaqBBjHg@mail.gmail.com>

On Wed, Jun 4, 2014 at 8:26 AM, Glenn Linderman <v+python at g.nevcal.com> wrote:
> On 6/3/2014 3:05 PM, Chris Angelico wrote:
>
> On Wed, Jun 4, 2014 at 8:03 AM, Victor Stinner <victor.stinner at gmail.com>
> wrote:
>
> 2014-06-03 23:38 GMT+02:00 Chris Angelico <rosuav at gmail.com>:
>
> Is this an intentional change? And if so, is it formally documented
> somewhere? I don't recall seeing anything about it, but my
> recollection doesn't mean much.
>
> Yes, it's intentional. See the issue for the rationale:
> http://bugs.python.org/issue19995
>
> Thanks! I'll fix (in this case, simply remove) the test and cite that issue.
>
>
> Wouldn't it be better to keep the test, but expect the operation to fail?

The way micropython does its tests is: Run CPython on a script, then
run micropython on the same script. If the output differs, it's an
error. The problem is, CPython 3.3 and CPython 3.5 give different
output (one gives an exception, the other works as if int(x) had been
given), so it's impossible for the test to be done right.

My question was mainly to ascertain whether it's the tests or my
system that needed fixing.

ChrisA

From steve at pearwood.info  Wed Jun  4 03:01:43 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 4 Jun 2014 11:01:43 +1000
Subject: [Python-Dev] Should standard library modules optimize for
	CPython?
In-Reply-To: <20140601081139.GO10355@ando>
References: <20140601081139.GO10355@ando>
Message-ID: <20140604010143.GC10355@ando>

On Sun, Jun 01, 2014 at 06:11:39PM +1000, Steven D'Aprano wrote:
> I think I know the answer to this, but I'm going to ask it anyway...
> 
> I know that there is a general policy of trying to write code in the 
> standard library that does not disadvantage other implementations. How 
> far does that go the other way? Should the standard library accept 
> slower code because it will be much faster in other implementations?
[...]


Thanks to everyone who replied! I just wanted to make a brief note to 
say that although I haven't been very chatty in this thread, I have been 
reading it, so thanks for the advice, it is appreciated.


-- 
Steven

From steve at pearwood.info  Wed Jun  4 03:17:18 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 4 Jun 2014 11:17:18 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
Message-ID: <20140604011718.GD10355@ando>

There is a discussion over at MicroPython about the internal 
representation of Unicode strings. Micropython is aimed at embedded 
devices, and so minimizing memory use is important, possibly even 
more important than performance.

(I'm not speaking on their behalf, just commenting as an interested 
outsider.)

At the moment, their Unicode support is patchy. They are talking about 
either:

* Having a build-time option to restrict all strings to ASCII-only.

  (I think what they mean by that is that strings will be like Python 2 
  strings, ASCII-plus-arbitrary-bytes, not actually ASCII.)

* Implementing Unicode internally as UTF-8, and giving up O(1) 
  indexing operations.

https://github.com/micropython/micropython/issues/657


Would either of these trade-offs be acceptable while still claiming 
"Python 3.4 compatibility"?

My own feeling is that O(1) string indexing operations are a quality of 
implementation issue, not a deal breaker to call it a Python. I can't 
see any requirement in the docs that str[n] must take O(1) time, but 
perhaps I have missed something.


-- 
Steven

From donald at stufft.io  Wed Jun  4 03:46:22 2014
From: donald at stufft.io (Donald Stufft)
Date: Tue, 3 Jun 2014 21:46:22 -0400
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604011718.GD10355@ando>
References: <20140604011718.GD10355@ando>
Message-ID: <7B966E20-909B-4DC6-9DCC-2206A93763E9@stufft.io>

I think UTF8 is the best option. 

> On Jun 3, 2014, at 9:17 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> 
> There is a discussion over at MicroPython about the internal 
> representation of Unicode strings. Micropython is aimed at embedded 
> devices, and so minimizing memory use is important, possibly even 
> more important than performance.
> 
> (I'm not speaking on their behalf, just commenting as an interested 
> outsider.)
> 
> At the moment, their Unicode support is patchy. They are talking about 
> either:
> 
> * Having a build-time option to restrict all strings to ASCII-only.
> 
>  (I think what they mean by that is that strings will be like Python 2 
>  strings, ASCII-plus-arbitrary-bytes, not actually ASCII.)
> 
> * Implementing Unicode internally as UTF-8, and giving up O(1) 
>  indexing operations.
> 
> https://github.com/micropython/micropython/issues/657
> 
> 
> Would either of these trade-offs be acceptable while still claiming 
> "Python 3.4 compatibility"?
> 
> My own feeling is that O(1) string indexing operations are a quality of 
> implementation issue, not a deal breaker to call it a Python. I can't 
> see any requirement in the docs that str[n] must take O(1) time, but 
> perhaps I have missed something.
> 
> 
> 
> 
> -- 
> Steven
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/donald%40stufft.io

From rosuav at gmail.com  Wed Jun  4 04:32:12 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 4 Jun 2014 12:32:12 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604011718.GD10355@ando>
References: <20140604011718.GD10355@ando>
Message-ID: <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>

On Wed, Jun 4, 2014 at 11:17 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> * Having a build-time option to restrict all strings to ASCII-only.
>
>   (I think what they mean by that is that strings will be like Python 2
>   strings, ASCII-plus-arbitrary-bytes, not actually ASCII.)

What I was actually suggesting along those lines was that the str type
still be notionally a Unicode string, but that any codepoints >127
would either raise an exception or blow an assertion, and all the code
to handle multibyte representations would be compiled out. So there'd
still be a difference between strings of text and streams of bytes,
but all encoding and decoding to/from ASCII-compatible encodings would
just point to the same bytes in RAM.

Risk: Someone would implement that with assertions, then compile with
assertions disabled, test only with ASCII, and have lurking bugs.

ChrisA

From ncoghlan at gmail.com  Wed Jun  4 07:17:00 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 4 Jun 2014 15:17:00 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604011718.GD10355@ando>
References: <20140604011718.GD10355@ando>
Message-ID: <CADiSq7d9A894RB-D9N+aqye7bsrhBXM-cBB=YTHiP3f_2J2apg@mail.gmail.com>

On 4 June 2014 11:17, Steven D'Aprano <steve at pearwood.info> wrote:
> My own feeling is that O(1) string indexing operations are a quality of
> implementation issue, not a deal breaker to call it a Python.

If string indexing & iteration is still presented to the user as "an
array of code points", it should still avoid the bugs that plagued
both Python 2 narrow builds and direct use of UTF-8 encoded Py2
strings.

If they don't try to offer C API compatibility, it should be feasible
to do it that way. If they *do* try to offer C API compatibility, they
may have a problem.

> I can't
> see any requirement in the docs that str[n] must take O(1) time, but
> perhaps I have missed something.

There's a general expectation that indexing will be O(1) because all
the builtin containers that support that syntax use it for O(1) lookup
operations.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From guido at python.org  Wed Jun  4 07:23:07 2014
From: guido at python.org (Guido van Rossum)
Date: Tue, 3 Jun 2014 22:23:07 -0700
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
Message-ID: <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>

On Tue, Jun 3, 2014 at 7:32 PM, Chris Angelico <rosuav at gmail.com> wrote:

> On Wed, Jun 4, 2014 at 11:17 AM, Steven D'Aprano <steve at pearwood.info>
> wrote:
> > * Having a build-time option to restrict all strings to ASCII-only.
> >
> >   (I think what they mean by that is that strings will be like Python 2
> >   strings, ASCII-plus-arbitrary-bytes, not actually ASCII.)
>
> What I was actually suggesting along those lines was that the str type
> still be notionally a Unicode string, but that any codepoints >127
> would either raise an exception or blow an assertion, and all the code
> to handle multibyte representations would be compiled out.


That would be a pretty lousy option.

So there'd
> still be a difference between strings of text and streams of bytes,
> but all encoding and decoding to/from ASCII-compatible encodings would
> just point to the same bytes in RAM.
>

I suppose this is why you propose to reject 128-255?


> Risk: Someone would implement that with assertions, then compile with
> assertions disabled, test only with ASCII, and have lurking bugs.
>

Never mind disabling assertions -- even with enabled assertions you'd have
to expect most Python programs to fail with non-ASCII input.

Then again the UTF-8 option would be pretty devastating too for anything
manipulating strings (especially since many Python APIs are defined using
indexes, e.g. the re module).

Why not support variable-width strings like CPython 3.4?

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140603/edbde954/attachment.html>

From rosuav at gmail.com  Wed Jun  4 08:51:12 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 4 Jun 2014 16:51:12 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CADiSq7d9A894RB-D9N+aqye7bsrhBXM-cBB=YTHiP3f_2J2apg@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CADiSq7d9A894RB-D9N+aqye7bsrhBXM-cBB=YTHiP3f_2J2apg@mail.gmail.com>
Message-ID: <CAPTjJmqfsLeYN8ZrprGkLed=esWm=zY4oxA05ed5F6wnTmb2qQ@mail.gmail.com>

On Wed, Jun 4, 2014 at 3:17 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 4 June 2014 11:17, Steven D'Aprano <steve at pearwood.info> wrote:
>> My own feeling is that O(1) string indexing operations are a quality of
>> implementation issue, not a deal breaker to call it a Python.
>
> If string indexing & iteration is still presented to the user as "an
> array of code points", it should still avoid the bugs that plagued
> both Python 2 narrow builds and direct use of UTF-8 encoded Py2
> strings.

It would. The downsides of a UTF-8 representation would be slower
iteration and much slower (O(N)) indexing/slicing.

ChrisA

From martin at v.loewis.de  Wed Jun  4 09:02:13 2014
From: martin at v.loewis.de (martin at v.loewis.de)
Date: Wed, 04 Jun 2014 09:02:13 +0200
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604011718.GD10355@ando>
References: <20140604011718.GD10355@ando>
Message-ID: <20140604090213.Horde.iGDQDjno1ZQixW4P-6T4Mw1@webmail.df.eu>


Zitat von Steven D'Aprano <steve at pearwood.info>:

> * Having a build-time option to restrict all strings to ASCII-only.
>
> (I think what they mean by that is that strings will be like Python 2
> strings, ASCII-plus-arbitrary-bytes, not actually ASCII.)

An ASCII-plus-arbitrary-bytes type called "str" would prevent claiming
"Python 3.4 compatibility" for sure.

Restricting strings to ASCII (as Chris apparently actually suggested)
would allow to claim compatibility with a stretch: existing Python
code might not run on such an implementation. However, since a lot
of existing Python code wouldn't run on MicroPython, anyway, one
might claim to implement a Python 3.4 subset.

> * Implementing Unicode internally as UTF-8, and giving up O(1)
> indexing operations.
>
> Would either of these trade-offs be acceptable while still claiming
> "Python 3.4 compatibility"?
>
> My own feeling is that O(1) string indexing operations are a quality of
> implementation issue, not a deal breaker to call it a Python. I can't
> see any requirement in the docs that str[n] must take O(1) time, but
> perhaps I have missed something.

I agree. It's an open question whether such an implementation would be
practical, both in terms of existing Python code, and in terms of existing
C extension modules that people might want to port to MicroPython.

There are more things to consider for the internal implementation,
in particular how the string length is implemented. Several alternatives
exist:
1. store the UTF-8 length (i.e. memory size)
2. store the number of code points (i.e. Python len())
3. store both
4. store neither, but use null termination instead

Variant 3 is most run-time efficient, but could easily use 8 bytes
just for the length, which could outweigh the storage of the actual
data. Variants 1 and 2 lose on some operations (1 loses on computing
len(), 2 loses on string concatenation). 3 would add the restriction
of not allowing U+0000 in a string (which would be reasonable IMO),
and make all length computations inefficient. However, it wouldn't
be worse than standard C.

Regards,
Martin


From rosuav at gmail.com  Wed Jun  4 09:03:22 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 4 Jun 2014 17:03:22 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
Message-ID: <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>

On Wed, Jun 4, 2014 at 3:23 PM, Guido van Rossum <guido at python.org> wrote:
> On Tue, Jun 3, 2014 at 7:32 PM, Chris Angelico <rosuav at gmail.com> wrote:
>>
>> On Wed, Jun 4, 2014 at 11:17 AM, Steven D'Aprano <steve at pearwood.info>
>> wrote:
>> > * Having a build-time option to restrict all strings to ASCII-only.
>> >
>> >   (I think what they mean by that is that strings will be like Python 2
>> >   strings, ASCII-plus-arbitrary-bytes, not actually ASCII.)
>>
>> What I was actually suggesting along those lines was that the str type
>> still be notionally a Unicode string, but that any codepoints >127
>> would either raise an exception or blow an assertion, and all the code
>> to handle multibyte representations would be compiled out.
>
>
> That would be a pretty lousy option.
>
>> So there'd
>> still be a difference between strings of text and streams of bytes,
>> but all encoding and decoding to/from ASCII-compatible encodings would
>> just point to the same bytes in RAM.
>
> I suppose this is why you propose to reject 128-255?

Correct. It would allow small devices to guarantee that strings are
compact (MicroPython is aimed primarily at an embedded controller),
guarantee identity transformations in several common encodings (and
maybe this sort of build wouldn't ship with any non-ASCII-compat
encodings at all), and never demonstrate behaviour different from
CPython's except by explicitly failing.

>> Risk: Someone would implement that with assertions, then compile with
>> assertions disabled, test only with ASCII, and have lurking bugs.
>
>
> Never mind disabling assertions -- even with enabled assertions you'd have
> to expect most Python programs to fail with non-ASCII input.

Right, which is why I don't like the idea. But you don't need
non-ASCII characters to blink an LED or turn a servo, and there is
significant resistance to the notion that appending a non-ASCII
character to a long ASCII-only string requires the whole string to be
copied and doubled in size (lots of heap space used).

> Then again the UTF-8 option would be pretty devastating too for anything
> manipulating strings (especially since many Python APIs are defined using
> indexes, e.g. the re module).

That's what I thought, too, but a quick poll on python-list suggests
that indexing isn't nearly as common as I had thought it to be. On a
smallish device, you won't have megabytes of string to index, so even
O(N) indexing can't get pathological. (This would be an acknowledged
limitation of micropython as a Unix Python - "it's designed for small
programs, and it's performance-optimized for small programs, so it
might get pathologically slow on certain large data manipulations".)

> Why not support variable-width strings like CPython 3.4?

That was my first recommendation, and in fact I started writing code
to implement parts of PEP 393, with a view to basically doing it the
same way in both Pythons. But discussion on the tracker issue showed a
certain amount of hostility toward the potential expansion of strings,
particularly in the worst-case example of appending a single SMP
character onto a long ASCII string.

ChrisA

From rosuav at gmail.com  Wed Jun  4 09:06:25 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 4 Jun 2014 17:06:25 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604090213.Horde.iGDQDjno1ZQixW4P-6T4Mw1@webmail.df.eu>
References: <20140604011718.GD10355@ando>
 <20140604090213.Horde.iGDQDjno1ZQixW4P-6T4Mw1@webmail.df.eu>
Message-ID: <CAPTjJmr3Jr6fY58igHAOyu9QYk9+o4ckpquUuNXibPW9wPY=pw@mail.gmail.com>

On Wed, Jun 4, 2014 at 5:02 PM,  <martin at v.loewis.de> wrote:
> There are more things to consider for the internal implementation,
> in particular how the string length is implemented. Several alternatives
> exist:
> 1. store the UTF-8 length (i.e. memory size)
> 2. store the number of code points (i.e. Python len())
> 3. store both
> 4. store neither, but use null termination instead
>
> Variant 3 is most run-time efficient, but could easily use 8 bytes
> just for the length, which could outweigh the storage of the actual
> data. Variants 1 and 2 lose on some operations (1 loses on computing
> len(), 2 loses on string concatenation). 3 would add the restriction
> of not allowing U+0000 in a string (which would be reasonable IMO),
> and make all length computations inefficient. However, it wouldn't
> be worse than standard C.

The current implementation stores a 16-bit length, which is both the
memory size and the len(). As far as I can see, the memory size is
never needed, so I'd just go for option 2; string concatenation is
already known to be one of those operations that can be slow if you do
it badly, and an optimized str.join() would cover the recommended
use-case.

ChrisA

From dw+python-dev at hmmz.org  Wed Jun  4 07:39:04 2014
From: dw+python-dev at hmmz.org (dw+python-dev at hmmz.org)
Date: Wed, 4 Jun 2014 05:39:04 +0000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CADiSq7d9A894RB-D9N+aqye7bsrhBXM-cBB=YTHiP3f_2J2apg@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CADiSq7d9A894RB-D9N+aqye7bsrhBXM-cBB=YTHiP3f_2J2apg@mail.gmail.com>
Message-ID: <20140604053904.GA5309@k2>

On Wed, Jun 04, 2014 at 03:17:00PM +1000, Nick Coghlan wrote:

> There's a general expectation that indexing will be O(1) because all
> the builtin containers that support that syntax use it for O(1) lookup
> operations.

Depending on your definition of built in, there is at least one standard
library container that does not - collections.deque.

Given the specialized kinds of application this Python implementation is
targetted at, it seems UTF-8 is ideal considering the huge memory
savings resulting from the compressed representation, and the reduced
likelihood of there being any real need for serious text processing on
the device.

It is also unlikely to find software or libraries like Django or
Werkzeug running on a microcontroller, more likely all the Python code
would be custom, in which case, replacing string indexing with
iteration, or temporary conversion to a list is easily done.

In this context, while a fixed-width encoding may be the correct choice
it would also likely be the wrong choice.


David

From ja.py at farowl.co.uk  Wed Jun  4 09:41:12 2014
From: ja.py at farowl.co.uk (Jeff Allen)
Date: Wed, 04 Jun 2014 08:41:12 +0100
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604011718.GD10355@ando>
References: <20140604011718.GD10355@ando>
Message-ID: <538ECD98.5030309@farowl.co.uk>

Jython uses UTF-16 internally -- probably the only sensible choice in a 
Python that can call Java. Indexing is O(N), fundamentally. By 
"fundamentally", I mean for those strings that have not yet noticed that 
they contain no supplementary (>0xffff) characters.

I've toyed with making this O(1) universally. Like Steven, I understand 
this to be a freedom afforded to implementers, rather than an issue of 
conformity.

Jeff Allen

On 04/06/2014 02:17, Steven D'Aprano wrote:
> There is a discussion over at MicroPython about the internal
> representation of Unicode strings.
...
> My own feeling is that O(1) string indexing operations are a quality of
> implementation issue, not a deal breaker to call it a Python. I can't
> see any requirement in the docs that str[n] must take O(1) time, but
> perhaps I have missed something.
>


From stephen at xemacs.org  Wed Jun  4 11:36:20 2014
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 04 Jun 2014 18:36:20 +0900
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604053904.GA5309@k2>
References: <20140604011718.GD10355@ando>
 <CADiSq7d9A894RB-D9N+aqye7bsrhBXM-cBB=YTHiP3f_2J2apg@mail.gmail.com>
 <20140604053904.GA5309@k2>
Message-ID: <87bnu93tkb.fsf@uwakimon.sk.tsukuba.ac.jp>

dw+python-dev at hmmz.org writes:

 > Given the specialized kinds of application this Python
 > implementation is targetted at, it seems UTF-8 is ideal considering
 > the huge memory savings resulting from the compressed
 > representation,

I think you really need to check what the applications are in detail.
UTF-8 costs about 35% more storage for Japanese, and even more for
Chinese, than does UTF-16.  So if you might be using a lot of Asian
localized strings, it might even be worth implementing PEP-393 to get
the best of both worlds for most strings.


From juraj.sukop at gmail.com  Wed Jun  4 11:53:43 2014
From: juraj.sukop at gmail.com (Juraj Sukop)
Date: Wed, 4 Jun 2014 11:53:43 +0200
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <87bnu93tkb.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20140604011718.GD10355@ando>
 <CADiSq7d9A894RB-D9N+aqye7bsrhBXM-cBB=YTHiP3f_2J2apg@mail.gmail.com>
 <20140604053904.GA5309@k2>
 <87bnu93tkb.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAFL_wHv5KdY5cRaiQDCbJh1QDVZTWuNmm4r79xpZu_BrghG6qA@mail.gmail.com>

On Wed, Jun 4, 2014 at 11:36 AM, Stephen J. Turnbull <stephen at xemacs.org>
wrote:

>
> I think you really need to check what the applications are in detail.
> UTF-8 costs about 35% more storage for Japanese, and even more for
> Chinese, than does UTF-16.


"UTF-8 can be smaller even for Asian languages, e.g.: front page of
Wikipedia Japan: 83 kB in UTF-8, 144 kB in UTF-16"
>From http://www.lua.org/wshop12/Ierusalimschy.pdf (p. 12)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140604/686f203d/attachment.html>

From dholth at gmail.com  Wed Jun  4 12:41:05 2014
From: dholth at gmail.com (Daniel Holth)
Date: Wed, 4 Jun 2014 06:41:05 -0400
Subject: [Python-Dev] Some notes about MicroPython from an observer
Message-ID: <CAG8k2+4i1OTKcxh7yJWsD2u_dY8-ETJ3ckFLS0F6wfBTe_4-vA@mail.gmail.com>

- micropython is designed to run on a machine with 192 kilobytes of
RAM and perhaps a megabyte of FLASH. The controller can execute
read-only code directly from FLASH. There is no dynamic linker in this
environment. (It also has a UNIX port).
- However it does include a full Python parser and REPL, so the board
can be programmed without a separate computer as opposed to, say,
having to upload bytecode compiled on a regular computer.
- It's definitely going to be a subset of Python. For example,
func.__name__ is not supported - to make it more micro?
- They have a C API. It is much different than the CPython C API.
- It mas more than one code emitter. A certain decorator causes a
function to be compiled to ARM Thumb code instead of bytecode.
- It even has an inline assembler than translates Python-syntax ARM
assembly (to re-use the same parser) into machine code.

Most information from
https://www.kickstarter.com/projects/214379695/micro-python-python-for-microcontrollers/posts
and http://micropython.org/

From rosuav at gmail.com  Wed Jun  4 12:51:36 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 4 Jun 2014 20:51:36 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604133857.13a0f0b9@x34f>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <20140604133857.13a0f0b9@x34f>
Message-ID: <CAPTjJmoPC3hQJ2-sZV1U9vWiMMz+=sUNAgDeRdiewsEe69wggA@mail.gmail.com>

On Wed, Jun 4, 2014 at 8:38 PM, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> That's another reason why people don't like Unicode enforced upon them
> - all the talk about supporting all languages and scripts is demagogy
> and hypocrisy, given a choice, Unicode zealots would rather limit
> people to Latin script then give up on their arbitrarily chosen,
> one-among-thousands,
> soon-to-be-replaced-by-apples'-and-microsofts'-"exciting-new" encoding.

Wrong. I use and recommend Unicode, with UTF-8 for transmission, and I
do not ever want to limit people to Latin-1 or any other such subset.
Even though English is the only language I speak, I am *frequently*
using non-ASCII characters (eg when I discuss mathematics on a MUD),
and if I could be absolutely sure that everyone in the conversation
correctly comprehended Unicode, I could do this with a lot more
confidence. Unfortunately, the server I use just passes bytes in and
out, and some clients assume CP-1252, others assume Latin-1, and
others (including my Gypsum) try UTF-8 first and fall back on an
eight-bit encoding (currently CP-1252 because of the first group). But
in an ideal world, server and clients would all speak Unicode
everywhere, and transmit and receive UTF-8. This is not hypocrisy,
this is the way to work reliably.

> Once again, my claim is what MicroPython implements now is more correct
> - in a sense wider than technical - handling. We don't provide Unicode
> encoding support, because it's highly bloated, but let people use any
> encoding they like. That comes at some price, like length of strings in
> characters are not know to runtime, only in bytes, but quite a lot of
> applications can be written by having just that.

The current implementation is flat-out lying, actually. It claims that
it's storing Unicode codepoints (as per the Python spec) while
actually storing bytes, and then it transmits those bytes to the
console etc as-is. This is a bug. It needs to be fixed. The only
question is, what form will the fix take? Will it be PEP 393's
flexible fixed-width representation? UTF-8? UTF-16 (I hope not!)? A
hybrid of Latin-1 where possible and UTF-8 otherwise? But something
has to be done.

ChrisA

From rosuav at gmail.com  Wed Jun  4 12:53:46 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 4 Jun 2014 20:53:46 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604133857.13a0f0b9@x34f>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <20140604133857.13a0f0b9@x34f>
Message-ID: <CAPTjJmp5mqfpZxcROUuLkeTO+xk_WYcEUpoNzMcdzBk-v2tcyg@mail.gmail.com>

On Wed, Jun 4, 2014 at 8:38 PM, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> And I'm saying that not to discourage Unicode addition to MicroPython,
> but to hint that "force-force" approach implemented by CPython3 and
> causing rage and split in the community is not appreciated.

FWIW, it's Python 3 (the language) and not CPython 3.x (the
implementation) that specifies Unicode strings in this way. I don't
know why it has to cause a split in the community; this is the one way
to make sure *everyone's* strings work perfectly, rather than having
ASCII strings work fine and others start tripping over problems in
various APIs.

ChrisA

From pmiscml at gmail.com  Wed Jun  4 12:38:57 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Wed, 4 Jun 2014 13:38:57 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
Message-ID: <20140604133857.13a0f0b9@x34f>

Hello,

On Wed, 4 Jun 2014 12:32:12 +1000
Chris Angelico <rosuav at gmail.com> wrote:

> On Wed, Jun 4, 2014 at 11:17 AM, Steven D'Aprano
> <steve at pearwood.info> wrote:
> > * Having a build-time option to restrict all strings to ASCII-only.
> >
> >   (I think what they mean by that is that strings will be like
> > Python 2 strings, ASCII-plus-arbitrary-bytes, not actually ASCII.)
> 
> What I was actually suggesting along those lines was that the str type
> still be notionally a Unicode string, but that any codepoints >127
> would either raise an exception or blow an assertion,

That's another reason why people don't like Unicode enforced upon them
- all the talk about supporting all languages and scripts is demagogy
and hypocrisy, given a choice, Unicode zealots would rather limit
people to Latin script then give up on their arbitrarily chosen,
one-among-thousands,
soon-to-be-replaced-by-apples'-and-microsofts'-"exciting-new" encoding.

Once again, my claim is what MicroPython implements now is more correct
- in a sense wider than technical - handling. We don't provide Unicode
encoding support, because it's highly bloated, but let people use any
encoding they like. That comes at some price, like length of strings in
characters are not know to runtime, only in bytes, but quite a lot of
applications can be written by having just that.

And I'm saying that not to discourage Unicode addition to MicroPython,
but to hint that "force-force" approach implemented by CPython3 and
causing rage and split in the community is not appreciated.

> and all the code
> to handle multibyte representations would be compiled out. So there'd
> still be a difference between strings of text and streams of bytes,
> but all encoding and decoding to/from ASCII-compatible encodings would
> just point to the same bytes in RAM.
> 
> Risk: Someone would implement that with assertions, then compile with
> assertions disabled, test only with ASCII, and have lurking bugs.
> 
> ChrisA


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From pmiscml at gmail.com  Wed Jun  4 12:53:14 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Wed, 4 Jun 2014 13:53:14 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
Message-ID: <20140604135314.4bb31d75@x34f>

Hello,

On Tue, 3 Jun 2014 22:23:07 -0700
Guido van Rossum <guido at python.org> wrote:

[]
> Never mind disabling assertions -- even with enabled assertions you'd
> have to expect most Python programs to fail with non-ASCII input.
> 
> Then again the UTF-8 option would be pretty devastating too for
> anything manipulating strings (especially since many Python APIs are
> defined using indexes, e.g. the re module).

If the Unicode is slow (*), then obvious choice is not using Unicode
when not needed. Too bad that's a bit hard in Python3, as it enforces
Unicode everywhere, and dealing with efficient strings requires
prefixing them with funny characters like "b", etc.

* If Unicode if slow because it causes heap to bloat and go swap, the
choice is still the same.

> 
> Why not support variable-width strings like CPython 3.4?

Because, like good deal of community, we hope that Python4 will get
back to reality, and strings will be efficient (both for processing and
storage) by default, and niche and marginal "Unicode string" type will
be used explicitly (using funny prefixes, etc.), only when really
needed.


Ah, all these not so funny geek jokes about internals of language
implementation, hope they didn't make somebody's day dull!

> 
> -- 
> --Guido van Rossum (python.org/~guido)


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From pmiscml at gmail.com  Wed Jun  4 13:12:31 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Wed, 4 Jun 2014 14:12:31 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
Message-ID: <20140604141231.3cdd4fdd@x34f>

Hello,

On Wed, 4 Jun 2014 17:03:22 +1000
Chris Angelico <rosuav at gmail.com> wrote:

[]

> > Why not support variable-width strings like CPython 3.4?
> 
> That was my first recommendation, and in fact I started writing code
> to implement parts of PEP 393, with a view to basically doing it the
> same way in both Pythons. But discussion on the tracker issue showed a
> certain amount of hostility toward the potential expansion of strings,
> particularly in the worst-case example of appending a single SMP
> character onto a long ASCII string.

An alternative view is that the discussion on the tracker showed Python
developers' mind-fixation on implementing something the way CPython does
it. And I didn't yet go to that argument, but in the end, MicroPython
does not try to rewrite CPython or compete with it. So, having few
choices with pros and cons leading approximately to the tie among them,
it's the least productive to make the same choice as CPython did.

Even having "rule of thumb" of choosing not-a-CPython way would be more
productive than having the same rule of thumb for blindly choosing
CPython way. (Of course, actually it should be technical discussion
based on the target requirements, like we hopefully did, with strong
arguments against using something else but the de-facto standard
transfer encoding for Unicode).


> 
> ChrisA

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From rosuav at gmail.com  Wed Jun  4 13:17:12 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Wed, 4 Jun 2014 21:17:12 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604141231.3cdd4fdd@x34f>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <20140604141231.3cdd4fdd@x34f>
Message-ID: <CAPTjJmr4pAtQAy=WxjaKXWZyhtPyGciLEAYgSjUY+OdzkzRivg@mail.gmail.com>

On Wed, Jun 4, 2014 at 9:12 PM, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> An alternative view is that the discussion on the tracker showed Python
> developers' mind-fixation on implementing something the way CPython does
> it. And I didn't yet go to that argument, but in the end, MicroPython
> does not try to rewrite CPython or compete with it. So, having few
> choices with pros and cons leading approximately to the tie among them,
> it's the least productive to make the same choice as CPython did.

I'm not a CPython dev, nor a Python dev, and I don't think any of the
big names of CPython or Python has showed up on that tracker as yet.
But why is "be different from CPython" such a valuable choice? CPython
works. It's had many hours of dev time put into it. Problems have been
identified and avoided. Throwing that out means throwing away a
freely-given shoulder to stand on, in an Isaac Newton way.

http://www.joelonsoftware.com/articles/fog0000000069.html

ChrisA

From dholth at gmail.com  Wed Jun  4 13:35:28 2014
From: dholth at gmail.com (Daniel Holth)
Date: Wed, 4 Jun 2014 07:35:28 -0400
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAPTjJmr4pAtQAy=WxjaKXWZyhtPyGciLEAYgSjUY+OdzkzRivg@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <20140604141231.3cdd4fdd@x34f>
 <CAPTjJmr4pAtQAy=WxjaKXWZyhtPyGciLEAYgSjUY+OdzkzRivg@mail.gmail.com>
Message-ID: <CAG8k2+48g4659ykxJUZNLJni7aZ+r+tpO+C7GNabYncHvsTQMw@mail.gmail.com>

Can of worms, opened.
On Jun 4, 2014 7:20 AM, "Chris Angelico" <rosuav at gmail.com> wrote:

> On Wed, Jun 4, 2014 at 9:12 PM, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> > An alternative view is that the discussion on the tracker showed Python
> > developers' mind-fixation on implementing something the way CPython does
> > it. And I didn't yet go to that argument, but in the end, MicroPython
> > does not try to rewrite CPython or compete with it. So, having few
> > choices with pros and cons leading approximately to the tie among them,
> > it's the least productive to make the same choice as CPython did.
>
> I'm not a CPython dev, nor a Python dev, and I don't think any of the
> big names of CPython or Python has showed up on that tracker as yet.
> But why is "be different from CPython" such a valuable choice? CPython
> works. It's had many hours of dev time put into it. Problems have been
> identified and avoided. Throwing that out means throwing away a
> freely-given shoulder to stand on, in an Isaac Newton way.
>
> http://www.joelonsoftware.com/articles/fog0000000069.html
>
> ChrisA
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140604/b7189b90/attachment.html>

From kristjan at ccpgames.com  Wed Jun  4 13:15:34 2014
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Wed, 4 Jun 2014 11:15:34 +0000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <7B966E20-909B-4DC6-9DCC-2206A93763E9@stufft.io>
References: <20140604011718.GD10355@ando>
 <7B966E20-909B-4DC6-9DCC-2206A93763E9@stufft.io>
Message-ID: <EFE3877620384242A686D52278B7CCD3A5315E06@RKV-IT-EXCH104.ccp.ad.local>

For those that haven't seen this:

http://www.utf8everywhere.org/

> -----Original Message-----
> From: Python-Dev [mailto:python-dev-
> bounces+kristjan=ccpgames.com at python.org] On Behalf Of Donald Stufft
> Sent: 4. j?n? 2014 01:46
> To: Steven D'Aprano
> Cc: python-dev at python.org
> Subject: Re: [Python-Dev] Internal representation of strings and
> Micropython
> 
> I think UTF8 is the best option.
> 


From pmiscml at gmail.com  Wed Jun  4 13:49:33 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Wed, 4 Jun 2014 14:49:33 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAPTjJmp5mqfpZxcROUuLkeTO+xk_WYcEUpoNzMcdzBk-v2tcyg@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <20140604133857.13a0f0b9@x34f>
 <CAPTjJmp5mqfpZxcROUuLkeTO+xk_WYcEUpoNzMcdzBk-v2tcyg@mail.gmail.com>
Message-ID: <20140604144933.66e6c2f4@x34f>

Hello,

On Wed, 4 Jun 2014 20:53:46 +1000
Chris Angelico <rosuav at gmail.com> wrote:

> On Wed, Jun 4, 2014 at 8:38 PM, Paul Sokolovsky <pmiscml at gmail.com>
> wrote:
> > And I'm saying that not to discourage Unicode addition to
> > MicroPython, but to hint that "force-force" approach implemented by
> > CPython3 and causing rage and split in the community is not
> > appreciated.
> 
> FWIW, it's Python 3 (the language) and not CPython 3.x (the
> implementation) that specifies Unicode strings in this way. 

Yeah, but it's CPython what dictates how language evolves (some people
even think that it dictates how language should be implemented!), so all
good parts belong to Python3, and all bad parts - to CPython3,
right? ;-)

> I don't
> know why it has to cause a split in the community; this is the one way
> to make sure *everyone's* strings work perfectly, rather than having
> ASCII strings work fine and others start tripping over problems in
> various APIs.

It did cause split in the community, that's the fact, that's why
Python2 and Python3 are at the respective positions. Anyway, I'm not
interested in participating in that split, I did not yet uttered my
opinion on that publicly enough, so I seized a chance to drop some
witty remarks, but I don't want to start yet another Unicode flame.


So, let's please be back to Unicode storage representation in
MicroPython. So, https://github.com/micropython/micropython/issues/657
discussed technical aspects, in a recent mail on this list I expressed
my opinion why following CPython way is not productive (for development
satisfaction and evolution of Python community, to be explicit).

Final argument I would have is that you certainly can implement Unicode
support the PEP393 way - it would be enormous help and would be gladly
accepted. The question, how useful it will be for MicroPython. It
certainly will be useful to report passing of testsuites. But will it
be *really* used?

For microcontroller board, it might be too heavy (put simple, with it,
people will be able to do less (== heap running out sooner)), than
without it, so one may expect it to be disabled by default. Then POSIX
port is there surely not to let people replace "python" command
with "micropython" and run Django, but to let people develop and debug
their apps with more comfort than on embedded board. So, it should
behave close to MCU version, and would follow with MCU choice
re: Unicode.

That's actually the reason why I keep up this discussion - not for the
sake of argument or to bash Python3's Unicode choices. With recent
MicroPython announcement, we surely looked for more people to
contribute to its development. But then we (or at least I can speak for
myself), would like to make sure that these contribution are actually
the most useful ones (for both MicroPython, and Python community in
general, which gets more choices, rather than just getting N% smaller
CPython rewrite).

So, you're not sure how O(N) string indexing will work? But MicroPython
offers a great opportunity to try! And it's something new and exciting,
which surely will be useful (== will save people memory), not just
something old and boring ;-).


> 
> ChrisA


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From dholth at gmail.com  Wed Jun  4 14:17:16 2014
From: dholth at gmail.com (Daniel Holth)
Date: Wed, 4 Jun 2014 08:17:16 -0400
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604144933.66e6c2f4@x34f>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <20140604133857.13a0f0b9@x34f>
 <CAPTjJmp5mqfpZxcROUuLkeTO+xk_WYcEUpoNzMcdzBk-v2tcyg@mail.gmail.com>
 <20140604144933.66e6c2f4@x34f>
Message-ID: <CAG8k2+4_ZWXcC1MkWG0gMexc4TENf8XUWi4zaK1bKQwfO==X7Q@mail.gmail.com>

If we're voting I think representing Unicode internally in micropython
as utf-8 with O(N) indexing is a great idea, partly because I'm not
sure indexing into strings is a good idea - lots of Unicode code
points don't make sense by themselves; see also grapheme clusters. It
would probably work great.

On Wed, Jun 4, 2014 at 7:49 AM, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> Hello,
>
> On Wed, 4 Jun 2014 20:53:46 +1000
> Chris Angelico <rosuav at gmail.com> wrote:
>
>> On Wed, Jun 4, 2014 at 8:38 PM, Paul Sokolovsky <pmiscml at gmail.com>
>> wrote:
>> > And I'm saying that not to discourage Unicode addition to
>> > MicroPython, but to hint that "force-force" approach implemented by
>> > CPython3 and causing rage and split in the community is not
>> > appreciated.
>>
>> FWIW, it's Python 3 (the language) and not CPython 3.x (the
>> implementation) that specifies Unicode strings in this way.
>
> Yeah, but it's CPython what dictates how language evolves (some people
> even think that it dictates how language should be implemented!), so all
> good parts belong to Python3, and all bad parts - to CPython3,
> right? ;-)
>
>> I don't
>> know why it has to cause a split in the community; this is the one way
>> to make sure *everyone's* strings work perfectly, rather than having
>> ASCII strings work fine and others start tripping over problems in
>> various APIs.
>
> It did cause split in the community, that's the fact, that's why
> Python2 and Python3 are at the respective positions. Anyway, I'm not
> interested in participating in that split, I did not yet uttered my
> opinion on that publicly enough, so I seized a chance to drop some
> witty remarks, but I don't want to start yet another Unicode flame.
>
>
>
> So, let's please be back to Unicode storage representation in
> MicroPython. So, https://github.com/micropython/micropython/issues/657
> discussed technical aspects, in a recent mail on this list I expressed
> my opinion why following CPython way is not productive (for development
> satisfaction and evolution of Python community, to be explicit).
>
> Final argument I would have is that you certainly can implement Unicode
> support the PEP393 way - it would be enormous help and would be gladly
> accepted. The question, how useful it will be for MicroPython. It
> certainly will be useful to report passing of testsuites. But will it
> be *really* used?
>
> For microcontroller board, it might be too heavy (put simple, with it,
> people will be able to do less (== heap running out sooner)), than
> without it, so one may expect it to be disabled by default. Then POSIX
> port is there surely not to let people replace "python" command
> with "micropython" and run Django, but to let people develop and debug
> their apps with more comfort than on embedded board. So, it should
> behave close to MCU version, and would follow with MCU choice
> re: Unicode.
>
> That's actually the reason why I keep up this discussion - not for the
> sake of argument or to bash Python3's Unicode choices. With recent
> MicroPython announcement, we surely looked for more people to
> contribute to its development. But then we (or at least I can speak for
> myself), would like to make sure that these contribution are actually
> the most useful ones (for both MicroPython, and Python community in
> general, which gets more choices, rather than just getting N% smaller
> CPython rewrite).
>
> So, you're not sure how O(N) string indexing will work? But MicroPython
> offers a great opportunity to try! And it's something new and exciting,
> which surely will be useful (== will save people memory), not just
> something old and boring ;-).
>
>
>>
>> ChrisA
>
>
> --
> Best regards,
>  Paul                          mailto:pmiscml at gmail.com
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com

From pmiscml at gmail.com  Wed Jun  4 14:18:01 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Wed, 4 Jun 2014 15:18:01 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAPTjJmr4pAtQAy=WxjaKXWZyhtPyGciLEAYgSjUY+OdzkzRivg@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <20140604141231.3cdd4fdd@x34f>
 <CAPTjJmr4pAtQAy=WxjaKXWZyhtPyGciLEAYgSjUY+OdzkzRivg@mail.gmail.com>
Message-ID: <20140604151801.4a08d40d@x34f>

Hello,

On Wed, 4 Jun 2014 21:17:12 +1000
Chris Angelico <rosuav at gmail.com> wrote:

> On Wed, Jun 4, 2014 at 9:12 PM, Paul Sokolovsky <pmiscml at gmail.com>
> wrote:
> > An alternative view is that the discussion on the tracker showed
> > Python developers' mind-fixation on implementing something the way
> > CPython does it. And I didn't yet go to that argument, but in the
> > end, MicroPython does not try to rewrite CPython or compete with
> > it. So, having few choices with pros and cons leading approximately
> > to the tie among them, it's the least productive to make the same
> > choice as CPython did.
> 
> I'm not a CPython dev, nor a Python dev, and I don't think any of the
> big names of CPython or Python has showed up on that tracker as yet.
> But why is "be different from CPython" such a valuable choice? CPython
> works. It's had many hours of dev time put into it. 

Exactly, CPython (already) exists, and it works, so people can just use
it. MicroPython's aim is to go where CPython didn't, and couldn't, go.
For that, it's got to be different, or it literally won't fit there,
like CPython doesn't.

[]

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From Steve.Dower at microsoft.com  Wed Jun  4 15:14:04 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Wed, 4 Jun 2014 13:14:04 +0000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAG8k2+4_ZWXcC1MkWG0gMexc4TENf8XUWi4zaK1bKQwfO==X7Q@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <20140604133857.13a0f0b9@x34f>
 <CAPTjJmp5mqfpZxcROUuLkeTO+xk_WYcEUpoNzMcdzBk-v2tcyg@mail.gmail.com>
 <20140604144933.66e6c2f4@x34f>,
 <CAG8k2+4_ZWXcC1MkWG0gMexc4TENf8XUWi4zaK1bKQwfO==X7Q@mail.gmail.com>
Message-ID: <f6ed67ba85af48d89a19e3e0ecf99df5@BLUPR03MB389.namprd03.prod.outlook.com>

I'm agree with Daniel. Directly indexing into text suggests an attempted optimization that is likely to be incorrect for a set of strings. Splitting, regex, concatenation and formatting are really the main operations that matter, and MicroPython can optimize their implementation of these easily enough for O(N) indexing.

Cheers,
Steve

Top-posted from my Windows Phone
________________________________
From: Daniel Holth<mailto:dholth at gmail.com>
Sent: ?6/?4/?2014 5:17
To: Paul Sokolovsky<mailto:pmiscml at gmail.com>
Cc: python-dev<mailto:python-dev at python.org>
Subject: Re: [Python-Dev] Internal representation of strings and Micropython

If we're voting I think representing Unicode internally in micropython
as utf-8 with O(N) indexing is a great idea, partly because I'm not
sure indexing into strings is a good idea - lots of Unicode code
points don't make sense by themselves; see also grapheme clusters. It
would probably work great.

On Wed, Jun 4, 2014 at 7:49 AM, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> Hello,
>
> On Wed, 4 Jun 2014 20:53:46 +1000
> Chris Angelico <rosuav at gmail.com> wrote:
>
>> On Wed, Jun 4, 2014 at 8:38 PM, Paul Sokolovsky <pmiscml at gmail.com>
>> wrote:
>> > And I'm saying that not to discourage Unicode addition to
>> > MicroPython, but to hint that "force-force" approach implemented by
>> > CPython3 and causing rage and split in the community is not
>> > appreciated.
>>
>> FWIW, it's Python 3 (the language) and not CPython 3.x (the
>> implementation) that specifies Unicode strings in this way.
>
> Yeah, but it's CPython what dictates how language evolves (some people
> even think that it dictates how language should be implemented!), so all
> good parts belong to Python3, and all bad parts - to CPython3,
> right? ;-)
>
>> I don't
>> know why it has to cause a split in the community; this is the one way
>> to make sure *everyone's* strings work perfectly, rather than having
>> ASCII strings work fine and others start tripping over problems in
>> various APIs.
>
> It did cause split in the community, that's the fact, that's why
> Python2 and Python3 are at the respective positions. Anyway, I'm not
> interested in participating in that split, I did not yet uttered my
> opinion on that publicly enough, so I seized a chance to drop some
> witty remarks, but I don't want to start yet another Unicode flame.
>
>
>
> So, let's please be back to Unicode storage representation in
> MicroPython. So, https://github.com/micropython/micropython/issues/657
> discussed technical aspects, in a recent mail on this list I expressed
> my opinion why following CPython way is not productive (for development
> satisfaction and evolution of Python community, to be explicit).
>
> Final argument I would have is that you certainly can implement Unicode
> support the PEP393 way - it would be enormous help and would be gladly
> accepted. The question, how useful it will be for MicroPython. It
> certainly will be useful to report passing of testsuites. But will it
> be *really* used?
>
> For microcontroller board, it might be too heavy (put simple, with it,
> people will be able to do less (== heap running out sooner)), than
> without it, so one may expect it to be disabled by default. Then POSIX
> port is there surely not to let people replace "python" command
> with "micropython" and run Django, but to let people develop and debug
> their apps with more comfort than on embedded board. So, it should
> behave close to MCU version, and would follow with MCU choice
> re: Unicode.
>
> That's actually the reason why I keep up this discussion - not for the
> sake of argument or to bash Python3's Unicode choices. With recent
> MicroPython announcement, we surely looked for more people to
> contribute to its development. But then we (or at least I can speak for
> myself), would like to make sure that these contribution are actually
> the most useful ones (for both MicroPython, and Python community in
> general, which gets more choices, rather than just getting N% smaller
> CPython rewrite).
>
> So, you're not sure how O(N) string indexing will work? But MicroPython
> offers a great opportunity to try! And it's something new and exciting,
> which surely will be useful (== will save people memory), not just
> something old and boring ;-).
>
>
>>
>> ChrisA
>
>
> --
> Best regards,
>  Paul                          mailto:pmiscml at gmail.com
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40microsoft.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140604/e68594d5/attachment.html>

From breamoreboy at yahoo.co.uk  Wed Jun  4 15:29:51 2014
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Wed, 04 Jun 2014 14:29:51 +0100
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604135314.4bb31d75@x34f>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <20140604135314.4bb31d75@x34f>
Message-ID: <lmn70j$8oc$1@ger.gmane.org>

On 04/06/2014 11:53, Paul Sokolovsky wrote:
> Hello,
>
> On Tue, 3 Jun 2014 22:23:07 -0700
> Guido van Rossum <guido at python.org> wrote:
>
> []
>> Never mind disabling assertions -- even with enabled assertions you'd
>> have to expect most Python programs to fail with non-ASCII input.
>>
>> Then again the UTF-8 option would be pretty devastating too for
>> anything manipulating strings (especially since many Python APIs are
>> defined using indexes, e.g. the re module).
>
> If the Unicode is slow (*), then obvious choice is not using Unicode
> when not needed. Too bad that's a bit hard in Python3, as it enforces
> Unicode everywhere, and dealing with efficient strings requires
> prefixing them with funny characters like "b", etc.
>
> * If Unicode if slow because it causes heap to bloat and go swap, the
> choice is still the same.

Where is your evidence that (presumably) CPython unicode is slow?  What 
is your response to this message 
http://bugs.python.org/issue16061#msg171413 from the bug tracker?

>
>>
>> Why not support variable-width strings like CPython 3.4?
>
> Because, like good deal of community, we hope that Python4 will get
> back to reality, and strings will be efficient (both for processing and
> storage) by default, and niche and marginal "Unicode string" type will
> be used explicitly (using funny prefixes, etc.), only when really
> needed.

Where is your evidence that supports the above claim?

>
>
> Ah, all these not so funny geek jokes about internals of language
> implementation, hope they didn't make somebody's day dull!
>
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>
>
>


-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com


From ncoghlan at gmail.com  Wed Jun  4 15:33:01 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 4 Jun 2014 23:33:01 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604053904.GA5309@k2>
References: <20140604011718.GD10355@ando>
 <CADiSq7d9A894RB-D9N+aqye7bsrhBXM-cBB=YTHiP3f_2J2apg@mail.gmail.com>
 <20140604053904.GA5309@k2>
Message-ID: <CADiSq7f8LrJXuTZCbHV0DU8OKtQVYsSR6MgJXpX+DTpa+bXDwg@mail.gmail.com>

On 4 June 2014 15:39,  <dw+python-dev at hmmz.org> wrote:
> On Wed, Jun 04, 2014 at 03:17:00PM +1000, Nick Coghlan wrote:
>
>> There's a general expectation that indexing will be O(1) because all
>> the builtin containers that support that syntax use it for O(1) lookup
>> operations.
>
> Depending on your definition of built in, there is at least one standard
> library container that does not - collections.deque.
>
> Given the specialized kinds of application this Python implementation is
> targetted at, it seems UTF-8 is ideal considering the huge memory
> savings resulting from the compressed representation, and the reduced
> likelihood of there being any real need for serious text processing on
> the device.

Right - I wasn't clear that I think storing text internally as UTF-8
sounds fine for MicroPython. Anything where the O(N) nature of
indexing by code point matters probably won't be run in that
environment anyway.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From storchaka at gmail.com  Wed Jun  4 15:39:46 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 04 Jun 2014 16:39:46 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604011718.GD10355@ando>
References: <20140604011718.GD10355@ando>
Message-ID: <lmn7i2$f4n$1@ger.gmane.org>

04.06.14 04:17, Steven D'Aprano ???????(??):
> Would either of these trade-offs be acceptable while still claiming
> "Python 3.4 compatibility"?
>
> My own feeling is that O(1) string indexing operations are a quality of
> implementation issue, not a deal breaker to call it a Python. I can't
> see any requirement in the docs that str[n] must take O(1) time, but
> perhaps I have missed something.

I think than breaking O(1) expectation for indexing makes the 
implementation significant incompatible with Python. Virtually all 
string operations in Python operates with indices.

O(1) indexing operations can be kept with minimal memory requirements if 
implement Unicode internally as modified UTF-8 plus optional array of 
offsets for every, say, 32th character (which even can be compressed to 
an array of 16-bit or 32-bit integers).


From dholth at gmail.com  Wed Jun  4 16:01:12 2014
From: dholth at gmail.com (Daniel Holth)
Date: Wed, 4 Jun 2014 10:01:12 -0400
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmn7i2$f4n$1@ger.gmane.org>
References: <20140604011718.GD10355@ando>
	<lmn7i2$f4n$1@ger.gmane.org>
Message-ID: <CAG8k2+5sWD0A94zBGODLkzx4Fhg1i6KGj+mj68EER842vyc21A@mail.gmail.com>

MicroPython is going to be significantly incompatible with Python
anyway. But you should be able to run your mp code on regular Python.

On Wed, Jun 4, 2014 at 9:39 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:
> 04.06.14 04:17, Steven D'Aprano ???????(??):
>
>> Would either of these trade-offs be acceptable while still claiming
>> "Python 3.4 compatibility"?
>>
>> My own feeling is that O(1) string indexing operations are a quality of
>> implementation issue, not a deal breaker to call it a Python. I can't
>> see any requirement in the docs that str[n] must take O(1) time, but
>> perhaps I have missed something.
>
>
> I think than breaking O(1) expectation for indexing makes the implementation
> significant incompatible with Python. Virtually all string operations in
> Python operates with indices.
>
> O(1) indexing operations can be kept with minimal memory requirements if
> implement Unicode internally as modified UTF-8 plus optional array of
> offsets for every, say, 32th character (which even can be compressed to an
> array of 16-bit or 32-bit integers).
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com

From p.f.moore at gmail.com  Wed Jun  4 16:02:48 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 4 Jun 2014 15:02:48 +0100
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmn7i2$f4n$1@ger.gmane.org>
References: <20140604011718.GD10355@ando>
	<lmn7i2$f4n$1@ger.gmane.org>
Message-ID: <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>

On 4 June 2014 14:39, Serhiy Storchaka <storchaka at gmail.com> wrote:
> I think than breaking O(1) expectation for indexing makes the implementation
> significant incompatible with Python. Virtually all string operations in
> Python operates with indices.

I don't use indexing on strings except in rare situations. Sure I use
lots of operations that may well use indexing *internally* but that's
the point. MicroPython can optimise those operations without needing
to guarantee O(1) indexing, and I'd be fine with that.

Paul

From steve at pearwood.info  Wed Jun  4 16:12:45 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 5 Jun 2014 00:12:45 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <f6ed67ba85af48d89a19e3e0ecf99df5@BLUPR03MB389.namprd03.prod.outlook.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <20140604133857.13a0f0b9@x34f>
 <CAPTjJmp5mqfpZxcROUuLkeTO+xk_WYcEUpoNzMcdzBk-v2tcyg@mail.gmail.com>
 <CAG8k2+4_ZWXcC1MkWG0gMexc4TENf8XUWi4zaK1bKQwfO==X7Q@mail.gmail.com>
 <f6ed67ba85af48d89a19e3e0ecf99df5@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <20140604141245.GF10355@ando>

On Wed, Jun 04, 2014 at 01:14:04PM +0000, Steve Dower wrote:
> I'm agree with Daniel. Directly indexing into text suggests an 
> attempted optimization that is likely to be incorrect for a set of 
> strings. 

I'm afraid I don't understand this argument. The language semantics says 
that a string is an array of code points. Every index relates to a 
single code point, no code point extends over two or more indexes. 
There's a 1:1 relationship between code points and indexes. How is 
direct indexing "likely to be incorrect"?

e.g.

s = "---?---"
offset = s.index('?')
assert s[offset] == '?'

That cannot fail with Python's semantics.

[Aside: it does fail in Python 2, showing that the idea that "strings 
are bytes" is fatally broken. Fortunately Python has moved beyond that.]


> Splitting, regex, concatenation and formatting are really the 
> main operations that matter, and MicroPython can optimize their 
> implementation of these easily enough for O(N) indexing.

Really? Well, it will be a nice experiment. Fortunately MicroPython runs 
under Linux as well as on embedded systems (a clever decision, by the 
way) so I look forward to seeing how their internal-utf8 implementation 
stacks up against CPython's FSR implementation.

Out of curiosity, when the FSR was proposed, did anyone consider an 
internal UTF-8 representation? If so, why was it rejected?


-- 
Steven

From storchaka at gmail.com  Wed Jun  4 16:17:29 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 04 Jun 2014 17:17:29 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
Message-ID: <lmn9oo$d20$1@ger.gmane.org>

04.06.14 10:03, Chris Angelico ???????(??):
> Right, which is why I don't like the idea. But you don't need
> non-ASCII characters to blink an LED or turn a servo, and there is
> significant resistance to the notion that appending a non-ASCII
> character to a long ASCII-only string requires the whole string to be
> copied and doubled in size (lots of heap space used).

But you need non-ASCII characters to display a title of MP3 track.


From rosuav at gmail.com  Wed Jun  4 16:26:10 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 5 Jun 2014 00:26:10 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmn9oo$d20$1@ger.gmane.org>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <lmn9oo$d20$1@ger.gmane.org>
Message-ID: <CAPTjJmof+qiELL+GeF90wRiFXkgRDyFo=TQ+fQZwJcgJys+_AQ@mail.gmail.com>

On Thu, Jun 5, 2014 at 12:17 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:
> 04.06.14 10:03, Chris Angelico ???????(??):
>
>> Right, which is why I don't like the idea. But you don't need
>> non-ASCII characters to blink an LED or turn a servo, and there is
>> significant resistance to the notion that appending a non-ASCII
>> character to a long ASCII-only string requires the whole string to be
>> copied and doubled in size (lots of heap space used).
>
>
> But you need non-ASCII characters to display a title of MP3 track.

Agreed. IMO, any Python, no matter how micro, needs full Unicode
support; but there is resistance from uPy's devs.

ChrisA

From storchaka at gmail.com  Wed Jun  4 16:40:14 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 04 Jun 2014 17:40:14 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
References: <20140604011718.GD10355@ando>	<lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
Message-ID: <lmnb3d$ugn$1@ger.gmane.org>

04.06.14 17:02, Paul Moore ???????(??):
> On 4 June 2014 14:39, Serhiy Storchaka <storchaka at gmail.com> wrote:
>> I think than breaking O(1) expectation for indexing makes the implementation
>> significant incompatible with Python. Virtually all string operations in
>> Python operates with indices.
>
> I don't use indexing on strings except in rare situations. Sure I use
> lots of operations that may well use indexing *internally* but that's
> the point. MicroPython can optimise those operations without needing
> to guarantee O(1) indexing, and I'd be fine with that.

Any non-trivial text parsing uses indices or regular expressions (and 
regular expressions themself use indices internally).

It would be interesting to collect a statistic about how many indexing 
operations happened during the life of a string in typical (Micro)Python 
program.


From steve at pearwood.info  Wed Jun  4 16:40:53 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 5 Jun 2014 00:40:53 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604133857.13a0f0b9@x34f>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <20140604133857.13a0f0b9@x34f>
Message-ID: <20140604144053.GG10355@ando>

On Wed, Jun 04, 2014 at 01:38:57PM +0300, Paul Sokolovsky wrote:

> That's another reason why people don't like Unicode enforced upon them

Enforcing design and language decisions is the job of the programming 
language. You might as well complain that Python forces C doubles as the 
floating point type, or that it forces Bignums as the integer type, or 
that it forces significant indentation, or "class" as a keyword. Or that 
C forces you to use braces and manage your own memory. That's the 
purpose of the language, to make those decisions as to what features to 
provide and what not to provide.


> - all the talk about supporting all languages and scripts is demagogy
> and hypocrisy, given a choice, Unicode zealots would rather limit
> people to Latin script 

I have no words to describe how ridiculous this accusation is.


> then give up on their arbitrarily chosen, one-among-thousands,
> soon-to-be-replaced-by-apples'-and-microsofts'-"exciting-new" encoding.

 
> Once again, my claim is what MicroPython implements now is more correct
> - in a sense wider than technical - handling. We don't provide Unicode
> encoding support, because it's highly bloated, but let people use any
> encoding they like. That comes at some price, like length of strings in
> characters are not know to runtime, only in bytes

What's does uPy return for the length of '?'? If the answer is anything 
but 1, that's a bug.


-- 
Steven

From pmiscml at gmail.com  Wed Jun  4 16:49:30 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Wed, 4 Jun 2014 17:49:30 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAPTjJmof+qiELL+GeF90wRiFXkgRDyFo=TQ+fQZwJcgJys+_AQ@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <lmn9oo$d20$1@ger.gmane.org>
 <CAPTjJmof+qiELL+GeF90wRiFXkgRDyFo=TQ+fQZwJcgJys+_AQ@mail.gmail.com>
Message-ID: <20140604174930.3a5af45f@x34f>

Hello,

On Thu, 5 Jun 2014 00:26:10 +1000
Chris Angelico <rosuav at gmail.com> wrote:

> On Thu, Jun 5, 2014 at 12:17 AM, Serhiy Storchaka
> <storchaka at gmail.com> wrote:
> > 04.06.14 10:03, Chris Angelico ???????(??):
> >
> >> Right, which is why I don't like the idea. But you don't need
> >> non-ASCII characters to blink an LED or turn a servo, and there is
> >> significant resistance to the notion that appending a non-ASCII
> >> character to a long ASCII-only string requires the whole string to
> >> be copied and doubled in size (lots of heap space used).
> >
> >
> > But you need non-ASCII characters to display a title of MP3 track.

Yes, but to display a title, you don't need to do codepoint access at
random - you need to either take a block of memory (length in bytes) and
do something with it (pass to a C function, transfer over some bus,
etc.), or *iterate in order* over codepoints in a string. All these
operations are as efficient (O-notation) for UTF-8 as for UTF-32.

Some operations are not going to be as fast, so - oops - avoid doing
them without good reason. And kindly drop expectations that doing
arbitrary operations on *Unicode* are as efficient as you imagined.
(Note the *Unicode* in general, not particular flavor of which you got
used to, up to thinking it's the one and only "right" flavor.)

> Agreed. IMO, any Python, no matter how micro, needs full Unicode
> support; but there is resistance from uPy's devs.

FUD ;-).

> 
> ChrisA

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From rosuav at gmail.com  Wed Jun  4 17:00:52 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 5 Jun 2014 01:00:52 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604174930.3a5af45f@x34f>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <lmn9oo$d20$1@ger.gmane.org>
 <CAPTjJmof+qiELL+GeF90wRiFXkgRDyFo=TQ+fQZwJcgJys+_AQ@mail.gmail.com>
 <20140604174930.3a5af45f@x34f>
Message-ID: <CAPTjJmp4Q8-Q6=B4BXB_mupv+c3TmjJzvVKNzctAxOYK8FQV9A@mail.gmail.com>

On Thu, Jun 5, 2014 at 12:49 AM, Paul Sokolovsky <pmiscml at gmail.com> wrote:
>> > But you need non-ASCII characters to display a title of MP3 track.
>
> Yes, but to display a title, you don't need to do codepoint access at
> random - you need to either take a block of memory (length in bytes) and
> do something with it (pass to a C function, transfer over some bus,
> etc.), or *iterate in order* over codepoints in a string. All these
> operations are as efficient (O-notation) for UTF-8 as for UTF-32.

Suppose you have a long title, and you need to abbreviate it by
dropping out words (delimited by whitespace), such that you keep the
first word (always) and the last (if possible) and as many as possible
in between. How are you going to write that? With PEP 393 or UTF-32
strings, you can simply record the index of every whitespace you find,
count off lengths, and decide what to keep and what to ellipsize.

> Some operations are not going to be as fast, so - oops - avoid doing
> them without good reason. And kindly drop expectations that doing
> arbitrary operations on *Unicode* are as efficient as you imagined.
> (Note the *Unicode* in general, not particular flavor of which you got
> used to, up to thinking it's the one and only "right" flavor.)

Not sure what you mean by flavors of Unicode. Unicode is a mapping of
codepoints to characters, not an in-memory representation. And I've
been working with Python 3.3 since before it came out, and with Pike
(which has a very similar model) for longer, and in both of them, I
casually perform operations on Unicode strings in the same way that I
used to perform operations on REXX strings (which were eight-bit in
the current system codepage - 437 for us). I do expect those
operations to be efficient, and I get what I expect.

Maybe they won't be in uPy, but that would be a limitation of uPy, not
a fundamental problem with Unicode.

ChrisA

From dholth at gmail.com  Wed Jun  4 17:31:17 2014
From: dholth at gmail.com (Daniel Holth)
Date: Wed, 4 Jun 2014 11:31:17 -0400
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604141245.GF10355@ando>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <20140604133857.13a0f0b9@x34f>
 <CAPTjJmp5mqfpZxcROUuLkeTO+xk_WYcEUpoNzMcdzBk-v2tcyg@mail.gmail.com>
 <CAG8k2+4_ZWXcC1MkWG0gMexc4TENf8XUWi4zaK1bKQwfO==X7Q@mail.gmail.com>
 <f6ed67ba85af48d89a19e3e0ecf99df5@BLUPR03MB389.namprd03.prod.outlook.com>
 <20140604141245.GF10355@ando>
Message-ID: <CAG8k2+4fcNnQauF0bb2QMiogb7R6SJK-LoK5HOokJWR1Qy9WoA@mail.gmail.com>

On Wed, Jun 4, 2014 at 10:12 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Wed, Jun 04, 2014 at 01:14:04PM +0000, Steve Dower wrote:
>> I'm agree with Daniel. Directly indexing into text suggests an
>> attempted optimization that is likely to be incorrect for a set of
>> strings.
>
> I'm afraid I don't understand this argument. The language semantics says
> that a string is an array of code points. Every index relates to a
> single code point, no code point extends over two or more indexes.
> There's a 1:1 relationship between code points and indexes. How is
> direct indexing "likely to be incorrect"?

"Useful" is probably a better word. When you get into the complicated
languages and you want to know how wide something is, and you might
have y with two dots on it as one code point or two and left-to-right
and right-to-left indicators and who knows what else... then looking
at individual code points only works sometimes. I get the slicing
idea.

I like the idea that encoding to utf-8 would be the fastest thing you
can do with a string. You could consider doing regexps in that domain,
and other implementation specific optimizations in exactly the same
way that any Python implementation has them.

None of this would make it harder to move a servo.

From Steve.Dower at microsoft.com  Wed Jun  4 17:32:25 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Wed, 4 Jun 2014 15:32:25 +0000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAPTjJmp4Q8-Q6=B4BXB_mupv+c3TmjJzvVKNzctAxOYK8FQV9A@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <lmn9oo$d20$1@ger.gmane.org>
 <CAPTjJmof+qiELL+GeF90wRiFXkgRDyFo=TQ+fQZwJcgJys+_AQ@mail.gmail.com>
 <20140604174930.3a5af45f@x34f>
 <CAPTjJmp4Q8-Q6=B4BXB_mupv+c3TmjJzvVKNzctAxOYK8FQV9A@mail.gmail.com>
Message-ID: <02b9a61658c04b11a21317da5b78bad6@BLUPR03MB389.namprd03.prod.outlook.com>

Steven D'Aprano wrote:
> The language semantics says that a string is an array of code points. Every
> index relates to a single code point, no code point extends over two or more
> indexes.
> There's a 1:1 relationship between code points and indexes. How is direct
> indexing "likely to be incorrect"?

We're discussing the behaviour under a different (hypothetical) design decision than a 1:1 relationship between code points and indexes, so arguing from that stance doesn't make much sense.

> e.g.
> 
> s = "---?---"
> offset = s.index('?')
> assert s[offset] == '?'
> 
> That cannot fail with Python's semantics.

Agreed, and it shouldn't (I was actually referring to the optimization being incorrect for the goal, not the language semantics). What you'd probably find is that sizeof('?') == sizeof(s[offset]) == 2, which may be surprising, but is also correct.

But what are you trying to achieve (why are you writing this code)? All this example really shows is that you're only using indexing for trivial purposes.

Chris's example of an actual case where it may look like a good idea to use indexing for optimization makes this more obvious IMHO:

Chris Angelico wrote:
> Suppose you have a long title, and you need to abbreviate it by dropping out
> words (delimited by whitespace), such that you keep the first word (always) and
> the last (if possible) and as many as possible in between. How are you going to
> write that? With PEP 393 or UTF-32 strings, you can simply record the index of
> every whitespace you find, count off lengths, and decide what to keep and what
> to ellipsize.

"Recording the index" is where the optimization comes in. With a variable-length encoding - heck, even with a fixed-length one - I'd just use str.split(' ') (or re.split('\\s', string), depending on how much I care about the type of delimiter) and manipulate the list.

If copying into a separate list is a problem (memory-wise), re.finditer('\\S+', string) also provides the same behaviour and gives me the sliced string, so there's no need to index for anything.

The downside is that it isn't as easy to teach as the 1:1 relationship, and currently it doesn't perform as well *in CPython*. But if MicroPython is focusing on size over speed, I don't see any reason why they shouldn't permit different performance characteristics and require a slightly different approach to highly-optimized coding.

In any case, this is an interesting discussion with a genuine effect on the Python interpreter ecosystem. Jython and IronPython already have different string implementations from CPython - having official (and hopefully flexible) guidance on deviations from the reference implementation would I think help other implementations provide even more value, which is only a good thing for Python.

Cheers,
Steve

From pmiscml at gmail.com  Wed Jun  4 17:38:31 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Wed, 4 Jun 2014 18:38:31 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmnb3d$ugn$1@ger.gmane.org>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org>
Message-ID: <20140604183831.7226448c@x34f>

Hello,

On Wed, 04 Jun 2014 17:40:14 +0300
Serhiy Storchaka <storchaka at gmail.com> wrote:

> 04.06.14 17:02, Paul Moore ???????(??):
> > On 4 June 2014 14:39, Serhiy Storchaka <storchaka at gmail.com> wrote:
> >> I think than breaking O(1) expectation for indexing makes the
> >> implementation significant incompatible with Python. Virtually all
> >> string operations in Python operates with indices.
> >
> > I don't use indexing on strings except in rare situations. Sure I
> > use lots of operations that may well use indexing *internally* but
> > that's the point. MicroPython can optimise those operations without
> > needing to guarantee O(1) indexing, and I'd be fine with that.
> 
> Any non-trivial text parsing uses indices or regular expressions (and 
> regular expressions themself use indices internally).

I keep hearing this stuff, and unfortunately so far don't have enough
time to collect all that stuff and provide detailed response. So,
here's spur of the moment response - hopefully we're in the same
context so it is easy to understand.

So, gentlemen, you keep mixing up character-by-character random access
to string and taking substrings of a string.

Character-by-character random access imply that you would need to scan
thru (possibly almost) all chars in a string. That's O(N) (N-length of
string). With varlength encoding (taking O(N) to index arbitrary char),
there's thus concern that this would be O(N^2) op.

But show me real-world case for that. Common usecase is scanning string
left-to-right, that should be done using iterator and thus O(N).
Right-to-left scanning would be order(s) of magnitude less frequent, as
and also handled by iterator.

What's next? You're doing some funky anagrams and need to swap each 2
adjacent chars? Sorry, naive implementation will be slow. If you're in
serious anagram business, you'll need to code C extension. No, wait!
Instead you should learn Python better. You should run a string
windowing iterator which will return adjacent pair and swap those
constant-len strings.

More cases anyone? Implementing DES and doing arbitrary permutations?
Kindly drop doing that on strings, do it on bytes or lists.

Hopefully, the idea is clear - if you *scan* thru string using indexes
in *random* order, you're doing weird thing and *want* weird
performance. Doing stuff is s[0] ot s[-1] - there's finite (and small)
number of such operation per strings.


Now about taking substrings of strings (which in Python often expressed
by slice indexing). Well, this is quite different from scanning each
character of a strings. Just like s[0]/s[-1] this usually happens
finite number of times for a particular string, independent of its
length, i.e. O(1) times (ex, you take a string and split it in 3
parts), or maybe number of substrings is not bound-fixed, but has
different growth order, O(M) (for example, you split string in tokens,
tokens can be long, but there're usually external limits on how many
it's sensible to have on one line).

So, again, you're not going to get quadric time unless you're unlucky
or sloppy. And just again, you should brush up your Python skills and
use regex functions shich return iterators to get your parsed tokens,
etc.

(To clarify the obvious - "you" here is abstract pronoun, not referring
to respectable Python developers who actually made it possible to write
efficient Python programs).


So, hopefully the point is conveyed - you can write inefficient Python
programs. CPython goes out of the way to hide many inefficiencies (using
unbelievably bloated heap usage - from uPy's point of view, which
starts up in 2K heap). You just shouldn't write inefficient programs,
voila. But if you want, you can keep writing inefficient programs, they
just will be inefficient. Peace.

> It would be interesting to collect a statistic about how many
> indexing operations happened during the life of a string in typical
> (Micro)Python program.

Yup.

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From Steve.Dower at microsoft.com  Wed Jun  4 17:51:38 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Wed, 4 Jun 2014 15:51:38 +0000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604183831.7226448c@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
Message-ID: <2a9cca53b0be4b41b425a8239d2dea77@BLUPR03MB389.namprd03.prod.outlook.com>

Paul Sokolovsky wrote:
> You just shouldn't write inefficient programs, voila. But if you want, you can keep writing inefficient programs, they just will be inefficient. Peace.

Can I nominate this for QOTD? :)

Cheers,
Steve

From breamoreboy at yahoo.co.uk  Wed Jun  4 17:52:26 2014
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Wed, 04 Jun 2014 16:52:26 +0100
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <02b9a61658c04b11a21317da5b78bad6@BLUPR03MB389.namprd03.prod.outlook.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <lmn9oo$d20$1@ger.gmane.org>
 <CAPTjJmof+qiELL+GeF90wRiFXkgRDyFo=TQ+fQZwJcgJys+_AQ@mail.gmail.com>
 <20140604174930.3a5af45f@x34f>
 <CAPTjJmp4Q8-Q6=B4BXB_mupv+c3TmjJzvVKNzctAxOYK8FQV9A@mail.gmail.com>
 <02b9a61658c04b11a21317da5b78bad6@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <lmnfbu$oes$1@ger.gmane.org>

On 04/06/2014 16:32, Steve Dower wrote:
>
> If copying into a separate list is a problem (memory-wise), re.finditer('\\S+', string) also provides the same behaviour and gives me the sliced string, so there's no need to index for anything.
>

Out of idle curiosity is there anything that stops MicroPython, or any 
other implementation for that matter, from providing views of a string 
rather than copying every time?  IIRC memoryviews in CPython rely on the 
buffer protocol at the C API level, so since strings don't support this 
protocol you can't take a memoryview of them.  Could this actually be 
implemented in the future, is the underlying C code just too 
complicated, or what?

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com


From pmiscml at gmail.com  Wed Jun  4 17:53:52 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Wed, 4 Jun 2014 18:53:52 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAPTjJmp4Q8-Q6=B4BXB_mupv+c3TmjJzvVKNzctAxOYK8FQV9A@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <lmn9oo$d20$1@ger.gmane.org>
 <CAPTjJmof+qiELL+GeF90wRiFXkgRDyFo=TQ+fQZwJcgJys+_AQ@mail.gmail.com>
 <20140604174930.3a5af45f@x34f>
 <CAPTjJmp4Q8-Q6=B4BXB_mupv+c3TmjJzvVKNzctAxOYK8FQV9A@mail.gmail.com>
Message-ID: <20140604185352.36f52959@x34f>

Hello,

On Thu, 5 Jun 2014 01:00:52 +1000
Chris Angelico <rosuav at gmail.com> wrote:

> On Thu, Jun 5, 2014 at 12:49 AM, Paul Sokolovsky <pmiscml at gmail.com>
> wrote:
> >> > But you need non-ASCII characters to display a title of MP3
> >> > track.
> >
> > Yes, but to display a title, you don't need to do codepoint access
> > at random - you need to either take a block of memory (length in
> > bytes) and do something with it (pass to a C function, transfer
> > over some bus, etc.), or *iterate in order* over codepoints in a
> > string. All these operations are as efficient (O-notation) for
> > UTF-8 as for UTF-32.
> 
> Suppose you have a long title, and you need to abbreviate it by
> dropping out words (delimited by whitespace), such that you keep the
> first word (always) and the last (if possible) and as many as possible
> in between. How are you going to write that? With PEP 393 or UTF-32
> strings, you can simply record the index of every whitespace you find,
> count off lengths, and decide what to keep and what to ellipsize.

I'll submit angry bugreport along the lines of "WWWHAT, it's 3.5 and
there's still no str.isplit()??!!11", then do it with re.finditer()
(while submitting another report on inconsistent naming scheme).

[]

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From songofacandy at gmail.com  Wed Jun  4 18:45:51 2014
From: songofacandy at gmail.com (INADA Naoki)
Date: Thu, 5 Jun 2014 01:45:51 +0900
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <538ECD98.5030309@farowl.co.uk>
References: <20140604011718.GD10355@ando> <538ECD98.5030309@farowl.co.uk>
Message-ID: <CAEfz+TwhAJXpjpvN7AjLhcoWQz4rKiOuibaDefF-AtDFLzYZmA@mail.gmail.com>

For Jython and IronPython, UTF-16 may be best internal encoding.

Recent languages (Swiffy, Golang, Rust) chose UTF-8 as internal encoding.
Using utf-8 is simple and efficient. For example, no need for utf-8
copy of the string when writing to file
and serializing to JSON.

When implementing Python using these languages, UTF-8 will be best
internal encoding.

To allow Python implementations other than CPython can use UTF-8 or
UTF-16 as internal encoding efficiently,
I think adding internal position based API is the best solution.

>>> s = "\U00100000x"
>>> len(s)
2
>>> s[1:]
'x'
>>> s.find('x')
1
>>> # s.isize() # Internal length. 5 for utf-8, 3 for utf-16
>>> # s.ifind('x') # Internal position, 4 for utf-8, 2 for utf-16
>>> # s.islice(s.ifind('x')) => 'x'


(I like design of golang and Rust. I hope CPython uses utf-8 as
internal encoding in the future.
But this is off-topic.)


On Wed, Jun 4, 2014 at 4:41 PM, Jeff Allen <ja.py at farowl.co.uk> wrote:
> Jython uses UTF-16 internally -- probably the only sensible choice in a
> Python that can call Java. Indexing is O(N), fundamentally. By
> "fundamentally", I mean for those strings that have not yet noticed that
> they contain no supplementary (>0xffff) characters.
>
> I've toyed with making this O(1) universally. Like Steven, I understand this
> to be a freedom afforded to implementers, rather than an issue of
> conformity.
>
> Jeff Allen
>
>
> On 04/06/2014 02:17, Steven D'Aprano wrote:
>>
>> There is a discussion over at MicroPython about the internal
>> representation of Unicode strings.
>
> ...
>
>> My own feeling is that O(1) string indexing operations are a quality of
>> implementation issue, not a deal breaker to call it a Python. I can't
>> see any requirement in the docs that str[n] must take O(1) time, but
>> perhaps I have missed something.
>>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com


-- 
INADA Naoki  <songofacandy at gmail.com>

From storchaka at gmail.com  Wed Jun  4 18:49:18 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 04 Jun 2014 19:49:18 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604183831.7226448c@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
Message-ID: <lmnile$6a8$1@ger.gmane.org>

04.06.14 18:38, Paul Sokolovsky ???????(??):
>> Any non-trivial text parsing uses indices or regular expressions (and
>> regular expressions themself use indices internally).
>
> I keep hearing this stuff, and unfortunately so far don't have enough
> time to collect all that stuff and provide detailed response. So,
> here's spur of the moment response - hopefully we're in the same
> context so it is easy to understand.
>
> So, gentlemen, you keep mixing up character-by-character random access
> to string and taking substrings of a string.
>
> Character-by-character random access imply that you would need to scan
> thru (possibly almost) all chars in a string. That's O(N) (N-length of
> string). With varlength encoding (taking O(N) to index arbitrary char),
> there's thus concern that this would be O(N^2) op.
>
> But show me real-world case for that. Common usecase is scanning string
> left-to-right, that should be done using iterator and thus O(N).
> Right-to-left scanning would be order(s) of magnitude less frequent, as
> and also handled by iterator.

html.HTMLParser, json.JSONDecoder, re.compile, tokenize.tokenize don't 
use iterators. They use indices, str.find and/or regular expressions. 
Common use case is quickly find substring starting from current position 
using str.find or re.search, process found token, advance position and 
repeat.


From python at mrabarnett.plus.com  Wed Jun  4 18:52:17 2014
From: python at mrabarnett.plus.com (MRAB)
Date: Wed, 04 Jun 2014 17:52:17 +0100
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CADiSq7f8LrJXuTZCbHV0DU8OKtQVYsSR6MgJXpX+DTpa+bXDwg@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CADiSq7d9A894RB-D9N+aqye7bsrhBXM-cBB=YTHiP3f_2J2apg@mail.gmail.com>
 <20140604053904.GA5309@k2>
 <CADiSq7f8LrJXuTZCbHV0DU8OKtQVYsSR6MgJXpX+DTpa+bXDwg@mail.gmail.com>
Message-ID: <538F4EC1.1030509@mrabarnett.plus.com>

On 2014-06-04 14:33, Nick Coghlan wrote:
> On 4 June 2014 15:39,  <dw+python-dev at hmmz.org> wrote:
>> On Wed, Jun 04, 2014 at 03:17:00PM +1000, Nick Coghlan wrote:
>>
>>> There's a general expectation that indexing will be O(1) because
>>> all the builtin containers that support that syntax use it for
>>> O(1) lookup operations.
>>
>> Depending on your definition of built in, there is at least one
>> standard library container that does not - collections.deque.
>>
>> Given the specialized kinds of application this Python
>> implementation is targetted at, it seems UTF-8 is ideal considering
>> the huge memory savings resulting from the compressed
>> representation, and the reduced likelihood of there being any real
>> need for serious text processing on the device.
>
> Right - I wasn't clear that I think storing text internally as UTF-8
> sounds fine for MicroPython. Anything where the O(N) nature of
> indexing by code point matters probably won't be run in that
> environment anyway.
>
In order to avoid indexing, you could use some kind of 'cursor' class to
step forwards and backwards along strings. The cursor could include
both the codepoint index and the byte index.

From pmiscml at gmail.com  Wed Jun  4 19:05:20 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Wed, 4 Jun 2014 20:05:20 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmnile$6a8$1@ger.gmane.org>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org>
Message-ID: <20140604200520.1d432329@x34f>

Hello,

On Wed, 04 Jun 2014 19:49:18 +0300
Serhiy Storchaka <storchaka at gmail.com> wrote:

[]
> > But show me real-world case for that. Common usecase is scanning
> > string left-to-right, that should be done using iterator and thus
> > O(N). Right-to-left scanning would be order(s) of magnitude less
> > frequent, as and also handled by iterator.
> 
> html.HTMLParser, json.JSONDecoder, re.compile, tokenize.tokenize
> don't use iterators. They use indices, str.find and/or regular
> expressions. Common use case is quickly find substring starting from
> current position using str.find or re.search, process found token,
> advance position and repeat.

That's sad, I agree.


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From storchaka at gmail.com  Wed Jun  4 19:11:11 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 04 Jun 2014 20:11:11 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <538F4EC1.1030509@mrabarnett.plus.com>
References: <20140604011718.GD10355@ando>
 <CADiSq7d9A894RB-D9N+aqye7bsrhBXM-cBB=YTHiP3f_2J2apg@mail.gmail.com>
 <20140604053904.GA5309@k2>
 <CADiSq7f8LrJXuTZCbHV0DU8OKtQVYsSR6MgJXpX+DTpa+bXDwg@mail.gmail.com>
 <538F4EC1.1030509@mrabarnett.plus.com>
Message-ID: <lmnjue$n3j$1@ger.gmane.org>

04.06.14 19:52, MRAB ???????(??):
> In order to avoid indexing, you could use some kind of 'cursor' class to
> step forwards and backwards along strings. The cursor could include
> both the codepoint index and the byte index.

So you need different string library and different regular expression 
library.


From storchaka at gmail.com  Wed Jun  4 19:35:06 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 04 Jun 2014 20:35:06 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604174930.3a5af45f@x34f>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <lmn9oo$d20$1@ger.gmane.org>
 <CAPTjJmof+qiELL+GeF90wRiFXkgRDyFo=TQ+fQZwJcgJys+_AQ@mail.gmail.com>
 <20140604174930.3a5af45f@x34f>
Message-ID: <lmnlb7$82o$1@ger.gmane.org>

04.06.14 17:49, Paul Sokolovsky ???????(??):
> On Thu, 5 Jun 2014 00:26:10 +1000
> Chris Angelico <rosuav at gmail.com> wrote:
>> On Thu, Jun 5, 2014 at 12:17 AM, Serhiy Storchaka
>> <storchaka at gmail.com> wrote:
>>> 04.06.14 10:03, Chris Angelico ???????(??):
>>>> Right, which is why I don't like the idea. But you don't need
>>>> non-ASCII characters to blink an LED or turn a servo, and there is
>>>> significant resistance to the notion that appending a non-ASCII
>>>> character to a long ASCII-only string requires the whole string to
>>>> be copied and doubled in size (lots of heap space used).
>>> But you need non-ASCII characters to display a title of MP3 track.
>
> Yes, but to display a title, you don't need to do codepoint access at
> random - you need to either take a block of memory (length in bytes) and
> do something with it (pass to a C function, transfer over some bus,
> etc.), or *iterate in order* over codepoints in a string. All these
> operations are as efficient (O-notation) for UTF-8 as for UTF-32.

Several previous comments discuss first option, ASCII-only strings.


From storchaka at gmail.com  Wed Jun  4 19:52:14 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 04 Jun 2014 20:52:14 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604200520.1d432329@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
Message-ID: <lmnmbd$n8q$1@ger.gmane.org>

04.06.14 20:05, Paul Sokolovsky ???????(??):
> On Wed, 04 Jun 2014 19:49:18 +0300
> Serhiy Storchaka <storchaka at gmail.com> wrote:
>> html.HTMLParser, json.JSONDecoder, re.compile, tokenize.tokenize
>> don't use iterators. They use indices, str.find and/or regular
>> expressions. Common use case is quickly find substring starting from
>> current position using str.find or re.search, process found token,
>> advance position and repeat.
>
> That's sad, I agree.

Other languages (Go, Rust) can be happy without O(1) indexing of 
strings. All string and regex operations work with iterators or cursors, 
and I believe this approach is not significant worse than implementing 
strings as O(1)-indexable arrays of characters (for some applications it 
can be worse, for other it can be better). But Python is different 
language, it has different operations for strings and different idioms. 
A language which doesn't support O(1) indexing is not Python, it is only 
Python-like language.


From stephen at xemacs.org  Wed Jun  4 19:57:39 2014
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 05 Jun 2014 02:57:39 +0900
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmnb3d$ugn$1@ger.gmane.org>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org>
Message-ID: <87vbsg36cs.fsf@uwakimon.sk.tsukuba.ac.jp>

Serhiy Storchaka writes:

 > It would be interesting to collect a statistic about how many indexing 
 > operations happened during the life of a string in typical (Micro)Python 
 > program.

Probably irrelevant (I doubt anybody is going to be writing
programmers' editors in MicroPython), but by far the most frequently
called functions in XEmacs are byte_to_char_index and its inverse.


From guido at python.org  Wed Jun  4 20:25:51 2014
From: guido at python.org (Guido van Rossum)
Date: Wed, 4 Jun 2014 11:25:51 -0700
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <87vbsg36cs.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <87vbsg36cs.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAP7+vJKPN9+S6t2mtwxqMV3xjNtBNrHPRnsTeSRqm4usXoHpUw@mail.gmail.com>

This thread has devolved into a flame war. I think we should trust the
Micropython implementers (whoever they are -- are they participating here?)
to know their users and let them do what feels right to them. We should
just ask them not to claim full compatibility with any particular Python
version -- that seems the most contentious point. Realistically, most
Python code that works on Python 3.4 won't work on Micropython (for various
reasons, not just the string behavior) and neither does it need to.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140604/0cdf1b6b/attachment.html>

From pmiscml at gmail.com  Wed Jun  4 20:29:29 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Wed, 4 Jun 2014 21:29:29 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmnmbd$n8q$1@ger.gmane.org>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org>
Message-ID: <20140604212929.7a1175f6@x34f>

Hello,

On Wed, 04 Jun 2014 20:52:14 +0300
Serhiy Storchaka <storchaka at gmail.com> wrote:

[]
> > That's sad, I agree.
> 
> Other languages (Go, Rust) can be happy without O(1) indexing of 
> strings. All string and regex operations work with iterators or
> cursors, and I believe this approach is not significant worse than
> implementing strings as O(1)-indexable arrays of characters (for some
> applications it can be worse, for other it can be better). But Python
> is different language, it has different operations for strings and
> different idioms. A language which doesn't support O(1) indexing is
> not Python, it is only Python-like language.

Sorry, but that's just your personal opinion, not shared by other
developers, as this thread showed. And let's not pretend we live in
happy-ever world of Python 1.5.2 which doesn't need anything more
because it's perfect as it is. Somebody added all those iterators and
iterator-returning functions to Pythons. And then the problem Python
has is a typical "last mile" problem, that iterators were not applied
completely everywhere. There's little choice but to move in that
direction, though.

What you call "idioms", other people call "sloppy programming
practices". There's common suggestion how to be at peace with Python's
indentation for those who find it a problem - "get over it". Well,
somehow it itches to say same for people who think that Python3 should
be used the same way as Python1: Get over the fact that Python is no
longer little funny language being laughed at by Perl crowd for being
order of magnitude slower at processing text files. While you still can
do little funny tricks we all love Python for, it now also offers
framework to do it right, and it makes little sense saying that doing it
little funny way is the definitive trait of Python.

(And for me it's easy to be such categorical - the only way I could
subscribe to idea of running Python on an MCU and not be laughable is by
trusting Python to provide framework for being efficient. I quit
working on another language because I have trusted that iterator,
generator, buffer protocols are not little funny things but thoroughly
engineered efficient concepts, and I don't feel betrayed.)


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From stefan_ml at behnel.de  Wed Jun  4 21:14:54 2014
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Wed, 04 Jun 2014 21:14:54 +0200
Subject: [Python-Dev] Should standard library modules optimize for
	CPython?
In-Reply-To: <1437293580423521164.103866sturla.molden-gmail.com@news.gmane.org>
References: <20140601081139.GO10355@ando>
 <CAMpsgwZf6E-emO-B0fsJqV5+1w_Kd-d2DUBj2rwNizG8hHoKiA@mail.gmail.com>
 <CAK5idxQ-foHo1_mtz0iLYeot-E1coaGwjFU67F7XKwHExPzXXQ@mail.gmail.com>
 <lmhjs4$mfs$1@ger.gmane.org>
 <1521177704423500642.020210sturla.molden-gmail.com@news.gmane.org>
 <lmkuv0$3ig$1@ger.gmane.org>
 <1437293580423521164.103866sturla.molden-gmail.com@news.gmane.org>
Message-ID: <lmnr7e$mro$1@ger.gmane.org>

Sturla Molden, 03.06.2014 22:51:
> Stefan Behnel wrote:
>> So the
>> argument in favour is mostly a pragmatic one. If you can have 2-5x faster
>> code essentially for free, why not just go for it?
> 
> I would be easier if the GIL or Cython's use of it was redesigned. Cython
> just grabs the GIL and holds on to it until it is manually released. The
> standard lib cannot have packages that holds the GIL forever, as a Cython
> compiled module would do. Cython has to start sharing access the GIL like
> the interpreter does.

Granted. This shouldn't be all that difficult to add as a special case when
compiling .py (not .pyx) files. Properly tuning it (i.e. avoiding to inject
the GIL release-acquire cycle in the wrong spots) may take a while, but
that can be improved over time.

(It's not required in .pyx files because users should rather explicitly
write "with nogil: pass" there to manually enable thread switches in safe
and desirable places.)

Stefan


From steve at pearwood.info  Wed Jun  4 22:10:40 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 5 Jun 2014 06:10:40 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <02b9a61658c04b11a21317da5b78bad6@BLUPR03MB389.namprd03.prod.outlook.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <lmn9oo$d20$1@ger.gmane.org>
 <CAPTjJmof+qiELL+GeF90wRiFXkgRDyFo=TQ+fQZwJcgJys+_AQ@mail.gmail.com>
 <20140604174930.3a5af45f@x34f>
 <CAPTjJmp4Q8-Q6=B4BXB_mupv+c3TmjJzvVKNzctAxOYK8FQV9A@mail.gmail.com>
 <02b9a61658c04b11a21317da5b78bad6@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <20140604201040.GI10355@ando>

On Wed, Jun 04, 2014 at 03:32:25PM +0000, Steve Dower wrote:
> Steven D'Aprano wrote:
> > The language semantics says that a string is an array of code points. Every
> > index relates to a single code point, no code point extends over two or more
> > indexes.
> > There's a 1:1 relationship between code points and indexes. How is direct
> > indexing "likely to be incorrect"?
> 
> We're discussing the behaviour under a different (hypothetical) design 
> decision than a 1:1 relationship between code points and indexes, so 
> arguing from that stance doesn't make much sense.

I'm open to different implementations. I earlier even suggested that the 
choice of O(1) indexing versus O(N) indexing was a quality of 
implementation issue, not a make-or-break issue for whether something 
can call itself Python (or even 99% compatible with Python").

But I don't believe that exposing that implementation at the Python 
level is valid: regardless of whether it is efficient or not, I should 
be able to write code like this:

a = [mystring[i] for i in range(len(mystring))]
b = list(mystring)
assert a == b

That is not the case if you expose the underlying byte-level 
implementation at the Python level, and treat strings as an array of 
*bytes*. Paul seems to want to do this, or at least he wants Python 4 
to do this. I think it is *completely* inappropriate to do so.

I *think* you may agree with me, (correct me if I'm wrong) because you 
go on to agree with me:

> > e.g.
> > 
> > s = "---?---"
> > offset = s.index('?')
> > assert s[offset] == '?'
> > 
> > That cannot fail with Python's semantics.
> 
> Agreed, and it shouldn't 

but I'm not actually sure.


> (I was actually referring to the optimization 
> being incorrect for the goal, not the language semantics). What you'd 
> probably find is that sizeof('?') == sizeof(s[offset]) == 2, which may 
> be surprising, but is also correct.

You don't seem to be taking about sys.getsizeof, so I guess you're 
talking about something at the C level (or other underlying 
implementation), ignoring the object overhead. I don't know why you 
think I'd find that surprising -- one cannot fit 0x10FFFF Unicode code 
points in a single byte, so whether you use UTF-32, UTF-16, UTF-8, 
Python 3.3's FSR or some other implementation, at least some code points 
are going to use more than one byte.


> But what are you trying to achieve (why are you writing this code)? 
> All this example really shows is that you're only using indexing for 
> trivial purposes.

I'm trying to understand what point you are trying to make, because I'm 
afraid I don't quite get it.


[...]
> If copying into a separate list is a problem (memory-wise), 
> re.finditer('\\S+', string) also provides the same behaviour and gives 
> me the sliced string, so there's no need to index for anything.

finditer returns a bunch of MatchObjects, which give you the indexes 
of the found substring. Whether you do it yourself, or get the re 
module to do it, you're indexing somewhere.


> The downside is that it isn't as easy to teach as the 1:1 
> relationship, and currently it doesn't perform as well *in CPython*. 
> But if MicroPython is focusing on size over speed, I don't see any 
> reason why they shouldn't permit different performance characteristics 
> and require a slightly different approach to highly-optimized coding.

I don't have a problem with different implementations, so long as that 
implementation isn't exposed at the Python level with changes of 
semantics such as breaking the promise that a string is an array of code 
points, not of bytes.

> In any case, this is an interesting discussion with a genuine effect 
> on the Python interpreter ecosystem. Jython and IronPython already 
> have different string implementations from CPython - having official 
> (and hopefully flexible) guidance on deviations from the reference 
> implementation would I think help other implementations provide even 
> more value, which is only a good thing for Python.

Yes, agreed.


-- 
Steven

From v+python at g.nevcal.com  Wed Jun  4 22:50:42 2014
From: v+python at g.nevcal.com (Glenn Linderman)
Date: Wed, 04 Jun 2014 13:50:42 -0700
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <f6ed67ba85af48d89a19e3e0ecf99df5@BLUPR03MB389.namprd03.prod.outlook.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <20140604133857.13a0f0b9@x34f>
 <CAPTjJmp5mqfpZxcROUuLkeTO+xk_WYcEUpoNzMcdzBk-v2tcyg@mail.gmail.com>
 <20140604144933.66e6c2f4@x34f>,
 <CAG8k2+4_ZWXcC1MkWG0gMexc4TENf8XUWi4zaK1bKQwfO==X7Q@mail.gmail.com>
 <f6ed67ba85af48d89a19e3e0ecf99df5@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <538F86A2.4080802@g.nevcal.com>

On 6/4/2014 6:14 AM, Steve Dower wrote:
> I'm agree with Daniel. Directly indexing into text suggests an 
> attempted optimization that is likely to be incorrect for a set of 
> strings. Splitting, regex, concatenation and formatting are really the 
> main operations that matter, and MicroPython can optimize their 
> implementation of these easily enough for O(N) indexing.
>
> Cheers,
> Steve
>
> Top-posted from my Windows Phone
> ------------------------------------------------------------------------
> From: Daniel Holth <mailto:dholth at gmail.com>
> Sent: ?6/?4/?2014 5:17
> To: Paul Sokolovsky <mailto:pmiscml at gmail.com>
> Cc: python-dev <mailto:python-dev at python.org>
> Subject: Re: [Python-Dev] Internal representation of strings and 
> Micropython
>
> If we're voting I think representing Unicode internally in micropython
> as utf-8 with O(N) indexing is a great idea, partly because I'm not
> sure indexing into strings is a good idea - lots of Unicode code
> points don't make sense by themselves; see also grapheme clusters. It
> would probably work great.

I think native UTF-8 support is the most promising route for a 
micropython Unicode support.

It would be an interesting proof-of-concept to implement an alternative 
CPython with PEP-393 replaced by UTF-8 internally... doing conversions 
for APIs that require a different encoding, but always maintaining and 
computing with the UTF-8 representation.

1) The first proof-of-concept implementation should implement codepoint 
indexing as a O(N) operation, searching from the beginning of the string 
for the Nth codepoint.

Other Proof-of-concept implementation could implement a codepoint 
boundary cache, there could be a variety of caching algorithms.

2) (Least space efficient) An array that could be indexed by codepoint 
position and result in byte position. (This would use more space than a 
UTF-32 representation!)

3) (Most space efficient) One cached entry, that caches the last 
codepoint/byte position referenced. UTF-8 is able to be traversed in 
either direction, so "next/previous" codepoint access would be 
relatively fast (and such are very common operations, even when indexing 
notation is used: "for ix in range( len( str_x )): func( str_x[ ix ])".)

4) (Fixed size caches)  N entries, one for the last codepoint, and 
others at Codepoint_Length/N intervals.  N could be tunable.

5) (Fixed size caches)  Like 4, plus an extra entry like 3.

6) (Variable size caches)  Like 2, but only indexing every  Nth code 
point.  N could be tunable.

7) (Variable size caches)  Like 6, plus an extra entry like 3.

8) (Content specific variable size caches)  Index each codepoint that is 
a different byte size than the previous codepoint, allowing indexing to 
be used in the intervals. Worst case size is like 2, best case size is a 
single entry for the end, when all code points are represented by the 
same number of bytes.

9) (Content specific variable size caches)  Like 8, only cache entries 
could indicate fixed or variable size characters in the next interval, 
with a scheme like 4 or 6 used to prevent one interval from covering the 
whole string.

Other hybrid schemes may present themselves as useful once experience is 
gained with some of these. It might be surprising how few algorithms 
need more than algorithm 3 to get reasonable performance.

Glenn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140604/24c3f031/attachment-0001.html>

From pmiscml at gmail.com  Wed Jun  4 23:14:32 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Thu, 5 Jun 2014 00:14:32 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAP7+vJKPN9+S6t2mtwxqMV3xjNtBNrHPRnsTeSRqm4usXoHpUw@mail.gmail.com>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org>
 <87vbsg36cs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAP7+vJKPN9+S6t2mtwxqMV3xjNtBNrHPRnsTeSRqm4usXoHpUw@mail.gmail.com>
Message-ID: <20140605001432.126a0a08@x34f>

Hello,

On Wed, 4 Jun 2014 11:25:51 -0700
Guido van Rossum <guido at python.org> wrote:

> This thread has devolved into a flame war. I think we should trust the
> Micropython implementers (whoever they are -- are they participating
> here?) 

I'm a regular contributor. I'm not sure if the author, Damien George,
is on the list. In either case, he's a nice guy who prefer to do
development rather than participate in flame wars ;-). And for the
record, all opinions expressed are solely mine, and not official
position of MicroPython project.

> to know their users and let them do what feels right to them.
> We should just ask them not to claim full compatibility with any
> particular Python version -- that seems the most contentious point.

"Full" compatibility is never claimed, and understanding it as such is
optimistic, "between the lines" reading of some users. All of:
announcement posted on python-list (which prompted current inflow of
MicroPython-related discussions), README at
https://github.com/micropython/micropython , and detailed differences
doc https://github.com/micropython/micropython/wiki/Differences make it
clear there's no talk about "full" compatibility, and only specific
compatibility (and incompatibility) points are claimed.

That said, and unlike previous attempts to develop a small Python
implementations (which of course existed), we're striving to be exactly
a Python language implementation, not a Python-like language
implementation. As there's no formal, implementation-independent
language spec, what constitutes a compatible language implementation is
subject to opinions, and we welcome and appreciate independent review,
like this thread did.

> Realistically, most Python code that works on Python 3.4 won't work
> on Micropython (for various reasons, not just the string behavior)
> and neither does it need to.

That's true. However, as was said, we're striving to provide a
compatible implementation, and compatibility claims must be validated.
While we have simple "in-house" testsuite, more serious compatibility
validation requires running a testsuite for reference implementation
(CPython), and that's gradually being approached.

> 
> -- 
> --Guido van Rossum (python.org/~guido)


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From tjreedy at udel.edu  Wed Jun  4 23:19:29 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 04 Jun 2014 17:19:29 -0400
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <538ECD98.5030309@farowl.co.uk>
References: <20140604011718.GD10355@ando> <538ECD98.5030309@farowl.co.uk>
Message-ID: <lmo2h5$hut$1@ger.gmane.org>

On 6/4/2014 3:41 AM, Jeff Allen wrote:
> Jython uses UTF-16 internally -- probably the only sensible choice in a
> Python that can call Java. Indexing is O(N), fundamentally. By
> "fundamentally", I mean for those strings that have not yet noticed that
> they contain no supplementary (>0xffff) characters.
>
> I've toyed with making this O(1) universally. Like Steven, I understand
> this to be a freedom afforded to implementers, rather than an issue of
> conformity.
>
> Jeff Allen
>
> On 04/06/2014 02:17, Steven D'Aprano wrote:
>> There is a discussion over at MicroPython about the internal
>> representation of Unicode strings.
> ...
>> My own feeling is that O(1) string indexing operations are a quality of
>> implementation issue, not a deal breaker to call it a Python. I can't
>> see any requirement in the docs that str[n] must take O(1) time, but
>> perhaps I have missed something.
>>
>


-- 
Terry Jan Reedy


From tjreedy at udel.edu  Wed Jun  4 23:21:20 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 04 Jun 2014 17:21:20 -0400
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <538ECD98.5030309@farowl.co.uk>
References: <20140604011718.GD10355@ando> <538ECD98.5030309@farowl.co.uk>
Message-ID: <lmo2kk$hut$2@ger.gmane.org>

On 6/4/2014 3:41 AM, Jeff Allen wrote:
> Jython uses UTF-16 internally -- probably the only sensible choice in a
> Python that can call Java. Indexing is O(N), fundamentally. By
> "fundamentally", I mean for those strings that have not yet noticed that
> they contain no supplementary (>0xffff) characters.

Indexing can be made O(log(k)) where k is the number of astral chars, 
and is usually small.

-- 
Terry Jan Reedy


From rosuav at gmail.com  Wed Jun  4 23:28:00 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 5 Jun 2014 07:28:00 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <538F86A2.4080802@g.nevcal.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <20140604133857.13a0f0b9@x34f>
 <CAPTjJmp5mqfpZxcROUuLkeTO+xk_WYcEUpoNzMcdzBk-v2tcyg@mail.gmail.com>
 <20140604144933.66e6c2f4@x34f>
 <CAG8k2+4_ZWXcC1MkWG0gMexc4TENf8XUWi4zaK1bKQwfO==X7Q@mail.gmail.com>
 <f6ed67ba85af48d89a19e3e0ecf99df5@BLUPR03MB389.namprd03.prod.outlook.com>
 <538F86A2.4080802@g.nevcal.com>
Message-ID: <CAPTjJmpJn0g+PijYZo72oL=LMxOu2E+XqN6upSsio5jcoFsHdw@mail.gmail.com>

On Thu, Jun 5, 2014 at 6:50 AM, Glenn Linderman <v+python at g.nevcal.com> wrote:
> 8) (Content specific variable size caches)  Index each codepoint that is a
> different byte size than the previous codepoint, allowing indexing to be
> used in the intervals. Worst case size is like 2, best case size is a single
> entry for the end, when all code points are represented by the same number
> of bytes.

Conceptually interesting, and I'd love to know how well that'd perform
in real-world usage. Would do very nicely on blocks of text that are
all from the same range of codepoints, but if you intersperse high and
low codepoints it'll be like 2 but with significantly more complicated
lookups (imagine a "name=value\nname=value\n" stream where the names
and values are all in the same language - you'll have a lot of
transitions).

Chrisa

From rdmurray at bitdance.com  Wed Jun  4 23:54:08 2014
From: rdmurray at bitdance.com (R. David Murray)
Date: Wed, 04 Jun 2014 17:54:08 -0400
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140605001432.126a0a08@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <87vbsg36cs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAP7+vJKPN9+S6t2mtwxqMV3xjNtBNrHPRnsTeSRqm4usXoHpUw@mail.gmail.com>
 <20140605001432.126a0a08@x34f>
Message-ID: <20140604215409.3C071250DE3@webabinitio.net>

On Thu, 05 Jun 2014 00:14:32 +0300, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> That said, and unlike previous attempts to develop a small Python
> implementations (which of course existed), we're striving to be exactly
> a Python language implementation, not a Python-like language
> implementation. As there's no formal, implementation-independent
> language spec, what constitutes a compatible language implementation is
> subject to opinions, and we welcome and appreciate independent review,
> like this thread did.

The language reference is also the language specification.  I don't
know what you mean by 'formal', so presumably it doesn't qualify
:)  That said, if there are places that are not correctly marked as
implementation specific, those are bugs in the reference and should
be fixed.  There almost certainly are still such bugs, and I suspect
MicroPython can help us fix them, just as PyPy did/does.

--David

From v+python at g.nevcal.com  Wed Jun  4 23:57:36 2014
From: v+python at g.nevcal.com (Glenn Linderman)
Date: Wed, 04 Jun 2014 14:57:36 -0700
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAPTjJmpJn0g+PijYZo72oL=LMxOu2E+XqN6upSsio5jcoFsHdw@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <20140604133857.13a0f0b9@x34f>
 <CAPTjJmp5mqfpZxcROUuLkeTO+xk_WYcEUpoNzMcdzBk-v2tcyg@mail.gmail.com>
 <20140604144933.66e6c2f4@x34f>
 <CAG8k2+4_ZWXcC1MkWG0gMexc4TENf8XUWi4zaK1bKQwfO==X7Q@mail.gmail.com>
 <f6ed67ba85af48d89a19e3e0ecf99df5@BLUPR03MB389.namprd03.prod.outlook.com>
 <538F86A2.4080802@g.nevcal.com>
 <CAPTjJmpJn0g+PijYZo72oL=LMxOu2E+XqN6upSsio5jcoFsHdw@mail.gmail.com>
Message-ID: <538F9650.106@g.nevcal.com>

On 6/4/2014 2:28 PM, Chris Angelico wrote:
> On Thu, Jun 5, 2014 at 6:50 AM, Glenn Linderman <v+python at g.nevcal.com> wrote:
>> 8) (Content specific variable size caches)  Index each codepoint that is a
>> different byte size than the previous codepoint, allowing indexing to be
>> used in the intervals. Worst case size is like 2, best case size is a single
>> entry for the end, when all code points are represented by the same number
>> of bytes.
> Conceptually interesting, and I'd love to know how well that'd perform
> in real-world usage.

So would I :)

> Would do very nicely on blocks of text that are
> all from the same range of codepoints, but if you intersperse high and
> low codepoints it'll be like 2 but with significantly more complicated
> lookups (imagine a "name=value\nname=value\n" stream where the names
> and values are all in the same language - you'll have a lot of
> transitions).

Lookup is binary search on code point index or a search for same in some 
tree structure, I would think.

"like 2 but ..." well, the data structure would be bigger than for 2, 
but your example shows 4-5 high codepoints per low codepoint (for some 
languages).

I did just think of another refinement to this technique (my list was 
not intended to be all-inclusive... just a bunch of variations I thought 
of then).

10) (Content specific variable size caches) Like 8, but the last 
character in a run is allowed (but not required) to be a different 
number of bytes than prior characters, because the offset calculation 
will still work for the first character of a different size.

So #10 would halve the size of your imagined stream that intersperses 
one low-byte charater with each sequence of high-byte characters.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140604/e70449bc/attachment.html>

From tjreedy at udel.edu  Thu Jun  5 00:04:52 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 04 Jun 2014 18:04:52 -0400
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140605001432.126a0a08@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <87vbsg36cs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAP7+vJKPN9+S6t2mtwxqMV3xjNtBNrHPRnsTeSRqm4usXoHpUw@mail.gmail.com>
 <20140605001432.126a0a08@x34f>
Message-ID: <lmo569$1cq$1@ger.gmane.org>

On 6/4/2014 5:14 PM, Paul Sokolovsky wrote:

> That said, and unlike previous attempts to develop a small Python
> implementations (which of course existed), we're striving to be exactly
> a Python language implementation, not a Python-like language
> implementation. As there's no formal, implementation-independent
> language spec, what constitutes a compatible language implementation is
> subject to opinions, and we welcome and appreciate independent review,
> like this thread did.
>
>> Realistically, most Python code that works on Python 3.4 won't work
>> on Micropython (for various reasons, not just the string behavior)
>> and neither does it need to.
>
> That's true. However, as was said, we're striving to provide a
> compatible implementation, and compatibility claims must be validated.
> While we have simple "in-house" testsuite, more serious compatibility
> validation requires running a testsuite for reference implementation
> (CPython), and that's gradually being approached.

I would call what you are doing a 'Python 3.n subset, with limitations', 
where n should be a specific number, which I would urge should be at 
least 3, if not 4 ('yield from'). To me, that would mean that every 
Micropython program (that does not use a clearly non-Python addon like 
inline assembly) would run the same* on CPython 3.n. Conversely, a 
Python 3.n program should either run the same* on MicroPython as 
CPython, or raise. What most to avoid is giving different* answers.

*'same' does not include timing differences or normal float variations 
or bug fixes in MicroPython not in CPython.

As for unicode: I would see ascii-only (very limited codepoints) or bare 
utf-8 (limited speed == expanded time) as possibly fitting the 
definition above. Just be clear what the limitations are. And accept 
that there will be people who do not bother to read the limitations and 
then complain when they bang into them.

PS. You do not seem to be aware of how well the current PEP393 
implementation works. If you are going to write any more about it, I 
suggest you run Tools/Stringbench/stringbench.py for timings.

-- 
Terry Jan Reedy


From ericsnowcurrently at gmail.com  Thu Jun  5 00:12:23 2014
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 4 Jun 2014 16:12:23 -0600
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140605001432.126a0a08@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org>
 <87vbsg36cs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAP7+vJKPN9+S6t2mtwxqMV3xjNtBNrHPRnsTeSRqm4usXoHpUw@mail.gmail.com>
 <20140605001432.126a0a08@x34f>
Message-ID: <CALFfu7BeeT0K6t9gidsbeRwzPvTby0VBgQ9Sd6H7yh+DdhiGxg@mail.gmail.com>

On Wed, Jun 4, 2014 at 3:14 PM, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> That said, and unlike previous attempts to develop a small Python
> implementations (which of course existed), we're striving to be exactly
> a Python language implementation, not a Python-like language
> implementation. As there's no formal, implementation-independent
> language spec, what constitutes a compatible language implementation is
> subject to opinions, and we welcome and appreciate independent review,
> like this thread did.

Actually, there is a "formal, implementation-independent language spec":

https://docs.python.org/3/reference/

>
>> Realistically, most Python code that works on Python 3.4 won't work
>> on Micropython (for various reasons, not just the string behavior)
>> and neither does it need to.
>
> That's true. However, as was said, we're striving to provide a
> compatible implementation, and compatibility claims must be validated.
> While we have simple "in-house" testsuite, more serious compatibility
> validation requires running a testsuite for reference implementation
> (CPython), and that's gradually being approached.

To a large extent the test suite in
http://hg.python.org/cpython/file/default/Lib/test effectively
validates (full) compliance with the corresponding release (change
"default" to the release branch of your choice).  With that goal, no
small effort has been made to mark implementation-specific tests as
such.  So uPy could consider using the test suite (and explicitly skip
the tests for features that uPy doesn't support).

-eric

From pmiscml at gmail.com  Thu Jun  5 00:52:53 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Thu, 5 Jun 2014 01:52:53 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmo569$1cq$1@ger.gmane.org>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org>
 <87vbsg36cs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAP7+vJKPN9+S6t2mtwxqMV3xjNtBNrHPRnsTeSRqm4usXoHpUw@mail.gmail.com>
 <20140605001432.126a0a08@x34f> <lmo569$1cq$1@ger.gmane.org>
Message-ID: <20140605015253.301e72e7@x34f>

Hello,

On Wed, 04 Jun 2014 18:04:52 -0400
Terry Reedy <tjreedy at udel.edu> wrote:

> On 6/4/2014 5:14 PM, Paul Sokolovsky wrote:
> 
> > That said, and unlike previous attempts to develop a small Python
> > implementations (which of course existed), we're striving to be
> > exactly a Python language implementation, not a Python-like language
> > implementation. As there's no formal, implementation-independent
> > language spec, what constitutes a compatible language
> > implementation is subject to opinions, and we welcome and
> > appreciate independent review, like this thread did.
> >
> >> Realistically, most Python code that works on Python 3.4 won't work
> >> on Micropython (for various reasons, not just the string behavior)
> >> and neither does it need to.
> >
> > That's true. However, as was said, we're striving to provide a
> > compatible implementation, and compatibility claims must be
> > validated. While we have simple "in-house" testsuite, more serious
> > compatibility validation requires running a testsuite for reference
> > implementation (CPython), and that's gradually being approached.
> 
> I would call what you are doing a 'Python 3.n subset, with

Thanks, that's what we call it ourselves in the docs linked in the
original message, and use n=4. Note that being a subset is not a design
requirement, but there's higher-priority requirement of staying lean,
so realistically uPy will always stay a subset.

> limitations', where n should be a specific number, which I would urge
> should be at least 3, if not 4 ('yield from'). To me, that would mean
> that every Micropython program (that does not use a clearly
> non-Python addon like inline assembly) would run the same* on CPython
> 3.n. Conversely, a Python 3.n program should either run the same* on
> MicroPython as CPython, or raise. What most to avoid is giving
> different* answers.

That's nice aim, to implement which we don't have enough resources, so
would appreciate any help from interested parties.

> *'same' does not include timing differences or normal float
> variations or bug fixes in MicroPython not in CPython.
> 
> As for unicode: I would see ascii-only (very limited codepoints) or
> bare utf-8 (limited speed == expanded time) as possibly fitting the 
> definition above. Just be clear what the limitations are. And accept 
> that there will be people who do not bother to read the limitations
> and then complain when they bang into them.
> 
> PS. You do not seem to be aware of how well the current PEP393 
> implementation works. If you are going to write any more about it, I 
> suggest you run Tools/Stringbench/stringbench.py for timings.

"Well" is subjective (or should be defined formally based on the
requirements). With my MicroPython hat on, an implementation which
receives a string, transcodes it, leading to bigger size, just to
immediately transcode back and send out - is awful, environment
unfriendly implementation ;-).


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From storchaka at gmail.com  Thu Jun  5 00:43:59 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Thu, 05 Jun 2014 01:43:59 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmo569$1cq$1@ger.gmane.org>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <87vbsg36cs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAP7+vJKPN9+S6t2mtwxqMV3xjNtBNrHPRnsTeSRqm4usXoHpUw@mail.gmail.com>
 <20140605001432.126a0a08@x34f> <lmo569$1cq$1@ger.gmane.org>
Message-ID: <lmo7eg$q5k$1@ger.gmane.org>

05.06.14 01:04, Terry Reedy ???????(??):
> PS. You do not seem to be aware of how well the current PEP393
> implementation works. If you are going to write any more about it, I
> suggest you run Tools/Stringbench/stringbench.py for timings.

AFAIK stringbench is ASCII-only, so it likely is compatible with current 
and any future MicroPython implementations, but unlikely will expose 
non-ASCII limitations or performance.


From rosuav at gmail.com  Thu Jun  5 01:05:33 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 5 Jun 2014 09:05:33 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140605015253.301e72e7@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org>
 <87vbsg36cs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAP7+vJKPN9+S6t2mtwxqMV3xjNtBNrHPRnsTeSRqm4usXoHpUw@mail.gmail.com>
 <20140605001432.126a0a08@x34f> <lmo569$1cq$1@ger.gmane.org>
 <20140605015253.301e72e7@x34f>
Message-ID: <CAPTjJmpVekGOoFx02rY3EAJ-OWd524B_DGX5t6ahHWi46CK_nA@mail.gmail.com>

On Thu, Jun 5, 2014 at 8:52 AM, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> "Well" is subjective (or should be defined formally based on the
> requirements). With my MicroPython hat on, an implementation which
> receives a string, transcodes it, leading to bigger size, just to
> immediately transcode back and send out - is awful, environment
> unfriendly implementation ;-).

Be careful of confusing correctness and performance, though. The
transcoding you describe is inefficient, but (presumably) correct;
something that's fast but wrong is straight-up buggy. You can always
fix inefficiency in a later release, but buggy behaviour sometimes is
relied on (which is why ECMAScript still exposes UTF-16 to scripts,
and why Windows window messages have a WPARAM and an LPARAM, and why
Python's threading module has duplicate names for a lot of functions,
because it's just not worth changing). I'd be much more comfortable
releasing something where "everything works fine, but if you use
astral characters in your strings, memory usage blows out by a factor
of four" (or "... the len() function takes O(N) time") than one where
"everything works fine as long as you use BMP only, but SMP characters
result in tests failing".

ChrisA

From pmiscml at gmail.com  Thu Jun  5 01:11:10 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Thu, 5 Jun 2014 02:11:10 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CALFfu7BeeT0K6t9gidsbeRwzPvTby0VBgQ9Sd6H7yh+DdhiGxg@mail.gmail.com>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org>
 <87vbsg36cs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAP7+vJKPN9+S6t2mtwxqMV3xjNtBNrHPRnsTeSRqm4usXoHpUw@mail.gmail.com>
 <20140605001432.126a0a08@x34f>
 <CALFfu7BeeT0K6t9gidsbeRwzPvTby0VBgQ9Sd6H7yh+DdhiGxg@mail.gmail.com>
Message-ID: <20140605021110.485c0ca1@x34f>

Hello,

On Wed, 4 Jun 2014 16:12:23 -0600
Eric Snow <ericsnowcurrently at gmail.com> wrote:

> On Wed, Jun 4, 2014 at 3:14 PM, Paul Sokolovsky <pmiscml at gmail.com>
> wrote:
> > That said, and unlike previous attempts to develop a small Python
> > implementations (which of course existed), we're striving to be
> > exactly a Python language implementation, not a Python-like language
> > implementation. As there's no formal, implementation-independent
> > language spec, what constitutes a compatible language
> > implementation is subject to opinions, and we welcome and
> > appreciate independent review, like this thread did.
> 
> Actually, there is a "formal, implementation-independent language
> spec":
> 
> https://docs.python.org/3/reference/

Opening that link in browser, pressing Ctrl+F and pasting your quote
gives zero hits, so it's not exactly what you claim it to be. It's also
pretty far from being formal (unambiguous, covering all choices, etc.)
and comprehensive. Also, please point me at "conformance" section.

That said, all of us Pythoneers treat it as the best formal reference
available, no news here.

> >> Realistically, most Python code that works on Python 3.4 won't work
> >> on Micropython (for various reasons, not just the string behavior)
> >> and neither does it need to.
> >
> > That's true. However, as was said, we're striving to provide a
> > compatible implementation, and compatibility claims must be
> > validated. While we have simple "in-house" testsuite, more serious
> > compatibility validation requires running a testsuite for reference
> > implementation (CPython), and that's gradually being approached.
> 
> To a large extent the test suite in
> http://hg.python.org/cpython/file/default/Lib/test effectively
> validates (full) compliance with the corresponding release (change
> "default" to the release branch of your choice).  With that goal, no
> small effort has been made to mark implementation-specific tests as
> such.  So uPy could consider using the test suite (and explicitly skip
> the tests for features that uPy doesn't support).

That's exactly what we do, per the previous paragraph. And we face a
lot of questionable tests, just like you say. Shameless plug: if anyone
interested to run existing code on MicroPython, please help us with
CPython testsuite! ;-)

> 
> -eric


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From storchaka at gmail.com  Thu Jun  5 00:54:42 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Thu, 05 Jun 2014 01:54:42 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmo2kk$hut$2@ger.gmane.org>
References: <20140604011718.GD10355@ando> <538ECD98.5030309@farowl.co.uk>
 <lmo2kk$hut$2@ger.gmane.org>
Message-ID: <lmo82j$vv$1@ger.gmane.org>

05.06.14 00:21, Terry Reedy ???????(??):
> On 6/4/2014 3:41 AM, Jeff Allen wrote:
>> Jython uses UTF-16 internally -- probably the only sensible choice in a
>> Python that can call Java. Indexing is O(N), fundamentally. By
>> "fundamentally", I mean for those strings that have not yet noticed that
>> they contain no supplementary (>0xffff) characters.
>
> Indexing can be made O(log(k)) where k is the number of astral chars,
> and is usually small.

I like your idea and think it would be great if Jython will implement 
it. Unfortunately it is too late to do this in CPython.


From ericsnowcurrently at gmail.com  Thu Jun  5 02:01:23 2014
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 4 Jun 2014 18:01:23 -0600
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140605021110.485c0ca1@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org>
 <87vbsg36cs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAP7+vJKPN9+S6t2mtwxqMV3xjNtBNrHPRnsTeSRqm4usXoHpUw@mail.gmail.com>
 <20140605001432.126a0a08@x34f>
 <CALFfu7BeeT0K6t9gidsbeRwzPvTby0VBgQ9Sd6H7yh+DdhiGxg@mail.gmail.com>
 <20140605021110.485c0ca1@x34f>
Message-ID: <CALFfu7C-fBjLNri7QfgLLnghWzcrc9qrP-3z20CKjV2dm6K0FQ@mail.gmail.com>

On Wed, Jun 4, 2014 at 5:11 PM, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> On Wed, 4 Jun 2014 16:12:23 -0600
> Eric Snow <ericsnowcurrently at gmail.com> wrote:
>> Actually, there is a "formal, implementation-independent language
>> spec":
>>
>> https://docs.python.org/3/reference/
>
> Opening that link in browser, pressing Ctrl+F and pasting your quote
> gives zero hits, so it's not exactly what you claim it to be. It's also
> pretty far from being formal (unambiguous, covering all choices, etc.)
> and comprehensive. Also, please point me at "conformance" section.
>
> That said, all of us Pythoneers treat it as the best formal reference
> available, no news here.

It's not just the best formal reference.  It's the official
specification.  I agree it is not so "formal" as other language
specifications and it does not enumerate every facet of the language.
However, underspecified parts are worth improving (as we've done with
the import system portion in the last few years).  Incidentally, the
efforts of other Python implementors have often resulted in such
improvements to the language reference.  Those improvements typically
come as a result of questions to this very list. :)  That's
essentially what this email thread is!

-eric

From greg.ewing at canterbury.ac.nz  Thu Jun  5 02:03:17 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 05 Jun 2014 12:03:17 +1200
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmnile$6a8$1@ger.gmane.org>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org>
Message-ID: <538FB3C5.6010104@canterbury.ac.nz>

Serhiy Storchaka wrote:
> html.HTMLParser, json.JSONDecoder, re.compile, tokenize.tokenize don't 
> use iterators. They use indices, str.find and/or regular expressions. 
> Common use case is quickly find substring starting from current position 
> using str.find or re.search, process found token, advance position and 
> repeat.

For that kind of thing, you don't need an actual character
index, just some way of referring to a place in a string.

Instead of an integer, str.find() etc. could return a
StringPosition, which would be an opaque reference to a
particular point in a particular string. You would be
able to pass StringPositions to indexing and slicing
operations to get fast indexing into the string that
they were derived from.

StringPositions could support the following operations:

    StringPosition + int --> StringPosition
    StringPosition - int --> StringPosition
    StringPosition - StringPosition --> int

These would be computed by counting characters forwards
or backwards in the string, which would be slower than
int arithmetic but still faster than counting from the
beginning of the string every time.

In other contexts, StringPositions would coerce to ints
(maybe being an int subclass?) allowing them to be used
in any existing algorithm that slices strings using ints.

-- 
Greg

From greg.ewing at canterbury.ac.nz  Thu Jun  5 02:08:21 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 05 Jun 2014 12:08:21 +1200
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmnmbd$n8q$1@ger.gmane.org>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org>
Message-ID: <538FB4F5.9070500@canterbury.ac.nz>

Serhiy Storchaka wrote:
> A language which doesn't support O(1) indexing is not Python, it is only 
> Python-like language.

That's debatable, but even if it's true, I don't think
there's anything wrong with MicroPython being only a
"Python-like language". As has been pointed out, fitting
Python onto a small device is always going to necessitate
some compromises.

-- 
Greg

From v+python at g.nevcal.com  Thu Jun  5 02:08:33 2014
From: v+python at g.nevcal.com (Glenn Linderman)
Date: Wed, 04 Jun 2014 17:08:33 -0700
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <538FB3C5.6010104@canterbury.ac.nz>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <538FB3C5.6010104@canterbury.ac.nz>
Message-ID: <538FB501.2040601@g.nevcal.com>

On 6/4/2014 5:03 PM, Greg Ewing wrote:
> Serhiy Storchaka wrote:
>> html.HTMLParser, json.JSONDecoder, re.compile, tokenize.tokenize 
>> don't use iterators. They use indices, str.find and/or regular 
>> expressions. Common use case is quickly find substring starting from 
>> current position using str.find or re.search, process found token, 
>> advance position and repeat.
>
> For that kind of thing, you don't need an actual character
> index, just some way of referring to a place in a string.

I think you meant codepoint index, rather than character index.

>
> Instead of an integer, str.find() etc. could return a
> StringPosition, which would be an opaque reference to a
> particular point in a particular string. You would be
> able to pass StringPositions to indexing and slicing
> operations to get fast indexing into the string that
> they were derived from.
>
> StringPositions could support the following operations:
>
>    StringPosition + int --> StringPosition
>    StringPosition - int --> StringPosition
>    StringPosition - StringPosition --> int
>
> These would be computed by counting characters forwards
> or backwards in the string, which would be slower than
> int arithmetic but still faster than counting from the
> beginning of the string every time.
>
> In other contexts, StringPositions would coerce to ints
> (maybe being an int subclass?) allowing them to be used
> in any existing algorithm that slices strings using ints.
>
This starts to diverge from Python codepoint indexing via integers. 
Calculating or caching the codepoint index to byte offset as part of the 
str implementation stays compatible with Python. Introducing 
StringPosition makes a Python-like language. Or so it seems to me.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140604/da1c44c4/attachment.html>

From v+python at g.nevcal.com  Thu Jun  5 02:13:37 2014
From: v+python at g.nevcal.com (Glenn Linderman)
Date: Wed, 04 Jun 2014 17:13:37 -0700
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <538FB501.2040601@g.nevcal.com>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <538FB3C5.6010104@canterbury.ac.nz>
 <538FB501.2040601@g.nevcal.com>
Message-ID: <538FB631.4000401@g.nevcal.com>

On 6/4/2014 5:08 PM, Glenn Linderman wrote:
> On 6/4/2014 5:03 PM, Greg Ewing wrote:
>> Serhiy Storchaka wrote:
>>> html.HTMLParser, json.JSONDecoder, re.compile, tokenize.tokenize 
>>> don't use iterators. They use indices, str.find and/or regular 
>>> expressions. Common use case is quickly find substring starting from 
>>> current position using str.find or re.search, process found token, 
>>> advance position and repeat.
>>
>> For that kind of thing, you don't need an actual character
>> index, just some way of referring to a place in a string.
>
> I think you meant codepoint index, rather than character index.
>
>>
>> Instead of an integer, str.find() etc. could return a
>> StringPosition, which would be an opaque reference to a
>> particular point in a particular string. You would be
>> able to pass StringPositions to indexing and slicing
>> operations to get fast indexing into the string that
>> they were derived from.
>>
>> StringPositions could support the following operations:
>>
>>    StringPosition + int --> StringPosition
>>    StringPosition - int --> StringPosition
>>    StringPosition - StringPosition --> int
>>
>> These would be computed by counting characters forwards
>> or backwards in the string, which would be slower than
>> int arithmetic but still faster than counting from the
>> beginning of the string every time.
>>
>> In other contexts, StringPositions would coerce to ints
>> (maybe being an int subclass?) allowing them to be used
>> in any existing algorithm that slices strings using ints.
>>
> This starts to diverge from Python codepoint indexing via integers. 
> Calculating or caching the codepoint index to byte offset as part of 
> the str implementation stays compatible with Python. Introducing 
> StringPosition makes a Python-like language. Or so it seems to me.

Another thought is that StringPosition only works (quickly, at least), 
as you point out, for the string that they were derived from... so 
algorithms that walk two strings at a time cannot use the same 
StringPosition to do so... yep, this is quite divergent from CPython and 
Python.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140604/45a0203d/attachment.html>

From greg.ewing at canterbury.ac.nz  Thu Jun  5 02:52:03 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 05 Jun 2014 12:52:03 +1200
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <538FB501.2040601@g.nevcal.com>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <538FB3C5.6010104@canterbury.ac.nz>
 <538FB501.2040601@g.nevcal.com>
Message-ID: <538FBF33.6020607@canterbury.ac.nz>

Glenn Linderman wrote:
> 
>> For that kind of thing, you don't need an actual character
>> index, just some way of referring to a place in a string.
> 
> I think you meant codepoint index, rather than character index.

Probably, but what I said is true either way.

> This starts to diverge from Python codepoint indexing via integers.

That's true, although most programs would have to go
out of their way to tell the difference, especially if
StringPosition were a subclass of int.

I agree that cacheing indexes would be more transparent,
though.

-- 
Greg

From greg.ewing at canterbury.ac.nz  Thu Jun  5 02:57:16 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 05 Jun 2014 12:57:16 +1200
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <538FB631.4000401@g.nevcal.com>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <538FB3C5.6010104@canterbury.ac.nz>
 <538FB501.2040601@g.nevcal.com> <538FB631.4000401@g.nevcal.com>
Message-ID: <538FC06C.2070802@canterbury.ac.nz>

Glenn Linderman wrote:
> 
> so algorithms that walk two strings at a time cannot use the same 
> StringPosition to do so... yep, this is quite divergent from CPython and 
> Python.

They can, it's just that at most one of the indexing
operations would be fast; the StringPosition would
devolve into an int for the other one.

Such an algorithm would be of dubious correctness
anyway, since as you pointed out, codepoints and
characters are not quite the same thing. A codepoint
index in one string doesn't necessarily count off
the same number of characters in another string.
So to be safe, you should really walk each string
individually.

-- 
Greg

From pmiscml at gmail.com  Thu Jun  5 03:01:38 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Thu, 5 Jun 2014 04:01:38 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <538FB3C5.6010104@canterbury.ac.nz>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <538FB3C5.6010104@canterbury.ac.nz>
Message-ID: <20140605040138.4e5a944f@x34f>

Hello,

On Thu, 05 Jun 2014 12:03:17 +1200
Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:

> Serhiy Storchaka wrote:
> > html.HTMLParser, json.JSONDecoder, re.compile, tokenize.tokenize
> > don't use iterators. They use indices, str.find and/or regular
> > expressions. Common use case is quickly find substring starting
> > from current position using str.find or re.search, process found
> > token, advance position and repeat.
> 
> For that kind of thing, you don't need an actual character
> index, just some way of referring to a place in a string.
> 
> Instead of an integer, str.find() etc. could return a
> StringPosition, 

That's more brave then I had in mind, but definitely shows what
alternative implementation have in store to fight back if some
perfomance problems are actually detected. My own thoughts were, for
example, as response to people who (quoting) "slice strings for living"
is some form of "extended slicing" like str[(0, 4, 6, 8, 15)].

But I really think that providing iterator interface for common string
operations would cover most of real-world cases, and will be actually
beneficial for Python language in general.

> 
> -- 
> Greg


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From rosuav at gmail.com  Thu Jun  5 03:17:04 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 5 Jun 2014 11:17:04 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <538FB3C5.6010104@canterbury.ac.nz>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <538FB3C5.6010104@canterbury.ac.nz>
Message-ID: <CAPTjJmrJU4FJHfxNsT5e2SZDih4n_G-1xVrFB9Nh8KCC_82zuA@mail.gmail.com>

On Thu, Jun 5, 2014 at 10:03 AM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> StringPositions could support the following operations:
>
>    StringPosition + int --> StringPosition
>    StringPosition - int --> StringPosition
>    StringPosition - StringPosition --> int
>
> These would be computed by counting characters forwards
> or backwards in the string, which would be slower than
> int arithmetic but still faster than counting from the
> beginning of the string every time.

The SP would have to keep track of which string it's associated with,
which might make for some surprising retentions of large strings.
(Imagine returning what you think is an integer, but actually turns
out to be a SP, and you're trying to work out why your program is
eating up so much more memory than it should. This int-like object is
so much more than an int.)

ChrisA

From pmiscml at gmail.com  Thu Jun  5 03:19:13 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Thu, 5 Jun 2014 04:19:13 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <538FB4F5.9070500@canterbury.ac.nz>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
Message-ID: <20140605041913.14886264@x34f>

Hello,

On Thu, 05 Jun 2014 12:08:21 +1200
Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:

> Serhiy Storchaka wrote:
> > A language which doesn't support O(1) indexing is not Python, it is
> > only Python-like language.
> 
> That's debatable, but even if it's true, I don't think
> there's anything wrong with MicroPython being only a
> "Python-like language". As has been pointed out, fitting
> Python onto a small device is always going to necessitate
> some compromises.

Thanks. I mentioned in another mail that we exactly trying to develop a
minimalistic, but Python implementation, not Python-like language.

What is "Python-like" for me. The other most well-know, and mature (as
in "started quite some time ago") "small Python" implementation is
PyMite aka Python-on-a-chip
https://code.google.com/p/python-on-a-chip/ . It implements good deal
of Python2 language. It doesn't implement exception handling
(try/except). Can a Python be without exception handling? For me,
the clear answer is "no".

Please put that in perspective when alarming over O(1) indexing of
inherently problematic niche datatype. (Again, it's not my or
MicroPython's fault that it was forced as standard string type. Maybe
if CPython seriously considered now-standard UTF-8 encoding, results
of what is "str" type might be different. But CPython has gigabytes of
heap to spare, and for MicroPython, every half-bit is precious).


> 
> -- 
> Greg
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/pmiscml%40gmail.com


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From tjreedy at udel.edu  Thu Jun  5 04:15:30 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 04 Jun 2014 22:15:30 -0400
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140605015253.301e72e7@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <87vbsg36cs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAP7+vJKPN9+S6t2mtwxqMV3xjNtBNrHPRnsTeSRqm4usXoHpUw@mail.gmail.com>
 <20140605001432.126a0a08@x34f> <lmo569$1cq$1@ger.gmane.org>
 <20140605015253.301e72e7@x34f>
Message-ID: <lmojs8$1te$1@ger.gmane.org>

On 6/4/2014 6:52 PM, Paul Sokolovsky wrote:

> "Well" is subjective (or should be defined formally based on the
> requirements). With my MicroPython hat on, an implementation which
> receives a string, transcodes it, leading to bigger size, just to
> immediately transcode back and send out - is awful, environment
> unfriendly implementation ;-).

I am not sure what you concretely mean by 'receive a string', but I 
think you are again batting at a strawman. If you mean 'read from a 
file', and all you want to do is read bytes from and write bytes to 
external 'files', then there is obviously no need to transcode and 
neither Python 2 or 3 make you do so.

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Thu Jun  5 04:25:03 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 04 Jun 2014 22:25:03 -0400
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmo82j$vv$1@ger.gmane.org>
References: <20140604011718.GD10355@ando> <538ECD98.5030309@farowl.co.uk>
 <lmo2kk$hut$2@ger.gmane.org> <lmo82j$vv$1@ger.gmane.org>
Message-ID: <lmoke5$7ii$1@ger.gmane.org>

On 6/4/2014 6:54 PM, Serhiy Storchaka wrote:
> 05.06.14 00:21, Terry Reedy ???????(??):
>> On 6/4/2014 3:41 AM, Jeff Allen wrote:
>>> Jython uses UTF-16 internally -- probably the only sensible choice in a
>>> Python that can call Java. Indexing is O(N), fundamentally. By
>>> "fundamentally", I mean for those strings that have not yet noticed that
>>> they contain no supplementary (>0xffff) characters.
>>
>> Indexing can be made O(log(k)) where k is the number of astral chars,
>> and is usually small.
>
> I like your idea and think it would be great if Jython will implement
> it.

A proof of concept implementation in Python that handles both indexing 
and slicing is on the tracker. It is simpler than I initially expected.

 > Unfortunately it is too late to do this in CPython.

I mentioned it as an alternative during the '393 discussion. I more than 
half agree that the FSR is the better choice for CPython, which had no 
particular attachment to UTF-16 in the way that I think Jython, for 
instance, does.

-- 
Terry Jan Reedy


From stephen at xemacs.org  Thu Jun  5 09:00:01 2014
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 05 Jun 2014 16:00:01 +0900
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <538F86A2.4080802@g.nevcal.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <20140604133857.13a0f0b9@x34f>
 <CAPTjJmp5mqfpZxcROUuLkeTO+xk_WYcEUpoNzMcdzBk-v2tcyg@mail.gmail.com>
 <20140604144933.66e6c2f4@x34f>
 <CAG8k2+4_ZWXcC1MkWG0gMexc4TENf8XUWi4zaK1bKQwfO==X7Q@mail.gmail.com>
 <f6ed67ba85af48d89a19e3e0ecf99df5@BLUPR03MB389.namprd03.prod.outlook.com>
 <538F86A2.4080802@g.nevcal.com>
Message-ID: <87ppin3kpa.fsf@uwakimon.sk.tsukuba.ac.jp>

Glenn Linderman writes:

 > 3) (Most space efficient) One cached entry, that caches the last 
 > codepoint/byte position referenced. UTF-8 is able to be traversed in 
 > either direction, so "next/previous" codepoint access would be 
 > relatively fast (and such are very common operations, even when indexing 
 > notation is used: "for ix in range( len( str_x )): func( str_x[ ix ])".)

Been there, tried that (Emacsen).  Either it's a YAGNI (moving forward
or backward over UTF-8 by characters short distances is plenty fast,
especially if you've got a lot of ASCII you can move by words for
somewhat longer distances), or it's not good enough.  There *may* be a
sweet spot, but it's definitely smaller than the one on Sharapova's
racket.

 > 4) (Fixed size caches)  N entries, one for the last codepoint, and 
 > others at Codepoint_Length/N intervals.  N could be tunable.

To achieve space saving, cache has to be quite small, and the bigger
your integers, the smaller it gets.  A naive implementation on 64-bit
machine would give you 16 bytes/cache entry.  Using a non-native size
will be a space win, but needs care in implementation.  Initializing
the cache is very expensive for small strings, so you need conditional
and maybe lazy initialization (for large strings).

By the way, there's also

10) Keep counts of the leading and trailing number of ASCII
    (one-octet) characters.  This is often a *huge* win; it's quite
    common to encounter documents where size - lc - tc = 2 (ie,
    there's only one two-octet character in the document).

11) Keep a list (or tree) of most-recently-accessed positions.

Despite my negative experience with multibyte encodings in Emacsen,
I'm persuaded by the arguments that there probably aren't all that
many places in core Python where indexing is used in an essential way,
so MicroPython itself can probably optimize those "behind the
scenes".  Application programmers in the embedded context may be
expected to be deal with the need to avoid random access algorithms
and use iterators and generators to accomplish most tasks.


From storchaka at gmail.com  Thu Jun  5 09:26:19 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Thu, 05 Jun 2014 10:26:19 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <538FB4F5.9070500@canterbury.ac.nz>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
Message-ID: <lmp6r9$ugv$1@ger.gmane.org>

05.06.14 03:08, Greg Ewing ???????(??):
> Serhiy Storchaka wrote:
>> A language which doesn't support O(1) indexing is not Python, it is
>> only Python-like language.
>
> That's debatable, but even if it's true, I don't think
> there's anything wrong with MicroPython being only a
> "Python-like language". As has been pointed out, fitting
> Python onto a small device is always going to necessitate
> some compromises.

Agree, there's anything wrong. I think that even limiting integers to 32 
or 64 bits is acceptable compromise for Python-like language targeted to 
small devices. But programming on such language requires different 
techniques and habits.


From storchaka at gmail.com  Thu Jun  5 09:39:18 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Thu, 05 Jun 2014 10:39:18 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <538FB3C5.6010104@canterbury.ac.nz>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <538FB3C5.6010104@canterbury.ac.nz>
Message-ID: <lmp6rr$urq$1@ger.gmane.org>

05.06.14 03:03, Greg Ewing ???????(??):
> Serhiy Storchaka wrote:
>> html.HTMLParser, json.JSONDecoder, re.compile, tokenize.tokenize don't
>> use iterators. They use indices, str.find and/or regular expressions.
>> Common use case is quickly find substring starting from current
>> position using str.find or re.search, process found token, advance
>> position and repeat.
>
> For that kind of thing, you don't need an actual character
> index, just some way of referring to a place in a string.

Of course. But _existing_ Python interfaces all work with indices. And 
it is too late to change this, this train was gone 20 years ago.

There is no need in yet one way to do string operations. One obvious way 
is enough.


From storchaka at gmail.com  Thu Jun  5 09:54:03 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Thu, 05 Jun 2014 10:54:03 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <538F86A2.4080802@g.nevcal.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <20140604133857.13a0f0b9@x34f>
 <CAPTjJmp5mqfpZxcROUuLkeTO+xk_WYcEUpoNzMcdzBk-v2tcyg@mail.gmail.com>
 <20140604144933.66e6c2f4@x34f>,
 <CAG8k2+4_ZWXcC1MkWG0gMexc4TENf8XUWi4zaK1bKQwfO==X7Q@mail.gmail.com>
 <f6ed67ba85af48d89a19e3e0ecf99df5@BLUPR03MB389.namprd03.prod.outlook.com>
 <538F86A2.4080802@g.nevcal.com>
Message-ID: <lmp7lt$8og$1@ger.gmane.org>

04.06.14 23:50, Glenn Linderman ???????(??):
> 3) (Most space efficient) One cached entry, that caches the last
> codepoint/byte position referenced. UTF-8 is able to be traversed in
> either direction, so "next/previous" codepoint access would be
> relatively fast (and such are very common operations, even when indexing
> notation is used: "for ix in range( len( str_x )): func( str_x[ ix ])".)

Great idea! It should cover most real-word cases. Note that we can scan 
UTF-8 string left-to-right and right-to-left.


From stephen at xemacs.org  Thu Jun  5 09:54:11 2014
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 05 Jun 2014 16:54:11 +0900
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140605041913.14886264@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
Message-ID: <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>

Paul Sokolovsky writes:

 > Please put that in perspective when alarming over O(1) indexing of
 > inherently problematic niche datatype. (Again, it's not my or
 > MicroPython's fault that it was forced as standard string type. Maybe
 > if CPython seriously considered now-standard UTF-8 encoding, results
 > of what is "str" type might be different. But CPython has gigabytes of
 > heap to spare, and for MicroPython, every half-bit is precious).

Would you please stop trolling?  The reasons for adopting Unicode as a
separate data type were good and sufficient in 2000, and they remain
so today, even if you have been fortunate enough not to burn yourself
on character-byte conflation yet.

What matters to you is that str (unicode) is an opaque type -- there
is no specification of the internal representation in the language
reference, and in fact several different ones coexist happily across
existing Python implementations -- and you're free to use a UTF-8
implementation if that suits the applications you expect for
MicroPython.

PEP 393 exists, of course, and specifies the current internal
representation for CPython 3.  But I don't see anything in it that
suggests it's mandated for any other implementation.


From storchaka at gmail.com  Thu Jun  5 10:08:21 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Thu, 05 Jun 2014 11:08:21 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmoke5$7ii$1@ger.gmane.org>
References: <20140604011718.GD10355@ando> <538ECD98.5030309@farowl.co.uk>
 <lmo2kk$hut$2@ger.gmane.org> <lmo82j$vv$1@ger.gmane.org>
 <lmoke5$7ii$1@ger.gmane.org>
Message-ID: <lmp8gk$i0c$1@ger.gmane.org>

05.06.14 05:25, Terry Reedy ???????(??):
> I mentioned it as an alternative during the '393 discussion. I more than
> half agree that the FSR is the better choice for CPython, which had no
> particular attachment to UTF-16 in the way that I think Jython, for
> instance, does.

Yes, I remember. I thing that hybrid FSR-UTF16 (like FSR, but UTF-16 is 
used instead of UCS4) is the better choice for CPython. I suppose that 
with populating emoticons and other icon characters in nearest 5 or 10 
years, even English text will often contain astral characters. And 
spending 4 bytes per character if long text contains one astral 
character looks too prodigally.


From stephen at xemacs.org  Thu Jun  5 12:00:01 2014
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 05 Jun 2014 19:00:01 +0900
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmp8gk$i0c$1@ger.gmane.org>
References: <20140604011718.GD10355@ando> <538ECD98.5030309@farowl.co.uk>
 <lmo2kk$hut$2@ger.gmane.org> <lmo82j$vv$1@ger.gmane.org>
 <lmoke5$7ii$1@ger.gmane.org> <lmp8gk$i0c$1@ger.gmane.org>
Message-ID: <87ha3z3cda.fsf@uwakimon.sk.tsukuba.ac.jp>

Serhiy Storchaka writes:

 > Yes, I remember. I thing that hybrid FSR-UTF16 (like FSR, but UTF-16 is 
 > used instead of UCS4) is the better choice for CPython. I suppose that 
 > with populating emoticons and other icon characters in nearest 5 or 10 
 > years, even English text will often contain astral characters. And 
 > spending 4 bytes per character if long text contains one astral 
 > character looks too prodigally.

Why use something that complex if you don't have to?  For the use case
you have in mind, just map them into private space.  If you really
want to be aggressive, use surrogate space, too (anything that cares
what a scalar represents should be trapping on non-scalars, catch that
exception and look up the char -- dangerous, though, because such
exceptions are probably all over the place).


From victor.stinner at gmail.com  Thu Jun  5 12:03:15 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Thu, 5 Jun 2014 12:03:15 +0200
Subject: [Python-Dev] Request: new "Asyncio" component on the bug tracker
Message-ID: <CAMpsgwZBzDfnJQ31S4QoubvWnmjRhFMxs1L-LGtAnBGE14anRg@mail.gmail.com>

Hi,

Would it be possible to add a new "Asyncio" component on
bugs.python.org? If this component is selected, the default nosy list
for asyncio would be used (guido, yury and me, there is already such
list in the nosy list completion).

Full text search for "asyncio" returns too many results.

Victor

From pmiscml at gmail.com  Thu Jun  5 12:10:39 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Thu, 5 Jun 2014 13:10:39 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmojs8$1te$1@ger.gmane.org>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org>
 <87vbsg36cs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAP7+vJKPN9+S6t2mtwxqMV3xjNtBNrHPRnsTeSRqm4usXoHpUw@mail.gmail.com>
 <20140605001432.126a0a08@x34f> <lmo569$1cq$1@ger.gmane.org>
 <20140605015253.301e72e7@x34f> <lmojs8$1te$1@ger.gmane.org>
Message-ID: <20140605131039.4f5b74d6@x34f>

Hello,

On Wed, 04 Jun 2014 22:15:30 -0400
Terry Reedy <tjreedy at udel.edu> wrote:

> On 6/4/2014 6:52 PM, Paul Sokolovsky wrote:
> 
> > "Well" is subjective (or should be defined formally based on the
> > requirements). With my MicroPython hat on, an implementation which
> > receives a string, transcodes it, leading to bigger size, just to
> > immediately transcode back and send out - is awful, environment
> > unfriendly implementation ;-).
> 
> I am not sure what you concretely mean by 'receive a string', but I 

I (surely) mean an abstract input (as an Input/Output aka I/O)
operation.

> think you are again batting at a strawman. If you mean 'read from a 
> file', and all you want to do is read bytes from and write bytes to 
> external 'files', then there is obviously no need to transcode and 
> neither Python 2 or 3 make you do so.

But most files, network protocols are text-based, and I (and many other
people) don't want to artificially use "binary data" type for them,
with all attached funny things, like "b" prefix. And then Python2
indeed doesn't transcode anything, and Python3 does, without being
asked, and for no good purpose, because in most cases, Input data will
be Output as-is (maybe in byte-boundary-split chunks).

So, it all goes in rounds - ignoring the forced-Unicode problem (after a
week of subscription to python-list, half of traffic there appear to be
dedicated to Unicode-related flames) on python-dev behalf is not
going to help (Python community).

[]


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From pmiscml at gmail.com  Thu Jun  5 13:25:28 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Thu, 5 Jun 2014 14:25:28 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20140605142528.39e0e5fc@x34f>

Hello,

On Thu, 05 Jun 2014 16:54:11 +0900
"Stephen J. Turnbull" <stephen at xemacs.org> wrote:

> Paul Sokolovsky writes:
> 
>  > Please put that in perspective when alarming over O(1) indexing of
>  > inherently problematic niche datatype. (Again, it's not my or
>  > MicroPython's fault that it was forced as standard string type.
>  > Maybe if CPython seriously considered now-standard UTF-8 encoding,
>  > results of what is "str" type might be different. But CPython has
>  > gigabytes of heap to spare, and for MicroPython, every half-bit is
>  > precious).
> 
> Would you please stop trolling?  The reasons for adopting Unicode as a
> separate data type were good and sufficient in 2000, and they remain

If it was kept at "separate data type" bay, there wouldn't be any
problem. But it was made "one and only string type", and all strife
started then.

And there going to be "trolling" as long as Python developers and
decision-makers will ignore (troll?) outcry from the community (again, I
was surprised and not surprised to see ~50% of traffic on python-list
touches Unicode issues). 

Well, I understand the plan - hoping that people will "get over this".
And I'm personally happy to stay away from this "trolling", but any
discussion related to Unicode goes in circles and returns to feeling
that Unicode at the central role as put there by Python3 is misplaced.

Then for me, it's just a matter of job security and personal future - I
don't want to spend rest of my days as a javascript (or other idiotic
language) monkey. And the message is clear in the air
(http://lucumr.pocoo.org/2014/5/12/everything-about-unicode/ and
elsewhere): if Python strings are now in Go, and in Python itself are
now Java strings, all causing strife, why not go cruising around and see
what's up, instead of staying strong, and growing bigger, community.

> so today, even if you have been fortunate enough not to burn yourself
> on character-byte conflation yet.
> 
> What matters to you is that str (unicode) is an opaque type -- there
> is no specification of the internal representation in the language
> reference, and in fact several different ones coexist happily across
> existing Python implementations -- and you're free to use a UTF-8
> implementation if that suits the applications you expect for
> MicroPython.
> 
> PEP 393 exists, of course, and specifies the current internal
> representation for CPython 3.  But I don't see anything in it that
> suggests it's mandated for any other implementation.

I knew all this before very well. What's strange is that other
developers don't know, or treat seriously, all of the above. That's why
gentleman who kindly was interested in adding Unicode support to
MicroPython started with the idea of dragging in CPython implementation.
And the only effect persuasion that it's not necessarily the best
solution had, was that he started to feel that he's being manipulated
into writing something ugly, instead of the bright idea he had.

That's why another gentleman reduces it to: "O(1) on string indexing or
not a Python!". 

And that's why another gentleman, who agrees to UTF-8 arguments, still
gives an excuse
(https://mail.python.org/pipermail/python-dev/2014-June/134727.html):
"In this context, while a fixed-width encoding may be the correct
choice it would also likely be the wrong choice."


In this regard, I'm glad to participate in mind-resetting discussion.
So, let's reiterate - there's nothing like "the best", "the only right",
"the only correct", "righter than", "more correct than" in CPython's
implementation of Unicode storage. It is *arbitrary*. Well, sure, it's
not arbitrary, but based on requirements, and these requirements match
CPython's (implied) usage model well enough. But among all possible
sets of requirements, CPython's requirements are no more valid that
other possible. And other set of requirement fairly clearly lead to
situation where CPython implementation is rejected as not correct for
those requirements at all.


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From ncoghlan at gmail.com  Thu Jun  5 13:32:19 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 5 Jun 2014 21:32:19 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CADiSq7eiR+vADiaHEytxwKFoZBPRp-x7Pf9W9s=0WCqOOqSj+w@mail.gmail.com>

On 5 June 2014 17:54, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> What matters to you is that str (unicode) is an opaque type -- there
> is no specification of the internal representation in the language
> reference, and in fact several different ones coexist happily across
> existing Python implementations -- and you're free to use a UTF-8
> implementation if that suits the applications you expect for
> MicroPython.

However, as others have noted in the thread, the critical thing is to
*not* let that internal implementation detail leak into the Python
level string behaviour. That's what happened with narrow builds of
Python 2 and pre-PEP-393 releases of Python 3 (effectively using
UTF-16 internally), and it was the cause of a sufficiently large
number of bugs that the Linux distributions tend to instead accept the
memory cost of using wide builds (4 bytes for all code points) for
affected versions.

Preserving the "the Python 3 str type is an immutable array of code
points" semantics matters significantly more than whether or not
indexing by code point is O(1). The various caching tricks suggested
in this thread (especially "leading ASCII characters", "trailing ASCII
characters" and "position & index of last lookup") could keep the
typical lookup performance well below O(N).

> PEP 393 exists, of course, and specifies the current internal
> representation for CPython 3.  But I don't see anything in it that
> suggests it's mandated for any other implementation.

CPython is constrained by C API compatibility requirements, as well as
implementation constraints due to the amount of internal code that
would need to be rewritten to handle a variable width encoding as the
canonical internal representation (since the problems with Python 2
narrow builds mean we already know variable width encodings aren't
handled correctly by the current code).

Implementations that share code with CPython, or try to mimic the C
API especially closely, may face similar restrictions. Outside that, I
think we're better off if alternative implementations are free to
experiment with different internal string representations.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Thu Jun  5 13:43:16 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 5 Jun 2014 21:43:16 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140605142528.39e0e5fc@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
Message-ID: <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>

On 5 June 2014 21:25, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> Well, I understand the plan - hoping that people will "get over this".
> And I'm personally happy to stay away from this "trolling", but any
> discussion related to Unicode goes in circles and returns to feeling
> that Unicode at the central role as put there by Python3 is misplaced.

Many of the challenges network programmers face in Python 3 are around
binary data being more inconvenient to work with than it needs to be,
not the fact we decentralised boundary code by offering a strict
binary/text separation as the default mode of operation. Aside from
some of the POSIX locale handling issues on Linux, many of the
concerns are with the usability of bytes and bytearray, not with str -
that's why binary interpolation is coming back in 3.5, and there will
likely be other usability tweaks for those types as well.

More on that at
http://python-notes.curiousefficiency.org/en/latest/python3/questions_and_answers.html#what-actually-changed-in-the-text-model-between-python-2-and-python-3

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From pmiscml at gmail.com  Thu Jun  5 14:01:21 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Thu, 5 Jun 2014 15:01:21 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
Message-ID: <20140605150121.286032df@x34f>

Hello,

On Thu, 5 Jun 2014 21:43:16 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 5 June 2014 21:25, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> > Well, I understand the plan - hoping that people will "get over
> > this". And I'm personally happy to stay away from this "trolling",
> > but any discussion related to Unicode goes in circles and returns
> > to feeling that Unicode at the central role as put there by Python3
> > is misplaced.
> 
> Many of the challenges network programmers face in Python 3 are around
> binary data being more inconvenient to work with than it needs to be,
> not the fact we decentralised boundary code by offering a strict
> binary/text separation as the default mode of operation. 

Just to clarify - (many) other gentlemen and I (in that order, I'm not
taking a lead), don't call to go back to Python2 behavior with implicit
conversion between byte-oriented strings and Unicode, etc. They just
point out that perhaps Python3 went too far with Unicode cause by making
it the default string type. Strict separation is surely mostly good
thing (I can sigh that it leads to Java-like dichotomical bloat for all
I/O classes, but well, I was able to put up with that in MicroPython
already).

> Aside from
> some of the POSIX locale handling issues on Linux, many of the
> concerns are with the usability of bytes and bytearray, not with str -
> that's why binary interpolation is coming back in 3.5, and there will
> likely be other usability tweaks for those types as well.

All these changes are what let me dream on and speculate on
possibility that Python4 could offer an encoding-neutral string type
(which means based on bytes), while move unicode back to an explicit
type to be used explicitly only when needed (bloated frameworks like
Django can force users to it anyway, but that will be forcing on
framework level, not on language level, against which people rebel.)
People can dream, right?


Thanks,
 Paul                          mailto:pmiscml at gmail.com

From stefan at bytereef.org  Thu Jun  5 14:10:54 2014
From: stefan at bytereef.org (Stefan Krah)
Date: Thu, 5 Jun 2014 14:10:54 +0200
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140605142528.39e0e5fc@x34f>
References: <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
Message-ID: <20140605121054.GA348@sleipnir.bytereef.org>

Paul Sokolovsky <pmiscml at gmail.com> wrote:
> In this regard, I'm glad to participate in mind-resetting discussion.
> So, let's reiterate - there's nothing like "the best", "the only right",
> "the only correct", "righter than", "more correct than" in CPython's
> implementation of Unicode storage. It is *arbitrary*. Well, sure, it's
> not arbitrary, but based on requirements, and these requirements match
> CPython's (implied) usage model well enough. But among all possible
> sets of requirements, CPython's requirements are no more valid that
> other possible. And other set of requirement fairly clearly lead to
> situation where CPython implementation is rejected as not correct for
> those requirements at all.

Several core-devs have said that using UTF-8 for MicroPython is perfectly okay.
I also think it's the right choice and I hope that you guys come up with a very
efficient implementation.


Stefan Krah


From ncoghlan at gmail.com  Thu Jun  5 14:20:04 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 5 Jun 2014 22:20:04 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140605150121.286032df@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
Message-ID: <CADiSq7fjjOp-x4RO6GfMFy0QTfgOcSzzDUM7Z3=PrJb_ApGTUg@mail.gmail.com>

On 5 June 2014 22:01, Paul Sokolovsky <pmiscml at gmail.com> wrote:
>> Aside from
>> some of the POSIX locale handling issues on Linux, many of the
>> concerns are with the usability of bytes and bytearray, not with str -
>> that's why binary interpolation is coming back in 3.5, and there will
>> likely be other usability tweaks for those types as well.
>
> All these changes are what let me dream on and speculate on
> possibility that Python4 could offer an encoding-neutral string type
> (which means based on bytes), while move unicode back to an explicit
> type to be used explicitly only when needed (bloated frameworks like
> Django can force users to it anyway, but that will be forcing on
> framework level, not on language level, against which people rebel.)
> People can dream, right?

If you don't model strings as arrays of code points, or at least
assume a particular universal encoding (like UTF-8), you have to give
up string concatenation in order to tolerate arbitrary encodings -
otherwise you end up with unintelligible data that nobody can decode
because it switches encodings without notice. That's a viable model if
your OS guarantees it (Mac OS X does, for example, so Python 3 assumes
UTF-8 for all OS interfaces there), but Linux currently has no such
guarantee - many runtimes just decide they don't care, and assume
UTF-8 anyway (Python 3 may even join them some day, due to the
problems caused by trusting the locale encoding to be correct, but the
startup code will need non-trivial changes for that to happen - the
C.UTF-8 locale may even become widespread before we get there).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From timothy.c.delaney at gmail.com  Thu Jun  5 14:21:30 2014
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Thu, 5 Jun 2014 22:21:30 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140605150121.286032df@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
Message-ID: <CAN8CLgk2hro-BQys=tN+DTmi6LzqUWo9LUoLzDiBFKwK_8R6Mg@mail.gmail.com>

On 5 June 2014 22:01, Paul Sokolovsky <pmiscml at gmail.com> wrote:

>
> All these changes are what let me dream on and speculate on
> possibility that Python4 could offer an encoding-neutral string type
> (which means based on bytes)
>

To me, an "encoding neutral string type" means roughly "characters are
atomic", and the best representation we have for a "character" is a Unicode
code point. Through any interface that provides "characters" each
individual "character" (code point) is indivisible.

To me, Python 3 has exactly an "encoding-neutral string type". It also has
a bytes type that is is just that - bytes which can represent anything at
all.It might be the UTF-8 representation of a string, but you have the
freedom to manipulate it however you like - including making it no longer
valid UTF-8.

Whilst I think O(1) indexing of strings is important, I don't think it's as
important as the property that "characters" are indivisible and would be
quite happy for MicroPython to use UTF-8 as the underlying string
representation (or some more clever thing, several ideas in this thread) so
long as:

1. It maintains a string type that presents code points as indivisible
elements;

2. The performance consequences of using UTF-8 are documented, as well as
any optimisations, tricks, etc that are used to overcome those consequences
(and what impact if any they would have if code written for MicroPython was
run in CPython).

Cheers,

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140605/4ca05eb2/attachment.html>

From pmiscml at gmail.com  Thu Jun  5 14:37:08 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Thu, 5 Jun 2014 15:37:08 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CADiSq7fjjOp-x4RO6GfMFy0QTfgOcSzzDUM7Z3=PrJb_ApGTUg@mail.gmail.com>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
 <CADiSq7fjjOp-x4RO6GfMFy0QTfgOcSzzDUM7Z3=PrJb_ApGTUg@mail.gmail.com>
Message-ID: <20140605153708.7f27412e@x34f>

Hello,

On Thu, 5 Jun 2014 22:20:04 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:

[]
> problems caused by trusting the locale encoding to be correct, but the
> startup code will need non-trivial changes for that to happen - the
> C.UTF-8 locale may even become widespread before we get there).

... And until those golden times come, it would be nice if Python did
not force its perfect world model, which unfortunately is not based on
surrounding reality, and let users solve their encoding problems
themselves - when they need, because again, one can go quite a long way
without dealing with encodings at all. Whereas now Python3 forces users
to deal with encoding almost universally, but forcing a particular for
all strings (which is again, doesn't correspond to the state of
surrounding reality). I already hear response that it's good that users
taught to deal with encoding, that will make them write correct
programs, but that's a bit far away from the original aim of making it
write "correct" programs easy and pleasant. (And definition of
"correct" vary.)

But all that is just an opinion.

> 
> Cheers,
> Nick.
> 
> -- 
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From ncoghlan at gmail.com  Thu Jun  5 14:38:13 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 5 Jun 2014 22:38:13 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140605121054.GA348@sleipnir.bytereef.org>
References: <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <20140605121054.GA348@sleipnir.bytereef.org>
Message-ID: <CADiSq7eTGkikS6pzJGJjrWFW=FkWXnt6n_kb+dFex4OYFtodag@mail.gmail.com>

On 5 June 2014 22:10, Stefan Krah <stefan at bytereef.org> wrote:
> Paul Sokolovsky <pmiscml at gmail.com> wrote:
>> In this regard, I'm glad to participate in mind-resetting discussion.
>> So, let's reiterate - there's nothing like "the best", "the only right",
>> "the only correct", "righter than", "more correct than" in CPython's
>> implementation of Unicode storage. It is *arbitrary*. Well, sure, it's
>> not arbitrary, but based on requirements, and these requirements match
>> CPython's (implied) usage model well enough. But among all possible
>> sets of requirements, CPython's requirements are no more valid that
>> other possible. And other set of requirement fairly clearly lead to
>> situation where CPython implementation is rejected as not correct for
>> those requirements at all.
>
> Several core-devs have said that using UTF-8 for MicroPython is perfectly okay.
> I also think it's the right choice and I hope that you guys come up with a very
> efficient implementation.

Based on this discussion , I've also posted a draft patch aimed at
clarifying the relevant aspects of the data model section of the
language reference (http://bugs.python.org/issue21667).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Thu Jun  5 15:15:54 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 5 Jun 2014 23:15:54 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140605153708.7f27412e@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
 <CADiSq7fjjOp-x4RO6GfMFy0QTfgOcSzzDUM7Z3=PrJb_ApGTUg@mail.gmail.com>
 <20140605153708.7f27412e@x34f>
Message-ID: <CADiSq7dqWRMVU1xda2FgK1hvaSb14JTZfQCq+nh6+q4CU6wz6A@mail.gmail.com>

On 5 June 2014 22:37, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> On Thu, 5 Jun 2014 22:20:04 +1000
> Nick Coghlan <ncoghlan at gmail.com> wrote:
>> problems caused by trusting the locale encoding to be correct, but the
>> startup code will need non-trivial changes for that to happen - the
>> C.UTF-8 locale may even become widespread before we get there).
>
> ... And until those golden times come, it would be nice if Python did
> not force its perfect world model, which unfortunately is not based on
> surrounding reality, and let users solve their encoding problems
> themselves - when they need, because again, one can go quite a long way
> without dealing with encodings at all. Whereas now Python3 forces users
> to deal with encoding almost universally, but forcing a particular for
> all strings (which is again, doesn't correspond to the state of
> surrounding reality). I already hear response that it's good that users
> taught to deal with encoding, that will make them write correct
> programs, but that's a bit far away from the original aim of making it
> write "correct" programs easy and pleasant. (And definition of
> "correct" vary.)

As I've said before in other contexts, find me Windows, Mac OS X and
JVM developers, or educators and scientists that are as concerned by
the text model changes as folks that are primarily focused on Linux
system (including network) programming, and I'll be more willing to
concede the point.

Windows, Mac OS X, and the JVM are all opinionated about the text
encodings to be used at platform boundaries (using UTF-16, UTF-8 and
UTF-16, respectively). By contrast, Linux (or, more accurately, POSIX)
says "well, it's configurable, but we won't provide a reliable
mechanism for finding out what the encoding is. So either guess as
best you can based on the info the OS *does* provide, assume UTF-8,
assume 'some ASCII compatible encoding', or don't do anything that
requires knowing the encoding of the data being exchanged with the OS,
like, say, displaying file names to users or accepting arbitrary text
as input, transforming it in a content aware fashion, and echoing it
back in a console application".

None of those options are perfectly good choices. 6(ish) years ago, we
chose the first option, because it has the best chance of working
properly on Linux systems that use ASCII incompatible encodings like
ShiftJIS, ISO-2022, and various other East Asian codecs. For normal
user space programming, Linux is pretty reliable when it comes to
ensuring the locale encoding is set to something sensible, but the
price we currently pay for that decision is interoperability issues
with things like daemons not receiving any configuration settings and
hence falling back the POSIX locale and ssh environment forwarding
moving a clients encoding settings to a session on a server with
different settings. I still consider it preferable to impose
inconveniences like that based on use case (situations where Linux
systems don't provide sensible encoding settings) than geographic
region (locales where ASCII incompatible encodings are likely to still
be in common use).

If I (or someone else) ever find the time to implement PEP 432 (or
something like it) to address some of the limitations of the
interpreter startup sequence that currently make it difficult to avoid
relying on the POSIX locale encoding on Linux, then we'll be in a
position to reassess that decision based on the increased adoption of
UTF-8 by Linux distributions in recent years. As the major community
Linux distributions complete the migration of their system utilities
to Python 3, we'll get to see if they decide it's better to make their
locale settings more reliable, or help make it easier for Python 3 to
ignore them when they're wrong.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From steve at pearwood.info  Thu Jun  5 15:23:12 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 5 Jun 2014 23:23:12 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140604011718.GD10355@ando>
References: <20140604011718.GD10355@ando>
Message-ID: <20140605132312.GK10355@ando>

On Wed, Jun 04, 2014 at 11:17:18AM +1000, Steven D'Aprano wrote:
> There is a discussion over at MicroPython about the internal 
> representation of Unicode strings. Micropython is aimed at embedded 
> devices, and so minimizing memory use is important, possibly even 
> more important than performance.
[...]

Wow! I'm amazed at the response here, since I expected it would have 
received a fairly brief "Yes" or "No" response, not this long thread. 
Here is a summary (as best as I am able) of a few points which I think 
are important:

(1) I asked if it would be okay for MicroPython to *optionally* use 
nominally Unicode strings limited to ASCII. Pretty much the only 
response to this as been Guido saying "That would be a pretty lousy 
option", and since nobody has really defended the suggestion, I think we 
can assume that it's off the table.

(2) I asked if it would be okay for ?Py to use an UTF-8 implementation 
even though it would lead to O(N) indexing operations instead of O(1). 
There's been some opposition to this, including Guido's:

    Then again the UTF-8 option would be pretty devastating 
    too for anything manipulating strings (especially since 
    many Python APIs are defined using indexes, e.g. the re 
    module).

but unless Guido wants to say different, I think the consensus is that 
a UTF-8 implementation is allowed, even at the cost of O(N) indexing 
operations. Saving memory -- assuming that it does save memory, which I 
think is an assumption and not proven -- over time is allowed.

(3) It seems to me that there's been a lot of theorizing about what 
implementation will be obviously more efficient. Folks, how about some 
benchmarks before making claims about code efficiency? :-)

(4) Similarly, there have been many suggestions more suited in my 
opinion to python-ideas, or even python-list, for ways to implement O(1) 
indexing on top of UTF-8. Some of them involve per-string mutable state 
(e.g. the last index seen), or complicated int sub-classes that need to 
know what string they come from. Remember your Zen please:

    Simple is better than complex.
    Complex is better than complicated.
    ...
    If the implementation is hard to explain, it's a bad idea.

(5) I'm not convinced that UTF-8 internally is *necessarily* more 
efficient, but look forward to seeing the result of benchmarks. The 
rationale of internal UTF-8 is that the use of any other encoding 
internally will be inefficient since those strings will need to be 
transcoded to UTF-8 before they can be written or printed, so keeping 
them as UTF-8 in the first place saves the transcoding step. Well, yes, 
but many strings may never be written out:

    print(prefix + s[1:].strip().lower().center(80) + suffix)

creates five strings that are never written out and one that is. So if 
the internal encoding of strings is more efficient than UTF-8, and most 
of them never need transcoding to UTF-8, a non-UTF-8 internal format 
might be a nett win. So I'm looking forward to seeing the results of 
?Py's experiments with it.

Thanks to all who have commented.


-- 
Steven


From rdmurray at bitdance.com  Thu Jun  5 17:05:19 2014
From: rdmurray at bitdance.com (R. David Murray)
Date: Thu, 05 Jun 2014 11:05:19 -0400
Subject: [Python-Dev] Request: new "Asyncio" component on the bug tracker
In-Reply-To: <CAMpsgwZBzDfnJQ31S4QoubvWnmjRhFMxs1L-LGtAnBGE14anRg@mail.gmail.com>
References: <CAMpsgwZBzDfnJQ31S4QoubvWnmjRhFMxs1L-LGtAnBGE14anRg@mail.gmail.com>
Message-ID: <20140605150519.9F40B250DE7@webabinitio.net>

On Thu, 05 Jun 2014 12:03:15 +0200, Victor Stinner <victor.stinner at gmail.com> wrote:
> Would it be possible to add a new "Asyncio" component on
> bugs.python.org? If this component is selected, the default nosy list
> for asyncio would be used (guido, yury and me, there is already such
> list in the nosy list completion).

Done.  There are two other people in the nosy list (Giapaolo and
Antoine).  If either of those wish to be auto-nosy, let me know.

--David

From p.f.moore at gmail.com  Thu Jun  5 17:59:51 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 5 Jun 2014 16:59:51 +0100
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CADiSq7dqWRMVU1xda2FgK1hvaSb14JTZfQCq+nh6+q4CU6wz6A@mail.gmail.com>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
 <CADiSq7fjjOp-x4RO6GfMFy0QTfgOcSzzDUM7Z3=PrJb_ApGTUg@mail.gmail.com>
 <20140605153708.7f27412e@x34f>
 <CADiSq7dqWRMVU1xda2FgK1hvaSb14JTZfQCq+nh6+q4CU6wz6A@mail.gmail.com>
Message-ID: <CACac1F9CpMv2xXc-GLfOcYXnVT4PmQrxhk03Vub=dvnO3RAjaA@mail.gmail.com>

On 5 June 2014 14:15, Nick Coghlan <ncoghlan at gmail.com> wrote:
> As I've said before in other contexts, find me Windows, Mac OS X and
> JVM developers, or educators and scientists that are as concerned by
> the text model changes as folks that are primarily focused on Linux
> system (including network) programming, and I'll be more willing to
> concede the point.

There is once again a strong selection bias in this discussion, by its
very nature. People who like the new model don't have anything to
complain about, and so are not heard.

Just to support Nick's point, I for one find the Python 3 text model a
huge benefit, both in practical terms of making my programs more
robust, and educationally, as I have a far better understanding of
encodings and their issues than I ever did under Python 2. Whenever a
discussion like this occurs, I find it hard not to resent the people
arguing that the new model should be taken away from me and replaced
with a form of the old error-prone (for me) approach - as if it was in
my best interests.

Internal details don't bother me - using UTF8 and having indexing be
potentially O(N) is of little relevance. But make me work with a
string type that *doesn't* abstract a string as a sequence of Unicode
code points and I'll get very upset.

Paul

From dholth at gmail.com  Thu Jun  5 20:41:28 2014
From: dholth at gmail.com (Daniel Holth)
Date: Thu, 5 Jun 2014 14:41:28 -0400
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CACac1F9CpMv2xXc-GLfOcYXnVT4PmQrxhk03Vub=dvnO3RAjaA@mail.gmail.com>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
 <CADiSq7fjjOp-x4RO6GfMFy0QTfgOcSzzDUM7Z3=PrJb_ApGTUg@mail.gmail.com>
 <20140605153708.7f27412e@x34f>
 <CADiSq7dqWRMVU1xda2FgK1hvaSb14JTZfQCq+nh6+q4CU6wz6A@mail.gmail.com>
 <CACac1F9CpMv2xXc-GLfOcYXnVT4PmQrxhk03Vub=dvnO3RAjaA@mail.gmail.com>
Message-ID: <CAG8k2+5UgB4RkH=37sfSjMnPDdfqOxmpx0Wt=T_tqzyxKsV2Jw@mail.gmail.com>

On Thu, Jun 5, 2014 at 11:59 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 5 June 2014 14:15, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> As I've said before in other contexts, find me Windows, Mac OS X and
>> JVM developers, or educators and scientists that are as concerned by
>> the text model changes as folks that are primarily focused on Linux
>> system (including network) programming, and I'll be more willing to
>> concede the point.
>
> There is once again a strong selection bias in this discussion, by its
> very nature. People who like the new model don't have anything to
> complain about, and so are not heard.
>
> Just to support Nick's point, I for one find the Python 3 text model a
> huge benefit, both in practical terms of making my programs more
> robust, and educationally, as I have a far better understanding of
> encodings and their issues than I ever did under Python 2. Whenever a
> discussion like this occurs, I find it hard not to resent the people
> arguing that the new model should be taken away from me and replaced
> with a form of the old error-prone (for me) approach - as if it was in
> my best interests.
>
> Internal details don't bother me - using UTF8 and having indexing be
> potentially O(N) is of little relevance. But make me work with a
> string type that *doesn't* abstract a string as a sequence of Unicode
> code points and I'll get very upset.

Once you get past whether str + bytes throws an exception which seems
to be the divide most people focus on, you can discover new things
like dance-encoded strings, bytes decoded using an incorrect encoding
intended to be transcoded into the correct encoding later, surrogates
that work perfectly until .encode(), str(bytes), APIs that disagree
with you about whether the result should be str or bytes, APIs that
return either string or bytes depending on their initializers and so
on. Unicode can still be complicated in Python 3 independent of any
judgement about whether it is worse, better, or different than Python
2.

From v+python at g.nevcal.com  Thu Jun  5 20:48:45 2014
From: v+python at g.nevcal.com (Glenn Linderman)
Date: Thu, 05 Jun 2014 11:48:45 -0700
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140605131039.4f5b74d6@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <87vbsg36cs.fsf@uwakimon.sk.tsukuba.ac.jp>
 <CAP7+vJKPN9+S6t2mtwxqMV3xjNtBNrHPRnsTeSRqm4usXoHpUw@mail.gmail.com>
 <20140605001432.126a0a08@x34f> <lmo569$1cq$1@ger.gmane.org>
 <20140605015253.301e72e7@x34f> <lmojs8$1te$1@ger.gmane.org>
 <20140605131039.4f5b74d6@x34f>
Message-ID: <5390BB8D.7030305@g.nevcal.com>

On 6/5/2014 3:10 AM, Paul Sokolovsky wrote:
> Hello,
>
> On Wed, 04 Jun 2014 22:15:30 -0400
> Terry Reedy <tjreedy at udel.edu> wrote:
>
>> think you are again batting at a strawman. If you mean 'read from a
>> file', and all you want to do is read bytes from and write bytes to
>> external 'files', then there is obviously no need to transcode and
>> neither Python 2 or 3 make you do so.
> But most files, network protocols are text-based, and I (and many other
> people) don't want to artificially use "binary data" type for them,
> with all attached funny things, like "b" prefix. And then Python2
> indeed doesn't transcode anything, and Python3 does, without being
> asked, and for no good purpose, because in most cases, Input data will
> be Output as-is (maybe in byte-boundary-split chunks).
>
> So, it all goes in rounds - ignoring the forced-Unicode problem (after a
> week of subscription to python-list, half of traffic there appear to be
> dedicated to Unicode-related flames) on python-dev behalf is not
> going to help (Python community).

If all your program is doing is reading and writing data (input data 
will be output as-is), then use of binary doesn't require "b" prefix, 
because you aren't manipulating the data. Then you have no unnecessary 
transcoding.

If you actually wish to examine or manipulate the content as it flows 
by, then there are choices.

1) If you need to examine/manipulate only a small fraction of text data 
with the file, you can pay the small price of a few "b" prefixes to get 
high performance, and explicitly transcode only the portions that need 
to be manipulated.

2) If you are examining the bulk of the data as it flows by, but not 
manipulating it, just examining/extracting, then a full transcoding may 
be useful for that purpose... but you can perhaps do it explicitly, so 
that you keep the binary form for I/O. Careful of the block boundaries, 
in this case, however.

3) If you are actually manipulating the bulk of the data, then the 
double transcoding (once on input, and once on output) allows you to 
work in units of codepoints, rather than bytes, which generally makes 
the manipulation algorithms easier.

4) If you truly cannot afford the processor code of the double 
transcoding, and need to do all your manipulations at the byte level, 
then you could avoid the need for "b" prefix by use of a preprocessor 
for those sections of code that are doing all and only bytes 
processing... and you'll have lots of arcane, error-prone code to write 
to manipulate the bytes rather than the codepoints.

On the other hand, if you can convince your data sources and sinks to 
deal in UTF-8, and implement a UTF-8 str in ?Py, then you can both avoid 
transcoding, and make the arcane algorithms part of the implementation 
of ?Py rather than of the application code, and support full Unicode. 
And it seems to me that the world is moving that way... towards UTF-8 as 
the standard interchange format. Encourage it.

Glenn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140605/10aab2a6/attachment.html>

From v+python at g.nevcal.com  Thu Jun  5 21:11:51 2014
From: v+python at g.nevcal.com (Glenn Linderman)
Date: Thu, 05 Jun 2014 12:11:51 -0700
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAG8k2+5UgB4RkH=37sfSjMnPDdfqOxmpx0Wt=T_tqzyxKsV2Jw@mail.gmail.com>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f> <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
 <CADiSq7fjjOp-x4RO6GfMFy0QTfgOcSzzDUM7Z3=PrJb_ApGTUg@mail.gmail.com>
 <20140605153708.7f27412e@x34f>
 <CADiSq7dqWRMVU1xda2FgK1hvaSb14JTZfQCq+nh6+q4CU6wz6A@mail.gmail.com>
 <CACac1F9CpMv2xXc-GLfOcYXnVT4PmQrxhk03Vub=dvnO3RAjaA@mail.gmail.com>
 <CAG8k2+5UgB4RkH=37sfSjMnPDdfqOxmpx0Wt=T_tqzyxKsV2Jw@mail.gmail.com>
Message-ID: <5390C0F7.7050709@g.nevcal.com>

On 6/5/2014 11:41 AM, Daniel Holth wrote:
> discover new things
> like dance-encoded strings, bytes decoded using an incorrect encoding
> intended to be transcoded into the correct encoding later, surrogates
> that work perfectly until .encode(), str(bytes), APIs that disagree
> with you about whether the result should be str or bytes, APIs that
> return either string or bytes depending on their initializers and so
> on. Unicode can still be complicated in Python 3 independent of any
> judgement about whether it is worse, better, or different than Python
> 2.
Yes, people can find ways to write bad code in any language.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140605/4c3594ca/attachment.html>

From antoine at python.org  Thu Jun  5 21:55:54 2014
From: antoine at python.org (Antoine Pitrou)
Date: Thu, 05 Jun 2014 15:55:54 -0400
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAPTjJmqfsLeYN8ZrprGkLed=esWm=zY4oxA05ed5F6wnTmb2qQ@mail.gmail.com>
References: <20140604011718.GD10355@ando>
 <CADiSq7d9A894RB-D9N+aqye7bsrhBXM-cBB=YTHiP3f_2J2apg@mail.gmail.com>
 <CAPTjJmqfsLeYN8ZrprGkLed=esWm=zY4oxA05ed5F6wnTmb2qQ@mail.gmail.com>
Message-ID: <lmqi0b$96s$1@ger.gmane.org>

Le 04/06/2014 02:51, Chris Angelico a ?crit :
> On Wed, Jun 4, 2014 at 3:17 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
> It would. The downsides of a UTF-8 representation would be slower
> iteration and much slower (O(N)) indexing/slicing.

There's no reason for iteration to be slower. Slicing would get
O(slice offset + slice size) instead of O(slice size).

Regards

Antoine.


From njs at pobox.com  Thu Jun  5 22:51:41 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 5 Jun 2014 21:51:41 +0100
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
Message-ID: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>

Hi all,

There's a very valuable optimization -- temporary elision -- which
numpy can *almost* do. It gives something like a 10-30% speedup for
lots of common real-world expressions. It would probably would be
useful for non-numpy code too. (In fact it generalizes the str += str
special case that's currently hardcoded in ceval.c.) But it can't be
done safely without help from the interpreter, and possibly not even
then. So I thought I'd raise it here and see if we can get any
consensus on whether and how CPython could support this.

=== The dream ===

Here's the idea. Take an innocuous expression like:

   result = (a + b + c) / c

This gets evaluated as:

   tmp1 = a + b
   tmp2 = tmp1 + c
   result = tmp2 / c

All these temporaries are very expensive. Suppose that a, b, c are
arrays with N bytes each, and N is large. For simple arithmetic like
this, then costs are dominated by memory access. Allocating an N byte
array requires the kernel to clear the memory, which incurs N bytes of
memory traffic. If all the operands are already allocated, then
performing a three-operand operation like tmp1 = a + b involves 3N
bytes of memory traffic (reading the two inputs plus writing the
output). In total our example does 3 allocations and has 9 operands,
so it does 12N bytes of memory access.

If our arrays are small, then the kernel doesn't get involved and some
of these accesses will hit the cache, but OTOH the overhead of things
like malloc won't be amortized out; the best case starting from a cold
cache is 3 mallocs and 6N bytes worth of cache misses (or maybe 5N if
we get lucky and malloc'ing 'result' returns the same memory that tmp1
used, and it's still in cache).

There's an obvious missed optimization in this code, though, which is
that it keeps allocating new temporaries and throwing away old ones.
It would be better to just allocate a temporary once and re-use it:

   tmp1 = a + b
   tmp1 += c
   tmp1 /= c
   result = tmp1

Now we have only 1 allocation and 7 operands, so we touch only 8N
bytes of memory. For large arrays -- that don't fit into cache, and
for which per-op overhead is amortized out -- this gives a theoretical
33% speedup, and we can realistically get pretty close to this. For
smaller arrays, the re-use of tmp1 means that in the best case we have
only 1 malloc and 4N bytes worth of cache misses, and we also have a
smaller cache footprint, which means this best case will be achieved
more often in practice. For small arrays it's harder to estimate the
total speedup here, but 66% fewer mallocs and 33% fewer cache misses
is certainly enough to make a practical difference.

Such optimizations are important enough that numpy operations always
give the option of explicitly specifying the output array (like
in-place operators but more general and with clumsier syntax). Here's
an example small-array benchmark that IIUC uses Jacobi iteration to
solve Laplace's equation. It's been written in both natural and
hand-optimized formats (compare "num_update" to "num_inplace"):

   https://yarikoptic.github.io/numpy-vbench/vb_vb_app.html#laplace-inplace

num_inplace is totally unreadable, but because we've manually elided
temporaries, it's 10-15% faster than num_update. With our prototype
automatic temporary elision turned on, this difference disappears --
the natural code gets 10-15% faster, *and* we remove the temptation to
write horrible things like num_inplace.

What do I mean by "automatic temporary elision"? It's *almost*
possible for numpy to automatically convert the first example into the
second. The idea is: we want to replace

  tmp2 = tmp1 + c

with

  tmp1 += c
  tmp2 = tmp1

And we can do this by defining

   def __add__(self, other):
       if is_about_to_be_thrown_away(self):
           return self.__iadd__(other)
       else:
           ...

now tmp1.__add__(c) does an in-place add and returns tmp1, no
allocation occurs, woohoo.

The only little problem is implementing is_about_to_be_thrown_away().

=== The sneaky-but-flawed approach ===

The following implementation may make you cringe, but it comes
tantalizingly close to working:

bool is_about_to_be_thrown_away(PyObject * obj) {
    return (Py_REFCNT(obj) == 1);
}

In fact, AFAICT it's 100% correct for libraries being called by
regular python code (which is why I'm able to quote benchmarks at you
:-)). The bytecode eval loop always holds a reference to all operands,
and then immediately DECREFs them after the operation completes. If
one of our arguments has no other references besides this one, then we
can be sure that it is a dead obj walking, and steal its corpse.

But this has a fatal flaw: people are unreasonable creatures, and
sometimes they call Python libraries without going through ceval.c
:-(. It's legal for random C code to hold an array object with a
single reference count, and then call PyNumber_Add on it, and then
expect the original array object to still be valid. But who writes
code like that in practice? Well, Cython does. So, this is no-go.

=== A better (?) approach ===

This is a pretty arcane bit of functionality that we need, and it
interacts with ceval.c, so I'm not at all confident about the best way
to do it. (We even have an implementation using libunwind to walk the
C stack and make sure that we're being called from ceval.c, which...
works, actually, but is unsatisfactory in other ways.) I do have an
idea that I *think* might work and be acceptable, but you tell me:

Proposal: We add an API call

    PyEval_LastOpDefinitelyMatches(frame, optype, *args)

which checks whether the last instruction executed in 'frame' was in
fact an 'optype' instruction and did in fact have arguments 'args'. If
it was, then it returns True. If it wasn't, or if we aren't sure, it
returns False. The intention is that 'optype' is a semantic encoding
of the instruction (like "+" or "function call") and thus can be
preserved even if the bytecode details change.

Then, in numpy's __add__, we do:

1) fetch the current stack frame from TLS
2) check PyEval_LastOpDefinitelyMatches(frame, "+", arg1, arg2)
3) check for arguments with refcnt == 1
4) check that all arguments are base-class numpy array objects (i.e.,
PyArray_CheckExact)

The logic here is that step (2) tells us that someone did 'arg1 +
arg2', so ceval.c is holding a temporary reference to the arguments,
and step (3) tells us that at the time of the opcode evaluation there
were no other references to these arguments, and step (4) tells us
that 'arg1 + arg2' dispatched directly to ndarray.__add__ so there's
no chance that anyone else has borrowed a reference in the mean time.

AFAICT PyEval_LastOpDefinitelyMatches can *almost* be implemented now;
the only problem is that stack_pointer is a local variable in
PyEval_EvalFrameEx, and we would need it to be accessible via the
frame object. The easy way would be to just move it in there. I don't
know if this would have any weird effects on speed due to cache
effects, but I guess we could arrange to put it into the same cache
line as f_lasti, which is also updated on every opcode? OTOH someone
has gone to some trouble to make sure that f_stacktop usually
*doesn't* point to the top of the stack, and I guess there must have
been some reason for this. Alternatively we could stash a pointer to
stack_pointer in the frame object, and that would only need to be
updated once per entry/exit to PyEval_EvalFrameEx.

Obviously there are a lot of details to work out here, like what the
calling convention for PyEval_LastOpDefinitelyMatches should really
be, but:

* Does this approach seem like it would successfully solve the problem?
* Does this approach seem like it would be acceptable in CPython?
* Is there a better idea I'm missing?

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

From p.f.moore at gmail.com  Thu Jun  5 23:37:26 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 5 Jun 2014 22:37:26 +0100
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
Message-ID: <CACac1F-rjnB6VDET+RRwZUZGysRDbeXjcqqcrn23uYx_vepw6w@mail.gmail.com>

On 5 June 2014 21:51, Nathaniel Smith <njs at pobox.com> wrote:
> Is there a better idea I'm missing?

Just a thought, but the temporaries come from the stack manipulation
done by the likes of the BINARY_ADD opcode. (After all the bytecode
doesn't use temporaries, it's a stack machine). Maybe BINARY_ADD and
friends could allow for an alternative fast calling convention for
__add__implementations that uses the stack slots directly? This may be
something that's only plausible from C code, though. Or may not be
plausible at all. I haven't looked at ceval.c for many years...

If this is an insane idea, please feel free to ignore me :-)

Paul

From njs at pobox.com  Thu Jun  5 23:47:54 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 5 Jun 2014 22:47:54 +0100
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <CACac1F-rjnB6VDET+RRwZUZGysRDbeXjcqqcrn23uYx_vepw6w@mail.gmail.com>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <CACac1F-rjnB6VDET+RRwZUZGysRDbeXjcqqcrn23uYx_vepw6w@mail.gmail.com>
Message-ID: <CAPJVwBnDhSAM0UTeS4ZJmV3-SR_wvNvO7cwUiEof1KBz3w+KDw@mail.gmail.com>

On Thu, Jun 5, 2014 at 10:37 PM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 5 June 2014 21:51, Nathaniel Smith <njs at pobox.com> wrote:
>> Is there a better idea I'm missing?
>
> Just a thought, but the temporaries come from the stack manipulation
> done by the likes of the BINARY_ADD opcode. (After all the bytecode
> doesn't use temporaries, it's a stack machine). Maybe BINARY_ADD and
> friends could allow for an alternative fast calling convention for
> __add__implementations that uses the stack slots directly? This may be
> something that's only plausible from C code, though. Or may not be
> plausible at all. I haven't looked at ceval.c for many years...
>
> If this is an insane idea, please feel free to ignore me :-)

To make sure I understand correctly, you're suggesting something like
adding a new set of special method slots, __te_add__, __te_mul__,
etc., which BINARY_ADD and friends would check for and if found,
dispatch to without going through PyNumber_Add? And this way, a type
like numpy's array could have a special implementation for __te_add__
that works the same as __add__, except with the added wrinkle that it
knows that it will only be called by the interpreter and thus any
arguments with refcnt 1 must be temporaries?

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

From p.f.moore at gmail.com  Fri Jun  6 00:12:04 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 5 Jun 2014 23:12:04 +0100
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <CAPJVwBnDhSAM0UTeS4ZJmV3-SR_wvNvO7cwUiEof1KBz3w+KDw@mail.gmail.com>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <CACac1F-rjnB6VDET+RRwZUZGysRDbeXjcqqcrn23uYx_vepw6w@mail.gmail.com>
 <CAPJVwBnDhSAM0UTeS4ZJmV3-SR_wvNvO7cwUiEof1KBz3w+KDw@mail.gmail.com>
Message-ID: <CACac1F-yHg4bmLpfFj=Z2-NcdwsJ=6Q4qfe7d6WeczwPP8R=Pw@mail.gmail.com>

On 5 June 2014 22:47, Nathaniel Smith <njs at pobox.com> wrote:
> To make sure I understand correctly, you're suggesting something like
> adding a new set of special method slots, __te_add__, __te_mul__,
> etc.

I wasn't thinking in that much detail, TBH. I'm not sure adding a
whole set of new slots is sensible for such a specialised case. I
think I was more assuming that the special method implementations
could use an alternative calling convention, METH_STACK in place of
METH_VARARGS, for example. That would likely only be viable for types
implemented in C.

But either way, it may be more complicated than the advantages would justify...
Paul

From tjreedy at udel.edu  Fri Jun  6 00:57:32 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 05 Jun 2014 18:57:32 -0400
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
Message-ID: <lmqsl2$5v5$1@ger.gmane.org>

On 6/5/2014 4:51 PM, Nathaniel Smith wrote:

> In fact, AFAICT it's 100% correct for libraries being called by
> regular python code (which is why I'm able to quote benchmarks at you
> :-)). The bytecode eval loop always holds a reference to all operands,
> and then immediately DECREFs them after the operation completes. If
> one of our arguments has no other references besides this one, then we
> can be sure that it is a dead obj walking, and steal its corpse.
>
> But this has a fatal flaw: people are unreasonable creatures, and
> sometimes they call Python libraries without going through ceval.c
> :-(. It's legal for random C code to hold an array object with a
> single reference count, and then call PyNumber_Add on it, and then
> expect the original array object to still be valid. But who writes
> code like that in practice? Well, Cython does. So, this is no-go.

I understand that a lot of numpy/scipy code is compiled with Cython, so 
you really want the optimization to continue working when so compiled. 
Is there a simple change to Cython that would work, perhaps in 
coordination with a change to numpy? Is so, you could get the result 
before 3.5 comes out.

I realized that there are other compilers than Cython and non-numpy code 
that could benefit, so that a more generic solution would also be good. 
In particular

 > Here's the idea. Take an innocuous expression like:
 >
 >     result = (a + b + c) / c
 >
 > This gets evaluated as:
 >
 >     tmp1 = a + b
 >     tmp2 = tmp1 + c
 >     result = tmp2 / c
...
 > There's an obvious missed optimization in this code, though, which is
 > that it keeps allocating new temporaries and throwing away old ones.
 > It would be better to just allocate a temporary once and re-use it:
 >     tmp1 = a + b
 >     tmp1 += c
 >     tmp1 /= c
 >     result = tmp1

Could this transformation be done in the ast? And would that help?

A prolonged discussion might be better on python-ideas. See what others say.

-- 
Terry Jan Reedy


From njs at pobox.com  Fri Jun  6 00:22:17 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 5 Jun 2014 23:22:17 +0100
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <CACac1F-yHg4bmLpfFj=Z2-NcdwsJ=6Q4qfe7d6WeczwPP8R=Pw@mail.gmail.com>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <CACac1F-rjnB6VDET+RRwZUZGysRDbeXjcqqcrn23uYx_vepw6w@mail.gmail.com>
 <CAPJVwBnDhSAM0UTeS4ZJmV3-SR_wvNvO7cwUiEof1KBz3w+KDw@mail.gmail.com>
 <CACac1F-yHg4bmLpfFj=Z2-NcdwsJ=6Q4qfe7d6WeczwPP8R=Pw@mail.gmail.com>
Message-ID: <CAPJVwBnEt_=SC=XZiPz3J1j6UF=rfMypN6ZUKSGNoUcbUd-Agg@mail.gmail.com>

On Thu, Jun 5, 2014 at 11:12 PM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 5 June 2014 22:47, Nathaniel Smith <njs at pobox.com> wrote:
>> To make sure I understand correctly, you're suggesting something like
>> adding a new set of special method slots, __te_add__, __te_mul__,
>> etc.
>
> I wasn't thinking in that much detail, TBH. I'm not sure adding a
> whole set of new slots is sensible for such a specialised case. I
> think I was more assuming that the special method implementations
> could use an alternative calling convention, METH_STACK in place of
> METH_VARARGS, for example. That would likely only be viable for types
> implemented in C.
>
> But either way, it may be more complicated than the advantages would justify...

Oh, I see, that's clever. But, unfortunately most __special__ methods
at the C level don't use METH_*, they just have hard-coded calling
conventions:
  https://docs.python.org/3/c-api/typeobj.html#number-structs

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

From ncoghlan at gmail.com  Fri Jun  6 01:36:09 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 6 Jun 2014 09:36:09 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <5390C0F7.7050709@g.nevcal.com>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
 <CADiSq7fjjOp-x4RO6GfMFy0QTfgOcSzzDUM7Z3=PrJb_ApGTUg@mail.gmail.com>
 <20140605153708.7f27412e@x34f>
 <CADiSq7dqWRMVU1xda2FgK1hvaSb14JTZfQCq+nh6+q4CU6wz6A@mail.gmail.com>
 <CACac1F9CpMv2xXc-GLfOcYXnVT4PmQrxhk03Vub=dvnO3RAjaA@mail.gmail.com>
 <CAG8k2+5UgB4RkH=37sfSjMnPDdfqOxmpx0Wt=T_tqzyxKsV2Jw@mail.gmail.com>
 <5390C0F7.7050709@g.nevcal.com>
Message-ID: <CADiSq7cw0g5hKxZ2JpGuzpxLoK4v=cs=D+Qy0HNFN5Ouvyqnng@mail.gmail.com>

On 6 Jun 2014 05:13, "Glenn Linderman" <v+python at g.nevcal.com> wrote:
>
> On 6/5/2014 11:41 AM, Daniel Holth wrote:
>>
>> discover new things
>> like dance-encoded strings, bytes decoded using an incorrect encoding
>> intended to be transcoded into the correct encoding later, surrogates
>> that work perfectly until .encode(), str(bytes), APIs that disagree
>> with you about whether the result should be str or bytes, APIs that
>> return either string or bytes depending on their initializers and so
>> on. Unicode can still be complicated in Python 3 independent of any
>> judgement about whether it is worse, better, or different than Python
>> 2.
>
> Yes, people can find ways to write bad code in any language.

Note that several of the issues Daniel mentions here are due to the lack of
reliable encoding settings on Linux and the challenges of the Py2->3
migration, rather than users writing bad code. Several of them represent
bugs to be fixed or serve as indicators of missing features that would make
it easier to work around an imperfect world.

Cheers,
Nick.

>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/8e5e0c4f/attachment.html>

From greg.ewing at canterbury.ac.nz  Fri Jun  6 02:51:11 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 06 Jun 2014 12:51:11 +1200
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140605132312.GK10355@ando>
References: <20140604011718.GD10355@ando> <20140605132312.GK10355@ando>
Message-ID: <5391107F.1010500@canterbury.ac.nz>

Steven D'Aprano wrote:
> (1) I asked if it would be okay for MicroPython to *optionally* use 
> nominally Unicode strings limited to ASCII. Pretty much the only 
> response to this as been Guido saying "That would be a pretty lousy 
> option",

It would be limiting to have this as the *only* way of
dealing with unicode, but I don't see anything wrong with
having this available as an option for applications that
truly don't need anything more than ascii. There must be
plenty of those; the controller that runs my car engine,
for example, doesn't exchange text with the outside world
at all.

> The 
> rationale of internal UTF-8 is that the use of any other encoding 
> internally will be inefficient since those strings will need to be 
> transcoded to UTF-8 before they can be written or printed,

No, I think the rationale is that UTF-8 is likely to use
less memory than UTF-16 or UTF-32.

-- 
Greg

From Nikolaus at rath.org  Fri Jun  6 03:15:42 2014
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Thu, 05 Jun 2014 18:15:42 -0700
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
	elision in third-party classes
In-Reply-To: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 (Nathaniel Smith's message of "Thu, 5 Jun 2014 21:51:41 +0100")
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
Message-ID: <8761kevnwh.fsf@vostro.rath.org>

Nathaniel Smith <njs at pobox.com> writes:
> Such optimizations are important enough that numpy operations always
> give the option of explicitly specifying the output array (like
> in-place operators but more general and with clumsier syntax). Here's
> an example small-array benchmark that IIUC uses Jacobi iteration to
> solve Laplace's equation. It's been written in both natural and
> hand-optimized formats (compare "num_update" to "num_inplace"):
>
>    https://yarikoptic.github.io/numpy-vbench/vb_vb_app.html#laplace-inplace
>
> num_inplace is totally unreadable, but because we've manually elided
> temporaries, it's 10-15% faster than num_update. 

Does it really have to be that ugly? Shouldn't using

  tmp += u[2:,1:-1]
  tmp *= dy2
  
instead of

  np.add(tmp, u[2:,1:-1], out=tmp)
  np.multiply(tmp, dy2, out=tmp)

give the same performance? (yes, not as nice as what you're proposing,
but I'm still curious).


Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From njs at pobox.com  Fri Jun  6 03:26:26 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 6 Jun 2014 02:26:26 +0100
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <8761kevnwh.fsf@vostro.rath.org>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <8761kevnwh.fsf@vostro.rath.org>
Message-ID: <CAPJVwB=49fU7vO_R8tgyw8BaDOs5vRVA0iTvwMUxXf20Ygnmcw@mail.gmail.com>

On 6 Jun 2014 02:16, "Nikolaus Rath" <Nikolaus at rath.org> wrote:
>
> Nathaniel Smith <njs at pobox.com> writes:
> > Such optimizations are important enough that numpy operations always
> > give the option of explicitly specifying the output array (like
> > in-place operators but more general and with clumsier syntax). Here's
> > an example small-array benchmark that IIUC uses Jacobi iteration to
> > solve Laplace's equation. It's been written in both natural and
> > hand-optimized formats (compare "num_update" to "num_inplace"):
> >
> >
https://yarikoptic.github.io/numpy-vbench/vb_vb_app.html#laplace-inplace
> >
> > num_inplace is totally unreadable, but because we've manually elided
> > temporaries, it's 10-15% faster than num_update.
>
> Does it really have to be that ugly? Shouldn't using
>
>   tmp += u[2:,1:-1]
>   tmp *= dy2
>
> instead of
>
>   np.add(tmp, u[2:,1:-1], out=tmp)
>   np.multiply(tmp, dy2, out=tmp)
>
> give the same performance? (yes, not as nice as what you're proposing,
> but I'm still curious).

Yes, only the last line actually requires the out= syntax, everything else
could use in place operators instead (and automatic temporary elision
wouldn't work for the last line anyway). I guess whoever wrote it did it
that way for consistency (and perhaps in hopes of eking out a tiny bit more
speed - in numpy currently the in-place operators are implemented by
dispatching to function calls like those).

Not sure how much difference it really makes in practice though. It'd still
be 8 statements and two named temporaries to do the work of one infix
expression, with order of operations implicit.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/a3e0d447/attachment-0001.html>

From njs at pobox.com  Fri Jun  6 03:47:50 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 6 Jun 2014 02:47:50 +0100
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <lmqsl2$5v5$1@ger.gmane.org>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <lmqsl2$5v5$1@ger.gmane.org>
Message-ID: <CAPJVwBnPR1bYqwof_LejPmya80CJiyaK+kUYk99HrDoCgHPFNQ@mail.gmail.com>

On 5 Jun 2014 23:58, "Terry Reedy" <tjreedy at udel.edu> wrote:
>
> On 6/5/2014 4:51 PM, Nathaniel Smith wrote:
>
>> In fact, AFAICT it's 100% correct for libraries being called by
>> regular python code (which is why I'm able to quote benchmarks at you
>> :-)). The bytecode eval loop always holds a reference to all operands,
>> and then immediately DECREFs them after the operation completes. If
>> one of our arguments has no other references besides this one, then we
>> can be sure that it is a dead obj walking, and steal its corpse.
>>
>> But this has a fatal flaw: people are unreasonable creatures, and
>> sometimes they call Python libraries without going through ceval.c
>> :-(. It's legal for random C code to hold an array object with a
>> single reference count, and then call PyNumber_Add on it, and then
>> expect the original array object to still be valid. But who writes
>> code like that in practice? Well, Cython does. So, this is no-go.
>
>
> I understand that a lot of numpy/scipy code is compiled with Cython, so
you really want the optimization to continue working when so compiled. Is
there a simple change to Cython that would work, perhaps in coordination
with a change to numpy? Is so, you could get the result before 3.5 comes
out.

Unfortunately we don't actually know whether Cython is the only culprit
(such code *could* be written by hand), and even if we fixed Cython it
would take some unknowable amount of time before all downstream users
upgraded their Cythons. (It's pretty common for projects to check in
Cython-generated .c files, and only regenerate when the Cython source
actually gets modified.) Pretty risky for an optimization.

> I realized that there are other compilers than Cython and non-numpy code
that could benefit, so that a more generic solution would also be good. In
particular
>
> > Here's the idea. Take an innocuous expression like:
> >
> >     result = (a + b + c) / c
> >
> > This gets evaluated as:
> >
> >     tmp1 = a + b
> >     tmp2 = tmp1 + c
> >     result = tmp2 / c
> ...
>
> > There's an obvious missed optimization in this code, though, which is
> > that it keeps allocating new temporaries and throwing away old ones.
> > It would be better to just allocate a temporary once and re-use it:
> >     tmp1 = a + b
> >     tmp1 += c
> >     tmp1 /= c
> >     result = tmp1
>
> Could this transformation be done in the ast? And would that help?

I don't think it could be done in the ast because I don't think you can
work with anonymous temporaries there. But, now that you mention it, it
could be done on the fly in the implementation of the relevant opcodes.
I.e., BIN_ADD could do

if (Py_REFCNT(left) == 1)
    result = PyNumber_InPlaceAdd(left, right);
else
    result = PyNumber_Add(left, right)

Upside: all packages automagically benefit!

Potential downsides to consider:
- Subtle but real and user-visible change in Python semantics. I'd be a
little nervous about whether anyone has implemented, say, an iadd with side
effects such that you can tell whether a copy was made, even if the object
being copied is immediately destroyed. Maybe this doesn't make sense though.
- Only works when left operand is the temporary ("remember that a*b+c is
faster than c+a*b"), and only for arithmetic (no benefit for np.sin(a +
b)). Probably does cover the majority of cases though.

> A prolonged discussion might be better on python-ideas. See what others
say.

Yeah, I wasn't sure which list to use for this one, happy to move if it
would work better.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/9c727006/attachment.html>

From rosuav at gmail.com  Fri Jun  6 03:51:13 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 6 Jun 2014 11:51:13 +1000
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <CAPJVwBnPR1bYqwof_LejPmya80CJiyaK+kUYk99HrDoCgHPFNQ@mail.gmail.com>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <lmqsl2$5v5$1@ger.gmane.org>
 <CAPJVwBnPR1bYqwof_LejPmya80CJiyaK+kUYk99HrDoCgHPFNQ@mail.gmail.com>
Message-ID: <CAPTjJmpQgUiOxcbrzf7ppdiqdiSDRapmCczUf2dpyN33xW6WZw@mail.gmail.com>

On Fri, Jun 6, 2014 at 11:47 AM, Nathaniel Smith <njs at pobox.com> wrote:
> Unfortunately we don't actually know whether Cython is the only culprit
> (such code *could* be written by hand), and even if we fixed Cython it would
> take some unknowable amount of time before all downstream users upgraded
> their Cythons. (It's pretty common for projects to check in Cython-generated
> .c files, and only regenerate when the Cython source actually gets
> modified.) Pretty risky for an optimization.

But code will still work, right? I mean, you miss out on an
optimization, but it won't actually be wrong code? It should be
possible to say "After upgrading to Cython version x.y, regenerate all
your .c files to take advantage of this new optimization".

ChrisA

From greg.ewing at canterbury.ac.nz  Fri Jun  6 04:17:20 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 06 Jun 2014 14:17:20 +1200
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <CAPJVwBnPR1bYqwof_LejPmya80CJiyaK+kUYk99HrDoCgHPFNQ@mail.gmail.com>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <lmqsl2$5v5$1@ger.gmane.org>
 <CAPJVwBnPR1bYqwof_LejPmya80CJiyaK+kUYk99HrDoCgHPFNQ@mail.gmail.com>
Message-ID: <539124B0.1070701@canterbury.ac.nz>

Nathaniel Smith wrote:
> I.e., BIN_ADD could do
> 
> if (Py_REFCNT(left) == 1)
>     result = PyNumber_InPlaceAdd(left, right);
> else
>     result = PyNumber_Add(left, right)
> 
> Upside: all packages automagically benefit!
> 
> Potential downsides to consider:
> - Subtle but real and user-visible change in Python semantics.

That would be a real worry. Even if such cases were rare,
they'd be damnably difficult to debug when they did occur.

I think for safety's sake this should only be done if the
type concerned opts in somehow, perhaps by a tp_flag
indicating that the type is eligible for temporary
elision.

-- 
Greg

From sturla.molden at gmail.com  Fri Jun  6 04:18:05 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Fri, 06 Jun 2014 04:18:05 +0200
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
Message-ID: <lmr8ct$tab$1@ger.gmane.org>

On 05/06/14 22:51, Nathaniel Smith wrote:

> This gets evaluated as:
>
>     tmp1 = a + b
>     tmp2 = tmp1 + c
>     result = tmp2 / c
>
> All these temporaries are very expensive. Suppose that a, b, c are
> arrays with N bytes each, and N is large. For simple arithmetic like
> this, then costs are dominated by memory access. Allocating an N byte
> array requires the kernel to clear the memory, which incurs N bytes of
> memory traffic.

It seems to be the case that a large portion of the run-time in Python 
code using NumPy can be spent in the kernel zeroing pages (which the 
kernel does for security reasons).

I think this can also be seen as a 'malloc problem'. It comes about 
because each new NumPy array starts with a fresh buffer allocated by 
malloc. Perhaps buffers can be reused?

Sturla


From greg.ewing at canterbury.ac.nz  Fri Jun  6 04:26:35 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 06 Jun 2014 14:26:35 +1200
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <CAPJVwBnPR1bYqwof_LejPmya80CJiyaK+kUYk99HrDoCgHPFNQ@mail.gmail.com>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <lmqsl2$5v5$1@ger.gmane.org>
 <CAPJVwBnPR1bYqwof_LejPmya80CJiyaK+kUYk99HrDoCgHPFNQ@mail.gmail.com>
Message-ID: <539126DB.8010306@canterbury.ac.nz>

Nathaniel Smith wrote:

> I'd be a 
> little nervous about whether anyone has implemented, say, an iadd with 
> side effects such that you can tell whether a copy was made, even if the 
> object being copied is immediately destroyed.

I can think of at least one plausible scenario where
this could occur: the operand is a view object that
wraps another object, and its __iadd__ method updates
that other object.

In fact, now that I think about it, exactly this
kind of thing happens in numpy when you slice an
array!

So the opt-in indicator would need to be dynamic, on
a per-object basis, rather than a type flag.

-- 
Greg

From Nikolaus at rath.org  Fri Jun  6 04:27:20 2014
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Thu, 05 Jun 2014 19:27:20 -0700
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
	elision in third-party classes
In-Reply-To: <CAPJVwBnPR1bYqwof_LejPmya80CJiyaK+kUYk99HrDoCgHPFNQ@mail.gmail.com>
 (Nathaniel Smith's message of "Fri, 6 Jun 2014 02:47:50 +0100")
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <lmqsl2$5v5$1@ger.gmane.org>
 <CAPJVwBnPR1bYqwof_LejPmya80CJiyaK+kUYk99HrDoCgHPFNQ@mail.gmail.com>
Message-ID: <87tx7yu60n.fsf@vostro.rath.org>

Nathaniel Smith <njs at pobox.com> writes:
>> >     tmp1 = a + b
>> >     tmp1 += c
>> >     tmp1 /= c
>> >     result = tmp1
>>
>> Could this transformation be done in the ast? And would that help?
>
> I don't think it could be done in the ast because I don't think you can
> work with anonymous temporaries there. But, now that you mention it, it
> could be done on the fly in the implementation of the relevant opcodes.
> I.e., BIN_ADD could do
>
> if (Py_REFCNT(left) == 1)
>     result = PyNumber_InPlaceAdd(left, right);
> else
>     result = PyNumber_Add(left, right)
>
> Upside: all packages automagically benefit!
>
> Potential downsides to consider:
> - Subtle but real and user-visible change in Python semantics. I'd be a
> little nervous about whether anyone has implemented, say, an iadd with side
> effects such that you can tell whether a copy was made, even if the object
> being copied is immediately destroyed. Maybe this doesn't make sense
> though.

Hmm. I don't think this is as unlikely as it may sound. Consider eg the
h5py module:

with h5py.File('database.h5') as fh:
     result = fh['key'] + np.ones(42)

if this were transformed to

with h5py.File('database.h5') as fh:
    tmp = fh['key']
    tmp += np.ones(42)
    result = tmp

then the database.h5 file would get modified, *and* result would be of
type h5py.Dataset rather than np.array.


Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From greg.ewing at canterbury.ac.nz  Fri Jun  6 01:06:57 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 06 Jun 2014 11:06:57 +1200
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140605150121.286032df@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f> <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
Message-ID: <5390F811.90300@canterbury.ac.nz>

Paul Sokolovsky wrote:
> All these changes are what let me dream on and speculate on
> possibility that Python4 could offer an encoding-neutral string type
> (which means based on bytes)

Can you elaborate on exactly what you have in mind?
You seem to want something different from Python 3 str,
Python 3 bytes and Python 2 str, but it's far from
clear what you want this type to be like.

-- 
Greg

From jimjjewett at gmail.com  Fri Jun  6 05:54:55 2014
From: jimjjewett at gmail.com (Jim J. Jewett)
Date: Thu, 05 Jun 2014 20:54:55 -0700 (PDT)
Subject: [Python-Dev] Internal representation of strings and Micropython
 (Steven D'Aprano's summary)
In-Reply-To: <20140605132312.GK10355@ando>
Message-ID: <53913b8f.4d16e00a.2ba2.44f0@mx.google.com>


Steven D'Aprano wrote:

> (1) I asked if it would be okay for MicroPython to *optionally* use 
> nominally Unicode strings limited to ASCII. Pretty much the only 
> response to this as been Guido saying "That would be a pretty lousy 
> option", and since nobody has really defended the suggestion, I think we 
> can assume that it's off the table.

Lousy is not quite the same as forbidden.

Doing it in good faith would require making the limit prominent
in the documentation, and raising some sort of CharacterNotSupported
exception (or at least a warning) whenever there is an attempt to
create a non-ASCII string, even via the C API.

> (2) I asked if it would be okay ... to use an UTF-8 implementation 
> even though it would lead to O(N) indexing operations instead of O(1). 
> There's been some opposition to this, including Guido's:

[Non-ASCII character removed.]

It is bad when quirks -- even good quirks -- of one implementation lead
people to write code that will perform badly on a different Python
implementation.  Cpython has at least delayed obvious optimizations for
this reason.  Changing idiomatic operations from O(1) to O(N) is big
enough to cause a concern.

That said, the target environment itself apparently limits N to small
enough that the problem should be mostly theoretical.  If you want to
be good citizens, then do put a note in the documentation warning that
particularly long strings are likely to cause performance issues unique
to the MicroPython implementation.

(Frankly, my personal opinion is that if you're really optimizing for
space, then long strings will start getting awkward long before N is
big enough for algorithmic complexity to overcome constant factors.)

> ... those strings will need to be transcoded to UTF-8 before they
> can be written or printed, so keeping them as UTF-8 ...

That all assumes that the external world is using UTF-8 anyhow.

Which is more likely to be true if you document it as a limitation
of MicroPython.

> ... but many strings may never be written out:

    print(prefix + s[1:].strip().lower().center(80) + suffix)

> creates five strings that are never written out and one that is.

But looking at the actual strings -- UTF-8 doesn't really hurt
much.  Only the slice and center() are more complex, and for a
string less than 80 characters long, O(N) is irrelevant.

-jJ

--

If there are still threading problems with my replies, please
email me with details, so that I can try to resolve them.  -jJ


From steve at pearwood.info  Fri Jun  6 08:37:57 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 6 Jun 2014 16:37:57 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <5391107F.1010500@canterbury.ac.nz>
References: <20140604011718.GD10355@ando> <20140605132312.GK10355@ando>
 <5391107F.1010500@canterbury.ac.nz>
Message-ID: <20140606063757.GM10355@ando>

On Fri, Jun 06, 2014 at 12:51:11PM +1200, Greg Ewing wrote:
> Steven D'Aprano wrote:
> >(1) I asked if it would be okay for MicroPython to *optionally* use 
> >nominally Unicode strings limited to ASCII. Pretty much the only 
> >response to this as been Guido saying "That would be a pretty lousy 
> >option",
> 
> It would be limiting to have this as the *only* way of
> dealing with unicode, but I don't see anything wrong with
> having this available as an option for applications that
> truly don't need anything more than ascii. There must be
> plenty of those; the controller that runs my car engine,
> for example, doesn't exchange text with the outside world
> at all.

I don't know about car engine controllers, but presumably they have 
diagnostic ports, and they may sometimes output text. If they output 
text, then at least hypothetically car mechanics in Russia might prefer 
their car to output "??????" and "??????" rather than "true" and 
"false". I think that opportunities for ASCII-only optimizations are 
shrinking, not getting bigger, as more people come to expect that their 
computing devices speak their language rather than Foreign.


> >The 
> >rationale of internal UTF-8 is that the use of any other encoding 
> >internally will be inefficient since those strings will need to be 
> >transcoded to UTF-8 before they can be written or printed,
> 
> No, I think the rationale is that UTF-8 is likely to use
> less memory than UTF-16 or UTF-32.

Right. I was talking about memory efficiency. Instead of this, which 
requires two copies of the string at one time:

1) accept UTF-8 bytes
2) transcode to internal representation
3) discard UTF-8 bytes

you could have:

1) accept UTF-8 bytes

and be done.


-- 
Steve

From breamoreboy at yahoo.co.uk  Fri Jun  6 10:32:25 2014
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Fri, 06 Jun 2014 09:32:25 +0100
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmnfbu$oes$1@ger.gmane.org>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <lmn9oo$d20$1@ger.gmane.org>
 <CAPTjJmof+qiELL+GeF90wRiFXkgRDyFo=TQ+fQZwJcgJys+_AQ@mail.gmail.com>
 <20140604174930.3a5af45f@x34f>
 <CAPTjJmp4Q8-Q6=B4BXB_mupv+c3TmjJzvVKNzctAxOYK8FQV9A@mail.gmail.com>
 <02b9a61658c04b11a21317da5b78bad6@BLUPR03MB389.namprd03.prod.outlook.com>
 <lmnfbu$oes$1@ger.gmane.org>
Message-ID: <lmruar$oda$1@ger.gmane.org>

On 04/06/2014 16:52, Mark Lawrence wrote:
> On 04/06/2014 16:32, Steve Dower wrote:
>>
>> If copying into a separate list is a problem (memory-wise),
>> re.finditer('\\S+', string) also provides the same behaviour and gives
>> me the sliced string, so there's no need to index for anything.
>>
>
> Out of idle curiosity is there anything that stops MicroPython, or any
> other implementation for that matter, from providing views of a string
> rather than copying every time?  IIRC memoryviews in CPython rely on the
> buffer protocol at the C API level, so since strings don't support this
> protocol you can't take a memoryview of them.  Could this actually be
> implemented in the future, is the underlying C code just too
> complicated, or what?
>

Anybody?

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com


From jtaylor.debian at googlemail.com  Fri Jun  6 10:01:17 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Fri, 06 Jun 2014 10:01:17 +0200
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <lmr8ct$tab$1@ger.gmane.org>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <lmr8ct$tab$1@ger.gmane.org>
Message-ID: <5391754D.8000607@googlemail.com>

On 06.06.2014 04:18, Sturla Molden wrote:
> On 05/06/14 22:51, Nathaniel Smith wrote:
> 
>> This gets evaluated as:
>>
>>     tmp1 = a + b
>>     tmp2 = tmp1 + c
>>     result = tmp2 / c
>>
>> All these temporaries are very expensive. Suppose that a, b, c are
>> arrays with N bytes each, and N is large. For simple arithmetic like
>> this, then costs are dominated by memory access. Allocating an N byte
>> array requires the kernel to clear the memory, which incurs N bytes of
>> memory traffic.
> 
> It seems to be the case that a large portion of the run-time in Python
> code using NumPy can be spent in the kernel zeroing pages (which the
> kernel does for security reasons).
> 
> I think this can also be seen as a 'malloc problem'. It comes about
> because each new NumPy array starts with a fresh buffer allocated by
> malloc. Perhaps buffers can be reused?
> 
> Sturla
> 
> 

Caching memory inside of numpy would indeed solve this issue too. There
has even been a paper written on this which contains some more serious
benchmarks than the laplace case which runs on very old hardware (and
the inplace and out of place cases are actually not the same, one
computes array/scalar the other array * (1 / scalar)):

hiperfit.dk/pdf/Doubling.pdf
"The result is an improvement of as much as 2.29 times speedup, on
average 1.32 times speedup across a benchmark suite of 15 applications"

The problem with this approach is that it is already difficult enough to
handle memory in numpy. Having a cache that potentially stores gigabytes
of memory out of the users sight will just make things worse.

This would not be needed if we can come up with a way on how python can
help out numpy in eliding the temporaries.

From hrvoje.niksic at avl.com  Fri Jun  6 10:53:50 2014
From: hrvoje.niksic at avl.com (Hrvoje Niksic)
Date: Fri, 6 Jun 2014 10:53:50 +0200
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmnfbu$oes$1@ger.gmane.org>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <lmn9oo$d20$1@ger.gmane.org>
 <CAPTjJmof+qiELL+GeF90wRiFXkgRDyFo=TQ+fQZwJcgJys+_AQ@mail.gmail.com>
 <20140604174930.3a5af45f@x34f>
 <CAPTjJmp4Q8-Q6=B4BXB_mupv+c3TmjJzvVKNzctAxOYK8FQV9A@mail.gmail.com>
 <02b9a61658c04b11a21317da5b78bad6@BLUPR03MB389.namprd03.prod.outlook.com>
 <lmnfbu$oes$1@ger.gmane.org>
Message-ID: <5391819E.3060300@avl.com>

On 06/04/2014 05:52 PM, Mark Lawrence wrote:
> On 04/06/2014 16:32, Steve Dower wrote:
>>
>> If copying into a separate list is a problem (memory-wise), re.finditer('\\S+', string) also provides the same behaviour and gives me the sliced string, so there's no need to index for anything.
>>
>
> Out of idle curiosity is there anything that stops MicroPython, or any
> other implementation for that matter, from providing views of a string
> rather than copying every time?  IIRC memoryviews in CPython rely on the
> buffer protocol at the C API level, so since strings don't support this
> protocol you can't take a memoryview of them.  Could this actually be
> implemented in the future, is the underlying C code just too
> complicated, or what?
>

Memory view of Unicode strings is controversial for two reasons:

1. It exposes the internal representation of the string. If memoryviews 
of strings were supported in Python 3, PEP 393 would not have been 
possible (without breaking that feature).

2. Even if it were OK to expose the internal representation, it might 
not be what the users expect. For example, memoryview("Hrvoje") would 
return a view of a 6-byte buffer, while memoryview("Nik?i?") would 
return a view of a 12-byte UCS-2 buffer. The user of a memory view might 
expect to get UCS-2 (or UCS-4, or even UTF-8) in all cases.

An implementation that decided to export strings as memory views might 
be forced to make a decision about internal representation of strings, 
and then stick to it.

The byte objects don't have these issues, which is why in Python 2.7 
memoryview("foo") works just fine, as does memoryview(b"foo") in Python 3.


From pmiscml at gmail.com  Fri Jun  6 11:13:06 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Fri, 6 Jun 2014 12:13:06 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAN8CLgk2hro-BQys=tN+DTmi6LzqUWo9LUoLzDiBFKwK_8R6Mg@mail.gmail.com>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
 <CAN8CLgk2hro-BQys=tN+DTmi6LzqUWo9LUoLzDiBFKwK_8R6Mg@mail.gmail.com>
Message-ID: <20140606121306.06783df6@x34f>

Hello,

On Thu, 5 Jun 2014 22:21:30 +1000
Tim Delaney <timothy.c.delaney at gmail.com> wrote:

> On 5 June 2014 22:01, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> 
> >
> > All these changes are what let me dream on and speculate on
> > possibility that Python4 could offer an encoding-neutral string type
> > (which means based on bytes)
> >
> 
> To me, an "encoding neutral string type" means roughly "characters are
> atomic", and the best representation we have for a "character" is a

And for me it means exactly what "encoding neutral string type" moniker
promises - that you should not make any assumption about its encoding.
That kinda means "string is atomic", instead of your "characters are
atomic". That's the most basic level, and you can write a big enough
set of applications using it - for example, get some information from
user, store in database, then show back to user at later time.

[]

> 
> Cheers,
> 
> Tim Delaney


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From victor.stinner at gmail.com  Fri Jun  6 11:31:23 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Fri, 6 Jun 2014 11:31:23 +0200
Subject: [Python-Dev] asyncio/Tulip: use CPython as the new upstream
Message-ID: <CAMpsgwbzDBQxbKPukYyLHHCL0AzXOe2gX2s7FHabMMFKeDQy4g@mail.gmail.com>

Hi,

I added a new BaseEventLoop.is_closed() method to Tulip and Python 3.5
to fix an issue (see Tulip issue 169 for the detail). The problem is
that I don't want to add this method to Python 3.4 because usually we
don't add new methods in minor versions of Python (future version
3.4.2 in this case).

Guido just wrote in the issue: "Actually for asyncio we have special
dispensation to push new features to minor releases (until 3.5).
Please push to 3.4 so the source code is the same everywhere (except
selectors.py, which is not covered by the exception)."

I disagree with Guido. I would prefer to start to maintain a different
branch for Python 3.4, because I consider that only bugfixes should be
applied to Python 3.4.

It's not the first change that cannot be applied on Python 3.4 (only
in Tulip and Python 3.5): the selectors module now also supports
devpoll on Solaris. It's annoying because the Tulip script
"update_stdlib.sh" used to synchronize Tulip and Python wants to
replace Lib/selectors.py in Python 3.4. I have to revert the change each time.

I propose a new workflow: use Python default (future version 3.5) as
the new asyncio "upstream". Bugfixes would be applied as other Python
bugfixes: first in Python 3.4, than in Python 3.5. The
"update_stdlib.sh" script of Tulip should be modified to copy files
from Python default to Tulip (opposite of the current direction).

Workflow:

New feature: Python 3.5 => Tulip => Trollius
Bugfix: Python 3.4 => Python 3.5 => Tulip => Trollius

I don't think that Tulip should have minor release just for bugfixes,
it would be a pain to maintain. Tulip is a third party module, it
doesn't have the same constraints than Python stdlib.

What do you think?

Victor

From pmiscml at gmail.com  Fri Jun  6 12:15:31 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Fri, 6 Jun 2014 13:15:31 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CADiSq7eTGkikS6pzJGJjrWFW=FkWXnt6n_kb+dFex4OYFtodag@mail.gmail.com>
References: <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <20140605121054.GA348@sleipnir.bytereef.org>
 <CADiSq7eTGkikS6pzJGJjrWFW=FkWXnt6n_kb+dFex4OYFtodag@mail.gmail.com>
Message-ID: <20140606131531.2f8431c1@x34f>

Hello,

On Thu, 5 Jun 2014 22:38:13 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 5 June 2014 22:10, Stefan Krah <stefan at bytereef.org> wrote:
> > Paul Sokolovsky <pmiscml at gmail.com> wrote:
> >> In this regard, I'm glad to participate in mind-resetting
> >> discussion. So, let's reiterate - there's nothing like "the best",
> >> "the only right", "the only correct", "righter than", "more
> >> correct than" in CPython's implementation of Unicode storage. It
> >> is *arbitrary*. Well, sure, it's not arbitrary, but based on
> >> requirements, and these requirements match CPython's (implied)
> >> usage model well enough. But among all possible sets of
> >> requirements, CPython's requirements are no more valid that other
> >> possible. And other set of requirement fairly clearly lead to
> >> situation where CPython implementation is rejected as not correct
> >> for those requirements at all.
> >
> > Several core-devs have said that using UTF-8 for MicroPython is
> > perfectly okay. I also think it's the right choice and I hope that
> > you guys come up with a very efficient implementation.
> 
> Based on this discussion , I've also posted a draft patch aimed at
> clarifying the relevant aspects of the data model section of the
> language reference (http://bugs.python.org/issue21667).

Thanks, it's very much appreciated. Though, the discussion there opened
another can of worms. I'm sorry if I was somehow related to that, my
bringing in the formal language spec was more a rhetorical figure, a
response to people claiming O(1) requirement. So, it either should be
in spec, or spec should be treated as such - something not specified
means it's underspecified and implementation-dependent. I'm glad that
the last point now explicitly pronounced by BDFL in the last comment
of that ticket (http://bugs.python.org/issue21667#msg219824)

> 
> Cheers,
> Nick.
> 
> -- 
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/pmiscml%40gmail.com


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From greg.ewing at canterbury.ac.nz  Fri Jun  6 12:48:59 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 06 Jun 2014 22:48:59 +1200
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140606063757.GM10355@ando>
References: <20140604011718.GD10355@ando> <20140605132312.GK10355@ando>
 <5391107F.1010500@canterbury.ac.nz> <20140606063757.GM10355@ando>
Message-ID: <53919C9B.6000100@canterbury.ac.nz>

Steven D'Aprano wrote:
> I don't know about car engine controllers, but presumably they have 
> diagnostic ports, and they may sometimes output text. If they output 
> text, then at least hypothetically car mechanics in Russia might prefer 
> their car to output "??????" and "??????" rather than "true" and 
> "false".

 From a bit of googling, it seems that engine controller
diagnostic ports typically speak some kind of binary
protocol. So it would be up to the software running on
whatever was plugged into the port to display the
information in the user's native language.

E.g. this document lists a big pile of hex byte values
and little or no text that I can see:

https://law.resource.org/pub/us/cfr/ibr/005/sae.j1979.2002.pdf

-- 
Greg

From rdmurray at bitdance.com  Fri Jun  6 13:00:40 2014
From: rdmurray at bitdance.com (R. David Murray)
Date: Fri, 06 Jun 2014 07:00:40 -0400
Subject: [Python-Dev] asyncio/Tulip: use CPython as the new upstream
In-Reply-To: <CAMpsgwbzDBQxbKPukYyLHHCL0AzXOe2gX2s7FHabMMFKeDQy4g@mail.gmail.com>
References: <CAMpsgwbzDBQxbKPukYyLHHCL0AzXOe2gX2s7FHabMMFKeDQy4g@mail.gmail.com>
Message-ID: <20140606110041.08C03250DE6@webabinitio.net>

On Fri, 06 Jun 2014 11:31:23 +0200, Victor Stinner <victor.stinner at gmail.com> wrote:
> Hi,
> 
> I added a new BaseEventLoop.is_closed() method to Tulip and Python 3.5
> to fix an issue (see Tulip issue 169 for the detail). The problem is
> that I don't want to add this method to Python 3.4 because usually we
> don't add new methods in minor versions of Python (future version
> 3.4.2 in this case).
> 
> Guido just wrote in the issue: "Actually for asyncio we have special
> dispensation to push new features to minor releases (until 3.5).
> Please push to 3.4 so the source code is the same everywhere (except
> selectors.py, which is not covered by the exception)."
> 
> I disagree with Guido. I would prefer to start to maintain a different
> branch for Python 3.4, because I consider that only bugfixes should be
> applied to Python 3.4.
> 
> It's not the first change that cannot be applied on Python 3.4 (only
> in Tulip and Python 3.5): the selectors module now also supports
> devpoll on Solaris. It's annoying because the Tulip script
> "update_stdlib.sh" used to synchronize Tulip and Python wants to
> replace Lib/selectors.py in Python 3.4. I have to revert the change each time.
> 
> I propose a new workflow: use Python default (future version 3.5) as
> the new asyncio "upstream". Bugfixes would be applied as other Python
> bugfixes: first in Python 3.4, than in Python 3.5. The
> "update_stdlib.sh" script of Tulip should be modified to copy files
> from Python default to Tulip (opposite of the current direction).
> 
> Workflow:
> 
> New feature: Python 3.5 => Tulip => Trollius
> Bugfix: Python 3.4 => Python 3.5 => Tulip => Trollius
> 
> I don't think that Tulip should have minor release just for bugfixes,
> it would be a pain to maintain. Tulip is a third party module, it
> doesn't have the same constraints than Python stdlib.
> 
> What do you think?

I don't have any opinion on the workflow.

My understanding is that part of the purpose of the "provisional"
designation is to allow faster evolution (read: fixing) of an API before
the library becomes non-provisional.  Thus I agree with Guido here, and
will be doing something similar with at least one of the minor provisional
email API features in 3.4.2 (unless I miss the cutoff again ... :(

--David

From ncoghlan at gmail.com  Fri Jun  6 13:10:49 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 6 Jun 2014 21:10:49 +1000
Subject: [Python-Dev] asyncio/Tulip: use CPython as the new upstream
In-Reply-To: <CAMpsgwbzDBQxbKPukYyLHHCL0AzXOe2gX2s7FHabMMFKeDQy4g@mail.gmail.com>
References: <CAMpsgwbzDBQxbKPukYyLHHCL0AzXOe2gX2s7FHabMMFKeDQy4g@mail.gmail.com>
Message-ID: <CADiSq7ctS2v0T7bi4vTKgDcU5aRhepBj+UL3hG4MC7U-0QrzEw@mail.gmail.com>

On 6 June 2014 19:31, Victor Stinner <victor.stinner at gmail.com> wrote:

> Guido just wrote in the issue: "Actually for asyncio we have special
> dispensation to push new features to minor releases (until 3.5).
> Please push to 3.4 so the source code is the same everywhere (except
> selectors.py, which is not covered by the exception)."
>
> I disagree with Guido. I would prefer to start to maintain a different
> branch for Python 3.4, because I consider that only bugfixes should be
> applied to Python 3.4.

This is why PEP 411 was thrashed out: to let us split the dates of
"make broadly available in the standard library" and "get ultra
conservative with API changes". asyncio was added as a provisional
module, so it can still get new features in 3.4.x maintenance releases
- that's a far more minor change than the backwards compatibility
breaks permitted by the PEP.

The difference with selectors is that it was *not* added as a
provisional module - it's subject to all the normal stability
requirements.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From stephen at xemacs.org  Fri Jun  6 13:11:27 2014
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 06 Jun 2014 20:11:27 +0900
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140606121306.06783df6@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
 <CAN8CLgk2hro-BQys=tN+DTmi6LzqUWo9LUoLzDiBFKwK_8R6Mg@mail.gmail.com>
 <20140606121306.06783df6@x34f>
Message-ID: <8761ke2syo.fsf@uwakimon.sk.tsukuba.ac.jp>

Paul Sokolovsky writes:

 > That kinda means "string is atomic", instead of your "characters are
 > atomic".

I would be very surprised if a language that behaved that way was
called a "Python subset".  No indexing, no slicing, no regexps, no
.split(), no .startswith(), no sorted() or .sort(), ...!?

If that's not what you mean by "string is atomic", I think you're
using very confusing terminology.


From pmiscml at gmail.com  Fri Jun  6 13:15:35 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Fri, 6 Jun 2014 14:15:35 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CADiSq7dqWRMVU1xda2FgK1hvaSb14JTZfQCq+nh6+q4CU6wz6A@mail.gmail.com>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
 <CADiSq7fjjOp-x4RO6GfMFy0QTfgOcSzzDUM7Z3=PrJb_ApGTUg@mail.gmail.com>
 <20140605153708.7f27412e@x34f>
 <CADiSq7dqWRMVU1xda2FgK1hvaSb14JTZfQCq+nh6+q4CU6wz6A@mail.gmail.com>
Message-ID: <20140606141535.69e8bab0@x34f>

Hello,

On Thu, 5 Jun 2014 23:15:54 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 5 June 2014 22:37, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> > On Thu, 5 Jun 2014 22:20:04 +1000
> > Nick Coghlan <ncoghlan at gmail.com> wrote:
> >> problems caused by trusting the locale encoding to be correct, but
> >> the startup code will need non-trivial changes for that to happen
> >> - the C.UTF-8 locale may even become widespread before we get
> >> there).
> >
> > ... And until those golden times come, it would be nice if Python
> > did not force its perfect world model, which unfortunately is not
> > based on surrounding reality, and let users solve their encoding
> > problems themselves - when they need, because again, one can go
> > quite a long way without dealing with encodings at all. Whereas now
> > Python3 forces users to deal with encoding almost universally, but
> > forcing a particular for all strings (which is again, doesn't
> > correspond to the state of surrounding reality). I already hear
> > response that it's good that users taught to deal with encoding,
> > that will make them write correct programs, but that's a bit far
> > away from the original aim of making it write "correct" programs
> > easy and pleasant. (And definition of "correct" vary.)
> 
> As I've said before in other contexts, find me Windows, Mac OS X and
> JVM developers, or educators and scientists that are as concerned by
> the text model changes as folks that are primarily focused on Linux
> system (including network) programming, and I'll be more willing to
> concede the point.

Well, but this question reduces to finding out (or specifying) who are
target audiences of Python. It always has been (with a bow to Guido)
forpost of scientific users (and probably even if there was mass exodus
of other categories of users will remain prominent in that role). But
Python has always had its share as system scripting language among
Perl-haters, and with Perl going flatline, I guess it's fair to say
that Python is major system scripting and service implementation
language.

To whom all features like memoryview, array.array, in-place
input operations, etc. cater? To scientists? I'm sure most of them are
just happy with stuffing "@jit" for their kernel functions. And
scientist who bother with memoryviews for their data structures are
system-level-ish programmers too.

So, no wonder that Linux crowd cries at Python3 - it makes doing simple
things unnecessarily complicated.

> Windows, Mac OS X, and the JVM are all opinionated about the text
> encodings to be used at platform boundaries (using UTF-16, UTF-8 and
> UTF-16, respectively). By contrast, Linux (or, more accurately, POSIX)
> says "well, it's configurable, but we won't provide a reliable
> mechanism for finding out what the encoding is. So either guess as

[]

Yes, I understand complexity of developing cross-platform language with
advanced features. By I may offer another look at all this activity:
Python3 was brave enough to do revolution in its own world (catching a
lot of its users by surprise), but surely not brave enough to do
revolution around itself, by saying something like "We choose ONE, the
most right, and even the most used (per bytes transferred) encoding as
our standard I/O encoding. Grow up or explicitly specify encoding which
you personally need.".

Surely, it didn't to that - it makes no sense to fight the world. But
then Python3 is sympathetic about Java's desire to use "UTF-16" instead
of "right" encoding, and no so about Unix desire to treat encodings
as a separate level from content (and treating Unicode by nothing else
as yet another arbitrary encoding, which it is formally, and will be
for a long time de-facto, however sad it is). So, maybe "cross-platform"
should have mean "don't do implicit conversions". Because see, Python2
had a problem with implicit encoding conversion when str and unicode
objects were mixed, and Python3 has problem with implicit conversions
whenever str is used at all.


Anyway, I appreciate detailed responses, and understand what you
(Python3 developers) are trying to achieve, and appreciate your work,
and hope it all work out. Each user has own concerns about Unicode.
Mine are efficiency and layering. But once MicroPython has UTF-8 support
I will be much more relaxed about it. Layering is harder to accept, but
hopefully can be tackled too both on own mind's and technical sides. I
hope other users will find their peace with Unicode too!


[]


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From pmiscml at gmail.com  Fri Jun  6 13:34:01 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Fri, 6 Jun 2014 14:34:01 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <8761ke2syo.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
 <CAN8CLgk2hro-BQys=tN+DTmi6LzqUWo9LUoLzDiBFKwK_8R6Mg@mail.gmail.com>
 <20140606121306.06783df6@x34f>
 <8761ke2syo.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20140606143401.79a7b0ee@x34f>

Hello,

On Fri, 06 Jun 2014 20:11:27 +0900
"Stephen J. Turnbull" <stephen at xemacs.org> wrote:

> Paul Sokolovsky writes:
> 
>  > That kinda means "string is atomic", instead of your "characters
>  > are atomic".
> 
> I would be very surprised if a language that behaved that way was
> called a "Python subset".  No indexing, no slicing, no regexps, no
> .split(), no .startswith(), no sorted() or .sort(), ...!?
> 
> If that's not what you mean by "string is atomic", I think you're
> using very confusing terminology.

I'm sorry if I didn't mention it, or didn't make it clear enough - it's
all about layering.

On level 0, you treat strings verbatim, and can write some subset of
apps (my point is that even this level allows to write lot enough
apps). Let's call this set A0.

On level 1, you accept that there's some universal enough conventions
for some chars, like space or newline. And you can write set of 
apps A1 > A0.

On level 2, you add len(), and - oh magic - you now can center a string
within fixed-size field, something you probably to as often as once a
month, so hopefully that will keep you busy for few.

On level 3, it indeed starts to smell Unicode, we get isdigit(),
isalpha(), which require long boring tables, which hopefully can be
compressed enough to fit in your pocket.

On level 4, it's pumping up, with tolower() and friends, tables for
which you carry around in suitcase.

On level 5, everything is Unicode, what a bliss! You can even start
pretending that no other levels exist (God created Unicode on a second
day).

On level 6, there're mind-boggling, ugly manual-use utilities to deal
with internals of "magic" "working on its own for everyone" encoding to
deal with stuff like code-point vs charecters vs surrogate pair
vs grapheme separation, etc.


So, once again, for me and some other people, it's not that bright idea
to shoot for level 5 if levels 0-4 exist and well-proven pragmatic
model. And level 6 is still there anyway.


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From ncoghlan at gmail.com  Fri Jun  6 13:35:49 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 6 Jun 2014 21:35:49 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140606141535.69e8bab0@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
 <CADiSq7fjjOp-x4RO6GfMFy0QTfgOcSzzDUM7Z3=PrJb_ApGTUg@mail.gmail.com>
 <20140605153708.7f27412e@x34f>
 <CADiSq7dqWRMVU1xda2FgK1hvaSb14JTZfQCq+nh6+q4CU6wz6A@mail.gmail.com>
 <20140606141535.69e8bab0@x34f>
Message-ID: <CADiSq7c2gn9MC9vMqvZtMz86zsrj8OtHoCD-jjtjWAWp7tbUww@mail.gmail.com>

On 6 June 2014 21:15, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> Hello,
>
> On Thu, 5 Jun 2014 23:15:54 +1000
> Nick Coghlan <ncoghlan at gmail.com> wrote:
>
>> On 5 June 2014 22:37, Paul Sokolovsky <pmiscml at gmail.com> wrote:
>> > On Thu, 5 Jun 2014 22:20:04 +1000
>> > Nick Coghlan <ncoghlan at gmail.com> wrote:
>> >> problems caused by trusting the locale encoding to be correct, but
>> >> the startup code will need non-trivial changes for that to happen
>> >> - the C.UTF-8 locale may even become widespread before we get
>> >> there).
>> >
>> > ... And until those golden times come, it would be nice if Python
>> > did not force its perfect world model, which unfortunately is not
>> > based on surrounding reality, and let users solve their encoding
>> > problems themselves - when they need, because again, one can go
>> > quite a long way without dealing with encodings at all. Whereas now
>> > Python3 forces users to deal with encoding almost universally, but
>> > forcing a particular for all strings (which is again, doesn't
>> > correspond to the state of surrounding reality). I already hear
>> > response that it's good that users taught to deal with encoding,
>> > that will make them write correct programs, but that's a bit far
>> > away from the original aim of making it write "correct" programs
>> > easy and pleasant. (And definition of "correct" vary.)
>>
>> As I've said before in other contexts, find me Windows, Mac OS X and
>> JVM developers, or educators and scientists that are as concerned by
>> the text model changes as folks that are primarily focused on Linux
>> system (including network) programming, and I'll be more willing to
>> concede the point.
>
> Well, but this question reduces to finding out (or specifying) who are
> target audiences of Python. It always has been (with a bow to Guido)
> forpost of scientific users (and probably even if there was mass exodus
> of other categories of users will remain prominent in that role). But
> Python has always had its share as system scripting language among
> Perl-haters, and with Perl going flatline, I guess it's fair to say
> that Python is major system scripting and service implementation
> language.

Correct - and the efforts of a number of core developers are focused
on getting the Linux distros and major projects like OpenStack
migrated. If other Linux users say "I'm not switching to Python 3
until after my distro has switched their own Python applications
over", that's a perfectly reasonable course of action for them to
take. After all, that approach to the adoption of new Python versions
is a large part of why Python 2.6 is still so widely supported by
library and framework developers: enterprise Linux distros haven't
even finished migrating to Python 2.7 yet, let alone Python 3. (The
other reason is that the language moratorium that was applied to
Python 2.7 and 3.2 means that supporting back to Python 2.6 isn't that
much harder than supporting 2.7 at this point in time).

That said, the feedback from the early adopters of Python 3 on Linux
is proving invaluable, and Linux users in general will benefit from
their work as the distros move their infrastructure applications over.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From timothy.c.delaney at gmail.com  Fri Jun  6 13:48:41 2014
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Fri, 6 Jun 2014 21:48:41 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140606143401.79a7b0ee@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
 <CAN8CLgk2hro-BQys=tN+DTmi6LzqUWo9LUoLzDiBFKwK_8R6Mg@mail.gmail.com>
 <20140606121306.06783df6@x34f>
 <8761ke2syo.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140606143401.79a7b0ee@x34f>
Message-ID: <CAN8CLgm7UdOXdmjKdFRn3KNi-Myp6SgQ9J_3WcP8XgT4G5ofdw@mail.gmail.com>

On 6 June 2014 21:34, Paul Sokolovsky <pmiscml at gmail.com> wrote:

>
> On Fri, 06 Jun 2014 20:11:27 +0900
> "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
>
> > Paul Sokolovsky writes:
> >
> >  > That kinda means "string is atomic", instead of your "characters
> >  > are atomic".
> >
> > I would be very surprised if a language that behaved that way was
> > called a "Python subset".  No indexing, no slicing, no regexps, no
> > .split(), no .startswith(), no sorted() or .sort(), ...!?
> >
> > If that's not what you mean by "string is atomic", I think you're
> > using very confusing terminology.
>
> I'm sorry if I didn't mention it, or didn't make it clear enough - it's
> all about layering.
>
> On level 0, you treat strings verbatim, and can write some subset of
> apps (my point is that even this level allows to write lot enough
> apps). Let's call this set A0.
>
> On level 1, you accept that there's some universal enough conventions
> for some chars, like space or newline. And you can write set of
> apps A1 > A0.
>

At heart, this is exactly what the Python 3 "str" type is. The universal
convention is "code points". It's got nothing to do with encodings, or
bytes. A Python string is simply a finite sequence of atomic code points -
it is indexable, and it has a length. Once you have that, everything is
layered on top of it. How the code points themselves are implemented is
opaque and irrelevant other than the memory and performance consequences of
the implementation decisions (for example, a string could be indexable by
iterating from the start until you find the nth code point).

Similarly the "bytes" type is a sequence of 8-bit bytes.

Encodings are simply a way to transport code points via a byte-oriented
transport.

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/86de3419/attachment-0001.html>

From pmiscml at gmail.com  Fri Jun  6 15:18:38 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Fri, 6 Jun 2014 16:18:38 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmruar$oda$1@ger.gmane.org>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <lmn9oo$d20$1@ger.gmane.org>
 <CAPTjJmof+qiELL+GeF90wRiFXkgRDyFo=TQ+fQZwJcgJys+_AQ@mail.gmail.com>
 <20140604174930.3a5af45f@x34f>
 <CAPTjJmp4Q8-Q6=B4BXB_mupv+c3TmjJzvVKNzctAxOYK8FQV9A@mail.gmail.com>
 <02b9a61658c04b11a21317da5b78bad6@BLUPR03MB389.namprd03.prod.outlook.com>
 <lmnfbu$oes$1@ger.gmane.org> <lmruar$oda$1@ger.gmane.org>
Message-ID: <20140606161838.2fe38114@x34f>

Hello,

On Fri, 06 Jun 2014 09:32:25 +0100
Mark Lawrence <breamoreboy at yahoo.co.uk> wrote:

> On 04/06/2014 16:52, Mark Lawrence wrote:
> > On 04/06/2014 16:32, Steve Dower wrote:
> >>
> >> If copying into a separate list is a problem (memory-wise),
> >> re.finditer('\\S+', string) also provides the same behaviour and
> >> gives me the sliced string, so there's no need to index for
> >> anything.
> >>
> >
> > Out of idle curiosity is there anything that stops MicroPython, or
> > any other implementation for that matter, from providing views of a
> > string rather than copying every time?  IIRC memoryviews in CPython
> > rely on the buffer protocol at the C API level, so since strings
> > don't support this protocol you can't take a memoryview of them.
> > Could this actually be implemented in the future, is the underlying
> > C code just too complicated, or what?
> >
> 
> Anybody?

I'd like to address this, and other, buffer manipulation
optimization ideas I have for MicroPython at some time later. But as
you suggest, it would possible to transparently have
"strings-by-reference". The reasons MicroPython doesn't have such so
far (and why I'm, as a uPy contributor, not ready to discuss them) is
because they're optimization, and everyone knows what premature
optimization is.

[]

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From breamoreboy at yahoo.co.uk  Fri Jun  6 15:30:18 2014
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Fri, 06 Jun 2014 14:30:18 +0100
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <5391819E.3060300@avl.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <lmn9oo$d20$1@ger.gmane.org>
 <CAPTjJmof+qiELL+GeF90wRiFXkgRDyFo=TQ+fQZwJcgJys+_AQ@mail.gmail.com>
 <20140604174930.3a5af45f@x34f>
 <CAPTjJmp4Q8-Q6=B4BXB_mupv+c3TmjJzvVKNzctAxOYK8FQV9A@mail.gmail.com>
 <02b9a61658c04b11a21317da5b78bad6@BLUPR03MB389.namprd03.prod.outlook.com>
 <lmnfbu$oes$1@ger.gmane.org> <5391819E.3060300@avl.com>
Message-ID: <lmsfpf$7o4$1@ger.gmane.org>

On 06/06/2014 09:53, Hrvoje Niksic wrote:
> On 06/04/2014 05:52 PM, Mark Lawrence wrote:
>> On 04/06/2014 16:32, Steve Dower wrote:
>>>
>>> If copying into a separate list is a problem (memory-wise),
>>> re.finditer('\\S+', string) also provides the same behaviour and
>>> gives me the sliced string, so there's no need to index for anything.
>>>
>>
>> Out of idle curiosity is there anything that stops MicroPython, or any
>> other implementation for that matter, from providing views of a string
>> rather than copying every time?  IIRC memoryviews in CPython rely on the
>> buffer protocol at the C API level, so since strings don't support this
>> protocol you can't take a memoryview of them.  Could this actually be
>> implemented in the future, is the underlying C code just too
>> complicated, or what?
>>
>
> Memory view of Unicode strings is controversial for two reasons:
>
> 1. It exposes the internal representation of the string. If memoryviews
> of strings were supported in Python 3, PEP 393 would not have been
> possible (without breaking that feature).
>
> 2. Even if it were OK to expose the internal representation, it might
> not be what the users expect. For example, memoryview("Hrvoje") would
> return a view of a 6-byte buffer, while memoryview("Nik?i?") would
> return a view of a 12-byte UCS-2 buffer. The user of a memory view might
> expect to get UCS-2 (or UCS-4, or even UTF-8) in all cases.
>
> An implementation that decided to export strings as memory views might
> be forced to make a decision about internal representation of strings,
> and then stick to it.
>
> The byte objects don't have these issues, which is why in Python 2.7
> memoryview("foo") works just fine, as does memoryview(b"foo") in Python 3.
>

Thanks for the explanation :)

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com


From antoine at python.org  Fri Jun  6 16:05:52 2014
From: antoine at python.org (Antoine Pitrou)
Date: Fri, 06 Jun 2014 10:05:52 -0400
Subject: [Python-Dev] asyncio/Tulip: use CPython as the new upstream
In-Reply-To: <20140606110041.08C03250DE6@webabinitio.net>
References: <CAMpsgwbzDBQxbKPukYyLHHCL0AzXOe2gX2s7FHabMMFKeDQy4g@mail.gmail.com>
 <20140606110041.08C03250DE6@webabinitio.net>
Message-ID: <lmshs1$9b9$1@ger.gmane.org>

Le 06/06/2014 07:00, R. David Murray a ?crit :
>
> I don't have any opinion on the workflow.
>
> My understanding is that part of the purpose of the "provisional"
> designation is to allow faster evolution (read: fixing) of an API before
> the library becomes non-provisional.  Thus I agree with Guido here, and
> will be doing something similar with at least one of the minor provisional
> email API features in 3.4.2 (unless I miss the cutoff again ... :(

I would personally distinguish API fixes (compatibility-breaking 
changes) from feature additions (new APIs).

Regards

Antoine.


From rdmurray at bitdance.com  Fri Jun  6 16:37:39 2014
From: rdmurray at bitdance.com (R. David Murray)
Date: Fri, 06 Jun 2014 10:37:39 -0400
Subject: [Python-Dev] asyncio/Tulip: use CPython as the new upstream
In-Reply-To: <lmshs1$9b9$1@ger.gmane.org>
References: <CAMpsgwbzDBQxbKPukYyLHHCL0AzXOe2gX2s7FHabMMFKeDQy4g@mail.gmail.com>
 <20140606110041.08C03250DE6@webabinitio.net> <lmshs1$9b9$1@ger.gmane.org>
Message-ID: <20140606143739.67FEB250DE6@webabinitio.net>

On Fri, 06 Jun 2014 10:05:52 -0400, Antoine Pitrou <antoine at python.org> wrote:
> Le 06/06/2014 07:00, R. David Murray a ??crit :
> >
> > I don't have any opinion on the workflow.
> >
> > My understanding is that part of the purpose of the "provisional"
> > designation is to allow faster evolution (read: fixing) of an API before
> > the library becomes non-provisional.  Thus I agree with Guido here, and
> > will be doing something similar with at least one of the minor provisional
> > email API features in 3.4.2 (unless I miss the cutoff again ... :(
> 
> I would personally distinguish API fixes (compatibility-breaking 
> changes) from feature additions (new APIs).

It doesn't look like the PEP directly addresses API changes in maintenance
releases, and I suppose that should be fixed.

I specifically want to fix this API before someone depends on it working
the wrong way, which they would have to if I left it alone for the whole
of the 3.4 series.  (Issue 21091 for the curious.)

--David

From pmiscml at gmail.com  Fri Jun  6 16:52:17 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Fri, 6 Jun 2014 17:52:17 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <CAN8CLgm7UdOXdmjKdFRn3KNi-Myp6SgQ9J_3WcP8XgT4G5ofdw@mail.gmail.com>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
 <CAN8CLgk2hro-BQys=tN+DTmi6LzqUWo9LUoLzDiBFKwK_8R6Mg@mail.gmail.com>
 <20140606121306.06783df6@x34f>
 <8761ke2syo.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140606143401.79a7b0ee@x34f>
 <CAN8CLgm7UdOXdmjKdFRn3KNi-Myp6SgQ9J_3WcP8XgT4G5ofdw@mail.gmail.com>
Message-ID: <20140606175217.766b781c@x34f>

Hello,

On Fri, 6 Jun 2014 21:48:41 +1000
Tim Delaney <timothy.c.delaney at gmail.com> wrote:

> On 6 June 2014 21:34, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> 
> >
> > On Fri, 06 Jun 2014 20:11:27 +0900
> > "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
> >
> > > Paul Sokolovsky writes:
> > >
> > >  > That kinda means "string is atomic", instead of your
> > >  > "characters are atomic".
> > >
> > > I would be very surprised if a language that behaved that way was
> > > called a "Python subset".  No indexing, no slicing, no regexps, no
> > > .split(), no .startswith(), no sorted() or .sort(), ...!?
> > >
> > > If that's not what you mean by "string is atomic", I think you're
> > > using very confusing terminology.
> >
> > I'm sorry if I didn't mention it, or didn't make it clear enough -
> > it's all about layering.
> >
> > On level 0, you treat strings verbatim, and can write some subset of
> > apps (my point is that even this level allows to write lot enough
> > apps). Let's call this set A0.
> >
> > On level 1, you accept that there's some universal enough
> > conventions for some chars, like space or newline. And you can
> > write set of apps A1 > A0.
> >
> 
> At heart, this is exactly what the Python 3 "str" type is. The
> universal convention is "code points". 

Yes. Except for one small detail - Python3 specifies these code points
to be Unicode code points. And Unicode is a very bloated thing.

But if we drop that "Unicode" stipulation, then it's also exactly what
MicroPython implements. Its "str" type consists of codepoints, we don't
have pet names for them yet, like Unicode does, but their numeric
values are 0-255. Note that it in no way limits encodings, characters,
or scripts which can be used with MicroPython, because just like
Unicode, it support concept of "surrogate pairs" (but we don't call it
like that) - specifically, smaller code points may comprise bigger
groupings. But unlike Unicode, we don't stipulate format, value or
other constraints on how these "surrogate pairs"-alikes are formed,
leaving that to users.


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From rosuav at gmail.com  Fri Jun  6 17:14:30 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Sat, 7 Jun 2014 01:14:30 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140606131531.2f8431c1@x34f>
References: <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <20140605121054.GA348@sleipnir.bytereef.org>
 <CADiSq7eTGkikS6pzJGJjrWFW=FkWXnt6n_kb+dFex4OYFtodag@mail.gmail.com>
 <20140606131531.2f8431c1@x34f>
Message-ID: <CAPTjJmoeLPNn+wAyR68Lf0Y-kvfNYeZD9PQ6xuhRCxxh0FoykA@mail.gmail.com>

On Fri, Jun 6, 2014 at 8:15 PM, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> I'm sorry if I was somehow related to that, my
> bringing in the formal language spec was more a rhetorical figure, a
> response to people claiming O(1) requirement.

This was exactly why this whole discussion came up, though. We were
debating on the uPy bug tracker about how important O(1) indexing is;
I then came to python-list to try to get some solid data from which to
debate; and then the discussion jumped here to python-dev for more
solid explanations. The spec wasn't perfectly clear, and now it's
being made clearer: O(N) indexing does not violate Python's spec, ergo
uPy is allowed to use UTF-8 as its internal representation, as long as
script-visible behaviour is correct. It'll be interesting to see when
it's done (I'm currently working on that implementation, bit by bit)
and to run the CPython benchmarks on it.

It's been a fruitful and interesting discussion, and the formal
language spec is key to it. No need to apologize!

ChrisA

From regex at mrabarnett.plus.com  Fri Jun  6 17:47:24 2014
From: regex at mrabarnett.plus.com (MRAB)
Date: Fri, 06 Jun 2014 16:47:24 +0100
Subject: [Python-Dev] asyncio/Tulip: use CPython as the new upstream
In-Reply-To: <CAMpsgwbzDBQxbKPukYyLHHCL0AzXOe2gX2s7FHabMMFKeDQy4g@mail.gmail.com>
References: <CAMpsgwbzDBQxbKPukYyLHHCL0AzXOe2gX2s7FHabMMFKeDQy4g@mail.gmail.com>
Message-ID: <5391E28C.1000406@mrabarnett.plus.com>

On 2014-06-06 10:31, Victor Stinner wrote:
> Hi,
>
> I added a new BaseEventLoop.is_closed() method to Tulip and Python
> 3.5 to fix an issue (see Tulip issue 169 for the detail). The problem
> is that I don't want to add this method to Python 3.4 because usually
> we don't add new methods in minor versions of Python (future version
> 3.4.2 in this case).
>
> Guido just wrote in the issue: "Actually for asyncio we have special
> dispensation to push new features to minor releases (until 3.5).
> Please push to 3.4 so the source code is the same everywhere (except
> selectors.py, which is not covered by the exception)."
>
> I disagree with Guido. I would prefer to start to maintain a
> different branch for Python 3.4, because I consider that only
> bugfixes should be applied to Python 3.4.
>
[snip]

Isn't this a little like when bool, True and False were added to
Python 2.2.1, a bugfix release, an act that is, I believe, now regarded
as a mistake not to be repeated?

From Steve.Dower at microsoft.com  Fri Jun  6 17:41:22 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Fri, 6 Jun 2014 15:41:22 +0000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
Message-ID: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>

Hi all

I would like to propose moving Python 3.5 to use Visual C++ 14.0 as the main compiler. The first CTP of Visual Studio "14" was released earlier this week: http://blogs.msdn.com/b/vcblog/archive/2014/06/03/visual-studio-14-ctp.aspx

The major feature of interest in this version of MSVC is a new policy to maintain binary compatibility for the CRT into the future. (There will be a blog about this soon, but I didn't want to hold up getting the discussion started here.)

What this means for Python is that C extensions for Python 3.5 and later can be built using any version of MSVC from 14.0 and later. Those who are aware of the current state of affairs where you need to use a matching compiler will hopefully see how big an improvement this will be. It is also likely that other compilers will have an easier time providing compatibility with this new CRT, making it simpler and more reliable to build extensions with LLVM or GCC against an MSVC CPython. 

The other major benefit is that both products are at points in their development where changes can be made. Being a Microsoft employee, I have the ability to test Python builds regularly against the daily MSVC builds and to file bugs directly to the VC team (crashes, incorrect code generation, incorrect linking, performance regressions, etc.). This is a great opportunity to make sure that our needs are covered by the compiler team - it's also a good chance to raise any particular missing features that would be beneficial.

My internal testing shows that the core code is almost fully compatible and builds successfully with only trivial modifications (some CRT variables are now macros with a leading underscore). The project files need updating, but I am willing to do this as part of any migration. There may also be some work required for external dependencies, since I did not test these, but I am also willing to do that.

Basically, what I am offering to do is:

* Update the files in PCBuild to work with Visual Studio "14"
* Make any code changes necessary to build with VC14
* Regularly test the latest Python source with the latest MSVC builds and report issues/suggestions to the MSVC team
* Keep all changes in a separate (public) repo until early next year when we're getting close to the final VS "14" release 

What I am asking anyone else to do is:

* Nothing

Thoughts/comments/concerns?

Cheers,
Steve

From tjreedy at udel.edu  Fri Jun  6 17:59:31 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 06 Jun 2014 11:59:31 -0400
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <5391819E.3060300@avl.com>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <lmn9oo$d20$1@ger.gmane.org>
 <CAPTjJmof+qiELL+GeF90wRiFXkgRDyFo=TQ+fQZwJcgJys+_AQ@mail.gmail.com>
 <20140604174930.3a5af45f@x34f>
 <CAPTjJmp4Q8-Q6=B4BXB_mupv+c3TmjJzvVKNzctAxOYK8FQV9A@mail.gmail.com>
 <02b9a61658c04b11a21317da5b78bad6@BLUPR03MB389.namprd03.prod.outlook.com>
 <lmnfbu$oes$1@ger.gmane.org> <5391819E.3060300@avl.com>
Message-ID: <lmsoh9$ufi$1@ger.gmane.org>

On 6/6/2014 4:53 AM, Hrvoje Niksic wrote:
> On 06/04/2014 05:52 PM, Mark Lawrence wrote:

>> Out of idle curiosity is there anything that stops MicroPython, or any
>> other implementation for that matter, from providing views of a string
>> rather than copying every time?  IIRC memoryviews in CPython rely on the
>> buffer protocol at the C API level, so since strings don't support this
>> protocol you can't take a memoryview of them.  Could this actually be
>> implemented in the future, is the underlying C code just too
>> complicated, or what?
>>
>
> Memory view of Unicode strings is controversial for two reasons:
>
> 1. It exposes the internal representation of the string. If memoryviews
> of strings were supported in Python 3, PEP 393 would not have been
> possible (without breaking that feature).
>
> 2. Even if it were OK to expose the internal representation, it might
> not be what the users expect. For example, memoryview("Hrvoje") would
> return a view of a 6-byte buffer, while memoryview("Nik?i?") would
> return a view of a 12-byte UCS-2 buffer. The user of a memory view might
> expect to get UCS-2 (or UCS-4, or even UTF-8) in all cases.
>
> An implementation that decided to export strings as memory views might
> be forced to make a decision about internal representation of strings,
> and then stick to it.
>
> The byte objects don't have these issues, which is why in Python 2.7
> memoryview("foo") works just fine, as does memoryview(b"foo") in Python 3.

The other problem is that a small slice view of a large object keeps the 
large object alive, so a view user needs to think carefully about 
whether to make a copy or create a view, and later to copy views to 
delete the base object. This is not for beginners.

-- 
Terry Jan Reedy


From rosuav at gmail.com  Fri Jun  6 18:01:08 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Sat, 7 Jun 2014 02:01:08 +1000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>

On Sat, Jun 7, 2014 at 1:41 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:
> What this means for Python is that C extensions for Python 3.5 and later can be built using any version of MSVC from 14.0 and later.

Oh, if only this had been available for 2.7!! Actually... this means
that 14.0 would be a good target for a compiler change for 2.7.x, if
such a change is ever acceptable.

To what extent is this compatibility going to be maintained? Is there
a guarantee that there'll be X versions (or X years) of
cross-compilation support?

ChrisA

From donald at stufft.io  Fri Jun  6 18:01:52 2014
From: donald at stufft.io (Donald Stufft)
Date: Fri, 6 Jun 2014 12:01:52 -0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <40554C37-5379-4D35-8EB8-93481436A8D0@stufft.io>

On Jun 6, 2014, at 11:41 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:

> words

+1 from me.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/13ca371d/attachment-0001.sig>

From guido at python.org  Fri Jun  6 18:04:38 2014
From: guido at python.org (Guido van Rossum)
Date: Fri, 6 Jun 2014 09:04:38 -0700
Subject: [Python-Dev] asyncio/Tulip: use CPython as the new upstream
In-Reply-To: <5391E28C.1000406@mrabarnett.plus.com>
References: <CAMpsgwbzDBQxbKPukYyLHHCL0AzXOe2gX2s7FHabMMFKeDQy4g@mail.gmail.com>
 <5391E28C.1000406@mrabarnett.plus.com>
Message-ID: <CAP7+vJ+fOZuz=pzp6aJUV2w1EsbzGETSWS6FXVcq2++6UmNnqA@mail.gmail.com>

On Fri, Jun 6, 2014 at 8:47 AM, MRAB <regex at mrabarnett.plus.com> wrote:

> On 2014-06-06 10:31, Victor Stinner wrote:
>
>> Hi,
>>
>> I added a new BaseEventLoop.is_closed() method to Tulip and Python
>> 3.5 to fix an issue (see Tulip issue 169 for the detail). The problem
>> is that I don't want to add this method to Python 3.4 because usually
>> we don't add new methods in minor versions of Python (future version
>> 3.4.2 in this case).
>>
>> Guido just wrote in the issue: "Actually for asyncio we have special
>> dispensation to push new features to minor releases (until 3.5).
>> Please push to 3.4 so the source code is the same everywhere (except
>> selectors.py, which is not covered by the exception)."
>>
>> I disagree with Guido. I would prefer to start to maintain a
>> different branch for Python 3.4, because I consider that only
>> bugfixes should be applied to Python 3.4.
>>
>>  [snip]
>
> Isn't this a little like when bool, True and False were added to
> Python 2.2.1, a bugfix release, an act that is, I believe, now regarded
> as a mistake not to be repeated?
>

It's a little like that, but it's also a little unlike that -- asyncio is
explicitly accepted in the stdlib with "provisional" status which allows
changes like this.

Regarding the workflow, I'd really like asyncio to be able to move faster
than the rest of the stdlib, at least until 3.5 is fixed. Working in the
Tulip repo is much easier for me than working in the CPython repo, so I'd
like to keep the workflow of Tulip -> 3.4 -> 3.5 as long as possible. I
also specifically consider selectors.py subject to a *different* workflow
-- for that module the workflow should be 3.5 -> Tulip. If Tulip's
update_stdlib.sh script's prompts to copy this file are too distracting, I
can hack the script to be silent about this file if it detects that the
CPython repo is 3.4.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/a90c443e/attachment.html>

From sturla.molden at gmail.com  Fri Jun  6 18:06:18 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Fri, 6 Jun 2014 16:06:18 +0000 (UTC)
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
	elision in third-party classes
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <lmr8ct$tab$1@ger.gmane.org> <5391754D.8000607@googlemail.com>
Message-ID: <1842263445423761298.493568sturla.molden-gmail.com@news.gmane.org>

Julian Taylor <jtaylor.debian at googlemail.com> wrote:
 
> The problem with this approach is that it is already difficult enough to
> handle memory in numpy.

I would not do this in a way that complicates memory management in NumPy. I
would just replace malloc and free with temporarily cached versions. From
the perspective of NumPy the API should be the same.

> Having a cache that potentially stores gigabytes
> of memory out of the users sight will just make things worse.

Buffer don't need to stay in cache forver, just long enough to allow resue
within an expression. We are probably talking about delaying the call to
free with just a few microseconds.

We could e.g. have a setup like this:

NumPy thread on "malloc":
- tries to grab memory off the internal heap
- calls system malloc on failure

NumPy thread on "free":
- returns a buffer to the internal heap
- signals a condition

Background daemonic GC thread:
- wakes after sleeping on the condition
- sleeps for another N microseconds (N = magic number)
- flushes or shrinks the internal heap with system free
- goes back to sleeping on the condition 
 
It can be implemented with the same API as malloc and free, and plugged
directly into the existing NumPy code. 

We would in total need two mutexes, one condition variable, a pthread, and
a heap.

Sturla


From status at bugs.python.org  Fri Jun  6 18:07:55 2014
From: status at bugs.python.org (Python tracker)
Date: Fri,  6 Jun 2014 18:07:55 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <20140606160755.1EFB156A46@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (2014-05-30 - 2014-06-06)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open    4650 (+15)
  closed 28802 (+52)
  total  33452 (+67)

Open issues with patches: 2127 


Issues opened (48)
==================

#21614: Case sensitivity problem in multiprocessing.
http://bugs.python.org/issue21614  opened by ColinPDavidson

#21615: Curses bug report for Python 2.7 and Python 3.2
http://bugs.python.org/issue21615  opened by eclectic9509

#21616: argparse explodes with nargs='*' and a tuple metavar
http://bugs.python.org/issue21616  opened by vvas

#21617: importlib reload can fail with AttributeError if module remove
http://bugs.python.org/issue21617  opened by ned.deily

#21619: Cleaning up a subprocess with a broken pipe
http://bugs.python.org/issue21619  opened by vadmium

#21621: Add note to 3.x What's New re Idle changes in bugfix releases
http://bugs.python.org/issue21621  opened by terry.reedy

#21622: ctypes.util incorrectly fails for libraries without DT_SONAME
http://bugs.python.org/issue21622  opened by Jeremy.Huntwork

#21623: build ssl failed use vs2010 express
http://bugs.python.org/issue21623  opened by Mo.Jia

#21624: Idle: polish htests
http://bugs.python.org/issue21624  opened by terry.reedy

#21625: help()'s more-mode is frustrating
http://bugs.python.org/issue21625  opened by nedbat

#21626: Add options width and compact to pickle cli
http://bugs.python.org/issue21626  opened by barcc

#21627: Concurrently closing files and iterating over the open files d
http://bugs.python.org/issue21627  opened by sstewartgallus

#21629: clinic.py --converters fails
http://bugs.python.org/issue21629  opened by serhiy.storchaka

#21632: Idle: sychronize text files across versions as appropriate.
http://bugs.python.org/issue21632  opened by terry.reedy

#21633: Argparse does not propagate HelpFormatter class to subparsers
http://bugs.python.org/issue21633  opened by Michael.Cohen

#21635: difflib.SequenceMatcher stores matching blocks as tuples, not 
http://bugs.python.org/issue21635  opened by drevicko

#21642: "_ if 1else _" does not compile
http://bugs.python.org/issue21642  opened by Joshua.Landau

#21644: Optimize bytearray(int) constructor to use calloc()
http://bugs.python.org/issue21644  opened by haypo

#21645: test_read_all_from_pipe_reader() of test_asyncio hangs on Free
http://bugs.python.org/issue21645  opened by haypo

#21646: Add tests for turtle.ScrolledCanvas
http://bugs.python.org/issue21646  opened by ingrid

#21647: Idle unittests: make gui, mock switching easier.
http://bugs.python.org/issue21647  opened by terry.reedy

#21648: urllib urlopener leaves open sockets for FTP connection
http://bugs.python.org/issue21648  opened by Claudiu.Popa

#21649: Mention "Recommendations for Secure Use of TLS and DTLS"
http://bugs.python.org/issue21649  opened by pitrou

#21650: add json.tool option to avoid alphabetic sort of fields
http://bugs.python.org/issue21650  opened by Pavel.Kazlou

#21652: Python 2.7.7 regression in mimetypes module on Windows
http://bugs.python.org/issue21652  opened by foom

#21655: Write Unit Test for Vec2 class in the Turtle Module
http://bugs.python.org/issue21655  opened by Lita.Cho

#21656: Create test coverage for TurtleScreenBase in Turtle
http://bugs.python.org/issue21656  opened by Lita.Cho

#21657: pip.get_installed_distributions() Does not return packages in 
http://bugs.python.org/issue21657  opened by Adam.Matan

#21658: __m128, can't build 3.4.1 with intel 14.0.0
http://bugs.python.org/issue21658  opened by aom

#21659: IDLE: One corner calltip case
http://bugs.python.org/issue21659  opened by serhiy.storchaka

#21660: Substitute @TOKENS@ from sysconfig variables, for python-confi
http://bugs.python.org/issue21660  opened by haubi

#21664: multiprocessing leaks temporary directories pymp-xxx
http://bugs.python.org/issue21664  opened by yjhong

#21665: 2.7.7 ttk widgets not themed
http://bugs.python.org/issue21665  opened by les.bothwell

#21666: Argparse exceptions should include which argument has a proble
http://bugs.python.org/issue21666  opened by v+python

#21667: Clarify status of O(1) indexing semantics of str objects
http://bugs.python.org/issue21667  opened by ncoghlan

#21668: The select and time modules uses libm functions without linkin
http://bugs.python.org/issue21668  opened by fornwall

#21669: Custom error messages when print & exec are used as statements
http://bugs.python.org/issue21669  opened by ncoghlan

#21670: Add repr to shelve.Shelf
http://bugs.python.org/issue21670  opened by Claudiu.Popa

#21671: CVE-2014-0224: OpenSSL upgrade to 1.0.1h on Windows required
http://bugs.python.org/issue21671  opened by lambacck

#21672: Python for Windows 2.7.7: Path Configuration File No Longer Wo
http://bugs.python.org/issue21672  opened by jblairpdx

#21673: Idle: hilite search terms in hits in Find in Files output wind
http://bugs.python.org/issue21673  opened by terry.reedy

#21674: Idle: Add 'find all' in current file
http://bugs.python.org/issue21674  opened by terry.reedy

#21675: Library - Introduction - paragraph 5 - wrong ordering
http://bugs.python.org/issue21675  opened by AnthonyBartoli

#21676: IDLE - Test Replace Dialog
http://bugs.python.org/issue21676  opened by sahutd

#21677: Exception context set to string by BufferedWriter.close()
http://bugs.python.org/issue21677  opened by vadmium

#21678: Add operation "plus" for dictionaries
http://bugs.python.org/issue21678  opened by Pix

#21679: Prevent extraneous fstat during open()
http://bugs.python.org/issue21679  opened by bkabrda

#21680: asyncio: document event loops
http://bugs.python.org/issue21680  opened by haypo


Most recent 15 issues with no replies (15)
==========================================

#21680: asyncio: document event loops
http://bugs.python.org/issue21680

#21679: Prevent extraneous fstat during open()
http://bugs.python.org/issue21679

#21677: Exception context set to string by BufferedWriter.close()
http://bugs.python.org/issue21677

#21676: IDLE - Test Replace Dialog
http://bugs.python.org/issue21676

#21675: Library - Introduction - paragraph 5 - wrong ordering
http://bugs.python.org/issue21675

#21674: Idle: Add 'find all' in current file
http://bugs.python.org/issue21674

#21673: Idle: hilite search terms in hits in Find in Files output wind
http://bugs.python.org/issue21673

#21670: Add repr to shelve.Shelf
http://bugs.python.org/issue21670

#21666: Argparse exceptions should include which argument has a proble
http://bugs.python.org/issue21666

#21660: Substitute @TOKENS@ from sysconfig variables, for python-confi
http://bugs.python.org/issue21660

#21657: pip.get_installed_distributions() Does not return packages in 
http://bugs.python.org/issue21657

#21656: Create test coverage for TurtleScreenBase in Turtle
http://bugs.python.org/issue21656

#21655: Write Unit Test for Vec2 class in the Turtle Module
http://bugs.python.org/issue21655

#21652: Python 2.7.7 regression in mimetypes module on Windows
http://bugs.python.org/issue21652

#21649: Mention "Recommendations for Secure Use of TLS and DTLS"
http://bugs.python.org/issue21649


Most recent 15 issues waiting for review (15)
=============================================

#21679: Prevent extraneous fstat during open()
http://bugs.python.org/issue21679

#21676: IDLE - Test Replace Dialog
http://bugs.python.org/issue21676

#21670: Add repr to shelve.Shelf
http://bugs.python.org/issue21670

#21669: Custom error messages when print & exec are used as statements
http://bugs.python.org/issue21669

#21668: The select and time modules uses libm functions without linkin
http://bugs.python.org/issue21668

#21660: Substitute @TOKENS@ from sysconfig variables, for python-confi
http://bugs.python.org/issue21660

#21650: add json.tool option to avoid alphabetic sort of fields
http://bugs.python.org/issue21650

#21648: urllib urlopener leaves open sockets for FTP connection
http://bugs.python.org/issue21648

#21627: Concurrently closing files and iterating over the open files d
http://bugs.python.org/issue21627

#21626: Add options width and compact to pickle cli
http://bugs.python.org/issue21626

#21610: load_module not closing opened files
http://bugs.python.org/issue21610

#21600: mock.patch.stopall doesn't work with patch.dict to sys.modules
http://bugs.python.org/issue21600

#21599: Argument transport in attach and detach method in Server class
http://bugs.python.org/issue21599

#21596: asyncio.wait fails when futures list is empty
http://bugs.python.org/issue21596

#21595: asyncio: Creating many subprocess generates lots of internal B
http://bugs.python.org/issue21595


Top 10 most discussed issues (10)
=================================

#21667: Clarify status of O(1) indexing semantics of str objects
http://bugs.python.org/issue21667  15 msgs

#21427: installer not working
http://bugs.python.org/issue21427  12 msgs

#21476: Inconsitent behaviour between BytesParser.parse and Parser.par
http://bugs.python.org/issue21476  11 msgs

#21592: Make statistics.median run in linear time
http://bugs.python.org/issue21592  11 msgs

#21573: Clean up turtle.py code formatting
http://bugs.python.org/issue21573   9 msgs

#21623: build ssl failed use vs2010 express
http://bugs.python.org/issue21623   9 msgs

#15590: --libs is inconsistent for python-config --libs and pkgconfig 
http://bugs.python.org/issue15590   8 msgs

#21665: 2.7.7 ttk widgets not themed
http://bugs.python.org/issue21665   8 msgs

#10740: sqlite3 module breaks transactions and potentially corrupts da
http://bugs.python.org/issue10740   7 msgs

#21671: CVE-2014-0224: OpenSSL upgrade to 1.0.1h on Windows required
http://bugs.python.org/issue21671   7 msgs


Issues closed (51)
==================

#6181: Tkinter.Listbox several minor issues
http://bugs.python.org/issue6181  closed by serhiy.storchaka

#11387: Tkinter, callback functions
http://bugs.python.org/issue11387  closed by terry.reedy

#13630: IDLE: Find(ed) text is not highlighted while dialog box is ope
http://bugs.python.org/issue13630  closed by terry.reedy

#17095: Modules/Setup *shared* support broken
http://bugs.python.org/issue17095  closed by ned.deily

#18292: Idle: test AutoExpand.py
http://bugs.python.org/issue18292  closed by terry.reedy

#18409: Idle: test AutoComplete.py
http://bugs.python.org/issue18409  closed by terry.reedy

#18492: Allow all resources if not running under regrtest.py
http://bugs.python.org/issue18492  closed by zach.ware

#18910: IDle: test textView.py
http://bugs.python.org/issue18910  closed by terry.reedy

#19656: Add Py3k warning for non-ascii bytes literals
http://bugs.python.org/issue19656  closed by serhiy.storchaka

#20336: test_asyncio: relax timings even more
http://bugs.python.org/issue20336  closed by skrah

#20383: Add a keyword-only spec argument to types.ModuleType
http://bugs.python.org/issue20383  closed by brett.cannon

#20475: pystone.py in 3.4 still uses time.clock(), even though it's ma
http://bugs.python.org/issue20475  closed by gvanrossum

#21119: asyncio create_connection resource warning
http://bugs.python.org/issue21119  closed by haypo

#21180: Efficiently create empty array.array, consistent with bytearra
http://bugs.python.org/issue21180  closed by gvanrossum

#21233: Add *Calloc functions to CPython memory allocation API
http://bugs.python.org/issue21233  closed by haypo

#21252: Lib/asyncio/events.py has tons of docstrings which are just "X
http://bugs.python.org/issue21252  closed by haypo

#21304: PEP 466: Backport hashlib.pbkdf2_hmac to Python 2.7
http://bugs.python.org/issue21304  closed by python-dev

#21344: save scores or ratios in difflib get_close_matches
http://bugs.python.org/issue21344  closed by zach.ware

#21462: PEP 466: upgrade OpenSSL in the Python 2.7 Windows builds
http://bugs.python.org/issue21462  closed by benjamin.peterson

#21477: Idle: improve idle_test.htest
http://bugs.python.org/issue21477  closed by terry.reedy

#21504: can the subprocess module war using os.wait4 and so return usa
http://bugs.python.org/issue21504  closed by r.david.murray

#21533: built-in types dict docs - construct dict from iterable, not i
http://bugs.python.org/issue21533  closed by terry.reedy

#21552: String length overflow in Tkinter
http://bugs.python.org/issue21552  closed by serhiy.storchaka

#21572: Use generic license web page rather than requiring release-spe
http://bugs.python.org/issue21572  closed by ned.deily

#21576: Overwritten (custom) uuid inside dictionary
http://bugs.python.org/issue21576  closed by r.david.murray

#21583: use support.captured_stderr context manager - test_logging
http://bugs.python.org/issue21583  closed by python-dev

#21593: Clarify re.search documentation first match
http://bugs.python.org/issue21593  closed by terry.reedy

#21594: asyncio.create_subprocess_exec raises OSError
http://bugs.python.org/issue21594  closed by haypo

#21601: Cancel method for Asyncio Task is not documented
http://bugs.python.org/issue21601  closed by haypo

#21604: Misleading 2to3 fixer name in documentation: standard_error
http://bugs.python.org/issue21604  closed by python-dev

#21605: Add tests for Tkinter images
http://bugs.python.org/issue21605  closed by serhiy.storchaka

#21612: IDLE should not open multiple instances of one file
http://bugs.python.org/issue21612  closed by terry.reedy

#21618: POpen does not close fds when fds have been inherited from a p
http://bugs.python.org/issue21618  closed by gregory.p.smith

#21620: OrderedDict KeysView set operations not supported
http://bugs.python.org/issue21620  closed by serhiy.storchaka

#21628: 2to3 does not fix zip in some cases
http://bugs.python.org/issue21628  closed by berker.peksag

#21630: List Dict bug?
http://bugs.python.org/issue21630  closed by Robert.w

#21631: List/Dict Combination Bug
http://bugs.python.org/issue21631  closed by rhettinger

#21634: Pystone uses floats
http://bugs.python.org/issue21634  closed by haypo

#21636: test_logging fails on Windows for Unix tests
http://bugs.python.org/issue21636  closed by haypo

#21637: Add a warning section exaplaining that tempfiles are opened in
http://bugs.python.org/issue21637  closed by r.david.murray

#21638: Seeking to EOF is too inefficient!
http://bugs.python.org/issue21638  closed by yanlinlin82

#21639: tracemalloc crashes with floating point exception when using S
http://bugs.python.org/issue21639  closed by haypo

#21640: References to other Python version in sidebar of documentation
http://bugs.python.org/issue21640  closed by orsenthil

#21641: smtplib leaves open sockets around if SMTPResponseException is
http://bugs.python.org/issue21641  closed by orsenthil

#21643: "File exists" error during venv --upgrade
http://bugs.python.org/issue21643  closed by python-dev

#21651: asyncio tests ResourceWarning
http://bugs.python.org/issue21651  closed by haypo

#21653: Row.keys() in sqlite3 returns a list, not a tuple
http://bugs.python.org/issue21653  closed by r.david.murray

#21654: IDLE call tips emitting future warnings about ElementTree obje
http://bugs.python.org/issue21654  closed by rhettinger

#21661: setuptools documentation: typo
http://bugs.python.org/issue21661  closed by python-dev

#21662: datamodel documentation: fix typo and phrasing
http://bugs.python.org/issue21662  closed by r.david.murray

#21663: venv upgrade fails on Windows when copying TCL files
http://bugs.python.org/issue21663  closed by python-dev

From barry at python.org  Fri Jun  6 18:10:37 2014
From: barry at python.org (Barry Warsaw)
Date: Fri, 6 Jun 2014 12:10:37 -0400
Subject: [Python-Dev] asyncio/Tulip: use CPython as the new upstream
In-Reply-To: <5391E28C.1000406@mrabarnett.plus.com>
References: <CAMpsgwbzDBQxbKPukYyLHHCL0AzXOe2gX2s7FHabMMFKeDQy4g@mail.gmail.com>
 <5391E28C.1000406@mrabarnett.plus.com>
Message-ID: <20140606121037.7b95fc3f@anarchist.wooz.org>

On Jun 06, 2014, at 04:47 PM, MRAB wrote:

>Isn't this a little like when bool, True and False were added to
>Python 2.2.1, a bugfix release, an act that is, I believe, now regarded
>as a mistake not to be repeated?

Yes, that was a mistake, but the case under discussion is different.  With
True/False, it was a runtime-wide change that affected every Python program,
and there was no such "special dispensation".

-Barry

From hrvoje.niksic at avl.com  Fri Jun  6 18:11:56 2014
From: hrvoje.niksic at avl.com (Hrvoje Niksic)
Date: Fri, 6 Jun 2014 18:11:56 +0200
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmsoh9$ufi$1@ger.gmane.org>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <lmn9oo$d20$1@ger.gmane.org>
 <CAPTjJmof+qiELL+GeF90wRiFXkgRDyFo=TQ+fQZwJcgJys+_AQ@mail.gmail.com>
 <20140604174930.3a5af45f@x34f>
 <CAPTjJmp4Q8-Q6=B4BXB_mupv+c3TmjJzvVKNzctAxOYK8FQV9A@mail.gmail.com>
 <02b9a61658c04b11a21317da5b78bad6@BLUPR03MB389.namprd03.prod.outlook.com>
 <lmnfbu$oes$1@ger.gmane.org> <5391819E.3060300@avl.com>
 <lmsoh9$ufi$1@ger.gmane.org>
Message-ID: <5391E84C.1060100@avl.com>

On 06/06/2014 05:59 PM, Terry Reedy wrote:
> The other problem is that a small slice view of a large object keeps the
> large object alive, so a view user needs to think carefully about
> whether to make a copy or create a view, and later to copy views to
> delete the base object. This is not for beginners.

And this was important enough that Java 7 actually removed the 
long-standing feature of String.substring creating a string that shares 
the character array with the original.

http://java-performance.info/changes-to-string-java-1-7-0_06/


From p.f.moore at gmail.com  Fri Jun  6 18:16:07 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 6 Jun 2014 17:16:07 +0100
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <CACac1F__LZy96Gj6_+OFOh4G91Bge3Gz-85id=DTbdNav+xOTw@mail.gmail.com>

On 6 June 2014 16:41, Steve Dower <Steve.Dower at microsoft.com> wrote:
> Basically, what I am offering to do is:
>
> * Update the files in PCBuild to work with Visual Studio "14"
> * Make any code changes necessary to build with VC14
> * Regularly test the latest Python source with the latest MSVC builds and report issues/suggestions to the MSVC team
> * Keep all changes in a separate (public) repo until early next year when we're getting close to the final VS "14" release
>
> What I am asking anyone else to do is:
>
> * Nothing

+1 from me.

Paul

From zachary.ware+pydev at gmail.com  Fri Jun  6 18:22:47 2014
From: zachary.ware+pydev at gmail.com (Zachary Ware)
Date: Fri, 6 Jun 2014 11:22:47 -0500
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <CAKJDb-PB2zkg9vcFXaifZo9G46MRnAYkL0kED0rQJZpvCvT6XA@mail.gmail.com>

On Fri, Jun 6, 2014 at 10:41 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:
> Thoughts/comments/concerns?

My only concern is support for elderly versions of Windows, in
particular: XP.  I seem to recall the last "let's update our MSVC
version" discussion dying off because of XP support.  Even though MS
has abandoned it, I'm not sure whether we can yet.

If that's a non-issue, or if we can actually drop XP support, I'm all for it.

-- 
Zach

From pmiscml at gmail.com  Fri Jun  6 18:25:03 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Fri, 6 Jun 2014 19:25:03 +0300
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <lmsoh9$ufi$1@ger.gmane.org>
References: <20140604011718.GD10355@ando>
 <CAPTjJmofEkoCbnP-+CO+7nOMbVjUo-suAzZGuth3rSRbPNdc1w@mail.gmail.com>
 <CAP7+vJLO_cC_n7a1U-nGRQ+NQFvDfafGcQovm5_EwGox9OWJew@mail.gmail.com>
 <CAPTjJmrw=Rw=t9HH7xVguh55QFjGNOww+CbYgnv9rfa4kzgv_g@mail.gmail.com>
 <lmn9oo$d20$1@ger.gmane.org>
 <CAPTjJmof+qiELL+GeF90wRiFXkgRDyFo=TQ+fQZwJcgJys+_AQ@mail.gmail.com>
 <20140604174930.3a5af45f@x34f>
 <CAPTjJmp4Q8-Q6=B4BXB_mupv+c3TmjJzvVKNzctAxOYK8FQV9A@mail.gmail.com>
 <02b9a61658c04b11a21317da5b78bad6@BLUPR03MB389.namprd03.prod.outlook.com>
 <lmnfbu$oes$1@ger.gmane.org> <5391819E.3060300@avl.com>
 <lmsoh9$ufi$1@ger.gmane.org>
Message-ID: <20140606192503.6a22d236@x34f>

Hello,

On Fri, 06 Jun 2014 11:59:31 -0400
Terry Reedy <tjreedy at udel.edu> wrote:

[]

> The other problem is that a small slice view of a large object keeps
> the large object alive, so a view user needs to think carefully about 
> whether to make a copy or create a view, and later to copy views to 
> delete the base object. This is not for beginners.

Yes, so it doesn't make sense to add such feature to any of existing
APIs. However, as I pointed in another mail, it would make lot of sense
to add iterator-based string API (because if dict methods were
*switched* to iterators, why can't string have them *as alternative*),
and for their return values, it would be ~ natural to return "string
views", especially if it's clearly and explicitly described that if
user wants to store them, they should be explicitly copied via
str(view).

One reason against this would be of course API bloat. But API bloat
happens all the time, for example compare this modest proposal
http://bugs.python.org/issue21180 with what's going to be actually
implemented:
http://legacy.python.org/dev/peps/pep-0467/#alternate-constructors .


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From dw+python-dev at hmmz.org  Fri Jun  6 18:37:01 2014
From: dw+python-dev at hmmz.org (dw+python-dev at hmmz.org)
Date: Fri, 6 Jun 2014 16:37:01 +0000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <20140606163701.GA10004@k2>

On Fri, Jun 06, 2014 at 03:41:22PM +0000, Steve Dower wrote:

> [snip]

Speaking as a third party who aims to provide binary distributions for
recent Python releases on Windows, every new compiler introduces a
licensing and configuration headache. So I guess the questions are:

* Does the ABI stability address some historical real world problem with
  Python binary builds? (I guess possibly)

* Is the existing solution of third parties building under e.g. Mingw as
  an option of last resort causing real world issues? It seems to work
  for a lot of people, although I personally avoid it.

* Have other compiler vendors indicated they will change their ABI
  environment to match VS under this new stability guarantee? If not,
  then as yet there is no real world benefit here.

* Has Python ever hit a showstopper release issue as a result of a bug
  in MSVC? (I guess probably not).

* Will VS 14 be golden prior to Python 3.5's release? It would suck to
  rely on a beta compiler.. :)


Sorry for dunking water on this, but I've recently spent a ton of time
getting a Microsoft build environment running, and it seems possible a
new compiler may not yet justify more effort if there is little tangible
benefit.


David

From bcannon at gmail.com  Fri Jun  6 18:37:30 2014
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 06 Jun 2014 16:37:30 +0000
Subject: [Python-Dev] Division of tool labour in porting Python 2 code to 2/3
Message-ID: <CAP1=2W7a76GJMDrT5C+Sc1X6kC2p2yML9wVWGyBzofPb+e+m1g@mail.gmail.com>

After Glyph and Alex's email about their asks for assisting in writing
Python 2/3 code, it got me thinking about where in the toolchain various
warnings and such should go in order to help direct energy to help develop
whatever future toolchain to assist in porting.

There seems to be three places where issues are/can be caught once a
project has embarked down the road of 2/3 source compatibility:

   1. -3 warnings
   2. Some linter tool
   3. Failing tests

-3 warnings are things that we know are flat-out wrong and do not cause
massive compatibility issues in the stdlib. For instance, warning that
buffer() is not in Python 3 is a py3k warning -- Glyph made a mistake when
he asked for it as a new warning -- is a perfect example of something that
isn't excessively noisy and won't cause issues when people run with it.

But what about warning about classic classes? The stdlib is full of them
and they were purposefully left alone for compatibility reasons. But there
is a subtle semantic difference between classic and new-style classes, and
so 2/3 code should consider switching (this is when people chime in saying
"this is why we want a 2.8 release!", but that still isn't happening). If
this were made a py3k warning in 2.7 then the stdlib itself would spew out
warnings which we can't change due to compatibility, so that makes it not
useful (http://bugs.python.org/issue21231). But as part of a lint tool
specific to Python 2.7 that kind of warning would not be an issue and is
easily managed and integrated into CI setups to make sure there are no
regressions.

Lastly, there are things like string/unicode comparisons.
http://bugs.python.org/issue21401 has a patch from VIctor which warns when
comparing strings and unicode in Python 2.7. Much like the classic classes
example, the stdlib becomes rather noisy due to APIs that handle either/or,
etc. But unlike the classic classes example, you just can't systematically
verify that two variables are always going to be str vs. unicode in Python
2.7 if they aren't literals. If people want to implement type constraint
graphs for 2.7 code to help find them then that's great, but I personally
don't have that kind of time. In this instance it would seem like relying
on a project's unit tests to find this sort of problem is the best option.

With those three levels in mind, where do we draw the line between these
levels? Take for instance the print statement. Right now there is no
warning with -3. Do we add one and then update the 2.7 stdlib to prevent
warnings being generated by the stdlib? Or do we add it to some linter tool
to pick up when people accidentally leave one in their code?

The reason I ask is since this is clear I'm willing to spearhead the
tooling work we talked about at the language summit to make sure there's a
clear path for people wanting to port which is as easy as (reasonably)
possible, but I don't want to start on it until I have a clear indication
of what people are going to be okay with.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/ca24db50/attachment.html>

From stefan at bytereef.org  Fri Jun  6 18:57:43 2014
From: stefan at bytereef.org (Stefan Krah)
Date: Fri, 6 Jun 2014 18:57:43 +0200
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <20140606163701.GA10004@k2>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <20140606163701.GA10004@k2>
Message-ID: <20140606165743.GA11669@sleipnir.bytereef.org>

dw+python-dev at hmmz.org <dw+python-dev at hmmz.org> wrote:
> * Has Python ever hit a showstopper release issue as a result of a bug
>   in MSVC? (I guess probably not).

Yes, a PGO issue:

http://bugs.python.org/issue15993


To be fair, in that issue I did not look if there's some undefined behavior in
longobject.c.


> * Will VS 14 be golden prior to Python 3.5's release? It would suck to
>   rely on a beta compiler.. :)

This is my only concern, too. Otherwise, +1 for the switch.


Stefan Krah


From rdmurray at bitdance.com  Fri Jun  6 19:05:25 2014
From: rdmurray at bitdance.com (R. David Murray)
Date: Fri, 06 Jun 2014 13:05:25 -0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <20140606163701.GA10004@k2>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <20140606163701.GA10004@k2>
Message-ID: <20140606170526.6B4AA250DCD@webabinitio.net>

On Fri, 06 Jun 2014 16:37:01 -0000, dw+python-dev at hmmz.org wrote:
> On Fri, Jun 06, 2014 at 03:41:22PM +0000, Steve Dower wrote:
> 
> > [snip]
> 
> Speaking as a third party who aims to provide binary distributions for
> recent Python releases on Windows, every new compiler introduces a
> licensing and configuration headache. So I guess the questions are:
> 
> * Does the ABI stability address some historical real world problem with
>   Python binary builds? (I guess possibly)
> 
> * Is the existing solution of third parties building under e.g. Mingw as
>   an option of last resort causing real world issues? It seems to work
>   for a lot of people, although I personally avoid it.
> 
> * Have other compiler vendors indicated they will change their ABI
>   environment to match VS under this new stability guarantee? If not,
>   then as yet there is no real world benefit here.
> 
> * Has Python ever hit a showstopper release issue as a result of a bug
>   in MSVC? (I guess probably not).
> 
> * Will VS 14 be golden prior to Python 3.5's release? It would suck to
>   rely on a beta compiler.. :)
> 
> 
> Sorry for dunking water on this, but I've recently spent a ton of time
> getting a Microsoft build environment running, and it seems possible a
> new compiler may not yet justify more effort if there is little tangible
> benefit.

If I understand correctly (but I may not as I'm not a windows dev) we're
going to want to switch VS versions for 3.5 anyway, so switching to the
cutting edge one but where Steve can be and is willing to be in a tight
feedback loop with the developers sounds like a win to me.

--David

From njs at pobox.com  Fri Jun  6 18:55:04 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 6 Jun 2014 17:55:04 +0100
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <1842263445423761298.493568sturla.molden-gmail.com@news.gmane.org>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <lmr8ct$tab$1@ger.gmane.org> <5391754D.8000607@googlemail.com>
 <1842263445423761298.493568sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAPJVwBm9N7GJ24ZBvDCGm6oPpNojbe5=-1FQhMyeKfCNHvMWZA@mail.gmail.com>

On 6 Jun 2014 17:07, "Sturla Molden" <sturla.molden at gmail.com> wrote:
> We would in total need two mutexes, one condition variable, a pthread, and
> a heap.

The proposal in my initial email requires zero pthreads, and is
substantially more effective. (Your proposal reduces only the alloc
overhead for large arrays; mine reduces both alloc and memory access
overhead for boyh large and small arrays.)

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/bfa95a9c/attachment.html>

From jtaylor.debian at googlemail.com  Fri Jun  6 19:21:40 2014
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Fri, 06 Jun 2014 19:21:40 +0200
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <539126DB.8010306@canterbury.ac.nz>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <lmqsl2$5v5$1@ger.gmane.org>
 <CAPJVwBnPR1bYqwof_LejPmya80CJiyaK+kUYk99HrDoCgHPFNQ@mail.gmail.com>
 <539126DB.8010306@canterbury.ac.nz>
Message-ID: <5391F8A4.70401@googlemail.com>

On 06.06.2014 04:26, Greg Ewing wrote:
> Nathaniel Smith wrote:
> 
>> I'd be a little nervous about whether anyone has implemented, say, an
>> iadd with side effects such that you can tell whether a copy was made,
>> even if the object being copied is immediately destroyed.
> 
> I can think of at least one plausible scenario where
> this could occur: the operand is a view object that
> wraps another object, and its __iadd__ method updates
> that other object.
> 
> In fact, now that I think about it, exactly this
> kind of thing happens in numpy when you slice an
> array!
> 
> So the opt-in indicator would need to be dynamic, on
> a per-object basis, rather than a type flag.
> 

yes an opt-in indicator would need to receive both operand objects so it
would need to be a slot in the object or number type object.
Would the addition of a tp_can_elide slot to the object types be
acceptable for this rather specialized case?


tp_can_elide receives two objects and returns one of three values:
* can work inplace, operation is associative
* can work inplace but not associative
* cannot work inplace

Implementation could e.g. look about like this:

TARGET(BINARY_SUBTRACT) {
   fl = left->obj_type->tp_can_elide
   fr = right->obj_type->tp_can_elide
   elide = 0
   if (unlikely(fl)) {
      elide = fl(left, right)
   }
   else if (unlikely(fr)) {
      elide = fr(left, right)
   }
   if (unlikely(elide == YES) && left->refcnt == 1) {
       PyNumber_InPlaceSubtract(left, right)
   }
   else if (unlikely(elide == SWAPPABLE) && right->refcnt == 1) {
       PyNumber_InPlaceSubtract(right, left)
   }
   else {
       PyNumber_Subtract(left, right)
   }
}

From stefan at bytereef.org  Fri Jun  6 19:24:40 2014
From: stefan at bytereef.org (Stefan Krah)
Date: Fri, 6 Jun 2014 19:24:40 +0200
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <20140606165743.GA11669@sleipnir.bytereef.org>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <20140606163701.GA10004@k2>
 <20140606165743.GA11669@sleipnir.bytereef.org>
Message-ID: <20140606172440.GA11927@sleipnir.bytereef.org>

Stefan Krah <stefan at bytereef.org> wrote:
> > * Will VS 14 be golden prior to Python 3.5's release? It would suck to
> >   rely on a beta compiler.. :)
> 
> This is my only concern, too. Otherwise, +1 for the switch.

One more thing: Will the SDK 64-bit tools be available for the Express
Versions?


Stefan Krah


From brian at python.org  Fri Jun  6 19:31:53 2014
From: brian at python.org (Brian Curtin)
Date: Fri, 6 Jun 2014 21:31:53 +0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CAKJDb-PB2zkg9vcFXaifZo9G46MRnAYkL0kED0rQJZpvCvT6XA@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAKJDb-PB2zkg9vcFXaifZo9G46MRnAYkL0kED0rQJZpvCvT6XA@mail.gmail.com>
Message-ID: <CAD+XWwonbsLRKvt98HM5=U3docVrr6sDkM=M3wi6q6F+WREUGg@mail.gmail.com>

On Fri, Jun 6, 2014 at 8:22 PM, Zachary Ware
<zachary.ware+pydev at gmail.com> wrote:
> On Fri, Jun 6, 2014 at 10:41 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:
>> Thoughts/comments/concerns?
>
> My only concern is support for elderly versions of Windows, in
> particular: XP.  I seem to recall the last "let's update our MSVC
> version" discussion dying off because of XP support.  Even though MS
> has abandoned it, I'm not sure whether we can yet.
>
> If that's a non-issue, or if we can actually drop XP support, I'm all for it.

Extended support ended in April of this year, so I think we should put
XP as unsupported for 3.5 in PEP 11 -
http://legacy.python.org/dev/peps/pep-0011/

I seem to remember that we were waiting for this anyway.

From Steve.Dower at microsoft.com  Fri Jun  6 19:40:03 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Fri, 6 Jun 2014 17:40:03 +0000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <20140606163701.GA10004@k2>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <20140606163701.GA10004@k2>
Message-ID: <c97082415b7648d6862d14afef76430f@BLUPR03MB389.namprd03.prod.outlook.com>

dw+python-dev at hmmz.org wrote:
> Speaking as a third party who aims to provide binary distributions for recent
> Python releases on Windows, every new compiler introduces a licensing and
> configuration headache. So I guess the questions are:
> 
> * Does the ABI stability address some historical real world problem with
> Python binary builds? (I guess possibly)

Yes. It's very hard to explain to users that even though they've gone out and paid for Visual Studio 2013 Ultimate, they don't really have a C compiler that works with Python. This stability will eventually get us to a place where it doesn't matter what version of the compiler you have, though for a while people will obviously need the latest. (Another thing I'm working on is making sure that it's really easy to get the latest... lots of pieces to this puzzle.)

> * Is the existing solution of third parties building under e.g. Mingw as
> an option of last resort causing real world issues? It seems to work
> for a lot of people, although I personally avoid it.

I think it actually tends to solve more issues than it causes :(  I want to fix that by making MSVC better for Python, rather than switching away to another toolset.

> * Have other compiler vendors indicated they will change their ABI
> environment to match VS under this new stability guarantee? If not,
> then as yet there is no real world benefit here.

I have no idea, but I hope they do (eventually they almost certainly will). I've already mentioned to our team that they should reach out to the other projects and try to help them move it along, though I have no idea if they have the time or contacts to manage that.

FWIW, the stability guarantee was only announced this week, so there's a good chance that the gcc/clang/etc. teams aren't even aware of it yet.

> * Has Python ever hit a showstopper release issue as a result of a bug
> in MSVC? (I guess probably not).

Not to my knowledge, and I'm certainly hoping to avoid it by keeping the builds coming regularly. I can't do an official buildbot for it (and probably can't even reuse the infrastructure) since I'm going to work against the latest internal version as much as I can and we get new builds almost daily. More likely, building Python will reveal showstopper issues that actually get fixed (and it has done in the past, though that was never publicised :) )

> * Will VS 14 be golden prior to Python 3.5's release? It would suck to
> rely on a beta compiler.. :)

I sure hope so. The current planning looks like it will (I'm assuming that Python 3.5 is going to be late next year, but I couldn't find a good reference).

If things slip here, I'm going to be surrounded by very stressed people, which is not much fun. So I hope it'll be done!

At worst, VS 14 RC (or whatever label it gets) will probably be released under a "go live" licence. If anything is dramatically broken at that point, we'll know and it should be fixed, or we know that it's going to be around for a while regardless and we can make the decision to either stick with VC10 or work around the issues.

> Sorry for dunking water on this, but I've recently spent a ton of time getting a
> Microsoft build environment running, and it seems possible a new compiler may
> not yet justify more effort if there is little tangible benefit.

Not at all. I've spent far more time than I wanted to getting a build environment running for producing the Python 2.7 installers, and I spent just as long getting an environment for default going too. I'm personally a big fan of automating things like this, so you can also expect scripts (probably PowerShell) that will configure as much as possible.

Cheers,
Steve

> David

From Steve.Dower at microsoft.com  Fri Jun  6 19:42:45 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Fri, 6 Jun 2014 17:42:45 +0000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <20140606172440.GA11927@sleipnir.bytereef.org>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <20140606163701.GA10004@k2> <20140606165743.GA11669@sleipnir.bytereef.org>
 <20140606172440.GA11927@sleipnir.bytereef.org>
Message-ID: <eed1039a377445c4815fe0d740a507f5@BLUPR03MB389.namprd03.prod.outlook.com>

Stefan Krah wrote:
>Stefan Krah <stefan at bytereef.org> wrote:
>> > * Will VS 14 be golden prior to Python 3.5's release? It would suck to
>> >   rely on a beta compiler.. :)
>> 
>> This is my only concern, too. Otherwise, +1 for the switch.
>
>One more thing: Will the SDK 64-bit tools be available for the Express Versions?

They should be. If they're not, I'll certainly be making a noise about it (unless there's another, easier way to get the tools by then...)

From Steve.Dower at microsoft.com  Fri Jun  6 20:12:04 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Fri, 6 Jun 2014 18:12:04 +0000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
Message-ID: <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>

Chris Angelico wrote:
> On Sat, Jun 7, 2014 at 1:41 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:
>> What this means for Python is that C extensions for Python 3.5 and later can be built using any version of MSVC from 14.0 and later.
>
> Oh, if only this had been available for 2.7!! Actually... this means that 14.0 would be a good target for a compiler change for 2.7.x, if such a change is ever acceptable.

Maybe, but I doubt it will ever be acceptable :)

> To what extent is this compatibility going to be maintained? Is there a guarantee that there'll be X versions (or X years) of cross-compilation support?

There are a few breaking changes in this version that are designed to standardize on a function-call based ABI, which should effectively be a life-long guarantee.

The only promise I can make is this: when cross-compilation support is eventually broken, it will be due to something that nobody has been able to predict up until now. (Hopefully that's better than promising that it will be broken in the very next release.)

> ChrisA

From rosuav at gmail.com  Fri Jun  6 20:19:32 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Sat, 7 Jun 2014 04:19:32 +1000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>

On Sat, Jun 7, 2014 at 4:12 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:
> Chris Angelico wrote:
>> On Sat, Jun 7, 2014 at 1:41 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:
>>> What this means for Python is that C extensions for Python 3.5 and later can be built using any version of MSVC from 14.0 and later.
>>
>> Oh, if only this had been available for 2.7!! Actually... this means that 14.0 would be a good target for a compiler change for 2.7.x, if such a change is ever acceptable.
>
> Maybe, but I doubt it will ever be acceptable :)

Well, there were discussions. Since Python 2.7's support is far
exceeding the Microsoft promise of support for the compiler it was
built on, there's going to be a problem, one way or the other. I don't
know how that's going to end up being resolved.

ChrisA

From brian at python.org  Fri Jun  6 20:25:14 2014
From: brian at python.org (Brian Curtin)
Date: Fri, 6 Jun 2014 22:25:14 +0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
Message-ID: <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>

On Fri, Jun 6, 2014 at 10:19 PM, Chris Angelico <rosuav at gmail.com> wrote:
> On Sat, Jun 7, 2014 at 4:12 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:
>> Chris Angelico wrote:
>>> On Sat, Jun 7, 2014 at 1:41 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:
>>>> What this means for Python is that C extensions for Python 3.5 and later can be built using any version of MSVC from 14.0 and later.
>>>
>>> Oh, if only this had been available for 2.7!! Actually... this means that 14.0 would be a good target for a compiler change for 2.7.x, if such a change is ever acceptable.
>>
>> Maybe, but I doubt it will ever be acceptable :)
>
> Well, there were discussions. Since Python 2.7's support is far
> exceeding the Microsoft promise of support for the compiler it was
> built on, there's going to be a problem, one way or the other. I don't
> know how that's going to end up being resolved.

We're going to have to change it at some point, otherwise we're going
to have people in 2018 scrambling to find VS2008, which will be 35
versions too old by then. No matter what we do here, we're going to
have a tough PR situation, but we have to make something workable. I'd
rather cause a hassle than outright kill extensions.

I would probably prefer we aim for VS 14 for 3.5, and then explore
making the same change for the 2.7.x release that comes after 3.5.0
comes out. Lessons learned and all that.

From tjreedy at udel.edu  Fri Jun  6 20:28:29 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 06 Jun 2014 14:28:29 -0400
Subject: [Python-Dev] Division of tool labour in porting Python 2 code
	to 2/3
In-Reply-To: <CAP1=2W7a76GJMDrT5C+Sc1X6kC2p2yML9wVWGyBzofPb+e+m1g@mail.gmail.com>
References: <CAP1=2W7a76GJMDrT5C+Sc1X6kC2p2yML9wVWGyBzofPb+e+m1g@mail.gmail.com>
Message-ID: <lmt18k$ag3$1@ger.gmane.org>

On 6/6/2014 12:37 PM, Brett Cannon wrote:
> After Glyph and Alex's email about their asks for assisting in writing
> Python 2/3 code, it got me thinking about where in the toolchain various
> warnings and such should go in order to help direct energy to help
> develop whatever future toolchain to assist in porting.
>
> There seems to be three places where issues are/can be caught once a
> project has embarked down the road of 2/3 source compatibility:
>
>  1. -3 warnings
>  2. Some linter tool
>  3. Failing tests
>
> -3 warnings are things that we know are flat-out wrong and do not cause
> massive compatibility issues in the stdlib. For instance, warning that
> buffer() is not in Python 3 is a py3k warning -- Glyph made a mistake
> when he asked for it as a new warning -- is a perfect example of
> something that isn't excessively noisy and won't cause issues when
> people run with it.
>
> But what about warning about classic classes? The stdlib is full of them
> and they were purposefully left alone for compatibility reasons. But
> there is a subtle semantic difference between classic and new-style
> classes,

A non-subtle difference is that old style classes do not have .__new__. 
I just ran into this when backporting an Idle test to 2.7. (I rewrote 
the test to avoid diverging the code). In retrospect, perhaps we should 
have added a global 'new-class future' -C switch, like -Q, and made sure 
that stdlib worked either way. People running 2and3 code could then run 
2.x with the switch. Is is possible to add this now?

 > and so 2/3 code should consider switching

I do not not understand what you mean without a switch or furture 
statement available to switch from old to new in 2.7.

> (this is when people
> chime in saying "this is why we want a 2.8 release!", but that still
> isn't happening). If this were made a py3k warning in 2.7 then the
> stdlib itself would spew out warnings which we can't change due to
> compatibility, so that makes it not useful
> (http://bugs.python.org/issue21231).

Don't issue the warning if the class is in the stdlib.
If the warning is issued *after* creating class C:

f = C.__module__.__file__
if classic(C) and (not 'lib' in f or 'site_packages' in f):
   warn(...)

On Windows, the directory is 'Lib'; I presume it is lowercased 
everywhere. If not, adjust.

 > But as part of a lint tool specific
> to Python 2.7 that kind of warning would not be an issue and is easily
> managed and integrated into CI setups to make sure there are no regressions.
>
> Lastly, there are things like string/unicode comparisons.
> http://bugs.python.org/issue21401 has a patch from VIctor which warns
> when comparing strings and unicode in Python 2.7. Much like the classic
> classes example, the stdlib becomes rather noisy due to APIs that handle
> either/or, etc. But unlike the classic classes example, you just can't
> systematically verify that two variables are always going to be str vs.
> unicode in Python 2.7 if they aren't literals. If people want to
> implement type constraint graphs for 2.7 code to help find them then
> that's great, but I personally don't have that kind of time. In this
> instance it would seem like relying on a project's unit tests to find
> this sort of problem is the best option.
>
> With those three levels in mind, where do we draw the line between these
> levels? Take for instance the print statement. Right now there is no
> warning with -3. Do we add one and then update the 2.7 stdlib to prevent
> warnings being generated by the stdlib?

Make conditional as with class.

We *could* change 'print s' to the exactly equivalent 'print(s)' 
(perhaps half the cases); 'print r, s' to "print('%s %s' % (r,s)), 
'print 'xxxx', y' to "print('xxxx %s' % y), and so on. However, 'print 
 >>self.stdout, x', etc, does not translate to a pseudo-call. It would 
need transltion to "self.stdout.write(x+'\n')".  Grepping 2.7.6 lib/*.py 
for print gives 1341 hits, with at least 1000 being actual print statements.

 > Or do we add it to some linter
 > tool to pick up when people accidentally leave one in their code?

> The reason I ask is since this is clear I'm willing to spearhead the
> tooling work we talked about at the language summit to make sure there's
> a clear path for people wanting to port which is as easy as (reasonably)
> possible, but I don't want to start on it until I have a clear
> indication of what people are going to be okay with.


-- 
Terry Jan Reedy


From mal at egenix.com  Fri Jun  6 20:41:16 2014
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 06 Jun 2014 20:41:16 +0200
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>	<CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>	<4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>	<CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
Message-ID: <53920B4C.8020700@egenix.com>

On 06.06.2014 20:25, Brian Curtin wrote:
> On Fri, Jun 6, 2014 at 10:19 PM, Chris Angelico <rosuav at gmail.com> wrote:
>> On Sat, Jun 7, 2014 at 4:12 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:
>>> Chris Angelico wrote:
>>>> On Sat, Jun 7, 2014 at 1:41 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:
>>>>> What this means for Python is that C extensions for Python 3.5 and later can be built using any version of MSVC from 14.0 and later.
>>>>
>>>> Oh, if only this had been available for 2.7!! Actually... this means that 14.0 would be a good target for a compiler change for 2.7.x, if such a change is ever acceptable.
>>>
>>> Maybe, but I doubt it will ever be acceptable :)
>>
>> Well, there were discussions. Since Python 2.7's support is far
>> exceeding the Microsoft promise of support for the compiler it was
>> built on, there's going to be a problem, one way or the other. I don't
>> know how that's going to end up being resolved.
> 
> We're going to have to change it at some point, otherwise we're going
> to have people in 2018 scrambling to find VS2008, which will be 35
> versions too old by then. No matter what we do here, we're going to
> have a tough PR situation, but we have to make something workable. I'd
> rather cause a hassle than outright kill extensions.
> 
> I would probably prefer we aim for VS 14 for 3.5, and then explore
> making the same change for the 2.7.x release that comes after 3.5.0
> comes out. Lessons learned and all that.

Are you sure that's an option ? Changing the compiler the stock
Python from python.org is built with will most likely render
existing Python extensions built for 2.7.x with x < (release that comes
after 3.5.0) broken, so users and installation tools will end up
having to pay close attention to the patch level version of Python
they are using... which is something we wanted to avoid after
we ran into this situation with 1.5.1 and 1.5.2 a few years ago.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 06 2014)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2014-05-28: Released mxODBC.Connect 2.1.0 ...     http://egenix.com/go56
2014-07-02: Python Meeting Duesseldorf ...                 26 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From brian at python.org  Fri Jun  6 20:49:24 2014
From: brian at python.org (Brian Curtin)
Date: Fri, 6 Jun 2014 22:49:24 +0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <53920B4C.8020700@egenix.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
Message-ID: <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>

On Fri, Jun 6, 2014 at 10:41 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> On 06.06.2014 20:25, Brian Curtin wrote:
>> On Fri, Jun 6, 2014 at 10:19 PM, Chris Angelico <rosuav at gmail.com> wrote:
>>> On Sat, Jun 7, 2014 at 4:12 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:
>>>> Chris Angelico wrote:
>>>>> On Sat, Jun 7, 2014 at 1:41 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:
>>>>>> What this means for Python is that C extensions for Python 3.5 and later can be built using any version of MSVC from 14.0 and later.
>>>>>
>>>>> Oh, if only this had been available for 2.7!! Actually... this means that 14.0 would be a good target for a compiler change for 2.7.x, if such a change is ever acceptable.
>>>>
>>>> Maybe, but I doubt it will ever be acceptable :)
>>>
>>> Well, there were discussions. Since Python 2.7's support is far
>>> exceeding the Microsoft promise of support for the compiler it was
>>> built on, there's going to be a problem, one way or the other. I don't
>>> know how that's going to end up being resolved.
>>
>> We're going to have to change it at some point, otherwise we're going
>> to have people in 2018 scrambling to find VS2008, which will be 35
>> versions too old by then. No matter what we do here, we're going to
>> have a tough PR situation, but we have to make something workable. I'd
>> rather cause a hassle than outright kill extensions.
>>
>> I would probably prefer we aim for VS 14 for 3.5, and then explore
>> making the same change for the 2.7.x release that comes after 3.5.0
>> comes out. Lessons learned and all that.
>
> Are you sure that's an option ? Changing the compiler the stock
> Python from python.org is built with will most likely render
> existing Python extensions built for 2.7.x with x < (release that comes
> after 3.5.0) broken, so users and installation tools will end up
> having to pay close attention to the patch level version of Python
> they are using... which is something we wanted to avoid after
> we ran into this situation with 1.5.1 and 1.5.2 a few years ago.

None of the options are particularly good, but yes, I think that's an
option we have to consider. We're supporting 2.7.x for 6 more years on
a compiler that is already 6 years old. Something less than awesome
for everyone involved is going to have to happen to make that
possible.

From mal at egenix.com  Fri Jun  6 20:52:49 2014
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 06 Jun 2014 20:52:49 +0200
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>	<CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>	<4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>	<CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>	<CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>	<53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
Message-ID: <53920E01.6080300@egenix.com>

On 06.06.2014 20:49, Brian Curtin wrote:
> On Fri, Jun 6, 2014 at 10:41 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>> On 06.06.2014 20:25, Brian Curtin wrote:
>>> On Fri, Jun 6, 2014 at 10:19 PM, Chris Angelico <rosuav at gmail.com> wrote:
>>>> On Sat, Jun 7, 2014 at 4:12 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:
>>>>> Chris Angelico wrote:
>>>>>> On Sat, Jun 7, 2014 at 1:41 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:
>>>>>>> What this means for Python is that C extensions for Python 3.5 and later can be built using any version of MSVC from 14.0 and later.
>>>>>>
>>>>>> Oh, if only this had been available for 2.7!! Actually... this means that 14.0 would be a good target for a compiler change for 2.7.x, if such a change is ever acceptable.
>>>>>
>>>>> Maybe, but I doubt it will ever be acceptable :)
>>>>
>>>> Well, there were discussions. Since Python 2.7's support is far
>>>> exceeding the Microsoft promise of support for the compiler it was
>>>> built on, there's going to be a problem, one way or the other. I don't
>>>> know how that's going to end up being resolved.
>>>
>>> We're going to have to change it at some point, otherwise we're going
>>> to have people in 2018 scrambling to find VS2008, which will be 35
>>> versions too old by then. No matter what we do here, we're going to
>>> have a tough PR situation, but we have to make something workable. I'd
>>> rather cause a hassle than outright kill extensions.
>>>
>>> I would probably prefer we aim for VS 14 for 3.5, and then explore
>>> making the same change for the 2.7.x release that comes after 3.5.0
>>> comes out. Lessons learned and all that.
>>
>> Are you sure that's an option ? Changing the compiler the stock
>> Python from python.org is built with will most likely render
>> existing Python extensions built for 2.7.x with x < (release that comes
>> after 3.5.0) broken, so users and installation tools will end up
>> having to pay close attention to the patch level version of Python
>> they are using... which is something we wanted to avoid after
>> we ran into this situation with 1.5.1 and 1.5.2 a few years ago.
> 
> None of the options are particularly good, but yes, I think that's an
> option we have to consider. We're supporting 2.7.x for 6 more years on
> a compiler that is already 6 years old. Something less than awesome
> for everyone involved is going to have to happen to make that
> possible.

Perhaps we could combine this with the breakage that a Python 2.7.10
would introduce due to the two digit patch level release version ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 06 2014)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2014-05-28: Released mxODBC.Connect 2.1.0 ...     http://egenix.com/go56
2014-07-02: Python Meeting Duesseldorf ...                 26 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From dw+python-dev at hmmz.org  Fri Jun  6 20:56:31 2014
From: dw+python-dev at hmmz.org (dw+python-dev at hmmz.org)
Date: Fri, 6 Jun 2014 18:56:31 +0000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
Message-ID: <20140606185631.GA11094@k2>

On Fri, Jun 06, 2014 at 10:49:24PM +0400, Brian Curtin wrote:

> None of the options are particularly good, but yes, I think that's an
> option we have to consider. We're supporting 2.7.x for 6 more years on
> a compiler that is already 6 years old.

Surely that is infinitely less desirable than simply bumping the minor
version?


David

From bcannon at gmail.com  Fri Jun  6 21:03:58 2014
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 06 Jun 2014 19:03:58 +0000
Subject: [Python-Dev] Division of tool labour in porting Python 2 code
	to 2/3
References: <CAP1=2W7a76GJMDrT5C+Sc1X6kC2p2yML9wVWGyBzofPb+e+m1g@mail.gmail.com>
 <lmt18k$ag3$1@ger.gmane.org>
Message-ID: <CAP1=2W6qsVX+a-Ro83VrJiZ6BZ0CgZxH3Em2=bVJbnGg3hHseA@mail.gmail.com>

On Fri Jun 06 2014 at 2:29:13 PM, Terry Reedy <tjreedy at udel.edu> wrote:

> On 6/6/2014 12:37 PM, Brett Cannon wrote:
> > After Glyph and Alex's email about their asks for assisting in writing
> > Python 2/3 code, it got me thinking about where in the toolchain various
> > warnings and such should go in order to help direct energy to help
> > develop whatever future toolchain to assist in porting.
> >
> > There seems to be three places where issues are/can be caught once a
> > project has embarked down the road of 2/3 source compatibility:
> >
> >  1. -3 warnings
> >  2. Some linter tool
> >  3. Failing tests
> >
> > -3 warnings are things that we know are flat-out wrong and do not cause
> > massive compatibility issues in the stdlib. For instance, warning that
> > buffer() is not in Python 3 is a py3k warning -- Glyph made a mistake
> > when he asked for it as a new warning -- is a perfect example of
> > something that isn't excessively noisy and won't cause issues when
> > people run with it.
> >
> > But what about warning about classic classes? The stdlib is full of them
> > and they were purposefully left alone for compatibility reasons. But
> > there is a subtle semantic difference between classic and new-style
> > classes,
>
> A non-subtle difference is that old style classes do not have .__new__.
> I just ran into this when backporting an Idle test to 2.7. (I rewrote
> the test to avoid diverging the code). In retrospect, perhaps we should
> have added a global 'new-class future' -C switch, like -Q, and made sure
> that stdlib worked either way. People running 2and3 code could then run
> 2.x with the switch. Is is possible to add this now?
>

I consider changing the CLI out of bounds in a bugfix release as it's part
of the API of Python.


>
>  > and so 2/3 code should consider switching
>
> I do not not understand what you mean without a switch or furture
> statement available to switch from old to new in 2.7.
>

Run a 2to3 fixer that changes all of their classes to new-style.


>
> > (this is when people
> > chime in saying "this is why we want a 2.8 release!", but that still
> > isn't happening). If this were made a py3k warning in 2.7 then the
> > stdlib itself would spew out warnings which we can't change due to
> > compatibility, so that makes it not useful
> > (http://bugs.python.org/issue21231).
>
> Don't issue the warning if the class is in the stdlib.
> If the warning is issued *after* creating class C:
>
> f = C.__module__.__file__
> if classic(C) and (not 'lib' in f or 'site_packages' in f):
>    warn(...)
>
> On Windows, the directory is 'Lib'; I presume it is lowercased
> everywhere. If not, adjust.
>

That's just asking for trouble. I don't want to be import-dependent like
that in the stdlib.


>
>  > But as part of a lint tool specific
> > to Python 2.7 that kind of warning would not be an issue and is easily
> > managed and integrated into CI setups to make sure there are no
> regressions.
> >
> > Lastly, there are things like string/unicode comparisons.
> > http://bugs.python.org/issue21401 has a patch from VIctor which warns
> > when comparing strings and unicode in Python 2.7. Much like the classic
> > classes example, the stdlib becomes rather noisy due to APIs that handle
> > either/or, etc. But unlike the classic classes example, you just can't
> > systematically verify that two variables are always going to be str vs.
> > unicode in Python 2.7 if they aren't literals. If people want to
> > implement type constraint graphs for 2.7 code to help find them then
> > that's great, but I personally don't have that kind of time. In this
> > instance it would seem like relying on a project's unit tests to find
> > this sort of problem is the best option.
> >
> > With those three levels in mind, where do we draw the line between these
> > levels? Take for instance the print statement. Right now there is no
> > warning with -3. Do we add one and then update the 2.7 stdlib to prevent
> > warnings being generated by the stdlib?
>
> Make conditional as with class.
>
> We *could* change 'print s' to the exactly equivalent 'print(s)'
> (perhaps half the cases); 'print r, s' to "print('%s %s' % (r,s)),
> 'print 'xxxx', y' to "print('xxxx %s' % y), and so on. However, 'print
>  >>self.stdout, x', etc, does not translate to a pseudo-call. It would
> need transltion to "self.stdout.write(x+'\n')".  Grepping 2.7.6 lib/*.py
> for print gives 1341 hits, with at least 1000 being actual print
> statements.
>

Yep, which is why I don't want to do a 2to3 run on the stdlib to get rid of
them. I also want to minimize conditional checks as it leads to potential
issues of people thinking it's okay not to change things when there are in
actually differences (e.g. I don't want to promote classic classes or
native strings if it can be helped for the vast majority of users).

-Brett


>
>  > Or do we add it to some linter
>  > tool to pick up when people accidentally leave one in their code?
>
> > The reason I ask is since this is clear I'm willing to spearhead the
> > tooling work we talked about at the language summit to make sure there's
> > a clear path for people wanting to port which is as easy as (reasonably)
> > possible, but I don't want to start on it until I have a clear
> > indication of what people are going to be okay with.
>
>
> --
> Terry Jan Reedy
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/170fc076/attachment.html>

From brian at python.org  Fri Jun  6 21:04:24 2014
From: brian at python.org (Brian Curtin)
Date: Fri, 6 Jun 2014 23:04:24 +0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <20140606185631.GA11094@k2>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
Message-ID: <CAD+XWwoxaRG1T9a3eeve4MtFs2=djHo9OjPCuMpRtWsUGKaMGw@mail.gmail.com>

On Fri, Jun 6, 2014 at 10:56 PM,  <dw+python-dev at hmmz.org> wrote:
> On Fri, Jun 06, 2014 at 10:49:24PM +0400, Brian Curtin wrote:
>
>> None of the options are particularly good, but yes, I think that's an
>> option we have to consider. We're supporting 2.7.x for 6 more years on
>> a compiler that is already 6 years old.
>
> Surely that is infinitely less desirable than simply bumping the minor
> version?

It's definitely not desirable, but "simply" bumping the minor version
is not A Thing.

From bcannon at gmail.com  Fri Jun  6 21:05:05 2014
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 06 Jun 2014 19:05:05 +0000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
Message-ID: <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>

On Fri Jun 06 2014 at 2:59:24 PM, <dw+python-dev at hmmz.org> wrote:

> On Fri, Jun 06, 2014 at 10:49:24PM +0400, Brian Curtin wrote:
>
> > None of the options are particularly good, but yes, I think that's an
> > option we have to consider. We're supporting 2.7.x for 6 more years on
> > a compiler that is already 6 years old.
>
> Surely that is infinitely less desirable than simply bumping the minor
> version?
>

Nope. A new minor release of Python is a massive undertaking which is why
we have saved ourselves the hassle of doing a Python 2.8 or not giving a
clear signal as to when Python 2.x will end as a language.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/c5e7385c/attachment-0001.html>

From donald at stufft.io  Fri Jun  6 21:08:19 2014
From: donald at stufft.io (Donald Stufft)
Date: Fri, 6 Jun 2014 15:08:19 -0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CAD+XWwoxaRG1T9a3eeve4MtFs2=djHo9OjPCuMpRtWsUGKaMGw@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAD+XWwoxaRG1T9a3eeve4MtFs2=djHo9OjPCuMpRtWsUGKaMGw@mail.gmail.com>
Message-ID: <99A4C614-FAC9-4201-859B-B698744A5DB9@stufft.io>


On Jun 6, 2014, at 3:04 PM, Brian Curtin <brian at python.org> wrote:

> On Fri, Jun 6, 2014 at 10:56 PM,  <dw+python-dev at hmmz.org> wrote:
>> On Fri, Jun 06, 2014 at 10:49:24PM +0400, Brian Curtin wrote:
>> 
>>> None of the options are particularly good, but yes, I think that's an
>>> option we have to consider. We're supporting 2.7.x for 6 more years on
>>> a compiler that is already 6 years old.
>> 
>> Surely that is infinitely less desirable than simply bumping the minor
>> version?
> 
> It's definitely not desirable, but "simply" bumping the minor version
> is not A Thing.

Why? I mean even if it?s the same thing as 2.7 just with an updated
compiler that seems like a better answer than having to deal with
2.7.whatever suddenly breaking all C exts.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/6a5ee5a2/attachment.sig>

From brian at python.org  Fri Jun  6 21:09:56 2014
From: brian at python.org (Brian Curtin)
Date: Fri, 6 Jun 2014 23:09:56 +0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <99A4C614-FAC9-4201-859B-B698744A5DB9@stufft.io>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAD+XWwoxaRG1T9a3eeve4MtFs2=djHo9OjPCuMpRtWsUGKaMGw@mail.gmail.com>
 <99A4C614-FAC9-4201-859B-B698744A5DB9@stufft.io>
Message-ID: <CAD+XWwofUpWk-9aUzKkrzq4X1v2p_XXrnUtsbmGvpJNKp=nixg@mail.gmail.com>

On Fri, Jun 6, 2014 at 11:08 PM, Donald Stufft <donald at stufft.io> wrote:
>
> On Jun 6, 2014, at 3:04 PM, Brian Curtin <brian at python.org> wrote:
>
>> On Fri, Jun 6, 2014 at 10:56 PM,  <dw+python-dev at hmmz.org> wrote:
>>> On Fri, Jun 06, 2014 at 10:49:24PM +0400, Brian Curtin wrote:
>>>
>>>> None of the options are particularly good, but yes, I think that's an
>>>> option we have to consider. We're supporting 2.7.x for 6 more years on
>>>> a compiler that is already 6 years old.
>>>
>>> Surely that is infinitely less desirable than simply bumping the minor
>>> version?
>>
>> It's definitely not desirable, but "simply" bumping the minor version
>> is not A Thing.
>
> Why? I mean even if it?s the same thing as 2.7 just with an updated
> compiler that seems like a better answer than having to deal with
> 2.7.whatever suddenly breaking all C exts.

Because then we have to maintain 2.8 at a time when no one even wants
to maintain 2.7?

From donald at stufft.io  Fri Jun  6 21:11:46 2014
From: donald at stufft.io (Donald Stufft)
Date: Fri, 6 Jun 2014 15:11:46 -0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CAD+XWwofUpWk-9aUzKkrzq4X1v2p_XXrnUtsbmGvpJNKp=nixg@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAD+XWwoxaRG1T9a3eeve4MtFs2=djHo9OjPCuMpRtWsUGKaMGw@mail.gmail.com>
 <99A4C614-FAC9-4201-859B-B698744A5DB9@stufft.io>
 <CAD+XWwofUpWk-9aUzKkrzq4X1v2p_XXrnUtsbmGvpJNKp=nixg@mail.gmail.com>
Message-ID: <DEA5F0EA-1EE9-4510-9400-64F3A7B5939C@stufft.io>


On Jun 6, 2014, at 3:09 PM, Brian Curtin <brian at python.org> wrote:

> On Fri, Jun 6, 2014 at 11:08 PM, Donald Stufft <donald at stufft.io> wrote:
>> 
>> On Jun 6, 2014, at 3:04 PM, Brian Curtin <brian at python.org> wrote:
>> 
>>> On Fri, Jun 6, 2014 at 10:56 PM,  <dw+python-dev at hmmz.org> wrote:
>>>> On Fri, Jun 06, 2014 at 10:49:24PM +0400, Brian Curtin wrote:
>>>> 
>>>>> None of the options are particularly good, but yes, I think that's an
>>>>> option we have to consider. We're supporting 2.7.x for 6 more years on
>>>>> a compiler that is already 6 years old.
>>>> 
>>>> Surely that is infinitely less desirable than simply bumping the minor
>>>> version?
>>> 
>>> It's definitely not desirable, but "simply" bumping the minor version
>>> is not A Thing.
>> 
>> Why? I mean even if it?s the same thing as 2.7 just with an updated
>> compiler that seems like a better answer than having to deal with
>> 2.7.whatever suddenly breaking all C exts.
> 
> Because then we have to maintain 2.8 at a time when no one even wants
> to maintain 2.7?

Is it really any difference in maintenance if you just stop applying updates to
2.7 and switch to 2.8? If 2.8 is really just 2.7 with a new compiler then there
should be no functional difference between doing that and doing a 2.7.whatever
except all of the tooling that relies on the compiler not to change in micro
releases won?t suddenly break and freak out.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/eb2232f2/attachment.sig>

From martin at v.loewis.de  Fri Jun  6 21:20:04 2014
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 06 Jun 2014 21:20:04 +0200
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <53921464.7030400@v.loewis.de>

Am 06.06.14 17:41, schrieb Steve Dower:
> Hi all
> 
> I would like to propose moving Python 3.5 to use Visual C++ 14.0 as
> the main compiler.

This is fine with me, but I'm worried about the precise timing of doing
so. I assume that you would plan to do this moving before VC++ 14 is
actually released. This worries me for three reasons:
1. what is the availability of the compiler during the testing phase,
   and what will it be immediately after the testing ends (where
   traditionally people would have to buy licenses, or wait for VS
   Express to be released)?
2. what is the risk of installing a beta compiler on what might
   otherwise be a "production" developer system? In particular, could
   it interfere with other VS installations, and could it require a
   complete system reinstall when the final release of VC 14 is
   available?
3. what is the chance of the final release being delayed beyond
   the planned release date of Python 3.5? Microsoft has a bad
   track record of meeting release dates (or the tradition of not
   announcing any for that reason); the blog says that it will
   be available "sometime in 2015". Now, Python 3.5 might appear
   November 2015, so what do we do if VS 2015 is not released
   by the time 3.5b1 is planned?

Regards,
Martin

From martin at v.loewis.de  Fri Jun  6 21:22:11 2014
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 06 Jun 2014 21:22:11 +0200
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CAD+XWwonbsLRKvt98HM5=U3docVrr6sDkM=M3wi6q6F+WREUGg@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAKJDb-PB2zkg9vcFXaifZo9G46MRnAYkL0kED0rQJZpvCvT6XA@mail.gmail.com>
 <CAD+XWwonbsLRKvt98HM5=U3docVrr6sDkM=M3wi6q6F+WREUGg@mail.gmail.com>
Message-ID: <539214E3.5010308@v.loewis.de>

Am 06.06.14 19:31, schrieb Brian Curtin:

>> If that's a non-issue, or if we can actually drop XP support, I'm all for it.
> 
> Extended support ended in April of this year, so I think we should put
> XP as unsupported for 3.5 in PEP 11 -
> http://legacy.python.org/dev/peps/pep-0011/
> 
> I seem to remember that we were waiting for this anyway.

We don't actually need to explicitly put XP there, as PEP 11 ties our
support to the Microsoft product life cycle. XP is not supported by
Python anymore.

Regards,
Martin

From rosuav at gmail.com  Fri Jun  6 21:33:45 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Sat, 7 Jun 2014 05:33:45 +1000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <DEA5F0EA-1EE9-4510-9400-64F3A7B5939C@stufft.io>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAD+XWwoxaRG1T9a3eeve4MtFs2=djHo9OjPCuMpRtWsUGKaMGw@mail.gmail.com>
 <99A4C614-FAC9-4201-859B-B698744A5DB9@stufft.io>
 <CAD+XWwofUpWk-9aUzKkrzq4X1v2p_XXrnUtsbmGvpJNKp=nixg@mail.gmail.com>
 <DEA5F0EA-1EE9-4510-9400-64F3A7B5939C@stufft.io>
Message-ID: <CAPTjJmrygHipJRGS9ttSFdX93CHYhJqyyFGKV=L-FdxSD3=uRg@mail.gmail.com>

On Sat, Jun 7, 2014 at 5:11 AM, Donald Stufft <donald at stufft.io> wrote:
> Is it really any difference in maintenance if you just stop applying updates to
> 2.7 and switch to 2.8? If 2.8 is really just 2.7 with a new compiler then there
> should be no functional difference between doing that and doing a 2.7.whatever
> except all of the tooling that relies on the compiler not to change in micro
> releases won?t suddenly break and freak out.

If the only difference between 2.7 and 2.8 is the compiler used on
Windows, what happens on Linux and other platforms? A Python 2.8 would
have to be materially different from Python 2.7, not just binarily
incompatible on one platform.

ChrisA

From donald at stufft.io  Fri Jun  6 21:36:59 2014
From: donald at stufft.io (Donald Stufft)
Date: Fri, 6 Jun 2014 15:36:59 -0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CAPTjJmrygHipJRGS9ttSFdX93CHYhJqyyFGKV=L-FdxSD3=uRg@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAD+XWwoxaRG1T9a3eeve4MtFs2=djHo9OjPCuMpRtWsUGKaMGw@mail.gmail.com>
 <99A4C614-FAC9-4201-859B-B698744A5DB9@stufft.io>
 <CAD+XWwofUpWk-9aUzKkrzq4X1v2p_XXrnUtsbmGvpJNKp=nixg@mail.gmail.com>
 <DEA5F0EA-1EE9-4510-9400-64F3A7B5939C@stufft.io>
 <CAPTjJmrygHipJRGS9ttSFdX93CHYhJqyyFGKV=L-FdxSD3=uRg@mail.gmail.com>
Message-ID: <F697DE39-F540-414D-B033-EC1F98E1320D@stufft.io>


On Jun 6, 2014, at 3:33 PM, Chris Angelico <rosuav at gmail.com> wrote:

> On Sat, Jun 7, 2014 at 5:11 AM, Donald Stufft <donald at stufft.io> wrote:
>> Is it really any difference in maintenance if you just stop applying updates to
>> 2.7 and switch to 2.8? If 2.8 is really just 2.7 with a new compiler then there
>> should be no functional difference between doing that and doing a 2.7.whatever
>> except all of the tooling that relies on the compiler not to change in micro
>> releases won?t suddenly break and freak out.
> 
> If the only difference between 2.7 and 2.8 is the compiler used on
> Windows, what happens on Linux and other platforms? A Python 2.8 would
> have to be materially different from Python 2.7, not just binarily
> incompatible on one platform.
> 
> ChrisA
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/donald%40stufft.io

Well it?d contain bug fixes and whatever other sorts of things you?d put
into a 2.7.whatever release. So they?d still want to upgrade to 2.8 since
that?ll have bug fixes.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/efb4d61c/attachment.sig>

From martin at v.loewis.de  Fri Jun  6 21:37:55 2014
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 06 Jun 2014 21:37:55 +0200
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
Message-ID: <53921893.4080200@v.loewis.de>

Am 06.06.14 20:25, schrieb Brian Curtin:
> We're going to have to change it at some point, otherwise we're going
> to have people in 2018 scrambling to find VS2008, which will be 35
> versions too old by then.

Not sure whether you picked 2018 deliberately: extended support for
VS2008 Professional ends on April 10, 2018.

In any case, the extension problem will occur regardless of what
you do:
- if you switch compilers within 2.7, applications may crash
- if you switch compilers and declare it 2.8, you extensions
  might not be available precompiled for some time (in particular
  if the developers of some package have abandoned 2.7)
- if you don't switch compilers, availability of the tool chain
  will be terrible.

Regards,
Martin


From rosuav at gmail.com  Fri Jun  6 21:46:48 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Sat, 7 Jun 2014 05:46:48 +1000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <F697DE39-F540-414D-B033-EC1F98E1320D@stufft.io>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAD+XWwoxaRG1T9a3eeve4MtFs2=djHo9OjPCuMpRtWsUGKaMGw@mail.gmail.com>
 <99A4C614-FAC9-4201-859B-B698744A5DB9@stufft.io>
 <CAD+XWwofUpWk-9aUzKkrzq4X1v2p_XXrnUtsbmGvpJNKp=nixg@mail.gmail.com>
 <DEA5F0EA-1EE9-4510-9400-64F3A7B5939C@stufft.io>
 <CAPTjJmrygHipJRGS9ttSFdX93CHYhJqyyFGKV=L-FdxSD3=uRg@mail.gmail.com>
 <F697DE39-F540-414D-B033-EC1F98E1320D@stufft.io>
Message-ID: <CAPTjJmqPx4WHikYgR1Ovx0g6WhWzmDBHM8wvtjzih_qUTMhDGw@mail.gmail.com>

On Sat, Jun 7, 2014 at 5:36 AM, Donald Stufft <donald at stufft.io> wrote:
> Well it?d contain bug fixes and whatever other sorts of things you?d put
> into a 2.7.whatever release. So they?d still want to upgrade to 2.8 since
> that?ll have bug fixes.

But it's not a potentially-breaking change. For example, on Debian
Wheezy, there are a huge number of packages that depend on "python (<<
2.8)", because they expect Python 2.7 and *not* Python 2.8. A newer
version 2.7 will satisfy that; a version 2.8 won't, because it's
entirely possible that 2.8 will have something that's significantly
different. That's what version numbers mean; Python follows the
standard three-part convention, where you upgrade automatically only
within the last part of the number.

ChrisA

From guido at python.org  Fri Jun  6 21:46:50 2014
From: guido at python.org (Guido van Rossum)
Date: Fri, 6 Jun 2014 12:46:50 -0700
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <53921893.4080200@v.loewis.de>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53921893.4080200@v.loewis.de>
Message-ID: <CAP7+vJLVNTUJvJc-0TwWeR_nUFnshFf=FJpf7mEjO+vMc6qtUQ@mail.gmail.com>

A reminder:
https://lh5.googleusercontent.com/-d4rF0qJPskQ/U0qpNjP5GoI/AAAAAAAAPW0/4RF_7zy3esY/w1118-h629-no/Python28.jpg

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/75c277dd/attachment.html>

From p.f.moore at gmail.com  Fri Jun  6 22:13:30 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 6 Jun 2014 21:13:30 +0100
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <53921464.7030400@v.loewis.de>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <53921464.7030400@v.loewis.de>
Message-ID: <CACac1F-jn02yBsMdzQ8vwhN93eT0JhpRb8nQtdn97rEw12OEXg@mail.gmail.com>

On 6 June 2014 20:20, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> 2. what is the risk of installing a beta compiler on what might
>    otherwise be a "production" developer system? In particular, could
>    it interfere with other VS installations, and could it require a
>    complete system reinstall when the final release of VC 14 is
>    available?

>From http://www.visualstudio.com/en-us/downloads/visual-studio-14-ctp-vs

"""
Currently, Visual Studio "14" CTPs have known compatibility issues
with previous releases of Visual Studio and should not be installed
side-by-side on the same computer.
"""

It also states that installing the CTP on a PC puts that PC into
"Unsupported" state.

Paul

From martin at v.loewis.de  Fri Jun  6 22:12:34 2014
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 06 Jun 2014 22:12:34 +0200
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <53921464.7030400@v.loewis.de>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <53921464.7030400@v.loewis.de>
Message-ID: <539220B2.4030901@v.loewis.de>

Am 06.06.14 21:20, schrieb "Martin v. L?wis":
> 2. what is the risk of installing a beta compiler on what might
>    otherwise be a "production" developer system? In particular, could
>    it interfere with other VS installations, and could it require a
>    complete system reinstall when the final release of VC 14 is
>    available?

I found an official answer here:

http://www.visualstudio.com/en-us/downloads/visual-studio-14-ctp-vs

"Installing a CTP release will place a computer in an unsupported state.
For that reason, we recommend only installing CTP releases in a virtual
machine, or on a computer that is available for reformatting."

So there is no promise that you will not need to reformat the system
during the evolution of the compiler.

Regards,
Martin


From dw+python-dev at hmmz.org  Fri Jun  6 21:42:52 2014
From: dw+python-dev at hmmz.org (dw+python-dev at hmmz.org)
Date: Fri, 6 Jun 2014 19:42:52 +0000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CAPTjJmrygHipJRGS9ttSFdX93CHYhJqyyFGKV=L-FdxSD3=uRg@mail.gmail.com>
References: <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAD+XWwoxaRG1T9a3eeve4MtFs2=djHo9OjPCuMpRtWsUGKaMGw@mail.gmail.com>
 <99A4C614-FAC9-4201-859B-B698744A5DB9@stufft.io>
 <CAD+XWwofUpWk-9aUzKkrzq4X1v2p_XXrnUtsbmGvpJNKp=nixg@mail.gmail.com>
 <DEA5F0EA-1EE9-4510-9400-64F3A7B5939C@stufft.io>
 <CAPTjJmrygHipJRGS9ttSFdX93CHYhJqyyFGKV=L-FdxSD3=uRg@mail.gmail.com>
Message-ID: <20140606194252.GA11482@k2>

On Sat, Jun 07, 2014 at 05:33:45AM +1000, Chris Angelico wrote:

> > Is it really any difference in maintenance if you just stop applying
> > updates to 2.7 and switch to 2.8? If 2.8 is really just 2.7 with a
> > new compiler then there should be no functional difference between
> > doing that and doing a 2.7.whatever except all of the tooling that
> > relies on the compiler not to change in micro releases won?t
> > suddenly break and freak out.

> If the only difference between 2.7 and 2.8 is the compiler used on
> Windows, what happens on Linux and other platforms? A Python 2.8 would
> have to be materially different from Python 2.7, not just binarily
> incompatible on one platform.

Grrmph, that's fair. Perhaps a final alternative is simply continuing
the 2.7 series with a stale compiler, as a kind of carrot on a stick to
encourage users to upgrade? Gating 2.7 life on the natural decline of
its supported compiler/related ecosystem seems somehow quite a gradual
and natural demise.. :)


David

From martin at v.loewis.de  Fri Jun  6 22:23:06 2014
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Fri, 06 Jun 2014 22:23:06 +0200
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CACac1F-jn02yBsMdzQ8vwhN93eT0JhpRb8nQtdn97rEw12OEXg@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <53921464.7030400@v.loewis.de>
 <CACac1F-jn02yBsMdzQ8vwhN93eT0JhpRb8nQtdn97rEw12OEXg@mail.gmail.com>
Message-ID: <5392232A.2000102@v.loewis.de>

Am 06.06.14 22:13, schrieb Paul Moore:
> From http://www.visualstudio.com/en-us/downloads/visual-studio-14-ctp-vs
> 
> """
> Currently, Visual Studio "14" CTPs have known compatibility issues
> with previous releases of Visual Studio and should not be installed
> side-by-side on the same computer.
> """

I also found

http://support.microsoft.com/kb/2967191

which is more specific about this issue:

'''There are known issues when you install Visual Studio "14" CTP
14.0.21730.1 DP on the same computer as Visual Studio 2013. While we
expect that an uninstallation of Visual Studio "14" and then a repair of
Visual Studio 2013 should fix these issues, our safest recommendation is
to install Visual Studio "14" in a VM, a VHD, a fresh computer, or
another non-production test-only computer that does not have Visual
Studio 2013 on it. All of these Visual Studio side-by-side issues are
expected to be fixed soon.

There is an installation block in this Visual Studio "14" CTP that will
prevent installation on a computer where an earlier version of Visual
Studio is already installed. To disable the block that will put the
computer in an un-recommended state, add the value "BlockerOverride" to
the registry:
HKLM\SOFTWARE\Microsoft\DevDiv\vs\Servicing'''

So it seems to me that switching to VS 14 at this point in time is
not possible.

Of course, Steve could certainly maintain a Mercurial clone
in his hg.python.org sandbox that has all the necessary changes
done, so people won't have to redo the porting over and over.

Regards,
Marti


From rosuav at gmail.com  Fri Jun  6 22:23:40 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Sat, 7 Jun 2014 06:23:40 +1000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <20140606194252.GA11482@k2>
References: <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAD+XWwoxaRG1T9a3eeve4MtFs2=djHo9OjPCuMpRtWsUGKaMGw@mail.gmail.com>
 <99A4C614-FAC9-4201-859B-B698744A5DB9@stufft.io>
 <CAD+XWwofUpWk-9aUzKkrzq4X1v2p_XXrnUtsbmGvpJNKp=nixg@mail.gmail.com>
 <DEA5F0EA-1EE9-4510-9400-64F3A7B5939C@stufft.io>
 <CAPTjJmrygHipJRGS9ttSFdX93CHYhJqyyFGKV=L-FdxSD3=uRg@mail.gmail.com>
 <20140606194252.GA11482@k2>
Message-ID: <CAPTjJmqdT7SntoqMt2w95w67rDSPsC1Ri5CbNucNqiP3aF2Lug@mail.gmail.com>

On Sat, Jun 7, 2014 at 5:42 AM,  <dw+python-dev at hmmz.org> wrote:
> Perhaps a final alternative is simply continuing
> the 2.7 series with a stale compiler, as a kind of carrot on a stick to
> encourage users to upgrade?

More likely, what would happen is that there'd be an alternate
distribution of Python 2.7 (eg ActiveState), which would be
language-compatible with python.org 2.7, but built with a different
compiler, and therefore unable to use extensions built for
python.org's 2.7. One way or another, pain will happen.

ChrisA

From brian at python.org  Fri Jun  6 22:28:10 2014
From: brian at python.org (Brian Curtin)
Date: Sat, 7 Jun 2014 00:28:10 +0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <20140606194252.GA11482@k2>
References: <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAD+XWwoxaRG1T9a3eeve4MtFs2=djHo9OjPCuMpRtWsUGKaMGw@mail.gmail.com>
 <99A4C614-FAC9-4201-859B-B698744A5DB9@stufft.io>
 <CAD+XWwofUpWk-9aUzKkrzq4X1v2p_XXrnUtsbmGvpJNKp=nixg@mail.gmail.com>
 <DEA5F0EA-1EE9-4510-9400-64F3A7B5939C@stufft.io>
 <CAPTjJmrygHipJRGS9ttSFdX93CHYhJqyyFGKV=L-FdxSD3=uRg@mail.gmail.com>
 <20140606194252.GA11482@k2>
Message-ID: <CAD+XWwo2T2b6H_+fOPLE2DSd6T-zBZ=gJ7nGp3doOjszkQhBGA@mail.gmail.com>

On Fri, Jun 6, 2014 at 11:42 PM,  <dw+python-dev at hmmz.org> wrote:
> On Sat, Jun 07, 2014 at 05:33:45AM +1000, Chris Angelico wrote:
>
>> > Is it really any difference in maintenance if you just stop applying
>> > updates to 2.7 and switch to 2.8? If 2.8 is really just 2.7 with a
>> > new compiler then there should be no functional difference between
>> > doing that and doing a 2.7.whatever except all of the tooling that
>> > relies on the compiler not to change in micro releases won?t
>> > suddenly break and freak out.
>
>> If the only difference between 2.7 and 2.8 is the compiler used on
>> Windows, what happens on Linux and other platforms? A Python 2.8 would
>> have to be materially different from Python 2.7, not just binarily
>> incompatible on one platform.
>
> Grrmph, that's fair. Perhaps a final alternative is simply continuing
> the 2.7 series with a stale compiler, as a kind of carrot on a stick to
> encourage users to upgrade? Gating 2.7 life on the natural decline of
> its supported compiler/related ecosystem seems somehow quite a gradual
> and natural demise.. :)

Adding features into 3.x is already not enough of a carrot on the
stick for many users. Intentionally leaving 2.7 on a dead compiler is
like beating them with the stick.

From jurko.gospodnetic at pke.hr  Fri Jun  6 22:28:01 2014
From: jurko.gospodnetic at pke.hr (=?UTF-8?B?SnVya28gR29zcG9kbmV0acSH?=)
Date: Fri, 06 Jun 2014 22:28:01 +0200
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CAP7+vJLVNTUJvJc-0TwWeR_nUFnshFf=FJpf7mEjO+vMc6qtUQ@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53921893.4080200@v.loewis.de>
 <CAP7+vJLVNTUJvJc-0TwWeR_nUFnshFf=FJpf7mEjO+vMc6qtUQ@mail.gmail.com>
Message-ID: <lmt88h$mgh$1@ger.gmane.org>

   Hi.

On 6.6.2014. 21:46, Guido van Rossum wrote:
> A reminder:
> https://lh5.googleusercontent.com/-d4rF0qJPskQ/U0qpNjP5GoI/AAAAAAAAPW0/4RF_7zy3esY/w1118-h629-no/Python28.jpg

   *ROFL*

   Subtle, ain't he? *gdr*

   Best regards,
     Jurko Gospodneti?


From timothy.c.delaney at gmail.com  Fri Jun  6 23:35:59 2014
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Sat, 7 Jun 2014 07:35:59 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140606175217.766b781c@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
 <CAN8CLgk2hro-BQys=tN+DTmi6LzqUWo9LUoLzDiBFKwK_8R6Mg@mail.gmail.com>
 <20140606121306.06783df6@x34f>
 <8761ke2syo.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140606143401.79a7b0ee@x34f>
 <CAN8CLgm7UdOXdmjKdFRn3KNi-Myp6SgQ9J_3WcP8XgT4G5ofdw@mail.gmail.com>
 <20140606175217.766b781c@x34f>
Message-ID: <CAN8CLgnwJjianM6VH8XWteyJBUj5Ng-Q=j19aLGfcJBFen4OAw@mail.gmail.com>

On 7 June 2014 00:52, Paul Sokolovsky <pmiscml at gmail.com> wrote:

> > At heart, this is exactly what the Python 3 "str" type is. The
> > universal convention is "code points".
>
> Yes. Except for one small detail - Python3 specifies these code points
> to be Unicode code points. And Unicode is a very bloated thing.
>
> But if we drop that "Unicode" stipulation, then it's also exactly what
> MicroPython implements. Its "str" type consists of codepoints, we don't
> have pet names for them yet, like Unicode does, but their numeric
> values are 0-255. Note that it in no way limits encodings, characters,
> or scripts which can be used with MicroPython, because just like
> Unicode, it support concept of "surrogate pairs" (but we don't call it
> like that) - specifically, smaller code points may comprise bigger
> groupings. But unlike Unicode, we don't stipulate format, value or
> other constraints on how these "surrogate pairs"-alikes are formed,
> leaving that to users.


I think you've missed my point.

There is absolutely nothing conceptually bloaty about what a Python 3
string is. It's just like a 7-bit ASCII string, except each entry can be
from a larger table. When you index into a Python 3 string, you get back
exactly *one valid entry* from the Unicode code point table. That plus the
length of the string, plus the guarantee of immutability gives everything
needed to layer the rest of the string functionality on top.

There are no surrogate pairs - each code point is standalone (unlike code
*units*). It is conceptually very simple. The implementation may be
difficult (if you're trying to do better than 4 bytes per code point) but
the concept is dead simple.

If the MicroPython string type requires people *using* it to deal with
surrogates (i.e. indexing could return a value that is not a valid Unicode
code point) then it will have broken the conceptual simplicity of the
Python 3 string type (and most certainly can't be considered in any way
compatible).

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140607/034954c2/attachment.html>

From sturla.molden at gmail.com  Sat Jun  7 00:33:29 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Fri, 6 Jun 2014 22:33:29 +0000 (UTC)
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
	elision in third-party classes
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <lmr8ct$tab$1@ger.gmane.org> <5391754D.8000607@googlemail.com>
 <1842263445423761298.493568sturla.molden-gmail.com@news.gmane.org>
 <CAPJVwBm9N7GJ24ZBvDCGm6oPpNojbe5=-1FQhMyeKfCNHvMWZA@mail.gmail.com>
Message-ID: <1064279801423785627.049284sturla.molden-gmail.com@news.gmane.org>

Nathaniel Smith <njs at pobox.com> wrote:

> The proposal in my initial email requires zero pthreads, and is
> substantially more effective. (Your proposal reduces only the alloc
> overhead for large arrays; mine reduces both alloc and memory access
> overhead for boyh large and small arrays.)

My suggestion prevents the kernel from zeroing pages in the middle of a
computation, which is an important part. It would also be an optimiation
the Python interpreter could benefit from indepently of NumPy, by allowing
reuse of allocated memory pages within CPU bound portions of the Python
code. And no, the method I suggested does not only work for large arrays.

If we really want to take out the memory access overhead, we need to
consider lazy evaluation. E.g. a context manager that collects a symbolic
expression and triggers evaluation on exit:

with numpy.accelerate:
    x = <expression>
    y = <expression>
    z = <expression>
# evaluation of x,y,z happens here

Sturla


From sturla.molden at gmail.com  Sat Jun  7 00:43:34 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Fri, 6 Jun 2014 22:43:34 +0000 (UTC)
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>
Message-ID: <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>

Brett Cannon <bcannon at gmail.com> wrote:

> Nope. A new minor release of Python is a massive undertaking which is why
> we have saved ourselves the hassle of doing a Python 2.8 or not giving a
> clear signal as to when Python 2.x will end as a language.

Why not just define Python 2.8 as Python 2.7 except with a newer compiler?
I cannot see why that would be massive undertaking, if changing compiler
for 2.7 is neccesary anyway.

Sturla


From sturla.molden at gmail.com  Sat Jun  7 01:01:31 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Fri, 6 Jun 2014 23:01:31 +0000 (UTC)
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
References: <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAD+XWwoxaRG1T9a3eeve4MtFs2=djHo9OjPCuMpRtWsUGKaMGw@mail.gmail.com>
 <99A4C614-FAC9-4201-859B-B698744A5DB9@stufft.io>
 <CAD+XWwofUpWk-9aUzKkrzq4X1v2p_XXrnUtsbmGvpJNKp=nixg@mail.gmail.com>
 <DEA5F0EA-1EE9-4510-9400-64F3A7B5939C@stufft.io>
 <CAPTjJmrygHipJRGS9ttSFdX93CHYhJqyyFGKV=L-FdxSD3=uRg@mail.gmail.com>
 <20140606194252.GA11482@k2>
 <CAD+XWwo2T2b6H_+fOPLE2DSd6T-zBZ=gJ7nGp3doOjszkQhBGA@mail.gmail.com>
Message-ID: <830540067423787719.509957sturla.molden-gmail.com@news.gmane.org>

Brian Curtin <brian at python.org> wrote:

> Adding features into 3.x is already not enough of a carrot on the
> stick for many users. Intentionally leaving 2.7 on a dead compiler is
> like beating them with the stick.

Those who want to build extensions on Windows will just use MinGW
(currently GCC 2.8.2) instead. 

NumPy and SciPy are planning a switch to a GCC based toolchain with static
linkage of the MinGW runtime on Windows.  It is carefully configured to be
binary compatible with VS2008 on Python 2.7. The major reason for this is
to use gfortran also on Windows. But the result will be a GCC based
toolchain that anyone can use to build extensions on Windows.

Sturla


From Steve.Dower at microsoft.com  Sat Jun  7 01:01:53 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Fri, 6 Jun 2014 23:01:53 +0000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <5392232A.2000102@v.loewis.de>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <53921464.7030400@v.loewis.de>
 <CACac1F-jn02yBsMdzQ8vwhN93eT0JhpRb8nQtdn97rEw12OEXg@mail.gmail.com>
 <5392232A.2000102@v.loewis.de>
Message-ID: <438e8a27e8e643f4841a22b24447b956@BLUPR03MB389.namprd03.prod.outlook.com>

Martin v. L?wis wrote:
> Am 06.06.14 22:13, schrieb Paul Moore:
>> From
>> http://www.visualstudio.com/en-us/downloads/visual-studio-14-ctp-vs
>>
>> """
>> Currently, Visual Studio "14" CTPs have known compatibility issues
>> with previous releases of Visual Studio and should not be installed
>> side-by-side on the same computer.
>> """
> 
> I also found
> 
> http://support.microsoft.com/kb/2967191
> 
> which is more specific about this issue:
> 
> '''There are known issues when you install Visual Studio "14" CTP
> 14.0.21730.1 DP on the same computer as Visual Studio 2013. While we expect that
> an uninstallation of Visual Studio "14" and then a repair of Visual Studio 2013
> should fix these issues, our safest recommendation is to install Visual Studio
> "14" in a VM, a VHD, a fresh computer, or another non-production test-only
> computer that does not have Visual Studio 2013 on it. All of these Visual Studio
> side-by-side issues are expected to be fixed soon.

Somebody ran a test to see how well the install/uninstall/repair scenario works, and it isn't that great. There are a lot of teams who contribute to Visual Studio, and not all of them have updated their installers yet (my team included...). Unfortunately, it all happened too close to the release to fix it for this version, hence the recommendation.

Eventually, VS 14 will be safe to install side-by-side with earlier versions. Chances are it is safe enough with VS 2010 or VS 2012 - it's the one-version-prior that's causing the most trouble.

> There is an installation block in this Visual Studio "14" CTP that will prevent
> installation on a computer where an earlier version of Visual Studio is already
> installed. To disable the block that will put the computer in an un-recommended
> state, add the value "BlockerOverride" to the registry:
> HKLM\SOFTWARE\Microsoft\DevDiv\vs\Servicing'''
> 
> So it seems to me that switching to VS 14 at this point in time is not possible.
> 
> Of course, Steve could certainly maintain a Mercurial clone in his hg.python.org
> sandbox that has all the necessary changes done, so people won't have to redo
> the porting over and over.

That's what I had in mind.

[Earlier post]
> 1. what is the availability of the compiler during the testing phase,
>    and what will it be immediately after the testing ends (where
>    traditionally people would have to buy licenses, or wait for VS
>    Express to be released)?

It's freely available now as part of Visual Studio, and all the pre-release releases will include everything. The last release (RC or whatever they decide to call it this time) should have a go-live license, though it will also be time bombed. I believe Express will be released at the same time as the paid versions.

> 2. what is the risk of installing a beta compiler on what might
>    otherwise be a "production" developer system? In particular, could
>    it interfere with other VS installations, and could it require a
>    complete system reinstall when the final release of VC 14 is
>    available?

Answered above. It's as risky as it always is, though as I mentioned, VC 14 may well be fine against VC 10.

Build-to-build upgrades may not be supported between pre-release versions, but typically RC to RTM upgrades are supported.

> 3. what is the chance of the final release being delayed beyond
>    the planned release date of Python 3.5? Microsoft has a bad
>    track record of meeting release dates (or the tradition of not
>    announcing any for that reason); the blog says that it will
>    be available "sometime in 2015". Now, Python 3.5 might appear
>    November 2015, so what do we do if VS 2015 is not released
>    by the time 3.5b1 is planned?

We keep the VS 2010 files around and make sure they keep working. This is the biggest risk of the whole plan, but I believe that there's enough of a gap between when VS 14 is planned to release (which I know, but can't share) and when Python 3.5 is planned (which I don't know, but have a semi-informed guess).

Is Python 3.5b1 being built with VS 14 RC (hypothetically) a blocking issue? Do we need to resolve that now or can it wait until it happens?

> Regards,
> Martin

Cheers,
Steve


From brian at python.org  Sat Jun  7 01:05:52 2014
From: brian at python.org (Brian Curtin)
Date: Sat, 7 Jun 2014 03:05:52 +0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <830540067423787719.509957sturla.molden-gmail.com@news.gmane.org>
References: <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAD+XWwoxaRG1T9a3eeve4MtFs2=djHo9OjPCuMpRtWsUGKaMGw@mail.gmail.com>
 <99A4C614-FAC9-4201-859B-B698744A5DB9@stufft.io>
 <CAD+XWwofUpWk-9aUzKkrzq4X1v2p_XXrnUtsbmGvpJNKp=nixg@mail.gmail.com>
 <DEA5F0EA-1EE9-4510-9400-64F3A7B5939C@stufft.io>
 <CAPTjJmrygHipJRGS9ttSFdX93CHYhJqyyFGKV=L-FdxSD3=uRg@mail.gmail.com>
 <20140606194252.GA11482@k2>
 <CAD+XWwo2T2b6H_+fOPLE2DSd6T-zBZ=gJ7nGp3doOjszkQhBGA@mail.gmail.com>
 <830540067423787719.509957sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAD+XWwoGZi52RB6sq9jvs7edOZkw7+7raPd_LyU8mWsmM3ZXWQ@mail.gmail.com>

On Jun 6, 2014 6:01 PM, "Sturla Molden" <sturla.molden at gmail.com> wrote:
>
> Brian Curtin <brian at python.org> wrote:
>
> > Adding features into 3.x is already not enough of a carrot on the
> > stick for many users. Intentionally leaving 2.7 on a dead compiler is
> > like beating them with the stick.
>
> Those who want to build extensions on Windows will just use MinGW
> (currently GCC 2.8.2) instead.

Well we're certainly not going to assume such a thing. I know people do
that, but many don't (I never have).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140607/5ba31d50/attachment.html>

From sturla.molden at gmail.com  Sat Jun  7 01:32:50 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Fri, 6 Jun 2014 23:32:50 +0000 (UTC)
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
References: <CAD+XWwoxaRG1T9a3eeve4MtFs2=djHo9OjPCuMpRtWsUGKaMGw@mail.gmail.com>
 <99A4C614-FAC9-4201-859B-B698744A5DB9@stufft.io>
 <CAD+XWwofUpWk-9aUzKkrzq4X1v2p_XXrnUtsbmGvpJNKp=nixg@mail.gmail.com>
 <DEA5F0EA-1EE9-4510-9400-64F3A7B5939C@stufft.io>
 <CAPTjJmrygHipJRGS9ttSFdX93CHYhJqyyFGKV=L-FdxSD3=uRg@mail.gmail.com>
 <20140606194252.GA11482@k2>
 <CAD+XWwo2T2b6H_+fOPLE2DSd6T-zBZ=gJ7nGp3doOjszkQhBGA@mail.gmail.com>
 <830540067423787719.509957sturla.molden-gmail.com@news.gmane.org>
 <CAD+XWwoGZi52RB6sq9jvs7edOZkw7+7raPd_LyU8mWsmM3ZXWQ@mail.gmail.com>
Message-ID: <1817978993423789902.560333sturla.molden-gmail.com@news.gmane.org>

Brian Curtin <brian at python.org> wrote:

> Well we're certainly not going to assume such a thing. I know people do
> that, but many don't (I never have).

If Python 2.7 users are left with a dead compiler on Windows, they will
find a solution. For example, Enthought is already bundling their Python
distribution with gcc 2.8.1 on Windows.

Sturla


From njs at pobox.com  Sat Jun  7 00:47:49 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 6 Jun 2014 23:47:49 +0100
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>
 <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAPJVwBmH+YZ7Yq9vWvmcwinCnvBmX-s57CLrCYc41O8vQ5JTmw@mail.gmail.com>

On Fri, Jun 6, 2014 at 11:43 PM, Sturla Molden <sturla.molden at gmail.com> wrote:
> Brett Cannon <bcannon at gmail.com> wrote:
>
>> Nope. A new minor release of Python is a massive undertaking which is why
>> we have saved ourselves the hassle of doing a Python 2.8 or not giving a
>> clear signal as to when Python 2.x will end as a language.
>
> Why not just define Python 2.8 as Python 2.7 except with a newer compiler?
> I cannot see why that would be massive undertaking, if changing compiler
> for 2.7 is neccesary anyway.

This would require recompiling all packages on OS X and Linux, even
though nothing had changed.

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

From njs at pobox.com  Sat Jun  7 00:53:25 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 6 Jun 2014 23:53:25 +0100
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <1064279801423785627.049284sturla.molden-gmail.com@news.gmane.org>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <lmr8ct$tab$1@ger.gmane.org> <5391754D.8000607@googlemail.com>
 <1842263445423761298.493568sturla.molden-gmail.com@news.gmane.org>
 <CAPJVwBm9N7GJ24ZBvDCGm6oPpNojbe5=-1FQhMyeKfCNHvMWZA@mail.gmail.com>
 <1064279801423785627.049284sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAPJVwBk+9hP0_U1=sTu-sTtqtmNt1Fp5G3HfVrhBTepn4Z4Sig@mail.gmail.com>

On Fri, Jun 6, 2014 at 11:33 PM, Sturla Molden <sturla.molden at gmail.com> wrote:
> Nathaniel Smith <njs at pobox.com> wrote:
>
>> The proposal in my initial email requires zero pthreads, and is
>> substantially more effective. (Your proposal reduces only the alloc
>> overhead for large arrays; mine reduces both alloc and memory access
>> overhead for boyh large and small arrays.)
>
> My suggestion prevents the kernel from zeroing pages in the middle of a
> computation, which is an important part. It would also be an optimiation
> the Python interpreter could benefit from indepently of NumPy, by allowing
> reuse of allocated memory pages within CPU bound portions of the Python
> code. And no, the method I suggested does not only work for large arrays.

Small allocations are already recycled within process and don't touch
the kernel, so your method doesn't affect them at all.

My guess is that PyMalloc is unlikely to start spawning background
threads any time soon, but if you'd like to propose it maybe you
should start a new thread for that?

> If we really want to take out the memory access overhead, we need to
> consider lazy evaluation. E.g. a context manager that collects a symbolic
> expression and triggers evaluation on exit:
>
> with numpy.accelerate:
>     x = <expression>
>     y = <expression>
>     z = <expression>
> # evaluation of x,y,z happens here

Using an alternative evaluation engine is indeed another way to
optimize execution, which is why projects like numexpr, numba, theano,
etc. exist. But this is basically switching to a different language in
a different VM. I think people will keep running plain-old-CPython
code for some time yet, and the point of this thread is that there's
some low-hanging fruit for making all *that* code faster.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

From eliben at gmail.com  Sat Jun  7 01:45:06 2014
From: eliben at gmail.com (Eli Bendersky)
Date: Fri, 6 Jun 2014 16:45:06 -0700
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <1817978993423789902.560333sturla.molden-gmail.com@news.gmane.org>
References: <CAD+XWwoxaRG1T9a3eeve4MtFs2=djHo9OjPCuMpRtWsUGKaMGw@mail.gmail.com>
 <99A4C614-FAC9-4201-859B-B698744A5DB9@stufft.io>
 <CAD+XWwofUpWk-9aUzKkrzq4X1v2p_XXrnUtsbmGvpJNKp=nixg@mail.gmail.com>
 <DEA5F0EA-1EE9-4510-9400-64F3A7B5939C@stufft.io>
 <CAPTjJmrygHipJRGS9ttSFdX93CHYhJqyyFGKV=L-FdxSD3=uRg@mail.gmail.com>
 <20140606194252.GA11482@k2>
 <CAD+XWwo2T2b6H_+fOPLE2DSd6T-zBZ=gJ7nGp3doOjszkQhBGA@mail.gmail.com>
 <830540067423787719.509957sturla.molden-gmail.com@news.gmane.org>
 <CAD+XWwoGZi52RB6sq9jvs7edOZkw7+7raPd_LyU8mWsmM3ZXWQ@mail.gmail.com>
 <1817978993423789902.560333sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAF-Rda9DJtVZYWptBUudB=bv8w56V9vzu1c+9Tv3q-bkUsYPAg@mail.gmail.com>

On Fri, Jun 6, 2014 at 4:32 PM, Sturla Molden <sturla.molden at gmail.com>
wrote:

> Brian Curtin <brian at python.org> wrote:
>
> > Well we're certainly not going to assume such a thing. I know people do
> > that, but many don't (I never have).
>
> If Python 2.7 users are left with a dead compiler on Windows, they will
> find a solution. For example, Enthought is already bundling their Python
> distribution with gcc 2.8.1 on Windows.
>

While we're at it, Clang in nearing a stage where it can compile C and C++
on Windows *with ABI-compatibility to MSVC* (yes, even C++) -- see
http://clang.llvm.org/docs/MSVCCompatibility.html for more details. Could
this help?

Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/72ba9152/attachment.html>

From sturla.molden at gmail.com  Sat Jun  7 02:05:44 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sat, 7 Jun 2014 00:05:44 +0000 (UTC)
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
References: <CAD+XWwofUpWk-9aUzKkrzq4X1v2p_XXrnUtsbmGvpJNKp=nixg@mail.gmail.com>
 <DEA5F0EA-1EE9-4510-9400-64F3A7B5939C@stufft.io>
 <CAPTjJmrygHipJRGS9ttSFdX93CHYhJqyyFGKV=L-FdxSD3=uRg@mail.gmail.com>
 <20140606194252.GA11482@k2>
 <CAD+XWwo2T2b6H_+fOPLE2DSd6T-zBZ=gJ7nGp3doOjszkQhBGA@mail.gmail.com>
 <830540067423787719.509957sturla.molden-gmail.com@news.gmane.org>
 <CAD+XWwoGZi52RB6sq9jvs7edOZkw7+7raPd_LyU8mWsmM3ZXWQ@mail.gmail.com>
 <1817978993423789902.560333sturla.molden-gmail.com@news.gmane.org>
 <CAF-Rda9DJtVZYWptBUudB=bv8w56V9vzu1c+9Tv3q-bkUsYPAg@mail.gmail.com>
Message-ID: <468120726423791445.579744sturla.molden-gmail.com@news.gmane.org>

Eli Bendersky <eliben at gmail.com> wrote:

> While we're at it, Clang in nearing a stage where it can compile C and C++
> on Windows *with ABI-compatibility to MSVC* (yes, even C++) -- see
> <a
> href="http://clang.llvm.org/docs/MSVCCompatibility.html">http://clang.llvm.org/docs/MSVCCompatibility.html</a>
> for more details. Could
> this help?

Possibly. "cl-clang" is exciting and I hope distutils will support it one
day. Clang is not well known among Windows users as it is among users of
"Unix" (Apple, Linux, FreeBSD, et al.) It would be even better if Python
were bundled with Clang on Windows. 

The MinGW-based "SciPy toolchain" has ABI compatibility with MSVC only for
C (and Fortran), not C++. Differences from vanilla MinGW is mainly static
linkage of the MinGW runtime, different stack alignment (4 bytes instead of
16), and it links with msvcr91.dll instead of msvcrt.dll. 

Sturla


From greg.ewing at canterbury.ac.nz  Sat Jun  7 02:37:14 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 07 Jun 2014 12:37:14 +1200
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <5391F8A4.70401@googlemail.com>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <lmqsl2$5v5$1@ger.gmane.org>
 <CAPJVwBnPR1bYqwof_LejPmya80CJiyaK+kUYk99HrDoCgHPFNQ@mail.gmail.com>
 <539126DB.8010306@canterbury.ac.nz> <5391F8A4.70401@googlemail.com>
Message-ID: <53925EBA.4050306@canterbury.ac.nz>

Julian Taylor wrote:
> tp_can_elide receives two objects and returns one of three values:
> * can work inplace, operation is associative
> * can work inplace but not associative
> * cannot work inplace

Does it really need to be that complicated? Isn't it
sufficient just to ask the object potentially being
overwritten whether it's okay to overwrite it?
I.e. a parameterless method returning a boolean.

-- 
Greg

From ncoghlan at gmail.com  Sat Jun  7 02:37:28 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 7 Jun 2014 10:37:28 +1000
Subject: [Python-Dev] Internal representation of strings and Micropython
In-Reply-To: <20140606175217.766b781c@x34f>
References: <20140604011718.GD10355@ando> <lmn7i2$f4n$1@ger.gmane.org>
 <CACac1F_kz6jN00C5NbgyoEoixjyvpQnr2uUFfnZcLiDCxEhvXA@mail.gmail.com>
 <lmnb3d$ugn$1@ger.gmane.org> <20140604183831.7226448c@x34f>
 <lmnile$6a8$1@ger.gmane.org> <20140604200520.1d432329@x34f>
 <lmnmbd$n8q$1@ger.gmane.org> <538FB4F5.9070500@canterbury.ac.nz>
 <20140605041913.14886264@x34f>
 <87oay73i70.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140605142528.39e0e5fc@x34f>
 <CADiSq7fk=PTwOp8_E=HmV--i8za+jEj9ArGseDS_10_-LDvr_w@mail.gmail.com>
 <20140605150121.286032df@x34f>
 <CAN8CLgk2hro-BQys=tN+DTmi6LzqUWo9LUoLzDiBFKwK_8R6Mg@mail.gmail.com>
 <20140606121306.06783df6@x34f>
 <8761ke2syo.fsf@uwakimon.sk.tsukuba.ac.jp>
 <20140606143401.79a7b0ee@x34f>
 <CAN8CLgm7UdOXdmjKdFRn3KNi-Myp6SgQ9J_3WcP8XgT4G5ofdw@mail.gmail.com>
 <20140606175217.766b781c@x34f>
Message-ID: <CADiSq7cLkHro5L8pRKcdwSE3tQ6sauaO5g_bzn+AU7szwbc8Og@mail.gmail.com>

On 7 Jun 2014 00:53, "Paul Sokolovsky" <pmiscml at gmail.com> wrote:
>
> Yes. Except for one small detail - Python3 specifies these code points
> to be Unicode code points. And Unicode is a very bloated thing.

I rather suspect users of East Asian & African scripts might have a
different notion of what constitutes "bloated" vs "can actually represent
this language properly, unlike 8-bit code spaces".

> But if we drop that "Unicode" stipulation, then it's also exactly what
> MicroPython implements. Its "str" type consists of codepoints, we don't
> have pet names for them yet, like Unicode does, but their numeric
> values are 0-255. Note that it in no way limits encodings, characters,
> or scripts which can be used with MicroPython, because just like
> Unicode, it support concept of "surrogate pairs" (but we don't call it
> like that) - specifically, smaller code points may comprise bigger
> groupings. But unlike Unicode, we don't stipulate format, value or
> other constraints on how these "surrogate pairs"-alikes are formed,
> leaving that to users.

This is effectively what the Python 2 str type does, and it's a recipe for
data driven latent defects. You inevitably end up concatenating strings
using different code spaces, or else splitting strings between surrogate
pairs rather than on the proper boundaries, etc.

The abstraction presented to users by the str type *must* be the full range
of Unicode code points as atomic units. Storing those internally as UTF-8
rather than as fixed width code points as CPython does is an experiment
worth trying, since you don't have the same C level backwards compatibility
constraints we do. But limiting the str type to a single code page per
process is not an acceptable constraint in a Python 3 implementation.

Regards,
Nick.

>
>
> --
> Best regards,
>  Paul                          mailto:pmiscml at gmail.com
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140607/2e960d1e/attachment.html>

From tjreedy at udel.edu  Sat Jun  7 03:05:53 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 06 Jun 2014 21:05:53 -0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CAPJVwBmH+YZ7Yq9vWvmcwinCnvBmX-s57CLrCYc41O8vQ5JTmw@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>
 <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
 <CAPJVwBmH+YZ7Yq9vWvmcwinCnvBmX-s57CLrCYc41O8vQ5JTmw@mail.gmail.com>
Message-ID: <lmtoho$l2r$1@ger.gmane.org>

On 6/6/2014 6:47 PM, Nathaniel Smith wrote:
> On Fri, Jun 6, 2014 at 11:43 PM, Sturla Molden <sturla.molden at gmail.com> wrote:
>> Brett Cannon <bcannon at gmail.com> wrote:
>>
>>> Nope. A new minor release of Python is a massive undertaking which is why
>>> we have saved ourselves the hassle of doing a Python 2.8 or not giving a
>>> clear signal as to when Python 2.x will end as a language.
>>
>> Why not just define Python 2.8 as Python 2.7 except with a newer compiler?
>> I cannot see why that would be massive undertaking, if changing compiler
>> for 2.7 is neccesary anyway.
>
> This would require recompiling all packages on OS X and Linux, even
> though nothing had changed.

If you are suggesting that a Windows compiler change should be invisible 
to non-Windows users, I agree.

Let us assume that /pcbuild remains for those who have vc2008 and that 
/pcbuild14 is added (and everything else remains as is). Then the only 
other thing that would change is the Windows installer released on 
Python.org. Call than 2.7.9W or whatever on the download site and 
interactive startup message to signal that something is different.

-- 
Terry Jan Reedy


From brian at python.org  Sat Jun  7 03:09:36 2014
From: brian at python.org (Brian Curtin)
Date: Sat, 7 Jun 2014 05:09:36 +0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <1817978993423789902.560333sturla.molden-gmail.com@news.gmane.org>
References: <CAD+XWwoxaRG1T9a3eeve4MtFs2=djHo9OjPCuMpRtWsUGKaMGw@mail.gmail.com>
 <99A4C614-FAC9-4201-859B-B698744A5DB9@stufft.io>
 <CAD+XWwofUpWk-9aUzKkrzq4X1v2p_XXrnUtsbmGvpJNKp=nixg@mail.gmail.com>
 <DEA5F0EA-1EE9-4510-9400-64F3A7B5939C@stufft.io>
 <CAPTjJmrygHipJRGS9ttSFdX93CHYhJqyyFGKV=L-FdxSD3=uRg@mail.gmail.com>
 <20140606194252.GA11482@k2>
 <CAD+XWwo2T2b6H_+fOPLE2DSd6T-zBZ=gJ7nGp3doOjszkQhBGA@mail.gmail.com>
 <830540067423787719.509957sturla.molden-gmail.com@news.gmane.org>
 <CAD+XWwoGZi52RB6sq9jvs7edOZkw7+7raPd_LyU8mWsmM3ZXWQ@mail.gmail.com>
 <1817978993423789902.560333sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAD+XWwrFC0GycOhmxcefoouk+0j9-VdS_HX6CSNEY+iR-6z4BQ@mail.gmail.com>

On Jun 6, 2014 6:33 PM, "Sturla Molden" <sturla.molden at gmail.com> wrote:
>
> Brian Curtin <brian at python.org> wrote:
>
> > Well we're certainly not going to assume such a thing. I know people do
> > that, but many don't (I never have).
>
> If Python 2.7 users are left with a dead compiler on Windows, they will
> find a solution. For example, Enthought is already bundling their Python
> distribution with gcc 2.8.1 on Windows.

Again, not something I think we should depend on. A lot of people use
python.org installers.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140607/619eaf83/attachment.html>

From donald at stufft.io  Sat Jun  7 03:13:32 2014
From: donald at stufft.io (Donald Stufft)
Date: Fri, 6 Jun 2014 21:13:32 -0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <lmtoho$l2r$1@ger.gmane.org>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>
 <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
 <CAPJVwBmH+YZ7Yq9vWvmcwinCnvBmX-s57CLrCYc41O8vQ5JTmw@mail.gmail.com>
 <lmtoho$l2r$1@ger.gmane.org>
Message-ID: <4A7F1D64-2F36-428E-9682-9861B05DEFAD@stufft.io>


On Jun 6, 2014, at 9:05 PM, Terry Reedy <tjreedy at udel.edu> wrote:

> On 6/6/2014 6:47 PM, Nathaniel Smith wrote:
>> On Fri, Jun 6, 2014 at 11:43 PM, Sturla Molden <sturla.molden at gmail.com> wrote:
>>> Brett Cannon <bcannon at gmail.com> wrote:
>>> 
>>>> Nope. A new minor release of Python is a massive undertaking which is why
>>>> we have saved ourselves the hassle of doing a Python 2.8 or not giving a
>>>> clear signal as to when Python 2.x will end as a language.
>>> 
>>> Why not just define Python 2.8 as Python 2.7 except with a newer compiler?
>>> I cannot see why that would be massive undertaking, if changing compiler
>>> for 2.7 is neccesary anyway.
>> 
>> This would require recompiling all packages on OS X and Linux, even
>> though nothing had changed.
> 
> If you are suggesting that a Windows compiler change should be invisible to non-Windows users, I agree.
> 
> Let us assume that /pcbuild remains for those who have vc2008 and that /pcbuild14 is added (and everything else remains as is). Then the only other thing that would change is the Windows installer released on Python.org. Call than 2.7.9W or whatever on the download site and interactive startup message to signal that something is different.
> 
> -- 
> Terry Jan Reedy
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/donald%40stufft.io

How are packaging tools supposed to cope with this? AFAIK there is nothing in most of them to deal with a X.Y.Z release suddenly dealing with a different compiler.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/1208c8c5/attachment.sig>

From sturla.molden at gmail.com  Sat Jun  7 03:35:40 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sat, 7 Jun 2014 01:35:40 +0000 (UTC)
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
References: <CAD+XWwofUpWk-9aUzKkrzq4X1v2p_XXrnUtsbmGvpJNKp=nixg@mail.gmail.com>
 <DEA5F0EA-1EE9-4510-9400-64F3A7B5939C@stufft.io>
 <CAPTjJmrygHipJRGS9ttSFdX93CHYhJqyyFGKV=L-FdxSD3=uRg@mail.gmail.com>
 <20140606194252.GA11482@k2>
 <CAD+XWwo2T2b6H_+fOPLE2DSd6T-zBZ=gJ7nGp3doOjszkQhBGA@mail.gmail.com>
 <830540067423787719.509957sturla.molden-gmail.com@news.gmane.org>
 <CAD+XWwoGZi52RB6sq9jvs7edOZkw7+7raPd_LyU8mWsmM3ZXWQ@mail.gmail.com>
 <1817978993423789902.560333sturla.molden-gmail.com@news.gmane.org>
 <CAD+XWwrFC0GycOhmxcefoouk+0j9-VdS_HX6CSNEY+iR-6z4BQ@mail.gmail.com>
Message-ID: <495727245423796617.857480sturla.molden-gmail.com@news.gmane.org>

Brian Curtin <brian at python.org> wrote:

>> If Python 2.7 users are left with a dead compiler on Windows, they will
>> find a solution. For example, Enthought is already bundling their Python
>> distribution with gcc 2.8.1 on Windows.
> 
> Again, not something I think we should depend on. A lot of people use
> python.org installers.

I am not talking about changing the python.org installers. Let it remain on
VS2008 for Python 2.7. I am only suggesting we make it easier to find a
free C compiler compatible with the python.org installers. 

The NumPy/SciPy dev team have taken the burden to build a MinGW toolchain
that is configured to be 100 % ABI compatible with the python.org
installer. I am only suggesting a link to it or something like that,
perhaps even host it as a separate download. (It is GPL, so anyone can do
that.) That way it would be easy to find a compatible C compiler. We have
to consider that VS2008 will be unobtainable abandonware long before the
promised Python 2.7 support expires. When that happens, users of Python 2.7
will need to find another compiler to build C extensions. If Python.org
makes this easier it would hurt less to have Python 2.7 remain on VS2008
forever.

Sturla


From sturla.molden at gmail.com  Sat Jun  7 03:40:34 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sat, 7 Jun 2014 01:40:34 +0000 (UTC)
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
	elision in third-party classes
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <lmqsl2$5v5$1@ger.gmane.org>
 <CAPJVwBnPR1bYqwof_LejPmya80CJiyaK+kUYk99HrDoCgHPFNQ@mail.gmail.com>
 <539126DB.8010306@canterbury.ac.nz> <5391F8A4.70401@googlemail.com>
 <53925EBA.4050306@canterbury.ac.nz>
Message-ID: <224483517423797963.412727sturla.molden-gmail.com@news.gmane.org>

Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Julian Taylor wrote:
>> tp_can_elide receives two objects and returns one of three values:
>> * can work inplace, operation is associative
>> * can work inplace but not associative
>> * cannot work inplace
> 
> Does it really need to be that complicated? Isn't it
> sufficient just to ask the object potentially being
> overwritten whether it's okay to overwrite it?

How can it know this without help from the interpreter? 

Sturla


From sturla.molden at gmail.com  Sat Jun  7 04:18:35 2014
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sat, 7 Jun 2014 02:18:35 +0000 (UTC)
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
	elision in third-party classes
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <lmr8ct$tab$1@ger.gmane.org> <5391754D.8000607@googlemail.com>
 <1842263445423761298.493568sturla.molden-gmail.com@news.gmane.org>
 <CAPJVwBm9N7GJ24ZBvDCGm6oPpNojbe5=-1FQhMyeKfCNHvMWZA@mail.gmail.com>
 <1064279801423785627.049284sturla.molden-gmail.com@news.gmane.org>
 <CAPJVwBk+9hP0_U1=sTu-sTtqtmNt1Fp5G3HfVrhBTepn4Z4Sig@mail.gmail.com>
Message-ID: <1245574759423799281.221143sturla.molden-gmail.com@news.gmane.org>

Nathaniel Smith <njs at pobox.com> wrote:

>> with numpy.accelerate:
>>     x = <expression>
>>     y = <expression>
>>     z = <expression>
>> # evaluation of x,y,z happens here
> 
> Using an alternative evaluation engine is indeed another way to
> optimize execution, which is why projects like numexpr, numba, theano,
> etc. exist. But this is basically switching to a different language in
> a different VM.

I was not thinking that complicated. Let us focus on what an unmodified
CPython can do.

A compound expression with arrays can also be seen as a pipeline. Imagine
what would happen if in "NumPy 2.0" arithmetic operators returned
coroutines instead of temporary arrays. That way an expression could be
evaluated chunkwise, and the chunks would be small enough to fit in cache.

Sturla


From tjreedy at udel.edu  Sat Jun  7 04:23:41 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 06 Jun 2014 22:23:41 -0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <4A7F1D64-2F36-428E-9682-9861B05DEFAD@stufft.io>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>
 <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
 <CAPJVwBmH+YZ7Yq9vWvmcwinCnvBmX-s57CLrCYc41O8vQ5JTmw@mail.gmail.com>
 <lmtoho$l2r$1@ger.gmane.org> <4A7F1D64-2F36-428E-9682-9861B05DEFAD@stufft.io>
Message-ID: <lmtt3k$2ih$1@ger.gmane.org>

On 6/6/2014 9:13 PM, Donald Stufft wrote:
>
> On Jun 6, 2014, at 9:05 PM, Terry Reedy <tjreedy at udel.edu> wrote:

>> If you are suggesting that a Windows compiler change should be
>> invisible to non-Windows users, I agree.
>>
>> Let us assume that /pcbuild remains for those who have vc2008 and
>> that /pcbuild14 is added (and everything else remains as is). Then
>> the only other thing that would change is the Windows installer
>> released on Python.org. Call than 2.7.9W or whatever on the
>> download site and interactive startup message to signal that
>> something is different.

> How are packaging tools supposed to cope with this? AFAIK there is
> nothing in most of them to deal with a X.Y.Z release suddenly dealing
> with a different compiler.

For this option, packaging tools on Windows would have to gain a special 
rule to cope with a special, hopefully unique, not-to-be-repeated, 
series of releases. If VC2008 ceases to become available to those who do 
not already have it, and who machines do not break or get replaced, 
dealing with a different easily available compiler would be easier than 
dealing with having no compiler.

-- 
Terry Jan Reedy


From chris.barker at noaa.gov  Sat Jun  7 06:01:58 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 6 Jun 2014 21:01:58 -0700
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>
 <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CALGmxEKmJWjExk_ygYudy77-y4pbahTGecq9p9nQPyQg8QTPvA@mail.gmail.com>

>
>
> Why not just define Python 2.8 as Python 2.7 except with a newer compiler?
> I cannot see why that would be massive undertaking, if changing compiler
> for 2.7 is neccesary anyway.
>

A reminder that this was brought up a few months ago, as a proposal by the
stackless team, as they wanted to use a newer compiler for binaries. IIRC,
there was a pretty resounding "don't do that" from this list. Makes sense
to me -- we have how many different binaries of 2.7 on how many platforms,
with how many compilers? Sure, python.org has been nicely consistent about
what compiler (run time, really) they use to distribute Windows binaries,
but the python version has NOTHING to do with what compiler is used. (for
hat matter there is 32 bit and 64 bit 2.7 on Windows ...)

I think, at the time, it was thought that pip, wheel, and the metadata
standards should be extended to allow multiple binaries of the same version
with different compilers to be in the wild. those projects have had bigger
fish to fry, but maybe it's time to get ahead of the game with that, so we
can accommodate this change. It's already getting hard to find VS2008
Express, and building 64 bit extensions is s serious pain.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140606/010c3bf8/attachment.html>

From ncoghlan at gmail.com  Sat Jun  7 06:41:34 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 7 Jun 2014 14:41:34 +1000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>
 <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CADiSq7fGgZvXDobvPNSovxtoY=+V=q0BiExV1GWQ+8wg5yPKcw@mail.gmail.com>

On 7 June 2014 08:43, Sturla Molden <sturla.molden at gmail.com> wrote:
> Brett Cannon <bcannon at gmail.com> wrote:
>
>> Nope. A new minor release of Python is a massive undertaking which is why
>> we have saved ourselves the hassle of doing a Python 2.8 or not giving a
>> clear signal as to when Python 2.x will end as a language.
>
> Why not just define Python 2.8 as Python 2.7 except with a newer compiler?
> I cannot see why that would be massive undertaking, if changing compiler
> for 2.7 is neccesary anyway.

It's honestly astonishing the number of people that tell us doing a
new minor release of Python 2 is easy, and then refuse to believe us
when we tell them it isn't.

It's 2014 and Python *2.7*, which was released in *2010*, is STILL
BEING ROLLED OUT. One part of the rollout that is near & dear to my
own heart is the fact that Red Hat Enterprise Linux 7 and CentOS 7 are
still in their respective release candidate phases, and it is the 6 ->
7 transition that finally upgrades their system Pythons from 2.6 to
2.7. Maya 2014 & MotionBuilder 2014 are also the first versions
Autodesk are shipping that use 2.7 rather than 2.6 as the scripting
engine (although my understanding is that Autodesk don't guarantee
compatibility with Python C extensions that aren't built specifically
for use with their products, so they already use a newer C runtime on
Windows than we do).

And once those two dominoes fall, then there'll be some additional
follow on upgrade work in some parts of the developer community as the
*users* that receive their Python through those channels rather than
directly from upstream switch from 2.6 to 2.7 and stumble over the
small compatibility breaks between those two releases.

Words like "just", or "simple", or "easy" really have no place being
applied to a task where the time required to fully execute it with *no
significant problems* is still measured in years.

That said, there are definitely problems with toolchain availability
on Windows for Python 2, and it isn't clear yet how that will be
addressed in the long run. Steve is working on ensuring the official
toolchain and C runtime binaries are more readily available from MS.
Other folks are independently looking into ensuring that open source
toolchains (like mingw) can be used effectively to at least build
Python C extensions for Windows (and ironing out some of the glitches
with that approach that others have mentioned). The Python Packaging
Authority are continuing to work on the wheel based infrastructure to
help avoid end users having to compile anything in the first place,
and redistributors like ActiveState, Enthought & Continuum Analytics
also make it possible for many end users to just ignore these upstream
concerns. An extension compatibility break would be an absolutely last
resort, pursued only if all other attempts at resolving the challenges
have demonstrably failed - even at the best of times it can take
months for C extension authors to start publishing compatible binaries
for a new minor release, so we'd have to assume that time would be
even longer for a Python 2.7 maintenance release, if they published
updated binaries at all.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From donald at stufft.io  Sat Jun  7 06:47:38 2014
From: donald at stufft.io (Donald Stufft)
Date: Sat, 7 Jun 2014 00:47:38 -0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CADiSq7fGgZvXDobvPNSovxtoY=+V=q0BiExV1GWQ+8wg5yPKcw@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>
 <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
 <CADiSq7fGgZvXDobvPNSovxtoY=+V=q0BiExV1GWQ+8wg5yPKcw@mail.gmail.com>
Message-ID: <B355A9B3-4A49-4109-8E52-C321A99BACB4@stufft.io>


On Jun 7, 2014, at 12:41 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 7 June 2014 08:43, Sturla Molden <sturla.molden at gmail.com> wrote:
>> Brett Cannon <bcannon at gmail.com> wrote:
>> 
>>> Nope. A new minor release of Python is a massive undertaking which is why
>>> we have saved ourselves the hassle of doing a Python 2.8 or not giving a
>>> clear signal as to when Python 2.x will end as a language.
>> 
>> Why not just define Python 2.8 as Python 2.7 except with a newer compiler?
>> I cannot see why that would be massive undertaking, if changing compiler
>> for 2.7 is neccesary anyway.
> 
> It's honestly astonishing the number of people that tell us doing a
> new minor release of Python 2 is easy, and then refuse to believe us
> when we tell them it isn't.
> 
> It's 2014 and Python *2.7*, which was released in *2010*, is STILL
> BEING ROLLED OUT. One part of the rollout that is near & dear to my
> own heart is the fact that Red Hat Enterprise Linux 7 and CentOS 7 are
> still in their respective release candidate phases, and it is the 6 ->
> 7 transition that finally upgrades their system Pythons from 2.6 to
> 2.7. Maya 2014 & MotionBuilder 2014 are also the first versions
> Autodesk are shipping that use 2.7 rather than 2.6 as the scripting
> engine (although my understanding is that Autodesk don't guarantee
> compatibility with Python C extensions that aren't built specifically
> for use with their products, so they already use a newer C runtime on
> Windows than we do).
> 
> And once those two dominoes fall, then there'll be some additional
> follow on upgrade work in some parts of the developer community as the
> *users* that receive their Python through those channels rather than
> directly from upstream switch from 2.6 to 2.7 and stumble over the
> small compatibility breaks between those two releases.
> 
> Words like "just", or "simple", or "easy" really have no place being
> applied to a task where the time required to fully execute it with *no
> significant problems* is still measured in years.

How much of that time exists because there were actual significant
changes from 2.6 to 2.7 and how much of it would not need to exist
if 2.8 was literally 2.7.Z with a new compiler on Windows. IOW is it
the *version* number that causes the slow upgrade, or is it the fact
that there are enough changes that it can?t be safely applied
automatically.

> 
> That said, there are definitely problems with toolchain availability
> on Windows for Python 2, and it isn't clear yet how that will be
> addressed in the long run. Steve is working on ensuring the official
> toolchain and C runtime binaries are more readily available from MS.
> Other folks are independently looking into ensuring that open source
> toolchains (like mingw) can be used effectively to at least build
> Python C extensions for Windows (and ironing out some of the glitches
> with that approach that others have mentioned). The Python Packaging
> Authority are continuing to work on the wheel based infrastructure to
> help avoid end users having to compile anything in the first place,
> and redistributors like ActiveState, Enthought & Continuum Analytics
> also make it possible for many end users to just ignore these upstream
> concerns. An extension compatibility break would be an absolutely last
> resort, pursued only if all other attempts at resolving the challenges
> have demonstrably failed - even at the best of times it can take
> months for C extension authors to start publishing compatible binaries
> for a new minor release, so we'd have to assume that time would be
> even longer for a Python 2.7 maintenance release, if they published
> updated binaries at all.
> 
> Regards,
> Nick.
> 
> -- 
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/donald%40stufft.io


-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140607/7ad85492/attachment.sig>

From ncoghlan at gmail.com  Sat Jun  7 06:49:24 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 7 Jun 2014 14:49:24 +1000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CALGmxEKmJWjExk_ygYudy77-y4pbahTGecq9p9nQPyQg8QTPvA@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>
 <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
 <CALGmxEKmJWjExk_ygYudy77-y4pbahTGecq9p9nQPyQg8QTPvA@mail.gmail.com>
Message-ID: <CADiSq7fzAPZ3=w-w7MhVYayRiwNnEO+Xogz2wjKXAfoKGQE5kA@mail.gmail.com>

On 7 June 2014 14:01, Chris Barker <chris.barker at noaa.gov> wrote:
>>
>> Why not just define Python 2.8 as Python 2.7 except with a newer compiler?
>> I cannot see why that would be massive undertaking, if changing compiler
>> for 2.7 is neccesary anyway.
>
>
> A reminder that this was brought up a few months ago, as a proposal by the
> stackless team, as they wanted to use a newer compiler for binaries. IIRC,
> there was a pretty resounding "don't do that" from this list. Makes sense to
> me -- we have how many different binaries of 2.7 on how many platforms, with
> how many compilers? Sure, python.org has been nicely consistent about what
> compiler (run time, really) they use to distribute Windows binaries, but the
> python version has NOTHING to do with what compiler is used. (for hat matter
> there is 32 bit and 64 bit 2.7 on Windows ...)

Supported by python-dev? We have two: 32-bit and 64-bit, both
depending on the Microsoft C runtime, and both published as binary
installers on python.org. That's it.

> I think, at the time, it was thought that pip, wheel, and the metadata
> standards should be extended to allow multiple binaries of the same version
> with different compilers to be in the wild. those projects have had bigger
> fish to fry, but maybe it's time to get ahead of the game with that, so we
> can accommodate this change. It's already getting hard to find VS2008
> Express, and building 64 bit extensions is s serious pain.

That was a largely independent discussion, noting that if we come up
with a mechanism for dealing with Linux distro variances, it may also
be useful for dealing with Windows C runtime variances.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Sat Jun  7 06:58:18 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 7 Jun 2014 14:58:18 +1000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <B355A9B3-4A49-4109-8E52-C321A99BACB4@stufft.io>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>
 <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
 <CADiSq7fGgZvXDobvPNSovxtoY=+V=q0BiExV1GWQ+8wg5yPKcw@mail.gmail.com>
 <B355A9B3-4A49-4109-8E52-C321A99BACB4@stufft.io>
Message-ID: <CADiSq7cw0LD55o6XryLKGp2XZzXcN2ktEMVv1j1+PEh-ArtBBw@mail.gmail.com>

On 7 June 2014 14:47, Donald Stufft <donald at stufft.io> wrote:
> On Jun 7, 2014, at 12:41 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>> Words like "just", or "simple", or "easy" really have no place being
>> applied to a task where the time required to fully execute it with *no
>> significant problems* is still measured in years.
>
> How much of that time exists because there were actual significant
> changes from 2.6 to 2.7 and how much of it would not need to exist
> if 2.8 was literally 2.7.Z with a new compiler on Windows. IOW is it
> the *version* number that causes the slow upgrade, or is it the fact
> that there are enough changes that it can?t be safely applied
> automatically.

It's the version number change itself. Python 2.7 was covered by the
language moratorium, so it consists almost entirely of standard
library changes, and the porting notes are minimal:
https://docs.python.org/2/whatsnew/2.7.html#porting-to-python-2-7

We didn't even switch compilers on Windows (both 2.6 and 2.7 use VS 2008).

I can't think of a better demonstration than the slow pace of the
Python 2.7 rollout that the challenges with doing a new minor release
of Python really aren't technical ones at the language level - they're
technical and administrative challenges in the way the language
version number interacts with the broader Python ecosystem, especially
the various redistribution channels.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From donald at stufft.io  Sat Jun  7 07:05:19 2014
From: donald at stufft.io (Donald Stufft)
Date: Sat, 7 Jun 2014 01:05:19 -0400
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CADiSq7cw0LD55o6XryLKGp2XZzXcN2ktEMVv1j1+PEh-ArtBBw@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>
 <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
 <CADiSq7fGgZvXDobvPNSovxtoY=+V=q0BiExV1GWQ+8wg5yPKcw@mail.gmail.com>
 <B355A9B3-4A49-4109-8E52-C321A99BACB4@stufft.io>
 <CADiSq7cw0LD55o6XryLKGp2XZzXcN2ktEMVv1j1+PEh-ArtBBw@mail.gmail.com>
Message-ID: <2FCC7CC7-8D23-45BF-8157-1C92B9566A16@stufft.io>


On Jun 7, 2014, at 12:58 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 7 June 2014 14:47, Donald Stufft <donald at stufft.io> wrote:
>> On Jun 7, 2014, at 12:41 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>> 
>>> Words like "just", or "simple", or "easy" really have no place being
>>> applied to a task where the time required to fully execute it with *no
>>> significant problems* is still measured in years.
>> 
>> How much of that time exists because there were actual significant
>> changes from 2.6 to 2.7 and how much of it would not need to exist
>> if 2.8 was literally 2.7.Z with a new compiler on Windows. IOW is it
>> the *version* number that causes the slow upgrade, or is it the fact
>> that there are enough changes that it can?t be safely applied
>> automatically.
> 
> It's the version number change itself. Python 2.7 was covered by the
> language moratorium, so it consists almost entirely of standard
> library changes, and the porting notes are minimal:
> https://docs.python.org/2/whatsnew/2.7.html#porting-to-python-2-7

I?m not sure I agree, the porting docs only show a subset of changes,
you also have a lot of new stuff like OrderedDict, dict comprehensions,
set literals, argparse, dict views, memory views, etc. AFAIK stable
releases don?t jump versions because all of these new features are
risks, not because a number didn?t change.

I don?t particularly care too much though, I just think that bumping
the compiler in a 2.7.Z release is a really bad idea and that either
of the other two options are massively better.

> 
> We didn't even switch compilers on Windows (both 2.6 and 2.7 use VS 2008).
> 
> I can't think of a better demonstration than the slow pace of the
> Python 2.7 rollout that the challenges with doing a new minor release
> of Python really aren't technical ones at the language level - they're
> technical and administrative challenges in the way the language
> version number interacts with the broader Python ecosystem, especially
> the various redistribution channels.
> 
> Cheers,
> Nick.
> 
> -- 
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140607/474f508c/attachment-0001.sig>

From ncoghlan at gmail.com  Sat Jun  7 07:18:49 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 7 Jun 2014 15:18:49 +1000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <2FCC7CC7-8D23-45BF-8157-1C92B9566A16@stufft.io>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>
 <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
 <CADiSq7fGgZvXDobvPNSovxtoY=+V=q0BiExV1GWQ+8wg5yPKcw@mail.gmail.com>
 <B355A9B3-4A49-4109-8E52-C321A99BACB4@stufft.io>
 <CADiSq7cw0LD55o6XryLKGp2XZzXcN2ktEMVv1j1+PEh-ArtBBw@mail.gmail.com>
 <2FCC7CC7-8D23-45BF-8157-1C92B9566A16@stufft.io>
Message-ID: <CADiSq7casoaZNEY9Hp_Rwc0fgd7tpGMhot8TJm3vOZN9DyYG_A@mail.gmail.com>

On 7 June 2014 15:05, Donald Stufft <donald at stufft.io> wrote:
> I don?t particularly care too much though, I just think that bumping
> the compiler in a 2.7.Z release is a really bad idea and that either
> of the other two options are massively better.

It is *incredibly* unlikely that backwards compatibility with binary
extensions will be broken within the Python 2.7 series - there's a
reason we said "No" when the Stackless folks were asking about it a
while back. Instead, the toolchain availability problem is currently
being tackled by trying to make suitable build toolchains more readily
available (both the official VS 2008 toolchain and alternative open
source toolchains), and by reducing the reliance on building from
source for end users.

Both of those courses of action are likely to bear fruit. It's only in
the case where those approaches *don't* solve the problem that we'll
need to come back and revisit the question of a compatibility break
for binary extensions - it is, as you say, a really bad idea, and
hence not something we would pursue when there are better options
available (I think a Python 2.8 release would be an *even worse* idea
in terms of souring our relationships with redistributors, but
fortunately, those aren't our only two choices).

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Sat Jun  7 07:28:47 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 7 Jun 2014 15:28:47 +1000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <CADiSq7eskGsg2qZegoOWjqEHspcB0Xkh1Sw3wYEnz2QHvo0ddQ@mail.gmail.com>

On 7 June 2014 01:41, Steve Dower <Steve.Dower at microsoft.com> wrote:
>
> What this means for Python is that C extensions for Python 3.5 and later can be built using any version of MSVC from 14.0 and later. Those who are aware of the current state of affairs where you need to use a matching compiler will hopefully see how big an improvement this will be. It is also likely that other compilers will have an easier time providing compatibility with this new CRT, making it simpler and more reliable to build extensions with LLVM or GCC against an MSVC CPython.

\o/ That's great news.

(I'm assuming that change in policy includes figuring out a solution
to the file descriptor problem, since we determined during the
Stackless 2.8 discussion that file descriptor mismatches were actually
our biggest stumbling block when it came to mixing and matching
different CRT versions in one process)

> Basically, what I am offering to do is:
>
> * Update the files in PCBuild to work with Visual Studio "14"
> * Make any code changes necessary to build with VC14
> * Regularly test the latest Python source with the latest MSVC builds and report issues/suggestions to the MSVC team
> * Keep all changes in a separate (public) repo until early next year when we're getting close to the final VS "14" release
>
> What I am asking anyone else to do is:
>
> * Nothing
>
> Thoughts/comments/concerns?

As long as we're also keeping the VS10 files up to date as a fallback
option, which we will be, since the VS14 work will be in a separate
repo, this sounds like a fine idea to me.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From stephen at xemacs.org  Sat Jun  7 08:09:08 2014
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 07 Jun 2014 15:09:08 +0900
Subject: [Python-Dev] Moving Python 2.7 [was: 3.5] on Windows to a new
 compiler
In-Reply-To: <CAD+XWwo2T2b6H_+fOPLE2DSd6T-zBZ=gJ7nGp3doOjszkQhBGA@mail.gmail.com>
References: <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAD+XWwoxaRG1T9a3eeve4MtFs2=djHo9OjPCuMpRtWsUGKaMGw@mail.gmail.com>
 <99A4C614-FAC9-4201-859B-B698744A5DB9@stufft.io>
 <CAD+XWwofUpWk-9aUzKkrzq4X1v2p_XXrnUtsbmGvpJNKp=nixg@mail.gmail.com>
 <DEA5F0EA-1EE9-4510-9400-64F3A7B5939C@stufft.io>
 <CAPTjJmrygHipJRGS9ttSFdX93CHYhJqyyFGKV=L-FdxSD3=uRg@mail.gmail.com>
 <20140606194252.GA11482@k2>
 <CAD+XWwo2T2b6H_+fOPLE2DSd6T-zBZ=gJ7nGp3doOjszkQhBGA@mail.gmail.com>
Message-ID: <87zjhp1caj.fsf@uwakimon.sk.tsukuba.ac.jp>

Brian Curtin writes:

 > Adding features into 3.x is already not enough of a carrot on the
 > stick for many users. Intentionally leaving 2.7 on a dead compiler is
 > like beating them with the stick.

No, it's like a New Year's resolution to stop self-flagellating, and
handing the whip to the users to use on themselves, or not, as they
choose.

Remember, the users *chose* to remain locked-in to 2.7, hoping that we
would continue to provide support, maybe 2.8.  They had alternatives:
contributing resources (in full-time developer support units!) to the
PSF earmarked for Python 2, porting their dependencies to Python 3,
etc.  All expensive, yes, but eventually they need to pay the price of
support or switching.  Staying with Python 2 was always a bet that
switching would be cheaper in the future, or that they'd have more
resources in the future, or both.  Who knows about the private
resources, but not only does Python 3 acquire more features steadily,
but efforts in core by folks like Ethan, distutils, and Nick (just to
name those I've followed personally), along with steadily and
expanding ports of 3rd party libraries, are quickly making switching
cheaper.

Cheap *enough*?  That's for the users themselves to decide.  So I'm
not arguing against support; this kind of support (*and* the people
who argue that it's worth doing, and then *do* it!) is one reason I
have *no* hesitation in recommending Python (3!) vs. any comparable
language.[1]  But whatever is decided here, we're doing it for pride
or for our own use, not because we owe the users anything.

Footnotes: 
[1]  I don't know enough about languages like Ruby or Perl to say
Python provides strictly better support.  I just can't imagine that it
gets better than this!


From breamoreboy at yahoo.co.uk  Sat Jun  7 09:57:59 2014
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Sat, 07 Jun 2014 08:57:59 +0100
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <4A7F1D64-2F36-428E-9682-9861B05DEFAD@stufft.io>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>
 <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
 <CAPJVwBmH+YZ7Yq9vWvmcwinCnvBmX-s57CLrCYc41O8vQ5JTmw@mail.gmail.com>
 <lmtoho$l2r$1@ger.gmane.org> <4A7F1D64-2F36-428E-9682-9861B05DEFAD@stufft.io>
Message-ID: <lmugla$nqk$1@ger.gmane.org>

On 07/06/2014 02:13, Donald Stufft wrote:
>
> On Jun 6, 2014, at 9:05 PM, Terry Reedy <tjreedy at udel.edu> wrote:
>
>> On 6/6/2014 6:47 PM, Nathaniel Smith wrote:
>>> On Fri, Jun 6, 2014 at 11:43 PM, Sturla Molden <sturla.molden at gmail.com> wrote:
>>>> Brett Cannon <bcannon at gmail.com> wrote:
>>>>
>>>>> Nope. A new minor release of Python is a massive undertaking which is why
>>>>> we have saved ourselves the hassle of doing a Python 2.8 or not giving a
>>>>> clear signal as to when Python 2.x will end as a language.
>>>>
>>>> Why not just define Python 2.8 as Python 2.7 except with a newer compiler?
>>>> I cannot see why that would be massive undertaking, if changing compiler
>>>> for 2.7 is neccesary anyway.
>>>
>>> This would require recompiling all packages on OS X and Linux, even
>>> though nothing had changed.
>>
>> If you are suggesting that a Windows compiler change should be invisible to non-Windows users, I agree.
>>
>> Let us assume that /pcbuild remains for those who have vc2008 and that /pcbuild14 is added (and everything else remains as is). Then the only other thing that would change is the Windows installer released on Python.org. Call than 2.7.9W or whatever on the download site and interactive startup message to signal that something is different.
>>
>> --
>> Terry Jan Reedy
>>
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: https://mail.python.org/mailman/options/python-dev/donald%40stufft.io
>
> How are packaging tools supposed to cope with this? AFAIK there is nothing in most of them to deal with a X.Y.Z release suddenly dealing with a different compiler.
>
> -----------------
> Donald Stufft
> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
>

Potentially completely stupid suggestion to get people thinking (or die 
laughing :) , but would it be possible to use hex digits, such that 
2.7.A was the first release on Windows with the different compiler?

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com


From g.rodola at gmail.com  Sat Jun  7 11:41:31 2014
From: g.rodola at gmail.com (Giampaolo Rodola')
Date: Sat, 7 Jun 2014 11:41:31 +0200
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <2FCC7CC7-8D23-45BF-8157-1C92B9566A16@stufft.io>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>
 <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
 <CADiSq7fGgZvXDobvPNSovxtoY=+V=q0BiExV1GWQ+8wg5yPKcw@mail.gmail.com>
 <B355A9B3-4A49-4109-8E52-C321A99BACB4@stufft.io>
 <CADiSq7cw0LD55o6XryLKGp2XZzXcN2ktEMVv1j1+PEh-ArtBBw@mail.gmail.com>
 <2FCC7CC7-8D23-45BF-8157-1C92B9566A16@stufft.io>
Message-ID: <CAFYqXL_brPJSRD0sep3EwkwMTHDQzKFQyc4gSZbN5+PVmdctfg@mail.gmail.com>

On Sat, Jun 7, 2014 at 7:05 AM, Donald Stufft <donald at stufft.io> wrote:

>
> I don?t particularly care too much though, I just think that bumping
> the compiler in a 2.7.Z release is a really bad idea and that either
> of the other two options are massively better.


+1


-- 
Giampaolo - http://grodola.blogspot.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140607/d31013f1/attachment.html>

From ncoghlan at gmail.com  Sat Jun  7 11:50:50 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 7 Jun 2014 19:50:50 +1000
Subject: [Python-Dev] [Python-ideas] Expose `itertools.count.start` and
 implement `itertools.count.__eq__` based on it, like `range`.
In-Reply-To: <CANXboVa-rbs4ugfDW+WcP900m_4M_S5sXkwSPo7aQJq8vmLaSA@mail.gmail.com>
References: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com>
 <a5933540-4451-4d7d-96bc-90e1fbb83219@googlegroups.com>
 <CAGRr6BGV0vAGo5+0FBt2=uaq82vLMqQiyNsbD1tzvMYxBty-tg@mail.gmail.com>
 <ea3e0849-c9d1-4cba-9fc0-2c12d1b22a0e@googlegroups.com>
 <CADiSq7c4m55bROUfu0_iUucrSy+30ynba6_q5p7Rr8xeQ40O9Q@mail.gmail.com>
 <CANXboVYjsrTAYJo90ztT4wrg7qg3hb3mjjdGa6y2S5soGiKhFQ@mail.gmail.com>
 <CADiSq7egbLRydJFKYif4i4cmRxzroGZZVSKoSYGVyWoAgpJuLg@mail.gmail.com>
 <CANXboVa-rbs4ugfDW+WcP900m_4M_S5sXkwSPo7aQJq8vmLaSA@mail.gmail.com>
Message-ID: <CADiSq7c3z77eQNyLXsaHe2TOrq0VF7zHudmF+kXHZ-ZVMTwpQA@mail.gmail.com>

On 7 June 2014 19:36, Ram Rachum <ram.rachum at gmail.com> wrote:
> My need is to have an infinite immutable sequence. I did this for myself by
> creating a simple `count`-like stateless class, but it would be nice if that
> behavior was part of `range`.

Handling esoteric use cases like it sounds yours was is *why* user
defined classes exist. It does not follow that "I had to write a
custom class to solve my problem" should lead to a standard library or
builtin changing unless you can make a compelling case for:

* the change being a solution to a common problem that a lot of other
people also have. "I think it might be nice" and "it would have been
useful to me to help solve this weird problem I had that one time"
isn't enough.
* the change fitting in *conceptually* with the existing language and
tools. In this case, "infinite sequence" is a fundamentally incoherent
concept in Python - len() certainly won't work, and negative indexing
behaviour is hence not defined. By contrast, since iterables and
iterators aren't required to support len() the way sequences are,
infinite iterable and infinite iterator are both perfectly well
defined.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From chris at simplistix.co.uk  Fri Jun  6 20:50:57 2014
From: chris at simplistix.co.uk (Chris Withers)
Date: Fri, 06 Jun 2014 19:50:57 +0100
Subject: [Python-Dev] namedtuple implementation grumble
Message-ID: <53920D91.3060207@simplistix.co.uk>

Hi All,

I've been trying to add support for explicit comparison of namedtuples 
into testfixtures and hit a problem which lead me to read the source and 
be sad.

Rather than the mixin and class assembly in the function I expected to 
find, I'm greeted by an exec of a string.

Curious as to what lead to that implementation approach? What does it 
buy that couldn't have been obtained by a mixin providing the functionality?

In my case, that's somewhat irrelevant, I'm looking to store a comparer 
in a registry that would get used for all namedtuples, but I have 
nothing to key that off, there are no shared bases other than object and 
tuple.

I guess I could duck-type it based on the _fields attribute but that 
feels implicit and fragile.

What do you guys suggest?

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk

From rdmurray at bitdance.com  Sat Jun  7 15:25:24 2014
From: rdmurray at bitdance.com (R. David Murray)
Date: Sat, 07 Jun 2014 09:25:24 -0400
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <53920D91.3060207@simplistix.co.uk>
References: <53920D91.3060207@simplistix.co.uk>
Message-ID: <20140607132525.2A2F9250D5C@webabinitio.net>

On Fri, 06 Jun 2014 19:50:57 +0100, Chris Withers <chris at simplistix.co.uk> wrote:
> I've been trying to add support for explicit comparison of namedtuples 
> into testfixtures and hit a problem which lead me to read the source and 
> be sad.
> 
> Rather than the mixin and class assembly in the function I expected to 
> find, I'm greeted by an exec of a string.
> 
> Curious as to what lead to that implementation approach? What does it 
> buy that couldn't have been obtained by a mixin providing the functionality?
> 
> In my case, that's somewhat irrelevant, I'm looking to store a comparer 
> in a registry that would get used for all namedtuples, but I have 
> nothing to key that off, there are no shared bases other than object and 
> tuple.
> 
> I guess I could duck-type it based on the _fields attribute but that 
> feels implicit and fragile.
> 
> What do you guys suggest?

I seem to remember a previous discussion that concluded that duck typing
based on _fields was the way to go.  (It's a public API, despite the _,
due to name-tuple's attribute namespacing issues.)

--David

From steve at pearwood.info  Sat Jun  7 16:29:55 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 8 Jun 2014 00:29:55 +1000
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <53920D91.3060207@simplistix.co.uk>
References: <53920D91.3060207@simplistix.co.uk>
Message-ID: <20140607142955.GQ10355@ando>

On Fri, Jun 06, 2014 at 07:50:57PM +0100, Chris Withers wrote:
> Hi All,
> 
> I've been trying to add support for explicit comparison of namedtuples 
> into testfixtures and hit a problem which lead me to read the source and 
> be sad.
> 
> Rather than the mixin and class assembly in the function I expected to 
> find, I'm greeted by an exec of a string.
> 
> Curious as to what lead to that implementation approach? What does it 
> buy that couldn't have been obtained by a mixin providing the functionality?

namedtuple started off as a recipe on ActiveState by Raymond Hettinger. 
Start here:

http://code.activestate.com/recipes/500261-named-tuples/?in=user-178123


-- 
Steven

From ncoghlan at gmail.com  Sat Jun  7 16:46:47 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 8 Jun 2014 00:46:47 +1000
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <53920D91.3060207@simplistix.co.uk>
References: <53920D91.3060207@simplistix.co.uk>
Message-ID: <CADiSq7fUZvDGg34Tjqp3pkHhg-zmLyCGSj0QEox6W9S1dOAryQ@mail.gmail.com>

On 7 June 2014 04:50, Chris Withers <chris at simplistix.co.uk> wrote:
> Curious as to what lead to that implementation approach? What does it buy
> that couldn't have been obtained by a mixin providing the functionality?

In principle, you could get the equivalent of collections.namedtuple
through dynamically constructed classes. In practice, that's actually
easier said than done, so the fact the current implementation works
fine for almost all purposes acts as a powerful disincentive to
rewriting it. The current implementation is also *really* easy to
understand, while writing out the dynamic type creation explicitly
would likely require much deeper knowledge of the type machinery to
follow.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From antoine at python.org  Sat Jun  7 16:50:16 2014
From: antoine at python.org (Antoine Pitrou)
Date: Sat, 07 Jun 2014 10:50:16 -0400
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <20140607132525.2A2F9250D5C@webabinitio.net>
References: <53920D91.3060207@simplistix.co.uk>
 <20140607132525.2A2F9250D5C@webabinitio.net>
Message-ID: <lmv8r8$dae$1@ger.gmane.org>

Le 07/06/2014 09:25, R. David Murray a ?crit :
> On Fri, 06 Jun 2014 19:50:57 +0100, Chris Withers <chris at simplistix.co.uk> wrote:
>> I've been trying to add support for explicit comparison of namedtuples
>> into testfixtures and hit a problem which lead me to read the source and
>> be sad.
>>
>> Rather than the mixin and class assembly in the function I expected to
>> find, I'm greeted by an exec of a string.
>>
>> Curious as to what lead to that implementation approach? What does it
>> buy that couldn't have been obtained by a mixin providing the functionality?
>>
>> In my case, that's somewhat irrelevant, I'm looking to store a comparer
>> in a registry that would get used for all namedtuples, but I have
>> nothing to key that off, there are no shared bases other than object and
>> tuple.
>>
>> I guess I could duck-type it based on the _fields attribute but that
>> feels implicit and fragile.
>>
>> What do you guys suggest?
>
> I seem to remember a previous discussion that concluded that duck typing
> based on _fields was the way to go.  (It's a public API, despite the _,
> due to name-tuple's attribute namespacing issues.)

There could be many third-party classes with a _fields member, so that 
sounds rather fragile.
There doesn't seem to be any technical reason barring the addition of a 
common base class for namedtuples.

Regards

Antoine.


From njs at pobox.com  Sat Jun  7 09:23:46 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 7 Jun 2014 08:23:46 +0100
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CADiSq7casoaZNEY9Hp_Rwc0fgd7tpGMhot8TJm3vOZN9DyYG_A@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>
 <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
 <CADiSq7fGgZvXDobvPNSovxtoY=+V=q0BiExV1GWQ+8wg5yPKcw@mail.gmail.com>
 <B355A9B3-4A49-4109-8E52-C321A99BACB4@stufft.io>
 <CADiSq7cw0LD55o6XryLKGp2XZzXcN2ktEMVv1j1+PEh-ArtBBw@mail.gmail.com>
 <2FCC7CC7-8D23-45BF-8157-1C92B9566A16@stufft.io>
 <CADiSq7casoaZNEY9Hp_Rwc0fgd7tpGMhot8TJm3vOZN9DyYG_A@mail.gmail.com>
Message-ID: <CAPJVwBn=DaCCFSCSBOGJiNr=wmQHV-iRt+DxMVrpvk8aAVyn8g@mail.gmail.com>

Once 7 Jun 2014 06:19, "Nick Coghlan" <ncoghlan at gmail.com> wrote:
>
> On 7 June 2014 15:05, Donald Stufft <donald at stufft.io> wrote:
> > I don?t particularly care too much though, I just think that bumping
> > the compiler in a 2.7.Z release is a really bad idea and that either
> > of the other two options are massively better.
>
> It is *incredibly* unlikely that backwards compatibility with binary
> extensions will be broken within the Python 2.7 series - there's a
> reason we said "No" when the Stackless folks were asking about it a
> while back. Instead, the toolchain availability problem is currently
> being tackled by trying to make suitable build toolchains more readily
> available (both the official VS 2008 toolchain and alternative open
> source toolchains), and by reducing the reliance on building from
> source for end users.

A third piece of the puzzle could potentially be the availability of
automated wheel-building services. (Personally I still haven't successfully
managed to build windows wheels for my own packages, and envy my R-using
colleagues whose PyPi equivalent does the building for them.)

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140607/05b87038/attachment.html>

From pcmanticore at gmail.com  Sat Jun  7 15:11:54 2014
From: pcmanticore at gmail.com (Claudiu Popa)
Date: Sat, 7 Jun 2014 16:11:54 +0300
Subject: [Python-Dev] Division of tool labour in porting Python 2 code
	to 2/3
In-Reply-To: <CAP1=2W7a76GJMDrT5C+Sc1X6kC2p2yML9wVWGyBzofPb+e+m1g@mail.gmail.com>
References: <CAP1=2W7a76GJMDrT5C+Sc1X6kC2p2yML9wVWGyBzofPb+e+m1g@mail.gmail.com>
Message-ID: <CAMy=CLrQ9Ad0tz=jrVAjZ=x8DeeogfAdyDLiAoVNnD7Gvouyvg@mail.gmail.com>

On Fri, Jun 6, 2014 at 7:37 PM, Brett Cannon <bcannon at gmail.com> wrote:
> After Glyph and Alex's email about their asks for assisting in writing
> Python 2/3 code, it got me thinking about where in the toolchain various
> warnings and such should go in order to help direct energy to help develop
> whatever future toolchain to assist in porting.
>
> There seems to be three places where issues are/can be caught once a project
> has embarked down the road of 2/3 source compatibility:
>
> -3 warnings
> Some linter tool


Pylint could help here. We already have a couple of checks which
addresses the issue of porting between Python 2 and 3, checks like:

raising-string
old-style-class
slots-on-old-class
super-on-old-class
old-raise-syntax
old-ne-operator
lowercase-l-suffix
backtick
unpacking-in-except
indexing-exception
property-on-old-class

There was an idea on Pylint's bugtracker to implement a plugin for
Python 2, with warnings dedicated to porting and this solution seems
easier than the alternatives.

From bcannon at gmail.com  Sat Jun  7 17:37:47 2014
From: bcannon at gmail.com (Brett Cannon)
Date: Sat, 07 Jun 2014 15:37:47 +0000
Subject: [Python-Dev] Division of tool labour in porting Python 2 code
	to 2/3
References: <CAP1=2W7a76GJMDrT5C+Sc1X6kC2p2yML9wVWGyBzofPb+e+m1g@mail.gmail.com>
 <CAMy=CLrQ9Ad0tz=jrVAjZ=x8DeeogfAdyDLiAoVNnD7Gvouyvg@mail.gmail.com>
Message-ID: <CAP1=2W7v2v29fFy6yaj_qewOhu_4dW8Cdn90XbbMwbrqNEuvkA@mail.gmail.com>

On Sat Jun 07 2014 at 9:11:54 AM, Claudiu Popa <pcmanticore at gmail.com>
wrote:

> On Fri, Jun 6, 2014 at 7:37 PM, Brett Cannon <bcannon at gmail.com> wrote:
> > After Glyph and Alex's email about their asks for assisting in writing
> > Python 2/3 code, it got me thinking about where in the toolchain various
> > warnings and such should go in order to help direct energy to help
> develop
> > whatever future toolchain to assist in porting.
> >
> > There seems to be three places where issues are/can be caught once a
> project
> > has embarked down the road of 2/3 source compatibility:
> >
> > -3 warnings
> > Some linter tool
>
>
> Pylint could help here. We already have a couple of checks which
> addresses the issue of porting between Python 2 and 3, checks like:
>
> raising-string
> old-style-class
> slots-on-old-class
> super-on-old-class
> old-raise-syntax
> old-ne-operator
> lowercase-l-suffix
> backtick
> unpacking-in-except
> indexing-exception
> property-on-old-class
>
> There was an idea on Pylint's bugtracker to implement a plugin for
> Python 2, with warnings dedicated to porting and this solution seems
> easier than the alternatives.
>

Yes, pylint is definitely an option. I have not looked at how hard it would
be to write the rules, though, and how easy it would be to run with just
those rules (if I remember correctly pylint can take a config, but I have
not run it manually in a while).

Having something which walked the 2.7 CST or AST wouldn't be difficult to
write either, so it's just a matter of balance of work required.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140607/78b43dd1/attachment.html>

From Steve.Dower at microsoft.com  Sat Jun  7 17:38:41 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Sat, 7 Jun 2014 15:38:41 +0000
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <CAPJVwBn=DaCCFSCSBOGJiNr=wmQHV-iRt+DxMVrpvk8aAVyn8g@mail.gmail.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmoqtd-JRMSTA=bGD4o1oK1Z5yxcg45kCXhxuhMOGFqLhg@mail.gmail.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>
 <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
 <CADiSq7fGgZvXDobvPNSovxtoY=+V=q0BiExV1GWQ+8wg5yPKcw@mail.gmail.com>
 <B355A9B3-4A49-4109-8E52-C321A99BACB4@stufft.io>
 <CADiSq7cw0LD55o6XryLKGp2XZzXcN2ktEMVv1j1+PEh-ArtBBw@mail.gmail.com>
 <2FCC7CC7-8D23-45BF-8157-1C92B9566A16@stufft.io>
 <CADiSq7casoaZNEY9Hp_Rwc0fgd7tpGMhot8TJm3vOZN9DyYG_A@mail.gmail.com>,
 <CAPJVwBn=DaCCFSCSBOGJiNr=wmQHV-iRt+DxMVrpvk8aAVyn8g@mail.gmail.com>
Message-ID: <1402155524095.94474@microsoft.com>

One more possible concern that I just thought of is the availability of the build tools on Windows Vista and Windows 7 RTM (that is, without SP1). I'd have to check, but I don't believe anything after VS 2012 is supported on Vista and it's entirely possible that installation is blocked.


This may be a non-issue. VC14 still has the "XP mode" that avoids using new APIs, so compiled Python will run fine, but it may be the case that the compiler doesn't (if we manage to get a separate, compiler-only package, that is. VS itself is definitely unusable). I assume gcc/clang will continue to support earlier OSs, so hopefully by the time 3.5 is getting early releases there will be an option for building extensions.


I doubt anyone on this list is stuck on Vista or in a position where they can't keep Win7 updated, but do we know of any environments where this may be a problem?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140607/d8d5bba6/attachment.html>

From njs at pobox.com  Sat Jun  7 18:56:16 2014
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 7 Jun 2014 17:56:16 +0100
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <53925EBA.4050306@canterbury.ac.nz>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <lmqsl2$5v5$1@ger.gmane.org>
 <CAPJVwBnPR1bYqwof_LejPmya80CJiyaK+kUYk99HrDoCgHPFNQ@mail.gmail.com>
 <539126DB.8010306@canterbury.ac.nz> <5391F8A4.70401@googlemail.com>
 <53925EBA.4050306@canterbury.ac.nz>
Message-ID: <CAPJVwBnTAZMvUGK4GRU2vY3kWac2v7Lw1tqkPs3iprKcpvP9Aw@mail.gmail.com>

On Sat, Jun 7, 2014 at 1:37 AM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Julian Taylor wrote:
>>
>> tp_can_elide receives two objects and returns one of three values:
>> * can work inplace, operation is associative
>> * can work inplace but not associative
>> * cannot work inplace
>
>
> Does it really need to be that complicated? Isn't it
> sufficient just to ask the object potentially being
> overwritten whether it's okay to overwrite it?
> I.e. a parameterless method returning a boolean.

For the numpy case, we really need to see all the operands, *and* know
what the operation in question is. Consider

tmp1 = np.ones((3, 1))
tmp2 = np.ones((1, 3))
tmp1 + tmp2

which returns an array with shape (3, 3).

Both input arrays are temporaries, but neither of them can be stolen
to use for the output array.

Or suppose 'a' is an array of integers and 'b' is an array of floats,
then 'a + b' and 'a += b' have very different results (the former
upcasts 'a' to float, the latter has to either downcast 'b' to int or
raise an error). But the casting rules depend on the particular input
types and the particular operation -- operations like & and << want to
cast to int, < and > return bools, etc. So one really needs to know
all the details of the operation before one can determine whether
temporary elision is possible.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

From mistersheik at gmail.com  Sat Jun  7 19:57:05 2014
From: mistersheik at gmail.com (Neil Girdhar)
Date: Sat, 7 Jun 2014 13:57:05 -0400
Subject: [Python-Dev] [Python-ideas] Expose `itertools.count.start` and
 implement `itertools.count.__eq__` based on it, like `range`.
In-Reply-To: <CADiSq7c3z77eQNyLXsaHe2TOrq0VF7zHudmF+kXHZ-ZVMTwpQA@mail.gmail.com>
References: <082cd87a-aeb5-49bf-9f79-d99a6d18e402@googlegroups.com>
 <a5933540-4451-4d7d-96bc-90e1fbb83219@googlegroups.com>
 <CAGRr6BGV0vAGo5+0FBt2=uaq82vLMqQiyNsbD1tzvMYxBty-tg@mail.gmail.com>
 <ea3e0849-c9d1-4cba-9fc0-2c12d1b22a0e@googlegroups.com>
 <CADiSq7c4m55bROUfu0_iUucrSy+30ynba6_q5p7Rr8xeQ40O9Q@mail.gmail.com>
 <CANXboVYjsrTAYJo90ztT4wrg7qg3hb3mjjdGa6y2S5soGiKhFQ@mail.gmail.com>
 <CADiSq7egbLRydJFKYif4i4cmRxzroGZZVSKoSYGVyWoAgpJuLg@mail.gmail.com>
 <CANXboVa-rbs4ugfDW+WcP900m_4M_S5sXkwSPo7aQJq8vmLaSA@mail.gmail.com>
 <CADiSq7c3z77eQNyLXsaHe2TOrq0VF7zHudmF+kXHZ-ZVMTwpQA@mail.gmail.com>
Message-ID: <CAA68w_mXzNPfO4Yh3kdcDNirwAqSJt=j_isaj2wnQtqgoKCePA@mail.gmail.com>

On Sat, Jun 7, 2014 at 5:50 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 7 June 2014 19:36, Ram Rachum <ram.rachum at gmail.com> wrote:
> > My need is to have an infinite immutable sequence. I did this for myself
> by
> > creating a simple `count`-like stateless class, but it would be nice if
> that
> > behavior was part of `range`.
>
> Handling esoteric use cases like it sounds yours was is *why* user
> defined classes exist. It does not follow that "I had to write a
> custom class to solve my problem" should lead to a standard library or
> builtin changing unless you can make a compelling case for:
>
> * the change being a solution to a common problem that a lot of other
> people also have. "I think it might be nice" and "it would have been
> useful to me to help solve this weird problem I had that one time"
> isn't enough.
> * the change fitting in *conceptually* with the existing language and
> tools. In this case, "infinite sequence" is a fundamentally incoherent
> concept in Python - len() certainly won't work, and negative indexing
> behaviour is hence not defined. By contrast, since iterables and
> iterators aren't required to support len() the way sequences are,
> infinite iterable and infinite iterator are both perfectly well
> defined.
>

With all due respect, ?"infinite sequence" is a fundamentally incoherent
concept in Python? is a bit hyperbolic.  It would be perfectly reasonable
to have them, but they're not defined (yet).

>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140607/96ff9c87/attachment-0001.html>

From v+python at g.nevcal.com  Sat Jun  7 21:42:32 2014
From: v+python at g.nevcal.com (Glenn Linderman)
Date: Sat, 07 Jun 2014 12:42:32 -0700
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <CADiSq7fUZvDGg34Tjqp3pkHhg-zmLyCGSj0QEox6W9S1dOAryQ@mail.gmail.com>
References: <53920D91.3060207@simplistix.co.uk>
 <CADiSq7fUZvDGg34Tjqp3pkHhg-zmLyCGSj0QEox6W9S1dOAryQ@mail.gmail.com>
Message-ID: <53936B28.1080605@g.nevcal.com>

On 6/7/2014 7:50 AM, Antoine Pitrou wrote:
> Le 07/06/2014 09:25, R. David Murray a ?crit :
>> On Fri, 06 Jun 2014 19:50:57 +0100, Chris Withers 
>> <chris at simplistix.co.uk> wrote:
>>> I guess I could duck-type it based on the _fields attribute but that
>>> feels implicit and fragile.
>>>
>>> What do you guys suggest?
>>
>> I seem to remember a previous discussion that concluded that duck typing
>> based on _fields was the way to go.  (It's a public API, despite the _,
>> due to name-tuple's attribute namespacing issues.)
>
> There could be many third-party classes with a _fields member, so that 
> sounds rather fragile.
> There doesn't seem to be any technical reason barring the addition of 
> a common base class for namedtuples.
>
> Regards
>
> Antoine.

A common base class sounds like a good idea, to me, at a minimum, to 
help identify all the namedtuple derivatives.


On 6/7/2014 7:46 AM, Nick Coghlan wrote:
> On 7 June 2014 04:50, Chris Withers <chris at simplistix.co.uk> wrote:
>> Curious as to what lead to that implementation approach? What does it buy
>> that couldn't have been obtained by a mixin providing the functionality?
> In principle, you could get the equivalent of collections.namedtuple
> through dynamically constructed classes. In practice, that's actually
> easier said than done, so the fact the current implementation works
> fine for almost all purposes acts as a powerful disincentive to
> rewriting it. The current implementation is also *really* easy to
> understand, while writing out the dynamic type creation explicitly
> would likely require much deeper knowledge of the type machinery to
> follow.
I wonder if the dynamically constructed classes approach could lead to 
the same space and time efficiencies... seems like I recall there being 
a discussion of efficiency, I think primarily space efficiency, as a 
justification for the present implementation. namedtuple predates of the 
improvements in metaclasses, also, which may be a justification for the 
present implementation.

I bumped into namedtuple when I first started coding in Python, I was 
looking for _some way_, _any way_ to achieve an unmutable class with 
named members, and came across Raymond's recipe, which others have 
linked to... and learned, at the time, that he was putting it into 
Python stdlib.  I found it far from "*really* easy to understand", 
although at that point in my Python knowledge, I highly doubt a 
metaclass implementation would have been easier to understand... but 
learning metaclasses earlier than I did might have been good for my 
general understanding of Python, and more useful in the toolbox than an 
implementation like namedtuple. I did, however, find and suggest a fix 
for a bug in the namedtuple implementation that Raymond was rather 
surprised that he had missed, although I would have to pick through the 
email archives to remember now what it was, or any other details about 
it... but it was in time to get fixed before the first release of Python 
that included namedtuple, happily.

I wouldn't be opposed to someone rewriting namedtuple using metaclasses, 
to compare the implementations from an understandability and from an 
efficiency standpoint... but I don't think my metaclass skills are 
presently sufficient to make the attempt myself.

I also seem to recall that somewhere in the (lengthy) Enum discussions, 
that Enum uses a technique similar to namedtuple, again for an 
efficiency reason, even though it also uses metaclasses in its 
implementation.

I wonder if, if the reasons were well understood by someone that 
understand Python internals far better than I do, if they point out some 
capability that is missing from metaclasses that lead to these decisions 
to use string parsing and manipulation as a basis for implementing 
classes with metaclass-like behaviors, yet not use the metaclass feature 
set to achieve those behaviors.

Glenn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140607/e65af8df/attachment.html>

From pmiscml at gmail.com  Sat Jun  7 22:00:15 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Sat, 7 Jun 2014 23:00:15 +0300
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <53936B28.1080605@g.nevcal.com>
References: <53920D91.3060207@simplistix.co.uk>
 <CADiSq7fUZvDGg34Tjqp3pkHhg-zmLyCGSj0QEox6W9S1dOAryQ@mail.gmail.com>
 <53936B28.1080605@g.nevcal.com>
Message-ID: <20140607230015.71fbc213@x34f>

Hello,

On Sat, 07 Jun 2014 12:42:32 -0700
Glenn Linderman <v+python at g.nevcal.com> wrote:

> On 6/7/2014 7:50 AM, Antoine Pitrou wrote:
> > Le 07/06/2014 09:25, R. David Murray a ?crit :
> >> On Fri, 06 Jun 2014 19:50:57 +0100, Chris Withers 
> >> <chris at simplistix.co.uk> wrote:
> >>> I guess I could duck-type it based on the _fields attribute but
> >>> that feels implicit and fragile.
> >>>
> >>> What do you guys suggest?
> >>
> >> I seem to remember a previous discussion that concluded that duck
> >> typing based on _fields was the way to go.  (It's a public API,
> >> despite the _, due to name-tuple's attribute namespacing issues.)
> >
> > There could be many third-party classes with a _fields member, so
> > that sounds rather fragile.
> > There doesn't seem to be any technical reason barring the addition
> > of a common base class for namedtuples.
> >
> > Regards
> >
> > Antoine.
> 
> A common base class sounds like a good idea, to me, at a minimum, to 
> help identify all the namedtuple derivatives.

I'm perplexed - isn't "tuple" such common base class? And checking for
both "tuple" base class and "_fields" member will identify it with
~same probability as a check for special base type (because it's fair
to say that if someone *both* subclassed a builtin type and add _fields
member, then they wanted it to be treated as namedtuple).

[]

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From lpanl09 at gmail.com  Sat Jun  7 23:50:02 2014
From: lpanl09 at gmail.com (Le Pa)
Date: Sat, 7 Jun 2014 21:50:02 +0000 (UTC)
Subject: [Python-Dev] cpython and python debugger documentation
Message-ID: <loom.20140607T234855-911@post.gmane.org>

Hi,

I am interested in learning how the cpython interpreter is designed and 
implemented,
and also how the python debugger works internally. My ultimate purpose is to 
modify
them for my distributed computing needs. Are there any documentations
on these please? I have done some goggling but failed to find anything useful.

Thanks you very much for your help!
-Le


From greg.ewing at canterbury.ac.nz  Sun Jun  8 01:02:51 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 08 Jun 2014 11:02:51 +1200
Subject: [Python-Dev] [numpy wishlist] Interpreter support for temporary
 elision in third-party classes
In-Reply-To: <CAPJVwBnTAZMvUGK4GRU2vY3kWac2v7Lw1tqkPs3iprKcpvP9Aw@mail.gmail.com>
References: <CAPJVwBnWxAYEgL6HYpwKSQLC8GGTcTLY-yt2YSA+=2Sx5vf5CQ@mail.gmail.com>
 <lmqsl2$5v5$1@ger.gmane.org>
 <CAPJVwBnPR1bYqwof_LejPmya80CJiyaK+kUYk99HrDoCgHPFNQ@mail.gmail.com>
 <539126DB.8010306@canterbury.ac.nz> <5391F8A4.70401@googlemail.com>
 <53925EBA.4050306@canterbury.ac.nz>
 <CAPJVwBnTAZMvUGK4GRU2vY3kWac2v7Lw1tqkPs3iprKcpvP9Aw@mail.gmail.com>
Message-ID: <53939A1B.9060705@canterbury.ac.nz>

Nathaniel Smith wrote:
> For the numpy case, we really need to see all the operands, *and* know
> what the operation in question is...

Okay, I see what you mean now.

Given all that, it might be simpler just to have the
method perform the operation itself if it can. It has
all the information necessary to do so, after all.

This would also make it possible for the inplace
operators to have different semantics from temp-elided
non-inplace ones if desired.

-- 
Greg

From ncoghlan at gmail.com  Sun Jun  8 01:35:58 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 8 Jun 2014 09:35:58 +1000
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <53936B28.1080605@g.nevcal.com>
References: <53920D91.3060207@simplistix.co.uk>
 <CADiSq7fUZvDGg34Tjqp3pkHhg-zmLyCGSj0QEox6W9S1dOAryQ@mail.gmail.com>
 <53936B28.1080605@g.nevcal.com>
Message-ID: <CADiSq7fxza6XYAPxp0Q7E5w=5sR9tEUvfnk7unq+bTYekbDOFw@mail.gmail.com>

On 8 Jun 2014 05:44, "Glenn Linderman" <v+python at g.nevcal.com> wrote:
>
> I wonder if the dynamically constructed classes approach could lead to
the same space and time efficiencies... seems like I recall there being a
discussion of efficiency, I think primarily space efficiency, as a
justification for the present implementation. namedtuple predates of the
improvements in metaclasses, also, which may be a justification for the
present implementation.

As far as I am aware, there's nothing magical in the classes namedtuple
creates that would require a custom metaclass - it's just that what it does
would likely be even harder to read if written out explicitly rather than
letting the compiler & eval loop deal with it.

However, we've drifted off topic for python-dev at this point. If anyone
wanted to experiment with alternative implementations, python-ideas would
be the place to discuss that.

Cheers,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140608/7fde3b51/attachment.html>

From eric at trueblade.com  Sun Jun  8 21:13:55 2014
From: eric at trueblade.com (Eric V. Smith)
Date: Sun, 08 Jun 2014 15:13:55 -0400
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <CADiSq7fUZvDGg34Tjqp3pkHhg-zmLyCGSj0QEox6W9S1dOAryQ@mail.gmail.com>
References: <53920D91.3060207@simplistix.co.uk>
 <CADiSq7fUZvDGg34Tjqp3pkHhg-zmLyCGSj0QEox6W9S1dOAryQ@mail.gmail.com>
Message-ID: <5394B5F3.6050403@trueblade.com>

On 6/7/2014 10:46 AM, Nick Coghlan wrote:
> On 7 June 2014 04:50, Chris Withers <chris at simplistix.co.uk> wrote:
>> Curious as to what lead to that implementation approach? What does it buy
>> that couldn't have been obtained by a mixin providing the functionality?
> 
> In principle, you could get the equivalent of collections.namedtuple
> through dynamically constructed classes. In practice, that's actually
> easier said than done, so the fact the current implementation works
> fine for almost all purposes acts as a powerful disincentive to
> rewriting it. The current implementation is also *really* easy to
> understand, while writing out the dynamic type creation explicitly
> would likely require much deeper knowledge of the type machinery to
> follow.

As proof that it's harder to understand, here's an example of that
dynamically creating functions and types:

https://pypi.python.org/pypi/namedlist

https://bitbucket.org/ericvsmith/namedlist/src/163d0d05e94f9cc0af8e269015b9ac3bf9a83826/namedlist.py?at=default#cl-155

It uses the ast module to build an __init__ (or __new__) function
dynamically, without exec. Then it creates a type using that function to
initialize the new type. namedlist.namedtuple passes all
collections.namedtuple tests, except for those using the _source
attribute (of course).

namedlist.namedlist and namedlist.namedtuple both support a clunky
interface to specify default values for member fields.

The reasons I didn't use the collections.namedtuple exec-based approach are:
- specify default values to __init__ or __new__ became very complex
- 2.x and 3.x support is harder with exec

Eric.


From dw+python-dev at hmmz.org  Sun Jun  8 21:37:46 2014
From: dw+python-dev at hmmz.org (dw+python-dev at hmmz.org)
Date: Sun, 8 Jun 2014 19:37:46 +0000
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <5394B5F3.6050403@trueblade.com>
References: <53920D91.3060207@simplistix.co.uk>
 <CADiSq7fUZvDGg34Tjqp3pkHhg-zmLyCGSj0QEox6W9S1dOAryQ@mail.gmail.com>
 <5394B5F3.6050403@trueblade.com>
Message-ID: <20140608193746.GA1687@k2>

On Sun, Jun 08, 2014 at 03:13:55PM -0400, Eric V. Smith wrote:

> > The current implementation is also *really* easy to understand,
> > while writing out the dynamic type creation explicitly would likely
> > require much deeper knowledge of the type machinery to follow.

> As proof that it's harder to understand, here's an example of that
> dynamically creating functions and types:

Probably I'm missing something, but there's a much simpler non-exec
approach, something like:

    class _NamedTuple(...):
        ...

    def namedtuple(name, fields):
        cls = tuple(name, (_NamedTuple,), {
            '_fields': fields.split()
        })
        for i, field_name in enumerate(cls._fields):
            prop = property(functools.partial(_NamedTuple.__getitem__, i)
                            functools.partial(_NamedTuple.__setitem__, i))
            setattr(cls, field_name, prop)
        return cls

David

From dw+python-dev at hmmz.org  Sun Jun  8 21:38:47 2014
From: dw+python-dev at hmmz.org (dw+python-dev at hmmz.org)
Date: Sun, 8 Jun 2014 19:38:47 +0000
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <20140608193746.GA1687@k2>
References: <53920D91.3060207@simplistix.co.uk>
 <CADiSq7fUZvDGg34Tjqp3pkHhg-zmLyCGSj0QEox6W9S1dOAryQ@mail.gmail.com>
 <5394B5F3.6050403@trueblade.com> <20140608193746.GA1687@k2>
Message-ID: <20140608193847.GB1687@k2>

On Sun, Jun 08, 2014 at 07:37:46PM +0000, dw+python-dev at hmmz.org wrote:

>         cls = tuple(name, (_NamedTuple,), {

Ugh, this should of course have been type().


David

From eric at trueblade.com  Sun Jun  8 23:27:41 2014
From: eric at trueblade.com (Eric V. Smith)
Date: Sun, 08 Jun 2014 17:27:41 -0400
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <20140608193746.GA1687@k2>
References: <53920D91.3060207@simplistix.co.uk>
 <CADiSq7fUZvDGg34Tjqp3pkHhg-zmLyCGSj0QEox6W9S1dOAryQ@mail.gmail.com>
 <5394B5F3.6050403@trueblade.com> <20140608193746.GA1687@k2>
Message-ID: <5394D54D.9020507@trueblade.com>

On 6/8/2014 3:37 PM, dw+python-dev at hmmz.org wrote:
> On Sun, Jun 08, 2014 at 03:13:55PM -0400, Eric V. Smith wrote:
> 
>>> The current implementation is also *really* easy to understand,
>>> while writing out the dynamic type creation explicitly would likely
>>> require much deeper knowledge of the type machinery to follow.
> 
>> As proof that it's harder to understand, here's an example of that
>> dynamically creating functions and types:
> 
> Probably I'm missing something, but there's a much simpler non-exec
> approach, something like:
> 
>     class _NamedTuple(...):
>         ...
> 
>     def namedtuple(name, fields):
>         cls = tuple(name, (_NamedTuple,), {
>             '_fields': fields.split()
>         })
>         for i, field_name in enumerate(cls._fields):
>             prop = property(functools.partial(_NamedTuple.__getitem__, i)
>                             functools.partial(_NamedTuple.__setitem__, i))
>             setattr(cls, field_name, prop)
>         return cls

How would you write _Namedtuple.__new__?

From dw+python-dev at hmmz.org  Sun Jun  8 23:51:35 2014
From: dw+python-dev at hmmz.org (dw+python-dev at hmmz.org)
Date: Sun, 8 Jun 2014 21:51:35 +0000
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <5394D54D.9020507@trueblade.com>
References: <53920D91.3060207@simplistix.co.uk>
 <CADiSq7fUZvDGg34Tjqp3pkHhg-zmLyCGSj0QEox6W9S1dOAryQ@mail.gmail.com>
 <5394B5F3.6050403@trueblade.com> <20140608193746.GA1687@k2>
 <5394D54D.9020507@trueblade.com>
Message-ID: <20140608215135.GA2970@k2>

On Sun, Jun 08, 2014 at 05:27:41PM -0400, Eric V. Smith wrote:

> How would you write _Namedtuple.__new__?

Knew something must be missing :)  Obviously it's possible, but not
nearly as efficiently as reusing the argument parsing machinery as in
the original implementation.

I guess especially the kwargs implementation below would suck..

    _undef = object()

    class _NamedTuple(...):
        def __new__(cls, *a, **kw):
            if kw:
                a = list(a) + ([_undef] * (len(self._fields)-len(a)))
                for k, v in kw.iteritems():
                    i = cls._name_id_map[k]
                    if a[i] is not _undef:
                        raise TypeError(...)
                    a[i] = v
                if _undef not in a:
                    return tuple.__new__(cls, a)
                raise TypeError(...)
            else:
                if len(a) == len(self._fields):
                    return tuple.__new__(cls, a)
                raise TypeError(...)

    def namedtuple(name, fields):
        fields = fields.split()
        cls = type(name, (_NamedTuple,), {
            '_fields': fields,
            '_name_id_map': {k: i for i, k in enumerate(fields)}
        })
        for i, field_name in enumerate(fields):
            getter = functools.partial(_NamedTuple.__getitem__, i)
            setattr(cls, field_name, property(getter))
        return cls


David

From rdmurray at bitdance.com  Mon Jun  9 00:44:02 2014
From: rdmurray at bitdance.com (R. David Murray)
Date: Sun, 08 Jun 2014 18:44:02 -0400
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <lmv8r8$dae$1@ger.gmane.org>
References: <53920D91.3060207@simplistix.co.uk>
 <20140607132525.2A2F9250D5C@webabinitio.net> <lmv8r8$dae$1@ger.gmane.org>
Message-ID: <20140608224402.834E5250D4E@webabinitio.net>

On Sat, 07 Jun 2014 10:50:16 -0400, Antoine Pitrou <antoine at python.org> wrote:
> Le 07/06/2014 09:25, R. David Murray a ??crit :
> > On Fri, 06 Jun 2014 19:50:57 +0100, Chris Withers <chris at simplistix.co.uk> wrote:
> >> I've been trying to add support for explicit comparison of namedtuples
> >> into testfixtures and hit a problem which lead me to read the source and
> >> be sad.
> >>
> >> Rather than the mixin and class assembly in the function I expected to
> >> find, I'm greeted by an exec of a string.
> >>
> >> Curious as to what lead to that implementation approach? What does it
> >> buy that couldn't have been obtained by a mixin providing the functionality?
> >>
> >> In my case, that's somewhat irrelevant, I'm looking to store a comparer
> >> in a registry that would get used for all namedtuples, but I have
> >> nothing to key that off, there are no shared bases other than object and
> >> tuple.
> >>
> >> I guess I could duck-type it based on the _fields attribute but that
> >> feels implicit and fragile.
> >>
> >> What do you guys suggest?
> >
> > I seem to remember a previous discussion that concluded that duck typing
> > based on _fields was the way to go.  (It's a public API, despite the _,
> > due to name-tuple's attribute namespacing issues.)
> 
> There could be many third-party classes with a _fields member, so that 
> sounds rather fragile.
> There doesn't seem to be any technical reason barring the addition of a 
> common base class for namedtuples.

For what it is worth, I found the discussion I was remembering:

    http://bugs.python.org/issue7796

And as someone pointed out down thread, the actual check is "inherits
from tuple and has a _fields attribute".

That gets you a duck type, which is generally what you want in Python.

--David

From antoine at python.org  Mon Jun  9 01:32:11 2014
From: antoine at python.org (Antoine Pitrou)
Date: Sun, 08 Jun 2014 19:32:11 -0400
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <20140608224402.834E5250D4E@webabinitio.net>
References: <53920D91.3060207@simplistix.co.uk>
 <20140607132525.2A2F9250D5C@webabinitio.net> <lmv8r8$dae$1@ger.gmane.org>
 <20140608224402.834E5250D4E@webabinitio.net>
Message-ID: <ln2rpr$u08$1@ger.gmane.org>

Le 08/06/2014 18:44, R. David Murray a ?crit :
>
> For what it is worth, I found the discussion I was remembering:
>
>      http://bugs.python.org/issue7796
>
> And as someone pointed out down thread, the actual check is "inherits
> from tuple and has a _fields attribute".
>
> That gets you a duck type, which is generally what you want in Python.

I think it's a bit complicated (and not obviously discoverable) as far 
as duck-typing goes.

Regards

Antoine.


From steve at pearwood.info  Mon Jun  9 01:31:17 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 9 Jun 2014 09:31:17 +1000
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <5394B5F3.6050403@trueblade.com>
References: <53920D91.3060207@simplistix.co.uk>
 <CADiSq7fUZvDGg34Tjqp3pkHhg-zmLyCGSj0QEox6W9S1dOAryQ@mail.gmail.com>
 <5394B5F3.6050403@trueblade.com>
Message-ID: <20140608233117.GS10355@ando>

On Sun, Jun 08, 2014 at 03:13:55PM -0400, Eric V. Smith wrote:
> On 6/7/2014 10:46 AM, Nick Coghlan wrote:
> > On 7 June 2014 04:50, Chris Withers <chris at simplistix.co.uk> wrote:
> >> Curious as to what lead to that implementation approach? What does it buy
> >> that couldn't have been obtained by a mixin providing the functionality?
> > 
> > In principle, you could get the equivalent of collections.namedtuple
> > through dynamically constructed classes. In practice, that's actually
> > easier said than done, so the fact the current implementation works
> > fine for almost all purposes acts as a powerful disincentive to
> > rewriting it. The current implementation is also *really* easy to
> > understand, while writing out the dynamic type creation explicitly
> > would likely require much deeper knowledge of the type machinery to
> > follow.
> 
> As proof that it's harder to understand, here's an example of that
> dynamically creating functions and types:
[...]


I wonder how a hybrid approach would work? Use a dynamically-created 
class, but then construct the __new__ method using exec and inject it 
into the new class. As far as I can see, it's only __new__ that benefits 
from the exec approach.

Anyone tried this yet? Is it worth an experiment?


-- 
Steven

From raymond.hettinger at gmail.com  Mon Jun  9 02:03:11 2014
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sun, 8 Jun 2014 17:03:11 -0700
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <20140607132525.2A2F9250D5C@webabinitio.net>
References: <53920D91.3060207@simplistix.co.uk>
 <20140607132525.2A2F9250D5C@webabinitio.net>
Message-ID: <F4D77359-11FF-45D0-B937-411B994EE9F9@gmail.com>


On Jun 7, 2014, at 6:25 AM, R. David Murray <rdmurray at bitdance.com> wrote:

>> I guess I could duck-type it based on the _fields attribute but that 
>> feels implicit and fragile.
>> 
>> What do you guys suggest?
> 
> I seem to remember a previous discussion that concluded that duck typing
> based on _fields was the way to go.  (It's a public API, despite the _,
> due to name-tuple's attribute namespacing issues.)

Yes.  That is the recommended approach.

IIRC that was Guido's suggestion rather than creating an abstract
base class for a named tuple (any tuple-like class with indexable
elements that are also accessible using named attributes).


Raymond

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140608/e15decd5/attachment.html>

From eric at trueblade.com  Mon Jun  9 03:21:42 2014
From: eric at trueblade.com (Eric V. Smith)
Date: Sun, 08 Jun 2014 21:21:42 -0400
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <20140608233117.GS10355@ando>
References: <53920D91.3060207@simplistix.co.uk>
 <CADiSq7fUZvDGg34Tjqp3pkHhg-zmLyCGSj0QEox6W9S1dOAryQ@mail.gmail.com>
 <5394B5F3.6050403@trueblade.com> <20140608233117.GS10355@ando>
Message-ID: <53950C26.10006@trueblade.com>

On 6/8/2014 7:31 PM, Steven D'Aprano wrote:
> On Sun, Jun 08, 2014 at 03:13:55PM -0400, Eric V. Smith wrote:
>> On 6/7/2014 10:46 AM, Nick Coghlan wrote:
>>> On 7 June 2014 04:50, Chris Withers <chris at simplistix.co.uk> wrote:
>>>> Curious as to what lead to that implementation approach? What does it buy
>>>> that couldn't have been obtained by a mixin providing the functionality?
>>>
>>> In principle, you could get the equivalent of collections.namedtuple
>>> through dynamically constructed classes. In practice, that's actually
>>> easier said than done, so the fact the current implementation works
>>> fine for almost all purposes acts as a powerful disincentive to
>>> rewriting it. The current implementation is also *really* easy to
>>> understand, while writing out the dynamic type creation explicitly
>>> would likely require much deeper knowledge of the type machinery to
>>> follow.
>>
>> As proof that it's harder to understand, here's an example of that
>> dynamically creating functions and types:
> [...]
> 
> 
> I wonder how a hybrid approach would work? Use a dynamically-created 
> class, but then construct the __new__ method using exec and inject it 
> into the new class. As far as I can see, it's only __new__ that benefits 
> from the exec approach.
> 
> Anyone tried this yet? Is it worth an experiment?

I'm not sure what the benefit would be. Other than the ast manipulations
for __new__, the rest of the non-exec code is easy to understand.

Eric.


From ncoghlan at gmail.com  Mon Jun  9 03:42:59 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 9 Jun 2014 11:42:59 +1000
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <F4D77359-11FF-45D0-B937-411B994EE9F9@gmail.com>
References: <53920D91.3060207@simplistix.co.uk>
 <20140607132525.2A2F9250D5C@webabinitio.net>
 <F4D77359-11FF-45D0-B937-411B994EE9F9@gmail.com>
Message-ID: <CADiSq7fObST2sAeqaSh86=xy_jFDaWC8dbfNqXP_9p2cf-eFYQ@mail.gmail.com>

On 9 Jun 2014 10:04, "Raymond Hettinger" <raymond.hettinger at gmail.com>
wrote:
>
>
> On Jun 7, 2014, at 6:25 AM, R. David Murray <rdmurray at bitdance.com> wrote:
>
>>> I guess I could duck-type it based on the _fields attribute but that
>>> feels implicit and fragile.
>>>
>>> What do you guys suggest?
>>
>>
>> I seem to remember a previous discussion that concluded that duck typing
>> based on _fields was the way to go.  (It's a public API, despite the _,
>> due to name-tuple's attribute namespacing issues.)
>
>
> Yes.  That is the recommended approach.
>
> IIRC that was Guido's suggestion rather than creating an abstract
> base class for a named tuple (any tuple-like class with indexable
> elements that are also accessible using named attributes).

Given the somewhat periodic recurrence of the question, might it be worth
making an ABC after all, with "subclass of tuple with a _fields attribute"
as its default check?

"isinstance(obj, collections.NamedTupleABC)" is quite a bit more
self-documenting than "isinstance(obj, tuple) and hasattr(obj, '_fields')"

Cheers,
Nick.

>
>
> Raymond
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140609/ddaa2075/attachment.html>

From raymond.hettinger at gmail.com  Mon Jun  9 06:05:17 2014
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sun, 8 Jun 2014 21:05:17 -0700
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <CADiSq7fObST2sAeqaSh86=xy_jFDaWC8dbfNqXP_9p2cf-eFYQ@mail.gmail.com>
References: <53920D91.3060207@simplistix.co.uk>
 <20140607132525.2A2F9250D5C@webabinitio.net>
 <F4D77359-11FF-45D0-B937-411B994EE9F9@gmail.com>
 <CADiSq7fObST2sAeqaSh86=xy_jFDaWC8dbfNqXP_9p2cf-eFYQ@mail.gmail.com>
Message-ID: <97F551CA-E992-43F9-B988-B040DF1B7D76@gmail.com>


On Jun 8, 2014, at 6:42 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> >> I seem to remember a previous discussion that concluded that duck typing
> >> based on _fields was the way to go.  (It's a public API, despite the _,
> >> due to name-tuple's attribute namespacing issues.)
> >
> >
> > Yes.  That is the recommended approach.
> >
> > IIRC that was Guido's suggestion rather than creating an abstract
> > base class for a named tuple (any tuple-like class with indexable
> > elements that are also accessible using named attributes).
> 
> Given the somewhat periodic recurrence of the question, might it be worth making an ABC after all, with "subclass of tuple with a _fields attribute" as its default check?
> 
> "isinstance(obj, collections.NamedTupleABC)" is quite a bit more self-documenting than "isinstance(obj, tuple) and hasattr(obj, '_fields')"
> 
The "isinstance(obj, tuple)" part isn't a requirement.  The concept of a named tuple is meant to include structseq objects or user defined classes that have are "tuple-like with indexable elements that are also accessible using named attributes" (see the definition in the glossary).

I could add a note to the docs saying that hasattr(obj, '_fields') is the preferred way to check for named tuples produced by the namedtuple() factory function, but it would be a waste to introduce an ABC for this. (Consider the failure of the Callable() abc leading to us deciding to reintroduce the callable() builtin function, and consider the general unwillingness to test for iterability using the Iterable abc).

Another issue is that a straight abc wouldn't be sufficient.  What we would really want is to check for is:
1) the presence of a _fields tuple (an abc can do this)
2) to check that all of the attribute names specified in _fields are defined (ABCMeta doesn't do this)
3) and that the type is a Sequence (ABCMeta can do this).

An tricked-out ABC extension might be worth it if it provided some non-trivial mixin capabilities for implementing homegrown named tuples (not created by the factory function), but I don't think we want to go there.  The problem isn't important enough to warrant throwing this much code and a new API at it (duck-typing the attributes and checking for _fields is a practical solution that works even on older pythons).


Raymond  


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140608/3b1c4782/attachment-0001.html>

From sstewartgallus00 at mylangara.bc.ca  Sun Jun  8 23:22:19 2014
From: sstewartgallus00 at mylangara.bc.ca (Steven Stewart-Gallus)
Date: Sun, 08 Jun 2014 21:22:19 +0000 (GMT)
Subject: [Python-Dev] Help with the build system and my first patch
Message-ID: <fb7af9b310b8.5394d40b@langara.bc.ca>

Hello,

I would like some help understanding the build system. I am currently
working on an issue (http://bugs.python.org/issue21627) and plan to
create some common functionality in Python/setcloexec.c and
Include/setcloexec.h that is conditionally compiled in on POSIX
systems and not on Windows systems. I need to extract this
functionality out from _Py_set_inheritable because it needs to run in
the dangerous context of right after a fork and I don't believe it can
throw exceptions. How can I conditionally compile some library code
for certain platforms only?

Thank you,
Steven Stewart-Gallus

From berker.peksag at gmail.com  Mon Jun  9 11:31:14 2014
From: berker.peksag at gmail.com (=?UTF-8?Q?Berker_Peksa=C4=9F?=)
Date: Mon, 9 Jun 2014 12:31:14 +0300
Subject: [Python-Dev] [Python-checkins] cpython: Closes #21256: Printout
 of keyword args in deterministic order in mock calls.
In-Reply-To: <3gn6jt4bdPz7LjP@mail.python.org>
References: <3gn6jt4bdPz7LjP@mail.python.org>
Message-ID: <CAF4280KRnsW3+qM6Nc5xSXbUg1RK1HtXs8zhXtMTzNeGCWyVuw@mail.gmail.com>

On Mon, Jun 9, 2014 at 11:16 AM, kushal.das <python-checkins at python.org> wrote:
> http://hg.python.org/cpython/rev/8e05e15901a8
> changeset:   91102:8e05e15901a8
> user:        Kushal Das <kushaldas at gmail.com>
> date:        Mon Jun 09 13:45:56 2014 +0530
> summary:
>   Closes #21256: Printout of keyword args in deterministic order in mock calls.
>
> Printout of keyword args should be in deterministic order in
> a mock function call. This will help to write better doctests.
>
> files:
>   Lib/unittest/mock.py                   |  2 +-
>   Lib/unittest/test/testmock/testmock.py |  6 ++++++
>   Misc/NEWS                              |  3 +++
>   3 files changed, 10 insertions(+), 1 deletions(-)
>
>
> diff --git a/Lib/unittest/mock.py b/Lib/unittest/mock.py
> --- a/Lib/unittest/mock.py
> +++ b/Lib/unittest/mock.py
> @@ -1894,7 +1894,7 @@
>      formatted_args = ''
>      args_string = ', '.join([repr(arg) for arg in args])
>      kwargs_string = ', '.join([
> -        '%s=%r' % (key, value) for key, value in kwargs.items()
> +        '%s=%r' % (key, value) for key, value in sorted(kwargs.items())
>      ])
>      if args_string:
>          formatted_args = args_string
> diff --git a/Lib/unittest/test/testmock/testmock.py b/Lib/unittest/test/testmock/testmock.py
> --- a/Lib/unittest/test/testmock/testmock.py
> +++ b/Lib/unittest/test/testmock/testmock.py
> @@ -1206,6 +1206,12 @@
>          with self.assertRaises(AssertionError):
>              m.hello.assert_not_called()
>
> +    #Issue21256 printout of keyword args should be in deterministic order
> +    def test_sorted_call_signature(self):
> +        m = Mock()
> +        m.hello(name='hello', daddy='hero')
> +        text = "call(daddy='hero', name='hello')"
> +        self.assertEquals(repr(m.hello.call_args), text)

Should this be assertEqual instead?

--Berker

>
>      def test_mock_add_spec(self):
>          class _One(object):
> diff --git a/Misc/NEWS b/Misc/NEWS
> --- a/Misc/NEWS
> +++ b/Misc/NEWS
> @@ -92,6 +92,9 @@
>  Library
>  -------
>
> +- Issue #21256: Printout of keyword args should be in deterministic order in
> +  a mock function call. This will help to write better doctests.
> +
>  - Issue #21677: Fixed chaining nonnormalized exceptions in io close() methods.
>
>  - Issue #11709: Fix the pydoc.help function to not fail when sys.stdin is not a
>
> --
> Repository URL: http://hg.python.org/cpython
>
> _______________________________________________
> Python-checkins mailing list
> Python-checkins at python.org
> https://mail.python.org/mailman/listinfo/python-checkins
>

From antoine at python.org  Mon Jun  9 13:40:41 2014
From: antoine at python.org (Antoine Pitrou)
Date: Mon, 09 Jun 2014 07:40:41 -0400
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <97F551CA-E992-43F9-B988-B040DF1B7D76@gmail.com>
References: <53920D91.3060207@simplistix.co.uk>
 <20140607132525.2A2F9250D5C@webabinitio.net>
 <F4D77359-11FF-45D0-B937-411B994EE9F9@gmail.com>
 <CADiSq7fObST2sAeqaSh86=xy_jFDaWC8dbfNqXP_9p2cf-eFYQ@mail.gmail.com>
 <97F551CA-E992-43F9-B988-B040DF1B7D76@gmail.com>
Message-ID: <ln46fp$h7n$1@ger.gmane.org>

Le 09/06/2014 00:05, Raymond Hettinger a ?crit :
>
> Another issue is that a straight abc wouldn't be sufficient.  What we
> would really want is to check for is:
> 1) the presence of a _fields tuple (an abc can do this)
> 2) to check that all of the attribute names specified in _fields are
> defined (ABCMeta doesn't do this)
> 3) and that the type is a Sequence (ABCMeta can do this).
>
> An tricked-out ABC extension might be worth it if it provided some
> non-trivial mixin capabilities for implementing homegrown named tuples
> (not created by the factory function), but I don't think we want to go
> there.

Instead of an ABC, why not a simple is_namedtuple() function?

Regards

Antoine.


From bcannon at gmail.com  Mon Jun  9 16:01:18 2014
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 09 Jun 2014 14:01:18 +0000
Subject: [Python-Dev]  cpython and python debugger documentation
References: <loom.20140607T234855-911@post.gmane.org>
Message-ID: <CAP1=2W4992O3p4=T9rCGz-WMgbHbxS8YFfkfYmZC=P_LTCmc9g@mail.gmail.com>

On Sat Jun 07 2014 at 5:55:29 PM, Le Pa <lpanl09 at gmail.com> wrote:

> Hi,
>
> I am interested in learning how the cpython interpreter is designed and
> implemented,
> and also how the python debugger works internally. My ultimate purpose is
> to
> modify
> them for my distributed computing needs. Are there any documentations
> on these please? I have done some goggling but failed to find anything
> useful.
>
> Thanks you very much for your help!
>

The only documentation we have is (roughly) how the parser and compiler
work, not the interpreter. As for pdb, it's written in Python so you can
look at the source to see how that works without much issue.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140609/90e05253/attachment.html>

From bcannon at gmail.com  Mon Jun  9 16:03:01 2014
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 09 Jun 2014 14:03:01 +0000
Subject: [Python-Dev]  Help with the build system and my first patch
References: <fb7af9b310b8.5394d40b@langara.bc.ca>
Message-ID: <CAP1=2W4OcDi93OJxS1ML+zuNA5wUVGd7w0PpcKdVnYsWOcdFwQ@mail.gmail.com>

On Mon Jun 09 2014 at 2:07:22 AM, Steven Stewart-Gallus <
sstewartgallus00 at mylangara.bc.ca> wrote:

> Hello,
>
> I would like some help understanding the build system. I am currently
> working on an issue (http://bugs.python.org/issue21627) and plan to
> create some common functionality in Python/setcloexec.c and
> Include/setcloexec.h that is conditionally compiled in on POSIX
> systems and not on Windows systems. I need to extract this
> functionality out from _Py_set_inheritable because it needs to run in
> the dangerous context of right after a fork and I don't believe it can
> throw exceptions. How can I conditionally compile some library code
> for certain platforms only?
>

Do you mean other than potentially detecting something in the configure
script and using an #ifdef guard?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140609/0209acdc/attachment.html>

From pmiscml at gmail.com  Mon Jun  9 16:50:02 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Mon, 9 Jun 2014 17:50:02 +0300
Subject: [Python-Dev] cpython and python debugger documentation
In-Reply-To: <CAP1=2W4992O3p4=T9rCGz-WMgbHbxS8YFfkfYmZC=P_LTCmc9g@mail.gmail.com>
References: <loom.20140607T234855-911@post.gmane.org>
 <CAP1=2W4992O3p4=T9rCGz-WMgbHbxS8YFfkfYmZC=P_LTCmc9g@mail.gmail.com>
Message-ID: <20140609175002.6ad27c91@x34f>

Hello,

On Mon, 09 Jun 2014 14:01:18 +0000
Brett Cannon <bcannon at gmail.com> wrote:

> On Sat Jun 07 2014 at 5:55:29 PM, Le Pa <lpanl09 at gmail.com> wrote:
> 
> > Hi,
> >
> > I am interested in learning how the cpython interpreter is designed
> > and implemented,
> > and also how the python debugger works internally. My ultimate
> > purpose is to
> > modify
> > them for my distributed computing needs. Are there any
> > documentations on these please? I have done some goggling but
> > failed to find anything useful.
> >
> > Thanks you very much for your help!
> >
> 
> The only documentation we have is (roughly) how the parser and
> compiler work, not the interpreter. As for pdb, it's written in
> Python so you can look at the source to see how that works without
> much issue.

But doing attentive googling will turn out a lot of 3rd-party blog
posts which discuss various implementation aspects of CPython (and even
alternative implementations). Some random links:

http://tech.blog.aknin.name/category/my-projects/pythons-innards/
http://eli.thegreenplace.net/2010/09/18/python-internals-symbol-tables-part-1/
http://nedbatchelder.com/blog/200804/the_structure_of_pyc_files.html

One should keep in mind that implementation evolves all the time, and
any info in older docs may be obsolete. So, the ultimate reference is
the source itself, but posts like above can be a good help to
understand it more easily and effectively.


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com


From eliben at gmail.com  Mon Jun  9 18:26:25 2014
From: eliben at gmail.com (Eli Bendersky)
Date: Mon, 9 Jun 2014 09:26:25 -0700
Subject: [Python-Dev] cpython and python debugger documentation
In-Reply-To: <20140609175002.6ad27c91@x34f>
References: <loom.20140607T234855-911@post.gmane.org>
 <CAP1=2W4992O3p4=T9rCGz-WMgbHbxS8YFfkfYmZC=P_LTCmc9g@mail.gmail.com>
 <20140609175002.6ad27c91@x34f>
Message-ID: <CAF-Rda8Sa8Qhy8uDyFkJOLYEZpHn=_1SOKsOci=WVX7dAeHheg@mail.gmail.com>

On Mon, Jun 9, 2014 at 7:50 AM, Paul Sokolovsky <pmiscml at gmail.com> wrote:

> Hello,
>
> On Mon, 09 Jun 2014 14:01:18 +0000
> Brett Cannon <bcannon at gmail.com> wrote:
>
> > On Sat Jun 07 2014 at 5:55:29 PM, Le Pa <lpanl09 at gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I am interested in learning how the cpython interpreter is designed
> > > and implemented,
> > > and also how the python debugger works internally. My ultimate
> > > purpose is to
> > > modify
> > > them for my distributed computing needs. Are there any
> > > documentations on these please? I have done some goggling but
> > > failed to find anything useful.
> > >
> > > Thanks you very much for your help!
> > >
> >
> > The only documentation we have is (roughly) how the parser and
> > compiler work, not the interpreter. As for pdb, it's written in
> > Python so you can look at the source to see how that works without
> > much issue.
>
> But doing attentive googling will turn out a lot of 3rd-party blog
> posts which discuss various implementation aspects of CPython (and even
> alternative implementations). Some random links:
>
> http://tech.blog.aknin.name/category/my-projects/pythons-innards/
>
> http://eli.thegreenplace.net/2010/09/18/python-internals-symbol-tables-part-1/
>

FWIW I have a bunch of those, and the symbol table one is probably not the
best for beginners. The whole category is here:
http://eli.thegreenplace.net/category/programming/python/python-internals/

Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140609/f7217a40/attachment.html>

From raymond.hettinger at gmail.com  Mon Jun  9 18:34:31 2014
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Mon, 9 Jun 2014 09:34:31 -0700
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <ln46fp$h7n$1@ger.gmane.org>
References: <53920D91.3060207@simplistix.co.uk>
 <20140607132525.2A2F9250D5C@webabinitio.net>
 <F4D77359-11FF-45D0-B937-411B994EE9F9@gmail.com>
 <CADiSq7fObST2sAeqaSh86=xy_jFDaWC8dbfNqXP_9p2cf-eFYQ@mail.gmail.com>
 <97F551CA-E992-43F9-B988-B040DF1B7D76@gmail.com> <ln46fp$h7n$1@ger.gmane.org>
Message-ID: <A58C37EC-BE82-40ED-9E8A-3B9210C17C7F@gmail.com>


On Jun 9, 2014, at 4:40 AM, Antoine Pitrou <antoine at python.org> wrote:

> Instead of an ABC, why not a simple is_namedtuple() function?

That would work.


Raymond

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140609/660f4884/attachment.html>

From sstewartgallus00 at mylangara.bc.ca  Mon Jun  9 19:48:27 2014
From: sstewartgallus00 at mylangara.bc.ca (Steven Stewart-Gallus)
Date: Mon, 09 Jun 2014 17:48:27 +0000 (GMT)
Subject: [Python-Dev] Help with the build system and my first patch
In-Reply-To: <CAP1=2W4OcDi93OJxS1ML+zuNA5wUVGd7w0PpcKdVnYsWOcdFwQ@mail.gmail.com>
References: <fb7af9b310b8.5394d40b@langara.bc.ca>
 <CAP1=2W4OcDi93OJxS1ML+zuNA5wUVGd7w0PpcKdVnYsWOcdFwQ@mail.gmail.com>
Message-ID: <fbd6e3052b03.5395f36b@langara.bc.ca>

> Do you mean other than potentially detecting something in the 
> configurescript and using an #ifdef guard?

Yes, that works on a static function inside a file level but I need to
conditionally include a whole file into the build.

From tjreedy at udel.edu  Mon Jun  9 20:44:03 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 09 Jun 2014 14:44:03 -0400
Subject: [Python-Dev] cpython and python debugger documentation
In-Reply-To: <CAF-Rda8Sa8Qhy8uDyFkJOLYEZpHn=_1SOKsOci=WVX7dAeHheg@mail.gmail.com>
References: <loom.20140607T234855-911@post.gmane.org>
 <CAP1=2W4992O3p4=T9rCGz-WMgbHbxS8YFfkfYmZC=P_LTCmc9g@mail.gmail.com>
 <20140609175002.6ad27c91@x34f>
 <CAF-Rda8Sa8Qhy8uDyFkJOLYEZpHn=_1SOKsOci=WVX7dAeHheg@mail.gmail.com>
Message-ID: <ln4v9k$a3h$1@ger.gmane.org>

On 6/9/2014 12:26 PM, Eli Bendersky wrote:
>
>
>
> On Mon, Jun 9, 2014 at 7:50 AM, Paul Sokolovsky <pmiscml at gmail.com
> <mailto:pmiscml at gmail.com>> wrote:
>
>     Hello,
>
>     On Mon, 09 Jun 2014 14:01:18 +0000
>     Brett Cannon <bcannon at gmail.com <mailto:bcannon at gmail.com>> wrote:
>
>      > On Sat Jun 07 2014 at 5:55:29 PM, Le Pa <lpanl09 at gmail.com
>     <mailto:lpanl09 at gmail.com>> wrote:
>      >
>      > > Hi,
>      > >
>      > > I am interested in learning how the cpython interpreter is designed
>      > > and implemented,
>      > > and also how the python debugger works internally. My ultimate
>      > > purpose is to
>      > > modify
>      > > them for my distributed computing needs. Are there any
>      > > documentations on these please? I have done some goggling but
>      > > failed to find anything useful.
>      > >
>      > > Thanks you very much for your help!
>      > >
>      >
>      > The only documentation we have is (roughly) how the parser and
>      > compiler work, not the interpreter. As for pdb, it's written in
>      > Python so you can look at the source to see how that works without
>      > much issue.
>
>     But doing attentive googling will turn out a lot of 3rd-party blog
>     posts which discuss various implementation aspects of CPython (and even
>     alternative implementations). Some random links:
>
>     http://tech.blog.aknin.name/category/my-projects/pythons-innards/
>     http://eli.thegreenplace.net/2010/09/18/python-internals-symbol-tables-part-1/
>
>
> FWIW I have a bunch of those, and the symbol table one is probably not
> the best for beginners. The whole category is here:
> http://eli.thegreenplace.net/category/programming/python/python-internals/

Perhaps someone could make a wiki entry such as PythonInternals with 
links such as these.


-- 
Terry Jan Reedy


From bcannon at gmail.com  Mon Jun  9 20:45:46 2014
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 09 Jun 2014 18:45:46 +0000
Subject: [Python-Dev] Help with the build system and my first patch
References: <fb7af9b310b8.5394d40b@langara.bc.ca>
 <CAP1=2W4OcDi93OJxS1ML+zuNA5wUVGd7w0PpcKdVnYsWOcdFwQ@mail.gmail.com>
 <fbd6e3052b03.5395f36b@langara.bc.ca>
Message-ID: <CAP1=2W5S2MCDTzzErzDX9FSO5wP_D34D0usyw0FoNRCqeLgNug@mail.gmail.com>

On Mon Jun 09 2014 at 1:48:27 PM, Steven Stewart-Gallus <
sstewartgallus00 at mylangara.bc.ca> wrote:

> > Do you mean other than potentially detecting something in the
> > configurescript and using an #ifdef guard?
>
> Yes, that works on a static function inside a file level but I need to
> conditionally include a whole file into the build.
>

Why specifically does the file itself be conditional? Typically you
unconditionally include the whole file and then put the entire contents of
it in a #ifdef guard.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140609/84c574d7/attachment.html>

From pmiscml at gmail.com  Tue Jun 10 04:23:12 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Tue, 10 Jun 2014 05:23:12 +0300
Subject: [Python-Dev] Criticism of execfile() removal in Python3
Message-ID: <20140610052312.280e49c9@x34f>

Hello,

I was pleasantly surprised with the response to recent post about
MicroPython implementation details 
(https://mail.python.org/pipermail/python-dev/2014-June/134718.html). I
hope that discussion means that posts about alternative implementations
are not unwelcome here, so I would like to bring up another (of many)
issues we faced while implementing MicroPython.

execfile() builtin function was removed in 3.0. This brings few
problems:

1. It hampers interactive mode - instead of short and easy to type
execfile("file.py") one needs to use exec(open("file.py").read()). I'm
sure that's not going to bother a lot of people - after all, the
easiest way to execute a Python file is to drop back to shell and
restart python with file name, using all wonders of tab completion. But
now imagine that Python interpreter runs on bare hardware, and its REPL
is the only shell. That's exactly what we have with MicroPython's
Cortex-M port. But it's not really MicroPython-specific, there's
CPython port to baremetal either - http://www.pycorn.org/ .

2. Ok, assuming that exec(open().read()) idiom is still a way to go,
there's a problem - it requires to load entire file to memory. But
there can be not enough memory. Consider 1Mb file with 900Kb comments
(autogenerated, for example). execfile() could easily parse it, using
small buffer. But exec() requires to slurp entire file into memory, and
1Mb is much more than heap sizes that we target.


Comments, suggestions? Just to set a productive direction, please
kindly don't consider the problems above as MicroPython's. I very much
liked how last discussion went: I was pointed that
https://docs.python.org/3/reference/index.html is not really a CPython
reference, it's a *Python* reference, and there were even motion to
clarify in it some points which came out from MicroPython discussion.
So, what about https://docs.python.org/3/library/index.html - is it
CPython, or Python standard library specification? Assuming the latter,
what we have is that, by removal of previously available feature,
*Python* became less friendly for interactive usage and less scalable.


Thanks,
 Paul                          mailto:pmiscml at gmail.com

From steve at pearwood.info  Tue Jun 10 05:03:03 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 10 Jun 2014 13:03:03 +1000
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <20140610052312.280e49c9@x34f>
References: <20140610052312.280e49c9@x34f>
Message-ID: <20140610030303.GU10355@ando>

On Tue, Jun 10, 2014 at 05:23:12AM +0300, Paul Sokolovsky wrote:

> execfile() builtin function was removed in 3.0. This brings few
> problems:
> 
> 1. It hampers interactive mode - instead of short and easy to type
> execfile("file.py") one needs to use exec(open("file.py").read()).

If the amount of typing is the problem, that's easy to solve:

# do this once
def execfile(name):
    exec(open("file.py").read())

Another possibility is:

os.system("python file.py")


> 2. Ok, assuming that exec(open().read()) idiom is still a way to go,
> there's a problem - it requires to load entire file to memory. But
> there can be not enough memory. Consider 1Mb file with 900Kb comments
> (autogenerated, for example). execfile() could easily parse it, using
> small buffer. But exec() requires to slurp entire file into memory, and
> 1Mb is much more than heap sizes that we target.

There's nothing stopping alternative implementations having their own 
implementation-specific standard library modules.

steve at orac:/home/s$ jython
Jython 2.5.1+ (Release_2_5_1, Aug 4 2010, 07:18:19)
[OpenJDK Server VM (Sun Microsystems Inc.)] on java1.6.0_27
Type "help", "copyright", "credits" or "license" for more information.
>>> import java
>>> 


So you could do this:

from upy import execfile
execfile("file.py")

So long as you make it clear that this is a platform specific module, 
and don't advertise it as a language feature, I see no reason why you 
cannot do that.


-- 
Steven

From tjreedy at udel.edu  Tue Jun 10 05:56:09 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 09 Jun 2014 23:56:09 -0400
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <20140610030303.GU10355@ando>
References: <20140610052312.280e49c9@x34f> <20140610030303.GU10355@ando>
Message-ID: <ln5vkr$v78$1@ger.gmane.org>

On 6/9/2014 11:03 PM, Steven D'Aprano wrote:
> On Tue, Jun 10, 2014 at 05:23:12AM +0300, Paul Sokolovsky wrote:
>
>> execfile() builtin function was removed in 3.0.

Because it was hardly ever used. For short bits of code, it is usually 
inferior to exec with a string in the file. For substantial bits of 
code, it is generally inferior to 'from file import *' and does not have 
the option of other forms of import. For startup code that you want 
every session, it is inferior to PYTHONSTARTUP or custom site module.

 >> This brings few problems:
>> 1. It hampers interactive mode - instead of short and easy to type
>> execfile("file.py") one needs to use exec(open("file.py").read())

> If the amount of typing is the problem, that's easy to solve:
>
> # do this once
> def execfile(name):
>      exec(open("file.py").read())
>
> Another possibility is:
>
> os.system("python file.py")
>
>
>> 2. Ok, assuming that exec(open().read()) idiom is still a way to go,
>> there's a problem - it requires to load entire file to memory. But
>> there can be not enough memory. Consider 1Mb file with 900Kb comments
>> (autogenerated, for example). execfile() could easily parse it, using
>> small buffer. But exec() requires to slurp entire file into memory, and
>> 1Mb is much more than heap sizes that we target.

Execfile could slurp the whole file into memory too. Next parse the 
entire file. Then execute the entire bytecode. Finally toss the bytecode 
so that the file has to be reparsed next time it is used.

> There's nothing stopping alternative implementations having their own
> implementation-specific standard library modules.
...
> So you could do this:
>
> from upy import execfile
> execfile("file.py")
>
> So long as you make it clear that this is a platform specific module,
> and don't advertise it as a language feature, I see no reason why you
> cannot do that.

If you want execfile as a substitute for 'python -i file' on the 
unavailable command console, you should have the option to restore 
globals to initial condition. Something like (untested)

# startup entries in globals in CPython 3.4.1
startnames={'__spec__', '__name__', '__builtins__',
         '__doc__', '__loader__', '__package__'}

def execfile(file, encoding='utf-8', restart=<True/False>):
   glodict = globals()
   code = open(file, 'r', encoding=encoding)
   # don't restart if the file does not open
   if restart:
     for name in list(glodict):
       if name not in startnames:
          del glodict(name)
   for statement in statements(code):  # statements is statement iterator
     exec(statement,...globals=glodict, locals=glodict)

-- 
Terry Jan Reedy


From benhoyt at gmail.com  Tue Jun 10 06:02:14 2014
From: benhoyt at gmail.com (Ben Hoyt)
Date: Tue, 10 Jun 2014 00:02:14 -0400
Subject: [Python-Dev] Returning Windows file attribute information via
	os.stat()
Message-ID: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>

Hi folks,

As pointed out to me recently in an issue report [1] on my scandir
module, Python's os.stat() simply discards most of the file attribute
information fetched via the Win32 system calls. On Windows, os.stat()
calls CreateFile to open the file and get the dwFileAttributes value,
but it throws it all away except the FILE_ATTRIBUTE_DIRECTORY and
FILE_ATTRIBUTE_READONLY bits. See CPython source at [2].

Given that os.stat() returns extended, platform-specific file
attributes on Linux and OS X platforms (see [3] -- for example,
st_blocks, st_rsize, etc), it seems that Windows is something of a
second-class citizen here.

There are several questions on StackOverflow about how to get this
information on Windows, and one has to resort to ctypes. For example,
[4].

To solve this problem, what do people think about adding an
"st_winattrs" attribute to the object returned by os.stat() on
Windows?

Then, similarly to existing code like hasattr(st, 'st_blocks') on
Linux, you could write a cross-platform function to determine if a
file was hidden, something like so:

FILE_ATTRIBUTE_HIDDEN = 2  # constant defined in Windows.h

def is_hidden(path):
    if startswith(os.path.basename(path), '.'):
        return True
    st = os.stat(path)
    if hasattr(st, 'st_winattrs') and st.st_winattrs & FILE_ATTRIBUTE_HIDDEN:
        return True
    return False

I'd be interested to hear people's thoughts on this.

Thanks,
Ben.

[1]: https://github.com/benhoyt/scandir/issues/22
[2]: https://github.com/python/cpython/blob/master/Modules/posixmodule.c#L1462
[3]: https://docs.python.org/3.4/library/os.html#os.stat
[4]: http://stackoverflow.com/a/6365265

From jim.baker at zyasoft.com  Tue Jun 10 06:41:19 2014
From: jim.baker at zyasoft.com (Jim Baker)
Date: Mon, 9 Jun 2014 22:41:19 -0600
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <20140610030303.GU10355@ando>
References: <20140610052312.280e49c9@x34f>
	<20140610030303.GU10355@ando>
Message-ID: <CAOhO=aNDFP-1dXeFRSoLuvbumjAWqDMRyzAPfGmauimmh1o=fw@mail.gmail.com>

On Mon, Jun 9, 2014 at 9:03 PM, Steven D'Aprano <steve at pearwood.info> wrote:

> ...
> There's nothing stopping alternative implementations having their own
> implementation-specific standard library modules.
>
> steve at orac:/home/s$ jython
> Jython 2.5.1+ (Release_2_5_1, Aug 4 2010, 07:18:19)
> [OpenJDK Server VM (Sun Microsystems Inc.)] on java1.6.0_27
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import java
> >>>
>
>
Small nit: Jython does implement a number of implementation-specific
modules in its version of the standard library; jarray comes to mind, which
is mostly but not completely superseded by the standard array module.
However, the java package namespace is not part of the standard library,
it's part of the standard Java ecosystem and it's due to a builtin import
hook:

Jython 2.7b3+ (default:6cee6fef06f0, Jun 9 2014, 22:29:14)
[Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.7.0_60
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', '/home/jbaker/jythondev/jython27/dist/Lib', '__classpath__',
'__pyclasspath__/', '/home/jbaker/.local/lib/jython2.7/site-packages',
'/home/jbaker/jythondev/jython27/dist/Lib/site-packages']

The entry __classpath__ means search CLASSPATH for Java packages; this
includes the Java runtime, rt.jar, from which you get package namespaces as
java.*, javax.*, sun.*, etc.

Another behavior that you get for free in Jython is being able to also
import the org.python.* namespace, which is Jython's own runtime. Some of
the implementations of standard library modules, such as threading, take
advantage of this support.

- Jim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140609/4362eeab/attachment.html>

From ncoghlan at gmail.com  Tue Jun 10 09:36:02 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 10 Jun 2014 17:36:02 +1000
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <20140610052312.280e49c9@x34f>
References: <20140610052312.280e49c9@x34f>
Message-ID: <CADiSq7d3FyByjZiwa1N4XZOoLUtz=TVYjM24P=iZMfNGzCCsPw@mail.gmail.com>

On 10 June 2014 12:23, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> 1. It hampers interactive mode - instead of short and easy to type
> execfile("file.py") one needs to use exec(open("file.py").read()). I'm
> sure that's not going to bother a lot of people - after all, the
> easiest way to execute a Python file is to drop back to shell and
> restart python with file name, using all wonders of tab completion. But
> now imagine that Python interpreter runs on bare hardware, and its REPL
> is the only shell. That's exactly what we have with MicroPython's
> Cortex-M port. But it's not really MicroPython-specific, there's
> CPython port to baremetal either - http://www.pycorn.org/ .

https://docs.python.org/3/library/runpy.html#runpy.run_path

    import runpy
    file_globals = runpy.run_path("file.py")

The standard implementation of run_path reads the whole file into
memory, but MicroPython would be free to optimise that and do
statement by statement execution instead (while that will pose some
challenges in terms of handling encoding cookies, future imports, etc
correctly, it's certainly feasible).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From p.f.moore at gmail.com  Tue Jun 10 10:37:12 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 10 Jun 2014 09:37:12 +0100
Subject: [Python-Dev] Returning Windows file attribute information via
	os.stat()
In-Reply-To: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
References: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
Message-ID: <CACac1F-WmHayRBVz-aktzqvfspNME=XLyNMjD_aKFNQ0iOHfJQ@mail.gmail.com>

On 10 June 2014 05:02, Ben Hoyt <benhoyt at gmail.com> wrote:
> To solve this problem, what do people think about adding an
> "st_winattrs" attribute to the object returned by os.stat() on
> Windows?

+1. Given the precedent of Linux- and OS X-specific attributes, this
seems like a no-brainer to me.

Paul

From p.f.moore at gmail.com  Tue Jun 10 10:41:16 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 10 Jun 2014 09:41:16 +0100
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <CADiSq7d3FyByjZiwa1N4XZOoLUtz=TVYjM24P=iZMfNGzCCsPw@mail.gmail.com>
References: <20140610052312.280e49c9@x34f>
 <CADiSq7d3FyByjZiwa1N4XZOoLUtz=TVYjM24P=iZMfNGzCCsPw@mail.gmail.com>
Message-ID: <CACac1F_3H1WzSx+ribWdwjQucfSOPdX7TKmv3BGqtZseJUdV=Q@mail.gmail.com>

On 10 June 2014 08:36, Nick Coghlan <ncoghlan at gmail.com> wrote:
> The standard implementation of run_path reads the whole file into
> memory, but MicroPython would be free to optimise that and do
> statement by statement execution instead (while that will pose some
> challenges in terms of handling encoding cookies, future imports, etc
> correctly, it's certainly feasible).

... and if they did optimise that way, I would imagine that the patch
would be a useful contribution back to the core Python stdlib, rather
than remaining a MicroPython-specific optimisation.

Paul

From ncoghlan at gmail.com  Tue Jun 10 11:07:40 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 10 Jun 2014 19:07:40 +1000
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <CACac1F_3H1WzSx+ribWdwjQucfSOPdX7TKmv3BGqtZseJUdV=Q@mail.gmail.com>
References: <20140610052312.280e49c9@x34f>
 <CADiSq7d3FyByjZiwa1N4XZOoLUtz=TVYjM24P=iZMfNGzCCsPw@mail.gmail.com>
 <CACac1F_3H1WzSx+ribWdwjQucfSOPdX7TKmv3BGqtZseJUdV=Q@mail.gmail.com>
Message-ID: <CADiSq7cR7t+H84Due7gcRsCus4YDojcx1i4+m8QsaZ1LzFQNFw@mail.gmail.com>

On 10 Jun 2014 18:41, "Paul Moore" <p.f.moore at gmail.com> wrote:
>
> On 10 June 2014 08:36, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > The standard implementation of run_path reads the whole file into
> > memory, but MicroPython would be free to optimise that and do
> > statement by statement execution instead (while that will pose some
> > challenges in terms of handling encoding cookies, future imports, etc
> > correctly, it's certainly feasible).
>
> ... and if they did optimise that way, I would imagine that the patch
> would be a useful contribution back to the core Python stdlib, rather
> than remaining a MicroPython-specific optimisation.

I believe it's a space/speed trade-off, so I'd be surprised if it made
sense for CPython in general. There are also some behavioural differences
when it comes to handling syntax errors.

Now that I think about the idea a bit more, if the MicroPython folks can
get a low memory usage incremental file execution model working, the
semantic differences mean it would likely make the most sense as a separate
API in runpy, rather than as an implicit change to run_path.

Cheers,
Nick.

>
> Paul
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140610/fcfc2f02/attachment.html>

From victor.stinner at gmail.com  Tue Jun 10 11:34:57 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Tue, 10 Jun 2014 11:34:57 +0200
Subject: [Python-Dev] Returning Windows file attribute information via
	os.stat()
In-Reply-To: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
References: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
Message-ID: <CAMpsgwaw+bknGPNc77_dcMzNHdDhyyuj4wuXMAsoHhbJdUHRMw@mail.gmail.com>

2014-06-10 6:02 GMT+02:00 Ben Hoyt <benhoyt at gmail.com>:
> To solve this problem, what do people think about adding an
> "st_winattrs" attribute to the object returned by os.stat() on
> Windows?
> (...)
> FILE_ATTRIBUTE_HIDDEN = 2  # constant defined in Windows.h
>
>     if hasattr(st, 'st_winattrs') and st.st_winattrs & FILE_ATTRIBUTE_HIDDEN:

I don't like such API, it requires to import constants, use masks, etc.

I would prefer something like:

   if st.win_hidden: ...

Or maybe:

   if st.winattrs.hidden: ...

Victor

From python at mrabarnett.plus.com  Tue Jun 10 14:03:14 2014
From: python at mrabarnett.plus.com (MRAB)
Date: Tue, 10 Jun 2014 13:03:14 +0100
Subject: [Python-Dev] Returning Windows file attribute information via
 os.stat()
In-Reply-To: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
References: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
Message-ID: <5396F402.3030309@mrabarnett.plus.com>

On 2014-06-10 05:02, Ben Hoyt wrote:
[snip]
>
> FILE_ATTRIBUTE_HIDDEN = 2  # constant defined in Windows.h
>
> def is_hidden(path):
>      if startswith(os.path.basename(path), '.'):
>          return True
>      st = os.stat(path)
>      if hasattr(st, 'st_winattrs') and st.st_winattrs & FILE_ATTRIBUTE_HIDDEN:

That could be written more succinctly as:

       if getattr(st, 'st_winattrs', 0) & FILE_ATTRIBUTE_HIDDEN:

>          return True
>      return False
>


From benhoyt at gmail.com  Tue Jun 10 14:19:54 2014
From: benhoyt at gmail.com (Ben Hoyt)
Date: Tue, 10 Jun 2014 08:19:54 -0400
Subject: [Python-Dev] Returning Windows file attribute information via
	os.stat()
In-Reply-To: <CAMpsgwaw+bknGPNc77_dcMzNHdDhyyuj4wuXMAsoHhbJdUHRMw@mail.gmail.com>
References: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
 <CAMpsgwaw+bknGPNc77_dcMzNHdDhyyuj4wuXMAsoHhbJdUHRMw@mail.gmail.com>
Message-ID: <CAL9jXCFcvamanwZBUZFMFGh1SHjxB7LRAXL2LxGULZVNJGqTJg@mail.gmail.com>

> > FILE_ATTRIBUTE_HIDDEN = 2  # constant defined in Windows.h
> >
> >     if hasattr(st, 'st_winattrs') and st.st_winattrs & FILE_ATTRIBUTE_HIDDEN:
>
> I don't like such API, it requires to import constants, use masks, etc.
>
> I would prefer something like:
>
>    if st.win_hidden: ...
>
> Or maybe:
>
>    if st.winattrs.hidden: ...

Yes, fair call. However, it looks like the precent for the attributes
in os.stat()'s return value has long since been set -- this is
OS-specific stuff. For example, what's in "st_flags"? It's not
documented, but comes straight from the OS. Same with st_rdev,
st_type, etc -- the documentation doesn't define them, and it looks
like they're OS-specific values.

I don't think the st.win_hidden approach gains us much, because the
next person is going to ask for the FILE_ATTRIBUTE_ENCRYPTED or
FILE_ATTRIBUTE_COMPRESSED flag. So we really need all the bits or
nothing. I don't mind the st.st_winattrs.hidden approach, except that
we'd need 17 sub-attributes, and they'd all have to be documented. And
if Windows added another attribute, Python wouldn't have it, etc. So I
think the OS-defined constant is the way to go.

Because these are fixed-forever constants, I suspect in library code
and the like people would just KISS and use an integer literal and a
comment, avoiding the import/constant thing:

    if getattr(st, 'st_winattrs', 0) & 2:  # FILE_ATTRIBUTE_HIDDEN
        ...

-Ben

From benhoyt at gmail.com  Tue Jun 10 14:20:56 2014
From: benhoyt at gmail.com (Ben Hoyt)
Date: Tue, 10 Jun 2014 08:20:56 -0400
Subject: [Python-Dev] Returning Windows file attribute information via
	os.stat()
In-Reply-To: <5396F402.3030309@mrabarnett.plus.com>
References: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
 <5396F402.3030309@mrabarnett.plus.com>
Message-ID: <CAL9jXCGxaBt=4zC_DO_1iLNCmZE=xORZXJyq24GL9Sn2BE3=PQ@mail.gmail.com>

>>      if hasattr(st, 'st_winattrs') and st.st_winattrs &
>> FILE_ATTRIBUTE_HIDDEN:
>
> That could be written more succinctly as:
>
>       if getattr(st, 'st_winattrs', 0) & FILE_ATTRIBUTE_HIDDEN:
>
>>          return True
>>      return False

Yes, good call. Or one further:

    return getattr(st, 'st_winattrs', 0) & FILE_ATTRIBUTE_HIDDEN != 0

-Ben

From p.f.moore at gmail.com  Tue Jun 10 14:44:43 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 10 Jun 2014 13:44:43 +0100
Subject: [Python-Dev] Returning Windows file attribute information via
	os.stat()
In-Reply-To: <CAL9jXCFcvamanwZBUZFMFGh1SHjxB7LRAXL2LxGULZVNJGqTJg@mail.gmail.com>
References: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
 <CAMpsgwaw+bknGPNc77_dcMzNHdDhyyuj4wuXMAsoHhbJdUHRMw@mail.gmail.com>
 <CAL9jXCFcvamanwZBUZFMFGh1SHjxB7LRAXL2LxGULZVNJGqTJg@mail.gmail.com>
Message-ID: <CACac1F8BrJrvSU3H7hP_4cU+QMv5N5MnrfeuMe5NB7=r=+53UQ@mail.gmail.com>

On 10 June 2014 13:19, Ben Hoyt <benhoyt at gmail.com> wrote:
> Because these are fixed-forever constants, I suspect in library code
> and the like people would just KISS and use an integer literal and a
> comment, avoiding the import/constant thing:

The stat module exposes a load of constants - why not add the
(currently known) ones there? Finding the values of Windows constants
if you don't have access to the C headers can be a pain, so having
them defined *somewhere* as named values is useful.

Paul

From benhoyt at gmail.com  Tue Jun 10 14:58:19 2014
From: benhoyt at gmail.com (Ben Hoyt)
Date: Tue, 10 Jun 2014 08:58:19 -0400
Subject: [Python-Dev] Returning Windows file attribute information via
	os.stat()
In-Reply-To: <CACac1F8BrJrvSU3H7hP_4cU+QMv5N5MnrfeuMe5NB7=r=+53UQ@mail.gmail.com>
References: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
 <CAMpsgwaw+bknGPNc77_dcMzNHdDhyyuj4wuXMAsoHhbJdUHRMw@mail.gmail.com>
 <CAL9jXCFcvamanwZBUZFMFGh1SHjxB7LRAXL2LxGULZVNJGqTJg@mail.gmail.com>
 <CACac1F8BrJrvSU3H7hP_4cU+QMv5N5MnrfeuMe5NB7=r=+53UQ@mail.gmail.com>
Message-ID: <CAL9jXCGa=86PKDNMqTMSZAONHejvsT0G4R8D-8xRcuVP6_fPrw@mail.gmail.com>

> The stat module exposes a load of constants - why not add the
> (currently known) ones there? Finding the values of Windows constants
> if you don't have access to the C headers can be a pain, so having
> them defined *somewhere* as named values is useful.

So stat.FILE_ATTRIBUTES_HIDDEN and the like? Alternatively they could
go in ctypes.wintypes, but I think stat makes more sense in this case.

-Ben

From rdmurray at bitdance.com  Tue Jun 10 15:05:55 2014
From: rdmurray at bitdance.com (R. David Murray)
Date: Tue, 10 Jun 2014 09:05:55 -0400
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <CADiSq7cR7t+H84Due7gcRsCus4YDojcx1i4+m8QsaZ1LzFQNFw@mail.gmail.com>
References: <20140610052312.280e49c9@x34f>
 <CADiSq7d3FyByjZiwa1N4XZOoLUtz=TVYjM24P=iZMfNGzCCsPw@mail.gmail.com>
 <CACac1F_3H1WzSx+ribWdwjQucfSOPdX7TKmv3BGqtZseJUdV=Q@mail.gmail.com>
 <CADiSq7cR7t+H84Due7gcRsCus4YDojcx1i4+m8QsaZ1LzFQNFw@mail.gmail.com>
Message-ID: <20140610130555.7B71A250D5E@webabinitio.net>

On Tue, 10 Jun 2014 19:07:40 +1000, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 10 Jun 2014 18:41, "Paul Moore" <p.f.moore at gmail.com> wrote:
> >
> > On 10 June 2014 08:36, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > > The standard implementation of run_path reads the whole file into
> > > memory, but MicroPython would be free to optimise that and do
> > > statement by statement execution instead (while that will pose some
> > > challenges in terms of handling encoding cookies, future imports, etc
> > > correctly, it's certainly feasible).
> >
> > ... and if they did optimise that way, I would imagine that the patch
> > would be a useful contribution back to the core Python stdlib, rather
> > than remaining a MicroPython-specific optimisation.
> 
> I believe it's a space/speed trade-off, so I'd be surprised if it made
> sense for CPython in general. There are also some behavioural differences
> when it comes to handling syntax errors.
> 
> Now that I think about the idea a bit more, if the MicroPython folks can
> get a low memory usage incremental file execution model working, the
> semantic differences mean it would likely make the most sense as a separate
> API in runpy, rather than as an implicit change to run_path.

If it is a separate API, it seems like there's no reason it couldn't be
contributed back to CPython.  There might be other contexts in which
low memory would be the right tradeoff.  Although, if key bits end
up working at the C level, "contributing back" might require writing
separate C for CPython, so that might not happen.

--David

From ncoghlan at gmail.com  Tue Jun 10 15:11:18 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 10 Jun 2014 23:11:18 +1000
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <20140610130555.7B71A250D5E@webabinitio.net>
References: <20140610052312.280e49c9@x34f>
 <CADiSq7d3FyByjZiwa1N4XZOoLUtz=TVYjM24P=iZMfNGzCCsPw@mail.gmail.com>
 <CACac1F_3H1WzSx+ribWdwjQucfSOPdX7TKmv3BGqtZseJUdV=Q@mail.gmail.com>
 <CADiSq7cR7t+H84Due7gcRsCus4YDojcx1i4+m8QsaZ1LzFQNFw@mail.gmail.com>
 <20140610130555.7B71A250D5E@webabinitio.net>
Message-ID: <CADiSq7fZ2pkDwZdFeKhYhESuSOSDOZHqVGW7-NszTLSdgVxNbQ@mail.gmail.com>

On 10 June 2014 23:05, R. David Murray <rdmurray at bitdance.com> wrote:
> On Tue, 10 Jun 2014 19:07:40 +1000, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> I believe it's a space/speed trade-off, so I'd be surprised if it made
>> sense for CPython in general. There are also some behavioural differences
>> when it comes to handling syntax errors.
>>
>> Now that I think about the idea a bit more, if the MicroPython folks can
>> get a low memory usage incremental file execution model working, the
>> semantic differences mean it would likely make the most sense as a separate
>> API in runpy, rather than as an implicit change to run_path.
>
> If it is a separate API, it seems like there's no reason it couldn't be
> contributed back to CPython.  There might be other contexts in which
> low memory would be the right tradeoff.  Although, if key bits end
> up working at the C level, "contributing back" might require writing
> separate C for CPython, so that might not happen.

Yeah, as a separate API it could make sense in CPython - I just didn't
go back and revise the first paragraph after writing the second one :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From p.f.moore at gmail.com  Tue Jun 10 15:22:04 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 10 Jun 2014 14:22:04 +0100
Subject: [Python-Dev] Returning Windows file attribute information via
	os.stat()
In-Reply-To: <CAL9jXCGa=86PKDNMqTMSZAONHejvsT0G4R8D-8xRcuVP6_fPrw@mail.gmail.com>
References: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
 <CAMpsgwaw+bknGPNc77_dcMzNHdDhyyuj4wuXMAsoHhbJdUHRMw@mail.gmail.com>
 <CAL9jXCFcvamanwZBUZFMFGh1SHjxB7LRAXL2LxGULZVNJGqTJg@mail.gmail.com>
 <CACac1F8BrJrvSU3H7hP_4cU+QMv5N5MnrfeuMe5NB7=r=+53UQ@mail.gmail.com>
 <CAL9jXCGa=86PKDNMqTMSZAONHejvsT0G4R8D-8xRcuVP6_fPrw@mail.gmail.com>
Message-ID: <CACac1F9E3iQb=GNeOb3gRe0viQKWRzeixSUkTi=gQT-zGgL2nA@mail.gmail.com>

On 10 June 2014 13:58, Ben Hoyt <benhoyt at gmail.com> wrote:
> So stat.FILE_ATTRIBUTES_HIDDEN and the like?

Yep. (Maybe WIN_FILE_ATTRIBUTES_HIDDEN, but the Unix ones don't have
an OA name prefix, so I'd go with your original).

Paul

From Steve.Dower at microsoft.com  Tue Jun 10 18:30:24 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Tue, 10 Jun 2014 16:30:24 +0000
Subject: [Python-Dev] Python 3.5 on VC14 - update
Message-ID: <e876acae2eec4773a8a12b0ac6c1c5eb@DM2PR03MB400.namprd03.prod.outlook.com>

For anyone who is interested in more details on the CRT changes, there's a blog post from my colleague who worked on most of them at http://blogs.msdn.com/b/vcblog/archive/2014/06/10/the-great-crt-refactoring.aspx

I wanted to call out one section and add some details:

    In order to unify these different CRTs [desktop, phone, etc], we have split the CRT into three pieces:
    1. VCRuntime (vcruntime140.dll): This DLL contains all of the runtime
       functionality required for things like process startup and exception handling,
       and functionality that is coupled to the compiler for one reason or another. We
       may need to make breaking changes to this library in the future.
    2. AppCRT (appcrt140.dll): This DLL contains all of the functionality that is
       usable on all platforms. This includes the heap, the math library, the stdio and
       locale libraries, most of the string manipulation functions, the time library,
       and a handful of other functions. We will maintain backwards compatibility for
       this part of the CRT.
    3. DesktopCRT (desktopcrt140.dll): This DLL contains all of the functionality
       that is usable only by desktop apps. Notably, this includes the functions for
       working with multibyte strings, the exec and spawn process management functions,
       and the direct-to-console I/O functions. We will maintain backwards
       compatibility for this part of the CRT.

The builds of Python I've already made are indeed linked against these three DLLs, though it happens transparently. Most of the APIs are from the AppCRT, which is a good sign as it will simplify portability to other Windows-based platforms (though the direct references to the Win32 API will arise again to complicate this).

Very few functions are imported from VCRuntime, which is the only part that *may* have breaking changes in the future (that's the current promise, and I'd expect it to be strengthened one way or the other by releas). Apart from the standard memcpy/strcpy type functions (which may be moved in later builds), these other imports are compiler helpers:

* void terminate(void) (currently exported as a decorated C++ function, but that's going to be fixed)
* __vcrt_TerminateProcess
* __vcrt_UnhandledException
* __vcrt_cleanup_type_info_names
* _except_handler4_common
* _local_unwind4

I've checked with our CRT dev and he says that these don't keep any state (and won't cause problems like we've seen in the past with FILE*), and are only there to deal with potential C++ exceptions - they are included at a point where it is impossible to tell whether C++ is involved, and so can't be removed.

My builds pass almost all of regrtest.py and the only issues are with Tcl/tk and OpenSSL, which need to update their compiler version detection. I've built them with changes, though as usual Tcl/tk is a real pain.

I ran a quick test with profile-guided optimization (PGO, pronounced "pogo"), which has supposedly been improved since VC9, and saw a very unscientific 20% speed improvement on pybench.py and 10% size reduction in python35.dll. I'm not sure what we used to get from VC9, but it certainly seems worth enabling provided it doesn't break anything. (Interestingly, PGO decided that only 1% of functions needed to be compiled for speed. Not sure if I can find out which ones those are but if anyone's interested I can give it a shot?)

Cheers,
Steve

From ethan at stoneleaf.us  Tue Jun 10 19:17:28 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 10 Jun 2014 10:17:28 -0700
Subject: [Python-Dev] Returning Windows file attribute information via
 os.stat()
In-Reply-To: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
References: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
Message-ID: <53973DA8.1090602@stoneleaf.us>

On 06/09/2014 09:02 PM, Ben Hoyt wrote:
>
> To solve this problem, what do people think about adding an
> "st_winattrs" attribute to the object returned by os.stat() on
> Windows?

+1 to the idea, whatever the exact implementation.

--
~Ethan~

From zachary.ware+pydev at gmail.com  Tue Jun 10 20:02:51 2014
From: zachary.ware+pydev at gmail.com (Zachary Ware)
Date: Tue, 10 Jun 2014 13:02:51 -0500
Subject: [Python-Dev] Returning Windows file attribute information via
	os.stat()
In-Reply-To: <53973DA8.1090602@stoneleaf.us>
References: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
 <53973DA8.1090602@stoneleaf.us>
Message-ID: <CAKJDb-Nm2FsxMP3AzFp7pZcOhKkyD9RqtNLsyVh9jKx4hEHWRw@mail.gmail.com>

On Tue, Jun 10, 2014 at 12:17 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
> On 06/09/2014 09:02 PM, Ben Hoyt wrote:
>> To solve this problem, what do people think about adding an
>> "st_winattrs" attribute to the object returned by os.stat() on
>> Windows?
>
>
> +1 to the idea, whatever the exact implementation.

Agreed.

-- 
Zach

From antoine at python.org  Tue Jun 10 20:26:33 2014
From: antoine at python.org (Antoine Pitrou)
Date: Tue, 10 Jun 2014 14:26:33 -0400
Subject: [Python-Dev] Python 3.5 on VC14 - update
In-Reply-To: <e876acae2eec4773a8a12b0ac6c1c5eb@DM2PR03MB400.namprd03.prod.outlook.com>
References: <e876acae2eec4773a8a12b0ac6c1c5eb@DM2PR03MB400.namprd03.prod.outlook.com>
Message-ID: <ln7ikv$f6m$1@ger.gmane.org>

Le 10/06/2014 12:30, Steve Dower a ?crit :
>
> I ran a quick test with profile-guided optimization (PGO, pronounced
"pogo"), which has supposedly been improved since VC9, and saw a very
unscientific 20% speed improvement on pybench.py and 10% size reduction
in python35.dll. I'm not sure what we used to get from VC9, but it
certainly seems worth enabling provided it doesn't break anything.
(Interestingly, PGO decided that only 1% of functions needed to be
compiled for speed. Not sure if I can find out which ones those are but
if anyone's interested I can give it a shot?)

I would recommend using the non-trivial suite of benchmarks at
http://hg.python.org/benchmarks
(both for the profiling and the benchmarking, though you may want to use 
additional workloads for profiling too)

Regards

Antoine.


From eric at trueblade.com  Tue Jun 10 20:33:27 2014
From: eric at trueblade.com (Eric V. Smith)
Date: Tue, 10 Jun 2014 14:33:27 -0400
Subject: [Python-Dev] namedtuple implementation grumble
In-Reply-To: <53950C26.10006@trueblade.com>
References: <53920D91.3060207@simplistix.co.uk>
 <CADiSq7fUZvDGg34Tjqp3pkHhg-zmLyCGSj0QEox6W9S1dOAryQ@mail.gmail.com>
 <5394B5F3.6050403@trueblade.com> <20140608233117.GS10355@ando>
 <53950C26.10006@trueblade.com>
Message-ID: <53974F77.8000302@trueblade.com>

>> I wonder how a hybrid approach would work? Use a dynamically-created 
>> class, but then construct the __new__ method using exec and inject it 
>> into the new class. As far as I can see, it's only __new__ that benefits 
>> from the exec approach.
>>
>> Anyone tried this yet? Is it worth an experiment?
> 
> I'm not sure what the benefit would be. Other than the ast manipulations
> for __new__, the rest of the non-exec code is easy to understand.

I misread this, sorry. This might work for collections.namedtuple, but
is probably not worth the hassle or churn of changing it.

The main reason I switched to ast for namedlist is because generating
the text version of __new__ or __init__ with default parameter values
was extremely difficult, so an approach of exec-ing that one function
wouldn't work for me.

Eric.


From hasan.diwan at gmail.com  Tue Jun 10 20:51:16 2014
From: hasan.diwan at gmail.com (Hasan Diwan)
Date: Tue, 10 Jun 2014 11:51:16 -0700
Subject: [Python-Dev] Documentation Oversight
Message-ID: <CAP+bYWAiPP8Gw1-FhhYjSSxsmg+5p8Z3tsTb1VWM7zkS0B5rkw@mail.gmail.com>

>From the csv module pydoc:
"The optional "dialect" parameter is discussed below"

The discussion is actually above the method. Present in 2.7.6. -- H

-- 
Sent from my mobile device
Envoy? de mon portable
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140610/46eb2c63/attachment.html>

From Steve.Dower at microsoft.com  Tue Jun 10 20:37:10 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Tue, 10 Jun 2014 18:37:10 +0000
Subject: [Python-Dev] Python 3.5 on VC14 - update
In-Reply-To: <ln7ikv$f6m$1@ger.gmane.org>
References: <e876acae2eec4773a8a12b0ac6c1c5eb@DM2PR03MB400.namprd03.prod.outlook.com>
 <ln7ikv$f6m$1@ger.gmane.org>
Message-ID: <d0dfa8613d1748c18c220bb8c9a4c057@DM2PR03MB400.namprd03.prod.outlook.com>

> Antoine Pitrou wrote:
> Le 10/06/2014 12:30, Steve Dower a ?crit :
>>
>> I ran a quick test with profile-guided optimization (PGO, pronounced
>> "pogo"), which has supposedly been improved since VC9, and saw a very
>> unscientific 20% speed improvement on pybench.py and 10% size reduction in
>> python35.dll. I'm not sure what we used to get from VC9, but it certainly seems
>> worth enabling provided it doesn't break anything.
>> (Interestingly, PGO decided that only 1% of functions needed to be compiled for
>> speed. Not sure if I can find out which ones those are but if anyone's
>> interested I can give it a shot?)
> 
> I would recommend using the non-trivial suite of benchmarks at
> http://hg.python.org/benchmarks (both for the profiling and the benchmarking,
> though you may want to use additional workloads for profiling too)
> 
> Regards
> 
> Antoine.
>

Thanks. I knew there was a proper set somewhere, but didn't manage to track it down in the minute or so I spent looking :)

Cheers,
Steve

From benhoyt at gmail.com  Tue Jun 10 21:04:12 2014
From: benhoyt at gmail.com (Ben Hoyt)
Date: Tue, 10 Jun 2014 15:04:12 -0400
Subject: [Python-Dev] Returning Windows file attribute information via
	os.stat()
In-Reply-To: <53973DA8.1090602@stoneleaf.us>
References: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
 <53973DA8.1090602@stoneleaf.us>
Message-ID: <CAL9jXCF9FfGdsOZSbZq-wL15qCHAXpo1X4rdPj-hst-K_ijNMw@mail.gmail.com>

>> To solve this problem, what do people think about adding an
>> "st_winattrs" attribute to the object returned by os.stat() on
>> Windows?
>
> +1 to the idea, whatever the exact implementation.

Cool.

I think we should add a st_winattrs integer attribute (on Windows) and
then also add the FILE_ATTRIBUTES_* constants to stat.py per Paul
Moore.

What would be the next steps to get this to happen? Open an issue on
bugs.python.org and submit a patch with tests?

-Ben

From zachary.ware+pydev at gmail.com  Tue Jun 10 21:08:26 2014
From: zachary.ware+pydev at gmail.com (Zachary Ware)
Date: Tue, 10 Jun 2014 14:08:26 -0500
Subject: [Python-Dev] Returning Windows file attribute information via
	os.stat()
In-Reply-To: <CAL9jXCF9FfGdsOZSbZq-wL15qCHAXpo1X4rdPj-hst-K_ijNMw@mail.gmail.com>
References: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
 <53973DA8.1090602@stoneleaf.us>
 <CAL9jXCF9FfGdsOZSbZq-wL15qCHAXpo1X4rdPj-hst-K_ijNMw@mail.gmail.com>
Message-ID: <CAKJDb-O7_VNCAgaT8qy21fu41-H-4k41T+fYmKLfNMoRTOcCiw@mail.gmail.com>

On Tue, Jun 10, 2014 at 2:04 PM, Ben Hoyt <benhoyt at gmail.com> wrote:
>>> To solve this problem, what do people think about adding an
>>> "st_winattrs" attribute to the object returned by os.stat() on
>>> Windows?
>>
>> +1 to the idea, whatever the exact implementation.
>
> Cool.
>
> I think we should add a st_winattrs integer attribute (on Windows) and
> then also add the FILE_ATTRIBUTES_* constants to stat.py per Paul
> Moore.

Add to _stat.c rather than stat.py.

> What would be the next steps to get this to happen? Open an issue on
> bugs.python.org and submit a patch with tests?

Yep!

-- 
Zach

From tjreedy at udel.edu  Tue Jun 10 21:49:26 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 10 Jun 2014 15:49:26 -0400
Subject: [Python-Dev] Documentation Oversight
In-Reply-To: <CAP+bYWAiPP8Gw1-FhhYjSSxsmg+5p8Z3tsTb1VWM7zkS0B5rkw@mail.gmail.com>
References: <CAP+bYWAiPP8Gw1-FhhYjSSxsmg+5p8Z3tsTb1VWM7zkS0B5rkw@mail.gmail.com>
Message-ID: <53976146.6040007@udel.edu>

On 6/10/2014 2:51 PM, Hasan Diwan wrote:
>  From the csv module pydoc:
> "The optional "dialect" parameter is discussed below"
>
> The discussion is actually above the method. Present in 2.7.6.

Bug reports should be posted on the tracker rather than sent here. Short 
doc reports like this can be sent to docs at python.org.

Also, the docs are continuous updated. Reports should be based on the 
current version as docs.python.org. As it turns out, this sentence is 
not in the current Doc/library/csv.rst or the online version at
https://docs.python.org/3/library/csv.html#module-csv
If this is what you meant, something has been changed.

-- 
Terry Jan Reedy


From victor.stinner at gmail.com  Tue Jun 10 22:29:07 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Tue, 10 Jun 2014 22:29:07 +0200
Subject: [Python-Dev] Python 3.5 on VC14 - update
In-Reply-To: <e876acae2eec4773a8a12b0ac6c1c5eb@DM2PR03MB400.namprd03.prod.outlook.com>
References: <e876acae2eec4773a8a12b0ac6c1c5eb@DM2PR03MB400.namprd03.prod.outlook.com>
Message-ID: <CAMpsgwamXpQ0OMKmKuu1uaMhtiORkkmjdy9=h7NKrgeM1JiALw@mail.gmail.com>

2014-06-10 18:30 GMT+02:00 Steve Dower <Steve.Dower at microsoft.com>:
> I ran a quick test with profile-guided optimization (PGO, pronounced "pogo"), which has supposedly been improved since VC9, and saw a very unscientific 20% speed improvement on pybench.py and 10% size reduction in python35.dll. I'm not sure what we used to get from VC9, but it certainly seems worth enabling provided it doesn't break anything. (Interestingly, PGO decided that only 1% of functions needed to be compiled for speed. Not sure if I can find out which ones those are but if anyone's interested I can give it a shot?)

If we upgrade the compiler on Windows, some optimizer options can
maybe be enabled again. Previous Visual Studio (2010?) bugs:

* http://bugs.python.org/issue15993

* http://bugs.python.org/issue8847#msg166935

Victor

From martin at v.loewis.de  Wed Jun 11 00:05:42 2014
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jun 2014 00:05:42 +0200
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <438e8a27e8e643f4841a22b24447b956@BLUPR03MB389.namprd03.prod.outlook.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <53921464.7030400@v.loewis.de>
 <CACac1F-jn02yBsMdzQ8vwhN93eT0JhpRb8nQtdn97rEw12OEXg@mail.gmail.com>
 <5392232A.2000102@v.loewis.de>
 <438e8a27e8e643f4841a22b24447b956@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <53978136.4000307@v.loewis.de>

Am 07.06.14 01:01, schrieb Steve Dower:
> We keep the VS 2010 files around and make sure they keep working.
> This is the biggest risk of the whole plan, but I believe that
> there's enough of a gap between when VS 14 is planned to release
> (which I know, but can't share) and when Python 3.5 is planned (which
> I don't know, but have a semi-informed guess).

By "keep around", I'd be fine with "in a subdirectory of PC". PCbuild
should either switch for sure, or not switch at all. People had proposed
to come up with a "PCbuildN" directory (N=10, N=14, or whatever) to
maintain two build environments simultaneously; I'd be -1 on such a
plan. There needs to be one official toolset to build Python X.Y with,
and it needs to be either VS 2010 or VS 2014, but not both.

> Is Python 3.5b1 being built with VS 14 RC (hypothetically) a blocking
> issue? Do we need to resolve that now or can it wait until it
> happens?

It's up to the release manager, but I'd personally see it as a blocking
issue: we shouldn't use a beta compiler for the final release, and we
shouldn't switch compilers (back) after b1. The RM *could* opt to bet
on VS 14 RTM appearing before 3.5rc1 is released (or otherwise blocking
rc1 until VS 14 is released); I would consider this risy, but possibly
worth it.

We certainly don't need to resolve this now. We should discuss it again
when the release schedule for 3.5 is proposed.

Regards,
Martin


From martin at v.loewis.de  Wed Jun 11 00:15:15 2014
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jun 2014 00:15:15 +0200
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <1402155524095.94474@microsoft.com>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <4bad156ff9f145b792191327736e672d@BLUPR03MB389.namprd03.prod.outlook.com>
 <CAPTjJmqnVSHzfPUe7xdQ5aVFufuf7nzh+7ds1pgHcfs0z5nr5A@mail.gmail.com>
 <CAD+XWwormbNdAO3orRy=gfdQMfkiaOj6frhtCTkOHoQO=ByGew@mail.gmail.com>
 <53920B4C.8020700@egenix.com>
 <CAD+XWwoBbF=EM66d=MoBH89WCouo8MKLTeHJmebrGsS8bo4JnQ@mail.gmail.com>
 <20140606185631.GA11094@k2>
 <CAP1=2W7wo2hWQz1jL4E3GGAEoP0ahn73isdFnefphRY80hx6xQ@mail.gmail.com>
 <896772508423787267.391031sturla.molden-gmail.com@news.gmane.org>
 <CADiSq7fGgZvXDobvPNSovxtoY=+V=q0BiExV1GWQ+8wg5yPKcw@mail.gmail.com>
 <B355A9B3-4A49-4109-8E52-C321A99BACB4@stufft.io>
 <CADiSq7cw0LD55o6XryLKGp2XZzXcN2ktEMVv1j1+PEh-ArtBBw@mail.gmail.com>
 <2FCC7CC7-8D23-45BF-8157-1C92B9566A16@stufft.io>
 <CADiSq7casoaZNEY9Hp_Rwc0fgd7tpGMhot8TJm3vOZN9DyYG_A@mail.gmail.com>,
 <CAPJVwBn=DaCCFSCSBOGJiNr=wmQHV-iRt+DxMVrpvk8aAVyn8g@mail.gmail.com>
 <1402155524095.94474@microsoft.com>
Message-ID: <53978373.5050701@v.loewis.de>

Am 07.06.14 17:38, schrieb Steve Dower:
> One more possible concern that I just thought of is the availability of
> the build tools on Windows Vista and Windows 7 RTM (that is, without
> SP1). I'd have to check, but I don't believe anything after VS 2012 is
> supported on Vista and it's entirely possible that installation is blocked.

I wouldn't worry about that. People can be asked to update their build
machines (within reason), as long as the resulting binaries should work
on older systems still.

There are testing issues, of course, but they show up even in other
cases, like testing whether a 32-bit installer actually runs on
a 32-bit system when the build system is a 64-bit system; such issues
will always exist.

Regards,
Martin


From martin at v.loewis.de  Wed Jun 11 00:24:48 2014
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 11 Jun 2014 00:24:48 +0200
Subject: [Python-Dev] Python 3.5 on VC14 - update
In-Reply-To: <e876acae2eec4773a8a12b0ac6c1c5eb@DM2PR03MB400.namprd03.prod.outlook.com>
References: <e876acae2eec4773a8a12b0ac6c1c5eb@DM2PR03MB400.namprd03.prod.outlook.com>
Message-ID: <539785B0.8030909@v.loewis.de>

Am 10.06.14 18:30, schrieb Steve Dower:
> I ran a quick test with profile-guided optimization (PGO, pronounced
> "pogo"), which has supposedly been improved since VC9, and saw a very
> unscientific 20% speed improvement on pybench.py and 10% size
> reduction in python35.dll. I'm not sure what we used to get from VC9,
> but it certainly seems worth enabling provided it doesn't break
> anything. (Interestingly, PGO decided that only 1% of functions
> needed to be compiled for speed. Not sure if I can find out which
> ones those are but if anyone's interested I can give it a shot?)

You probably ran too little Python code. See PCbuild/build_pgo.bat
for what used to be part of the release process. It takes quite some
time, but it rebuilt more than 1% (IIRC).

FWIW, I stopped using PGO for the official releases when it was
demonstrated to generate bad code. In my experience, a compiler
that generates bad code has lost trust "forever", so it will be
hard to justify re-enabling PGO (like "but it really works this
time"). I wasn't sad when I found a justification to skip the
profiling, since it significantly held up the release process.

Regards,
Martin

From Steve.Dower at microsoft.com  Wed Jun 11 00:48:21 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Tue, 10 Jun 2014 22:48:21 +0000
Subject: [Python-Dev] Python 3.5 on VC14 - update
In-Reply-To: <539785B0.8030909@v.loewis.de>
References: <e876acae2eec4773a8a12b0ac6c1c5eb@DM2PR03MB400.namprd03.prod.outlook.com>
 <539785B0.8030909@v.loewis.de>
Message-ID: <be46866c9208431ea585e7660ff9c70b@DM2PR03MB400.namprd03.prod.outlook.com>

Martin v. L?wis wrote:
> Am 10.06.14 18:30, schrieb Steve Dower:
>> I ran a quick test with profile-guided optimization (PGO, pronounced
>> "pogo"), which has supposedly been improved since VC9, and saw a very
>> unscientific 20% speed improvement on pybench.py and 10% size
>> reduction in python35.dll. I'm not sure what we used to get from VC9,
>> but it certainly seems worth enabling provided it doesn't break
>> anything. (Interestingly, PGO decided that only 1% of functions needed
>> to be compiled for speed. Not sure if I can find out which ones those
>> are but if anyone's interested I can give it a shot?)
> 
> You probably ran too little Python code. See PCbuild/build_pgo.bat for what used
> to be part of the release process. It takes quite some time, but it rebuilt more
> than 1% (IIRC).

That's almost certainly the case. I didn't run anywhere near enough to call it good, though I'd only really expect the size to get worse and the speed to get better.

> FWIW, I stopped using PGO for the official releases when it was demonstrated to
> generate bad code. In my experience, a compiler that generates bad code has lost
> trust "forever", so it will be hard to justify re-enabling PGO (like "but it
> really works this time"). I wasn't sad when I found a justification to skip the
> profiling, since it significantly held up the release process.

Yeah, and it seems the bad code is still there. I suspect it's actually due to optimizing for space rather than speed, and not due to PGO directly, but either way I'll be trying to get it fixed.

[EARLIER EMAIL]
> By "keep around", I'd be fine with "in a subdirectory of PC". PCbuild should
> either switch for sure, or not switch at all. People had proposed to come up
> with a "PCbuildN" directory (N=10, N=14, or whatever) to maintain two build
> environments simultaneously; I'd be -1 on such a plan. There needs to be one
> official toolset to build Python X.Y with, and it needs to be either VS 2010 or
> VS 2014, but not both.

That's what I have planned. Right now it's in my sandbox and I've just replaced the existing PCbuild contents (rather wholesale - I took the opportunity to simplify the files, which is important to me as I spend most of my time editing them by hand rather than through VS). When/if I merge, the version in PC\VS10.0 will be exactly what was there at merge time.

> Regards,
> Martin

And thanks, I appreciate the context and suggestions.

Cheers,
Steve

From thomas at python.org  Wed Jun 11 03:10:43 2014
From: thomas at python.org (Thomas Wouters)
Date: Tue, 10 Jun 2014 18:10:43 -0700
Subject: [Python-Dev] Python 3.5 on VC14 - update
In-Reply-To: <e876acae2eec4773a8a12b0ac6c1c5eb@DM2PR03MB400.namprd03.prod.outlook.com>
References: <e876acae2eec4773a8a12b0ac6c1c5eb@DM2PR03MB400.namprd03.prod.outlook.com>
Message-ID: <CAPdQG2p1QxfkDLXWz25awAh9gaWrUc0tTj7_zXHoUgOc2xYB1g@mail.gmail.com>

On Tue, Jun 10, 2014 at 9:30 AM, Steve Dower <Steve.Dower at microsoft.com>
wrote:

>
> I ran a quick test with profile-guided optimization (PGO, pronounced
> "pogo"), which has supposedly been improved since VC9, and saw a very
> unscientific 20% speed improvement on pybench.py and 10% size reduction in
> python35.dll. I'm not sure what we used to get from VC9, but it certainly
> seems worth enabling provided it doesn't break anything. (Interestingly,
> PGO decided that only 1% of functions needed to be compiled for speed. Not
> sure if I can find out which ones those are but if anyone's interested I
> can give it a shot?)
>

For what it's worth, we build Google's internal Python interpreters with
gcc's flavour of PGO and are seeing somewhat more than 20% performance
increase for Python 2.7. (We train using most of the testsuite, not
pybench, and I believe the Debian/Ubuntu packages also do this.) I believe
almost all of that is from speedups to the main eval loop, which is a huge
switch in a bigger loop with complicated jump logic. It wouldn't surprise
me if VS's PGO only decided to optimize that eval loop :)

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm an email virus! Think twice before sending your email to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140610/30917dc0/attachment.html>

From Nikolaus at rath.org  Wed Jun 11 03:30:49 2014
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Tue, 10 Jun 2014 18:30:49 -0700
Subject: [Python-Dev] Why does IOBase.__del__ call .close?
Message-ID: <87d2egnsfq.fsf@vostro.rath.org>

Hello,

I recently noticed (after some rather protacted debugging) that the
io.IOBase class comes with a destructor that calls self.close():

[0] nikratio at vostro:~/tmp$ cat test.py
import io
class Foo(io.IOBase):
    def close(self):
        print('close called')
r = Foo()
del r
[0] nikratio at vostro:~/tmp$ python3 test.py
close called

To me, this came as quite a surprise, and the best "documentation" of
this feature seems to be the following note (from the io library
reference):

"The abstract base classes also provide default implementations of some
 methods in order to help implementation of concrete stream classes. For
 example, BufferedIOBase provides unoptimized implementations of
 readinto() and readline()."

For me, having __del__ call close() does not qualify as a reasonable
default implementation unless close() is required to be idempotent
(which one could deduce from the documentation if one tries to, but it's
far from clear).

Is this behavior an accident, or was that a deliberate decision?


Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From python at mrabarnett.plus.com  Wed Jun 11 03:51:43 2014
From: python at mrabarnett.plus.com (MRAB)
Date: Wed, 11 Jun 2014 02:51:43 +0100
Subject: [Python-Dev] Why does IOBase.__del__ call .close?
In-Reply-To: <87d2egnsfq.fsf@vostro.rath.org>
References: <87d2egnsfq.fsf@vostro.rath.org>
Message-ID: <5397B62F.80004@mrabarnett.plus.com>

On 2014-06-11 02:30, Nikolaus Rath wrote:
> Hello,
>
> I recently noticed (after some rather protacted debugging) that the
> io.IOBase class comes with a destructor that calls self.close():
>
> [0] nikratio at vostro:~/tmp$ cat test.py
> import io
> class Foo(io.IOBase):
>      def close(self):
>          print('close called')
> r = Foo()
> del r
> [0] nikratio at vostro:~/tmp$ python3 test.py
> close called
>
> To me, this came as quite a surprise, and the best "documentation" of
> this feature seems to be the following note (from the io library
> reference):
>
> "The abstract base classes also provide default implementations of some
>   methods in order to help implementation of concrete stream classes. For
>   example, BufferedIOBase provides unoptimized implementations of
>   readinto() and readline()."
>
> For me, having __del__ call close() does not qualify as a reasonable
> default implementation unless close() is required to be idempotent
> (which one could deduce from the documentation if one tries to, but it's
> far from clear).
>
> Is this behavior an accident, or was that a deliberate decision?
>
To me, it makes sense. You want to make sure that it's closed, releasing
any resources it might be holding, even if you haven't done so
explicitly.


From antoine at python.org  Wed Jun 11 04:28:17 2014
From: antoine at python.org (Antoine Pitrou)
Date: Tue, 10 Jun 2014 22:28:17 -0400
Subject: [Python-Dev] Why does IOBase.__del__ call .close?
In-Reply-To: <87d2egnsfq.fsf@vostro.rath.org>
References: <87d2egnsfq.fsf@vostro.rath.org>
Message-ID: <ln8es2$ook$1@ger.gmane.org>

Le 10/06/2014 21:30, Nikolaus Rath a ?crit :
>
> For me, having __del__ call close() does not qualify as a reasonable
> default implementation unless close() is required to be idempotent
> (which one could deduce from the documentation if one tries to, but it's
> far from clear).

close() should indeed be idempotent on all bundled IO class 
implementations (otherwise it's a bug), and so should it preferably on 
third-party IO class implementations.

If you want to improve the documentation on this, you're welcome to 
provide a patch!

Regards

Antoine.


From ncoghlan at gmail.com  Wed Jun 11 14:38:13 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 11 Jun 2014 22:38:13 +1000
Subject: [Python-Dev] Why does IOBase.__del__ call .close?
In-Reply-To: <ln8es2$ook$1@ger.gmane.org>
References: <87d2egnsfq.fsf@vostro.rath.org>
	<ln8es2$ook$1@ger.gmane.org>
Message-ID: <CADiSq7dBRnipVTtHZ6jW58uXwBnLn09ArzCZ5+AnVQ8R0Us2ow@mail.gmail.com>

On 11 Jun 2014 12:31, "Antoine Pitrou" <antoine at python.org> wrote:
>
> Le 10/06/2014 21:30, Nikolaus Rath a ?crit :
>
>>
>> For me, having __del__ call close() does not qualify as a reasonable
>> default implementation unless close() is required to be idempotent
>> (which one could deduce from the documentation if one tries to, but it's
>> far from clear).
>
>
> close() should indeed be idempotent on all bundled IO class
implementations (otherwise it's a bug), and so should it preferably on
third-party IO class implementations.
>
> If you want to improve the documentation on this, you're welcome to
provide a patch!

We certainly assume idempotent close() behaviour in various places, so if
that expectation isn't currently clear in the docs, suggestions for
improved wording would definitely be appreciated!

Cheers,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140611/4324709e/attachment.html>

From benhoyt at gmail.com  Wed Jun 11 15:27:25 2014
From: benhoyt at gmail.com (Ben Hoyt)
Date: Wed, 11 Jun 2014 09:27:25 -0400
Subject: [Python-Dev] Returning Windows file attribute information via
	os.stat()
In-Reply-To: <CAKJDb-O7_VNCAgaT8qy21fu41-H-4k41T+fYmKLfNMoRTOcCiw@mail.gmail.com>
References: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
 <53973DA8.1090602@stoneleaf.us>
 <CAL9jXCF9FfGdsOZSbZq-wL15qCHAXpo1X4rdPj-hst-K_ijNMw@mail.gmail.com>
 <CAKJDb-O7_VNCAgaT8qy21fu41-H-4k41T+fYmKLfNMoRTOcCiw@mail.gmail.com>
Message-ID: <CAL9jXCFR3NGXUnKB--3oerz2aG1wOD02gGmUjLp=CG8R1LLq4g@mail.gmail.com>

>> What would be the next steps to get this to happen? Open an issue on
>> bugs.python.org and submit a patch with tests?
>
> Yep!

Okay, I've done step one (opened an issue on bugs.python.org), and
hope to provide a patch in the next few weeks if no-one else does
(I've never compiled CPython on Windows before):

http://bugs.python.org/issue21719

-Ben

From victor.stinner at gmail.com  Wed Jun 11 16:28:53 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Wed, 11 Jun 2014 16:28:53 +0200
Subject: [Python-Dev] Issue #21205: add __qualname__ to generators
Message-ID: <CAMpsgwa-u1Y4dF0H982cD2O5mAZCkmfjGtPkC5Ke7s1jGuHyGQ@mail.gmail.com>

Hi,

I'm working on asyncio and it's difficult to debug code because
@asyncio.coroutine decorator removes the name of the function if the
function is not a generator (if it doesn't use yield from).

I propose to add new gi_name and gi_qualname fields to the C structure
PyGenObject, add a new __qualname__ (= gi_qualname) attribute to the
Python API of generator, and change how the default value of __name__
(= gi_name) of generators.

Instead of getting the name from the code object, I propose to get the
name from the function (if the generator was created from a function).
So if the function name was modified, you get the new name instead of
getting the name from the code object (as done in Python 3.4).

I also propose to display the qualified name in repr(generator)
instead of the name.

All these changes should make my life easier to debug asyncio, but it
should help any project using generators.

Issues describing the problem, I attached a patch implementing my ideas:
http://bugs.python.org/issue21205

Would you be ok with these (minor) incompatible changes?

By the way, it looks like generator attributes were never documented
:-( My patch also adds a basic documentation (at least, it lists all
attributes in the documentation of the inspect module).

Victor

From antoine at python.org  Wed Jun 11 18:17:40 2014
From: antoine at python.org (Antoine Pitrou)
Date: Wed, 11 Jun 2014 12:17:40 -0400
Subject: [Python-Dev] Issue #21205: add __qualname__ to generators
In-Reply-To: <CAMpsgwa-u1Y4dF0H982cD2O5mAZCkmfjGtPkC5Ke7s1jGuHyGQ@mail.gmail.com>
References: <CAMpsgwa-u1Y4dF0H982cD2O5mAZCkmfjGtPkC5Ke7s1jGuHyGQ@mail.gmail.com>
Message-ID: <ln9vf4$poa$1@ger.gmane.org>

Le 11/06/2014 10:28, Victor Stinner a ?crit :
> Hi,
>
> I'm working on asyncio and it's difficult to debug code because
> @asyncio.coroutine decorator removes the name of the function if the
> function is not a generator (if it doesn't use yield from).
>
> I propose to add new gi_name and gi_qualname fields to the C structure
> PyGenObject, add a new __qualname__ (= gi_qualname) attribute to the
> Python API of generator, and change how the default value of __name__
> (= gi_name) of generators.
>
> Instead of getting the name from the code object, I propose to get the
> name from the function (if the generator was created from a function).
> So if the function name was modified, you get the new name instead of
> getting the name from the code object (as done in Python 3.4).
>
> I also propose to display the qualified name in repr(generator)
> instead of the name.
>
> All these changes should make my life easier to debug asyncio, but it
> should help any project using generators.
>
> Issues describing the problem, I attached a patch implementing my ideas:
> http://bugs.python.org/issue21205
>
> Would you be ok with these (minor) incompatible changes?

+1 from me.

Regards

Antoine.


From tjreedy at udel.edu  Wed Jun 11 18:24:35 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 11 Jun 2014 12:24:35 -0400
Subject: [Python-Dev] Returning Windows file attribute information via
	os.stat()
In-Reply-To: <CAL9jXCFR3NGXUnKB--3oerz2aG1wOD02gGmUjLp=CG8R1LLq4g@mail.gmail.com>
References: <CAL9jXCELJhyqODbqqMiRs6qMf_JAFmr9UWTcckTK65=qw7J+qg@mail.gmail.com>
 <53973DA8.1090602@stoneleaf.us>
 <CAL9jXCF9FfGdsOZSbZq-wL15qCHAXpo1X4rdPj-hst-K_ijNMw@mail.gmail.com>
 <CAKJDb-O7_VNCAgaT8qy21fu41-H-4k41T+fYmKLfNMoRTOcCiw@mail.gmail.com>
 <CAL9jXCFR3NGXUnKB--3oerz2aG1wOD02gGmUjLp=CG8R1LLq4g@mail.gmail.com>
Message-ID: <ln9vs6$ta9$1@ger.gmane.org>

On 6/11/2014 9:27 AM, Ben Hoyt wrote:
>>> What would be the next steps to get this to happen? Open an issue on
>>> bugs.python.org and submit a patch with tests?
>>
>> Yep!
>
> Okay, I've done step one (opened an issue on bugs.python.org), and
> hope to provide a patch in the next few weeks if no-one else does
> (I've never compiled CPython on Windows before):
>
> http://bugs.python.org/issue21719

If you have problems compiling, the core-mentorship list is one place to 
ask. For 3.4+, I believe the devguide instructions are correct. If not, 
say something.

-- 
Terry Jan Reedy


From techtonik at gmail.com  Wed Jun 11 22:26:26 2014
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 11 Jun 2014 23:26:26 +0300
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
Message-ID: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>

I am banned from tracker, so I post the bug here:

Normal Windows behavior:

  >hg status --rev ".^1"
  M mercurial\commands.py
  ? pysptest.py

  >hg status --rev .^1
  abort: unknown revision '.1'!

So, ^ is an escape character. See
http://www.tomshardware.co.uk/forum/35565-45-when-special-command-line


But subprocess doesn't escape it, making cross-platform command fail on
Windows.

---[cut pysptest.py]--
import subprocess as sp

# this fails with
# abort: unknown revision '.1'!
cmd = ['hg', 'status', '--rev', '.^1']
# this works
#cmd = 'hg status --rev ".^1"'
# this works too
#cmd = ['hg', 'status', '--rev', '.^^1']

try:
  print sp.check_output(cmd, stderr=sp.STDOUT, shell=True)
except Exception as e:
  print e.output
------------------------------

-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140611/57620b52/attachment.html>

From rymg19 at gmail.com  Wed Jun 11 23:58:30 2014
From: rymg19 at gmail.com (Ryan)
Date: Wed, 11 Jun 2014 16:58:30 -0500
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape
	^	character
In-Reply-To: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
Message-ID: <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>

Of course! And, why not escape everything else, too?

abc -> ^a^b^c

echo %PATH% -> ^e^c^h^o^ ^%^P^A^T^H^%

In all seriousness, to me this is obvious. When you pass a command to the shell, naturally, certain details are shell-specific.

-10000. Bad idea. Very bad idea. If you want the ^ to be escaped, do it yourself. Or better yet, don't pass shell=True.

anatoly techtonik <techtonik at gmail.com> wrote:
>I am banned from tracker, so I post the bug here:
>
>Normal Windows behavior:
>
>  >hg status --rev ".^1"
>  M mercurial\commands.py
>  ? pysptest.py
>
>  >hg status --rev .^1
>  abort: unknown revision '.1'!
>
>So, ^ is an escape character. See
>http://www.tomshardware.co.uk/forum/35565-45-when-special-command-line
>
>
>But subprocess doesn't escape it, making cross-platform command fail on
>Windows.
>
>---[cut pysptest.py]--
>import subprocess as sp
>
># this fails with
># abort: unknown revision '.1'!
>cmd = ['hg', 'status', '--rev', '.^1']
># this works
>#cmd = 'hg status --rev ".^1"'
># this works too
>#cmd = ['hg', 'status', '--rev', '.^^1']
>
>try:
>  print sp.check_output(cmd, stderr=sp.STDOUT, shell=True)
>except Exception as e:
>  print e.output
>------------------------------
>
>-- 
>anatoly t.
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Python-Dev mailing list
>Python-Dev at python.org
>https://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe:
>https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140611/c25eba8f/attachment.html>

From rosuav at gmail.com  Thu Jun 12 00:30:30 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 12 Jun 2014 08:30:30 +1000
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
Message-ID: <CAPTjJmq4uTos7Bm9hmBNuX+L6pJVyO8FCA2ku5UMWVw0f9NkYA@mail.gmail.com>

On Thu, Jun 12, 2014 at 7:58 AM, Ryan <rymg19 at gmail.com> wrote:
> In all seriousness, to me this is obvious. When you pass a command to the
> shell, naturally, certain details are shell-specific.
>
> -10000. Bad idea. Very bad idea. If you want the ^ to be escaped, do it
> yourself. Or better yet, don't pass shell=True.

Definitely the latter. Why pass shell=True when executing a single
command? I don't get it.

ChrisA

From benjamin at python.org  Thu Jun 12 00:34:51 2014
From: benjamin at python.org (Benjamin Peterson)
Date: Wed, 11 Jun 2014 15:34:51 -0700
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
 character
In-Reply-To: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
Message-ID: <1402526091.15771.127814637.2B184603@webmail.messagingengine.com>

On Wed, Jun 11, 2014, at 13:26, anatoly techtonik wrote:
> I am banned from tracker, so I post the bug here:

Being banned from the tracker is not an invitation to use python-dev@ as
one.

From rdmurray at bitdance.com  Thu Jun 12 01:00:29 2014
From: rdmurray at bitdance.com (R. David Murray)
Date: Wed, 11 Jun 2014 19:00:29 -0400
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
Message-ID: <20140611230030.6F56F250DC4@webabinitio.net>

Also notice that using a list with shell=True is using the API
incorrectly.  It wouldn't even work on Linux, so that torpedoes
the cross-platform concern already :)

This kind of confusion is why I opened http://bugs.python.org/issue7839.

On Wed, 11 Jun 2014 16:58:30 -0500, Ryan <rymg19 at gmail.com> wrote:
> Of course! And, why not escape everything else, too?
> 
> abc -> ^a^b^c
> 
> echo %PATH% -> ^e^c^h^o^ ^%^P^A^T^H^%
> 
> In all seriousness, to me this is obvious. When you pass a command to the shell, naturally, certain details are shell-specific.
> 
> -10000. Bad idea. Very bad idea. If you want the ^ to be escaped, do it yourself. Or better yet, don't pass shell=True.
> 
> anatoly techtonik <techtonik at gmail.com> wrote:
> >I am banned from tracker, so I post the bug here:
> >
> >Normal Windows behavior:
> >
> >  >hg status --rev ".^1"
> >  M mercurial\commands.py
> >  ? pysptest.py
> >
> >  >hg status --rev .^1
> >  abort: unknown revision '.1'!
> >
> >So, ^ is an escape character. See
> >http://www.tomshardware.co.uk/forum/35565-45-when-special-command-line
> >
> >
> >But subprocess doesn't escape it, making cross-platform command fail on
> >Windows.
> >
> >---[cut pysptest.py]--
> >import subprocess as sp
> >
> ># this fails with
> ># abort: unknown revision '.1'!
> >cmd = ['hg', 'status', '--rev', '.^1']
> ># this works
> >#cmd = 'hg status --rev ".^1"'
> ># this works too
> >#cmd = ['hg', 'status', '--rev', '.^^1']
> >
> >try:
> >  print sp.check_output(cmd, stderr=sp.STDOUT, shell=True)
> >except Exception as e:
> >  print e.output
> >------------------------------
> >
> >-- 
> >anatoly t.
> >
> >
> >------------------------------------------------------------------------
> >
> >_______________________________________________
> >Python-Dev mailing list
> >Python-Dev at python.org
> >https://mail.python.org/mailman/listinfo/python-dev
> >Unsubscribe:
> >https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com
> 
> -- 
> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/rdmurray%40bitdance.com

From techtonik at gmail.com  Thu Jun 12 00:53:20 2014
From: techtonik at gmail.com (anatoly techtonik)
Date: Thu, 12 Jun 2014 01:53:20 +0300
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <CAPTjJmq4uTos7Bm9hmBNuX+L6pJVyO8FCA2ku5UMWVw0f9NkYA@mail.gmail.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <CAPTjJmq4uTos7Bm9hmBNuX+L6pJVyO8FCA2ku5UMWVw0f9NkYA@mail.gmail.com>
Message-ID: <CAPkN8x+Fav3LhCWgSou2Dp+RqvD2ko7vcfxM8N487w0jLfQJVQ@mail.gmail.com>

On Thu, Jun 12, 2014 at 1:30 AM, Chris Angelico <rosuav at gmail.com> wrote:

> On Thu, Jun 12, 2014 at 7:58 AM, Ryan <rymg19 at gmail.com> wrote:
> > In all seriousness, to me this is obvious. When you pass a command to the
> > shell, naturally, certain details are shell-specific.
>

On Windows cmd.exe is used by default:
http://hg.python.org/cpython/file/38a325c84564/Lib/subprocess.py#l1108
so it makes sense to make default behavior cross-platform.


>  > -10000. Bad idea. Very bad idea. If you want the ^ to be escaped, do it
> > yourself. Or better yet, don't pass shell=True.
>
> Definitely the latter. Why pass shell=True when executing a single
> command? I don't get it.
>

This is a complete use case using Rietveld upload script:
http://techtonik.rainforce.org/2013/07/code-review-with-rietveld-and-mercurial.html

I am interested to know how to modify upload script without kludges:
https://code.google.com/p/rietveld/source/browse/upload.py#1056
I expect many people are facing with the same problem trying to wrap
Git and HG with Python scripts.
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140612/c6320701/attachment-0001.html>

From techtonik at gmail.com  Thu Jun 12 01:00:55 2014
From: techtonik at gmail.com (anatoly techtonik)
Date: Thu, 12 Jun 2014 02:00:55 +0300
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <CAPTjJmq4uTos7Bm9hmBNuX+L6pJVyO8FCA2ku5UMWVw0f9NkYA@mail.gmail.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <CAPTjJmq4uTos7Bm9hmBNuX+L6pJVyO8FCA2ku5UMWVw0f9NkYA@mail.gmail.com>
Message-ID: <CAPkN8xL2ktUdY38odxH7fD2Jv1=A5RQUkE99ZEEUXa70UxKxyA@mail.gmail.com>

On Thu, Jun 12, 2014 at 1:30 AM, Chris Angelico <rosuav at gmail.com> wrote:

> Why pass shell=True when executing a single
> command? I don't get it.
>

I don't know about Linux, but on Windows programs are not directly
available as /usr/bin/python, so you need to find command in PATH
directories. Passing shell=True makes this lookup done by shell and not
manually.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140612/f2d03e31/attachment-0001.html>

From Nikolaus at rath.org  Thu Jun 12 02:11:53 2014
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Wed, 11 Jun 2014 17:11:53 -0700
Subject: [Python-Dev] Why does IOBase.__del__ call .close?
In-Reply-To: <5397B62F.80004@mrabarnett.plus.com> (MRAB's message of "Wed, 11
 Jun 2014 02:51:43 +0100")
References: <87d2egnsfq.fsf@vostro.rath.org>
 <5397B62F.80004@mrabarnett.plus.com>
Message-ID: <87a99jnfzq.fsf@vostro.rath.org>

MRAB <python at mrabarnett.plus.com> writes:
> On 2014-06-11 02:30, Nikolaus Rath wrote:
>> Hello,
>>
>> I recently noticed (after some rather protacted debugging) that the
>> io.IOBase class comes with a destructor that calls self.close():
>>
>> [0] nikratio at vostro:~/tmp$ cat test.py
>> import io
>> class Foo(io.IOBase):
>>      def close(self):
>>          print('close called')
>> r = Foo()
>> del r
>> [0] nikratio at vostro:~/tmp$ python3 test.py
>> close called
>>
>> To me, this came as quite a surprise, and the best "documentation" of
>> this feature seems to be the following note (from the io library
>> reference):
>>
>> "The abstract base classes also provide default implementations of some
>>   methods in order to help implementation of concrete stream classes. For
>>   example, BufferedIOBase provides unoptimized implementations of
>>   readinto() and readline()."
>>
>> For me, having __del__ call close() does not qualify as a reasonable
>> default implementation unless close() is required to be idempotent
>> (which one could deduce from the documentation if one tries to, but it's
>> far from clear).
>>
>> Is this behavior an accident, or was that a deliberate decision?
>>
> To me, it makes sense. You want to make sure that it's closed, releasing
> any resources it might be holding, even if you haven't done so
> explicitly.

I agree with your intentions, but I come to the opposite conclusion:
automatically calling close() in the destructor will hide that there's a
problem in the code. Without that automatic cleanup, there's at least a
good chance that a ResourceWarning will be emitted so the problem gets
noticed. "Silently work around bugs in caller's code" doesn't seem like
a very useful default to me...


Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From techtonik at gmail.com  Thu Jun 12 02:00:42 2014
From: techtonik at gmail.com (anatoly techtonik)
Date: Thu, 12 Jun 2014 03:00:42 +0300
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <20140611230030.6F56F250DC4@webabinitio.net>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net>
Message-ID: <CAPkN8xJY_cNoUa7+OZk5_gmhX1d-11m-Uxzh4mCvu_U4yDGahw@mail.gmail.com>

On Thu, Jun 12, 2014 at 2:00 AM, R. David Murray <rdmurray at bitdance.com>
wrote:

> Also notice that using a list with shell=True is using the API
> incorrectly.  It wouldn't even work on Linux, so that torpedoes
> the cross-platform concern already :)
>
> This kind of confusion is why I opened http://bugs.python.org/issue7839.


I thought exactly about that. Usually separate arguments are used to avoid
problems with escaping of quotes and other stuff.

I'd deprecate subprocess and split it into separate modules. One is about
shell execution and another one is for secure process control.

shell execution module then could build on top of process control and be
insecure by design.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140612/28e4a9eb/attachment.html>

From benjamin at python.org  Thu Jun 12 02:54:53 2014
From: benjamin at python.org (Benjamin Peterson)
Date: Wed, 11 Jun 2014 17:54:53 -0700
Subject: [Python-Dev] Why does IOBase.__del__ call .close?
In-Reply-To: <87a99jnfzq.fsf@vostro.rath.org>
References: <87d2egnsfq.fsf@vostro.rath.org>
 <5397B62F.80004@mrabarnett.plus.com>
 <87a99jnfzq.fsf@vostro.rath.org>
Message-ID: <1402534493.31346.127850065.34AEEDD2@webmail.messagingengine.com>

On Wed, Jun 11, 2014, at 17:11, Nikolaus Rath wrote:
> MRAB <python at mrabarnett.plus.com> writes:
> > On 2014-06-11 02:30, Nikolaus Rath wrote:
> >> Hello,
> >>
> >> I recently noticed (after some rather protacted debugging) that the
> >> io.IOBase class comes with a destructor that calls self.close():
> >>
> >> [0] nikratio at vostro:~/tmp$ cat test.py
> >> import io
> >> class Foo(io.IOBase):
> >>      def close(self):
> >>          print('close called')
> >> r = Foo()
> >> del r
> >> [0] nikratio at vostro:~/tmp$ python3 test.py
> >> close called
> >>
> >> To me, this came as quite a surprise, and the best "documentation" of
> >> this feature seems to be the following note (from the io library
> >> reference):
> >>
> >> "The abstract base classes also provide default implementations of some
> >>   methods in order to help implementation of concrete stream classes. For
> >>   example, BufferedIOBase provides unoptimized implementations of
> >>   readinto() and readline()."
> >>
> >> For me, having __del__ call close() does not qualify as a reasonable
> >> default implementation unless close() is required to be idempotent
> >> (which one could deduce from the documentation if one tries to, but it's
> >> far from clear).
> >>
> >> Is this behavior an accident, or was that a deliberate decision?
> >>
> > To me, it makes sense. You want to make sure that it's closed, releasing
> > any resources it might be holding, even if you haven't done so
> > explicitly.
> 
> I agree with your intentions, but I come to the opposite conclusion:
> automatically calling close() in the destructor will hide that there's a
> problem in the code. Without that automatic cleanup, there's at least a
> good chance that a ResourceWarning will be emitted so the problem gets
> noticed. "Silently work around bugs in caller's code" doesn't seem like
> a very useful default to me...

Things which actually hold system resources (like FileIO) give
ResourceWarning if they close in __del__, so I don't understand your
point.

From rosuav at gmail.com  Thu Jun 12 04:07:19 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 12 Jun 2014 12:07:19 +1000
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <CAPkN8xJY_cNoUa7+OZk5_gmhX1d-11m-Uxzh4mCvu_U4yDGahw@mail.gmail.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net>
 <CAPkN8xJY_cNoUa7+OZk5_gmhX1d-11m-Uxzh4mCvu_U4yDGahw@mail.gmail.com>
Message-ID: <CAPTjJmqnuk0orM7GEj79Wmn7BcuvwFUvj5r=HTqjLOLkh4uhxQ@mail.gmail.com>

On Thu, Jun 12, 2014 at 10:00 AM, anatoly techtonik <techtonik at gmail.com> wrote:
> I thought exactly about that. Usually separate arguments are used to avoid
> problems with escaping of quotes and other stuff.
>
> I'd deprecate subprocess and split it into separate modules. One is about
> shell execution and another one is for secure process control.

ISTM what you want is not shell=True, but a separate function that
follows the system policy for translating a command name into a
path-to-binary. That's something that, AFAIK, doesn't currently exist
in the Python 2 stdlib, but Python 3 has shutil.which(). If there's a
PyPI backport of that for Py2, you should be able to use that to
figure out the command name, and then avoid shell=False.

ChrisA

From rosuav at gmail.com  Thu Jun 12 04:12:48 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 12 Jun 2014 12:12:48 +1000
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <CAPTjJmqnuk0orM7GEj79Wmn7BcuvwFUvj5r=HTqjLOLkh4uhxQ@mail.gmail.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net>
 <CAPkN8xJY_cNoUa7+OZk5_gmhX1d-11m-Uxzh4mCvu_U4yDGahw@mail.gmail.com>
 <CAPTjJmqnuk0orM7GEj79Wmn7BcuvwFUvj5r=HTqjLOLkh4uhxQ@mail.gmail.com>
Message-ID: <CAPTjJmqL9Zrj2UPZnqeKOq5YK7hp61AggY8fR2buZDom1PFyqg@mail.gmail.com>

On Thu, Jun 12, 2014 at 12:07 PM, Chris Angelico <rosuav at gmail.com> wrote:
> ISTM what you want is not shell=True, but a separate function that
> follows the system policy for translating a command name into a
> path-to-binary. That's something that, AFAIK, doesn't currently exist
> in the Python 2 stdlib, but Python 3 has shutil.which(). If there's a
> PyPI backport of that for Py2, you should be able to use that to
> figure out the command name, and then avoid shell=False.

Huh. Next time, Chris, search the web before you post. Via a
StackOverflow post, learned about distutils.spawn.find_executable().

Python 2.7.4 (default, Apr  6 2013, 19:54:46) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import distutils.spawn
>>> distutils.spawn.find_executable("python")
'C:\\Program Files\\LilyPond\\usr\\bin\\python.exe'

So that would be the way to go. Render the short-form into an
executable name, then skip the shell.

ChrisA

From ethan at stoneleaf.us  Thu Jun 12 04:43:49 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 11 Jun 2014 19:43:49 -0700
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
 character
In-Reply-To: <CAPTjJmqL9Zrj2UPZnqeKOq5YK7hp61AggY8fR2buZDom1PFyqg@mail.gmail.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net>
 <CAPkN8xJY_cNoUa7+OZk5_gmhX1d-11m-Uxzh4mCvu_U4yDGahw@mail.gmail.com>
 <CAPTjJmqnuk0orM7GEj79Wmn7BcuvwFUvj5r=HTqjLOLkh4uhxQ@mail.gmail.com>
 <CAPTjJmqL9Zrj2UPZnqeKOq5YK7hp61AggY8fR2buZDom1PFyqg@mail.gmail.com>
Message-ID: <539913E5.3050007@stoneleaf.us>

On 06/11/2014 07:12 PM, Chris Angelico wrote:
> On Thu, Jun 12, 2014 at 12:07 PM, Chris Angelico <rosuav at gmail.com> wrote:
>> ISTM what you want is not shell=True, but a separate function that
>> follows the system policy for translating a command name into a
>> path-to-binary. That's something that, AFAIK, doesn't currently exist
>> in the Python 2 stdlib, but Python 3 has shutil.which(). If there's a
>> PyPI backport of that for Py2, you should be able to use that to
>> figure out the command name, and then avoid shell=False.
>
> Huh. Next time, Chris, search the web before you post. Via a
> StackOverflow post, learned about distutils.spawn.find_executable().

--> import sys
--> sys.executable
'/usr/bin/python'

From brian at python.org  Thu Jun 12 06:27:23 2014
From: brian at python.org (Brian Curtin)
Date: Wed, 11 Jun 2014 23:27:23 -0500
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <539913E5.3050007@stoneleaf.us>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net>
 <CAPkN8xJY_cNoUa7+OZk5_gmhX1d-11m-Uxzh4mCvu_U4yDGahw@mail.gmail.com>
 <CAPTjJmqnuk0orM7GEj79Wmn7BcuvwFUvj5r=HTqjLOLkh4uhxQ@mail.gmail.com>
 <CAPTjJmqL9Zrj2UPZnqeKOq5YK7hp61AggY8fR2buZDom1PFyqg@mail.gmail.com>
 <539913E5.3050007@stoneleaf.us>
Message-ID: <CAD+XWwried_44oskmiC18pE9JX29daP0BMthKy2-_WO=GKTOCA@mail.gmail.com>

On Wed, Jun 11, 2014 at 9:43 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
> On 06/11/2014 07:12 PM, Chris Angelico wrote:
>>
>> On Thu, Jun 12, 2014 at 12:07 PM, Chris Angelico <rosuav at gmail.com> wrote:
>>>
>>> ISTM what you want is not shell=True, but a separate function that
>>> follows the system policy for translating a command name into a
>>> path-to-binary. That's something that, AFAIK, doesn't currently exist
>>> in the Python 2 stdlib, but Python 3 has shutil.which(). If there's a
>>> PyPI backport of that for Py2, you should be able to use that to
>>> figure out the command name, and then avoid shell=False.
>>
>>
>> Huh. Next time, Chris, search the web before you post. Via a
>> StackOverflow post, learned about distutils.spawn.find_executable().
>
>
> --> import sys
> --> sys.executable
> '/usr/bin/python'

For finding the Python executable, yes, but the discussion and example
are about a 2.x version of shutil.which

From me at the-compiler.org  Thu Jun 12 06:34:59 2014
From: me at the-compiler.org (Florian Bruhin)
Date: Thu, 12 Jun 2014 06:34:59 +0200
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
 character
In-Reply-To: <CAPkN8xL2ktUdY38odxH7fD2Jv1=A5RQUkE99ZEEUXa70UxKxyA@mail.gmail.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <CAPTjJmq4uTos7Bm9hmBNuX+L6pJVyO8FCA2ku5UMWVw0f9NkYA@mail.gmail.com>
 <CAPkN8xL2ktUdY38odxH7fD2Jv1=A5RQUkE99ZEEUXa70UxKxyA@mail.gmail.com>
Message-ID: <20140612043459.GA19485@lupin>

* anatoly techtonik <techtonik at gmail.com> [2014-06-12 02:00:55 +0300]:
> On Thu, Jun 12, 2014 at 1:30 AM, Chris Angelico <rosuav at gmail.com> wrote:
> 
> > Why pass shell=True when executing a single
> > command? I don't get it.
> >
> 
> I don't know about Linux, but on Windows programs are not directly
> available as /usr/bin/python, so you need to find command in PATH
> directories. Passing shell=True makes this lookup done by shell and not
> manually.

As it's been said, the whole *point* of shell=True is to be able to
use shell features, so ^ being escaped automatically just would be...
broken. How would I escape > then, for example ;)

You basically have two options:

- Do the lookup in PATH yourself, it's not like that's rocket science.

  I haven't checked if there's a ready function for it in the stdlib,
  but even when not: Get os.environ['PATH'], split it by os.pathsep,
  then for every directory check if your binary is in there. There's
  also some environment variable on Windows which contains the
  possible extensions for a binary in PATH, add that, and that's all.

- Use shell=True and a cross-platform shell escape function. I've
  wrote one for a project of mine: [1]

  I've written some tests[2] but I haven't checked all corner-cases,
  so I can't guarantee it'll always work, as the interpretation of
  special chars by cmd.exe *is* black magic, at least to me.

  Needless to say this is probably the worse choice of the two.

  [1] http://git.the-compiler.org/qutebrowser/tree/qutebrowser/utils/misc.py?id=dffec73db76c867d261ec3416de011becb209f13#n154
  [2] http://git.the-compiler.org/qutebrowser/tree/qutebrowser/test/utils/test_misc.py?id=dffec73db76c867d261ec3416de011becb209f13#n195

Florian

-- 
http://www.the-compiler.org | me at the-compiler.org (Mail/XMPP)
             GPG 0xFD55A072 | http://the-compiler.org/pubkey.asc
         I love long mails! | http://email.is-not-s.ms/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140612/a6a9dd2c/attachment.sig>

From p.f.moore at gmail.com  Thu Jun 12 08:57:41 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 12 Jun 2014 07:57:41 +0100
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <20140612043459.GA19485@lupin>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <CAPTjJmq4uTos7Bm9hmBNuX+L6pJVyO8FCA2ku5UMWVw0f9NkYA@mail.gmail.com>
 <CAPkN8xL2ktUdY38odxH7fD2Jv1=A5RQUkE99ZEEUXa70UxKxyA@mail.gmail.com>
 <20140612043459.GA19485@lupin>
Message-ID: <CACac1F-ivKOYAa38eej=7sZteb+pm90LGBmoFot31rZaYChBDg@mail.gmail.com>

On 12 June 2014 05:34, Florian Bruhin <me at the-compiler.org> wrote:
> Do the lookup in PATH yourself, it's not like that's rocket science.

Am I missing something here? I routinely do
subprocess.check_call(['hg', 'update']) or whatever, and it finds the
hg executable fine.

Paul

From victor.stinner at gmail.com  Thu Jun 12 11:41:22 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Thu, 12 Jun 2014 11:41:22 +0200
Subject: [Python-Dev] Issue #21205: add __qualname__ to generators
In-Reply-To: <ln9vf4$poa$1@ger.gmane.org>
References: <CAMpsgwa-u1Y4dF0H982cD2O5mAZCkmfjGtPkC5Ke7s1jGuHyGQ@mail.gmail.com>
 <ln9vf4$poa$1@ger.gmane.org>
Message-ID: <CAMpsgwbC4QdQLXwPtn3Yn-MfdTTyC3-_vRjv6LuqPXiGr+oyRw@mail.gmail.com>

2014-06-11 18:17 GMT+02:00 Antoine Pitrou <antoine at python.org>:
> Le 11/06/2014 10:28, Victor Stinner a ?crit :
>> (...)
>> Issues describing the problem, I attached a patch implementing my ideas:
>> http://bugs.python.org/issue21205
>>
>> Would you be ok with these (minor) incompatible changes?
>
> +1 from me.
>
> Regards
> Antoine.

I asked myself if this change can cause issues with serialization. The
marshal and pickle modules cannot serialize a generator. Marshal only
supports a few types. For pickle, I found this explanation:
http://peadrop.com/blog/2009/12/29/why-you-cannot-pickle-generators/

So I consider that my change is safe. It changes the representation of
a generator, but repr() is usually only checked in unit tests, tests
can be fixed. It also changes the value of the __name__ attribute if
the name of the function was changed, but I don't think that anyone
relies on it. If you really want the original name of the code object,
you can still get gen.gi_code.co_name.

Another recent change in the Python API was the __wrapped__ attribute
set by functools.wraps(). It is now chain wrapper functions, and I'm
not aware of anyone complaining of such change. So I'm confident in my
change :)

Victor

From storchaka at gmail.com  Thu Jun 12 15:16:38 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Thu, 12 Jun 2014 16:16:38 +0300
Subject: [Python-Dev] close() questions
In-Reply-To: <ln8es2$ook$1@ger.gmane.org>
References: <87d2egnsfq.fsf@vostro.rath.org> <ln8es2$ook$1@ger.gmane.org>
Message-ID: <lnc979$92s$1@ger.gmane.org>

11.06.14 05:28, Antoine Pitrou ???????(??):
> close() should indeed be idempotent on all bundled IO class
> implementations (otherwise it's a bug), and so should it preferably on
> third-party IO class implementations.

There are some questions about close().

1. If object owns several resources, should close() try to clean up all 
them if error is happened during cleaning up some resource. E.g. should 
BufferedRWPair.close() close reader if closing writer failed?

2. If close() raises an exception, should repeated call of close() raise 
an exception or do nothing? E.g. if GzipFile.close() fails during 
writing gzip tail (CRC and size), should repeated call of it try to 
write this tail again?

3. If close() raises an exception, should the closed attribute (if 
exists) be True or False?


From yselivanov.ml at gmail.com  Thu Jun 12 18:34:47 2014
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Thu, 12 Jun 2014 12:34:47 -0400
Subject: [Python-Dev] Issue #21205: add __qualname__ to generators
In-Reply-To: <CAMpsgwa-u1Y4dF0H982cD2O5mAZCkmfjGtPkC5Ke7s1jGuHyGQ@mail.gmail.com>
References: <CAMpsgwa-u1Y4dF0H982cD2O5mAZCkmfjGtPkC5Ke7s1jGuHyGQ@mail.gmail.com>
Message-ID: <5399D6A7.8050609@gmail.com>

Hello Victor,

On 2014-06-11, 10:28 AM, Victor Stinner wrote:
> Hi,
>
> I'm working on asyncio and it's difficult to debug code because
> @asyncio.coroutine decorator removes the name of the function if the
> function is not a generator (if it doesn't use yield from).
>
> I propose to add new gi_name and gi_qualname fields to the C structure
> PyGenObject, add a new __qualname__ (= gi_qualname) attribute to the
> Python API of generator, and change how the default value of __name__
> (= gi_name) of generators.
>
> Instead of getting the name from the code object, I propose to get the
> name from the function (if the generator was created from a function).
> So if the function name was modified, you get the new name instead of
> getting the name from the code object (as done in Python 3.4).
>
> I also propose to display the qualified name in repr(generator)
> instead of the name.
>
> All these changes should make my life easier to debug asyncio, but it
> should help any project using generators.
>
> Issues describing the problem, I attached a patch implementing my ideas:
> http://bugs.python.org/issue21205
>
> Would you be ok with these (minor) incompatible changes?

I'm +1 for your proposal.

This change will indeed make debugging asyncio (and any generator-heavy 
code) easier.  I wouldn't worry too much about compatibility, as the 
change is fairly minimal, and the feature will only land in 3.5, where 
people expect new things and are generally OK with slightly updated 
behaviors.


Yury
>
> By the way, it looks like generator attributes were never documented
> :-( My patch also adds a basic documentation (at least, it lists all
> attributes in the documentation of the inspect module).
>
> Victor
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/yselivanov.ml%40gmail.com


From donspauldingii at gmail.com  Fri Jun 13 00:38:26 2014
From: donspauldingii at gmail.com (Don Spaulding)
Date: Thu, 12 Jun 2014 17:38:26 -0500
Subject: [Python-Dev] Backwards Incompatibility in logging module in 3.4?
Message-ID: <CAMaNpgWBXFkLPwVQO-c61qhdCVFpRrHjmmhQt+O=qwqBOYNhCw@mail.gmail.com>

Hi there,

I just started testing a project of mine on Python 3.4.0b1.  I ran into a
change that broke compatibility with the logging module in 3.3.

The basic test is:

    $ py34/bin/python -c 'import logging;
print(logging.getLevelName("debug".upper()))'
    Level DEBUG

    $ py33/bin/python -c 'import logging;
print(logging.getLevelName("debug".upper()))'
    10

I quickly stumbled upon this webpage:

http://aazza.github.io/2014/05/31/testing-on-multiple-versions-of-Python/

Which led me to this ticket regarding the change:

http://bugs.python.org/issue18046

Is this a bug or an intentional break?  If it's the latter, shouldn't this
at least be mentioned in the "What's new in Python 3.4" document?  If it's
the former, should I file a bug?

Thanks,
Don
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140612/8887808d/attachment.html>

From ncoghlan at gmail.com  Fri Jun 13 01:10:16 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 13 Jun 2014 09:10:16 +1000
Subject: [Python-Dev] Backwards Incompatibility in logging module in 3.4?
In-Reply-To: <CAMaNpgWBXFkLPwVQO-c61qhdCVFpRrHjmmhQt+O=qwqBOYNhCw@mail.gmail.com>
References: <CAMaNpgWBXFkLPwVQO-c61qhdCVFpRrHjmmhQt+O=qwqBOYNhCw@mail.gmail.com>
Message-ID: <CADiSq7crGj4cp34Kdh6mhydtnQ7268ggJKmUc-46Lgv5hWxJPw@mail.gmail.com>

On 13 Jun 2014 08:59, "Don Spaulding" <donspauldingii at gmail.com> wrote:
>
> Hi there,
>
> I just started testing a project of mine on Python 3.4.0b1.  I ran into a
change that broke compatibility with the logging module in 3.3.
>
> The basic test is:
>
>     $ py34/bin/python -c 'import logging;
print(logging.getLevelName("debug".upper()))'
>     Level DEBUG
>
>     $ py33/bin/python -c 'import logging;
print(logging.getLevelName("debug".upper()))'
>     10
>
> I quickly stumbled upon this webpage:
>
> http://aazza.github.io/2014/05/31/testing-on-multiple-versions-of-Python/
>
> Which led me to this ticket regarding the change:
>
> http://bugs.python.org/issue18046
>
> Is this a bug or an intentional break?  If it's the latter, shouldn't
this at least be mentioned in the "What's new in Python 3.4" document?  If
it's the former, should I file a bug?

Yes, it sounds like a bug to me - there's no indication of an intent to
change behaviour with that cleanup patch.

Cheers,
Nick.


>
> Thanks,
> Don
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140613/61e52060/attachment.html>

From victor.stinner at gmail.com  Fri Jun 13 01:45:13 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Fri, 13 Jun 2014 01:45:13 +0200
Subject: [Python-Dev] Backwards Incompatibility in logging module in 3.4?
In-Reply-To: <CAMaNpgWBXFkLPwVQO-c61qhdCVFpRrHjmmhQt+O=qwqBOYNhCw@mail.gmail.com>
References: <CAMaNpgWBXFkLPwVQO-c61qhdCVFpRrHjmmhQt+O=qwqBOYNhCw@mail.gmail.com>
Message-ID: <CAMpsgwawY0AkNm0sxOdPVB7PFOJPnR2KJmE2x1O+VGB=N4ee3A@mail.gmail.com>

Hi,

2014-06-13 0:38 GMT+02:00 Don Spaulding <donspauldingii at gmail.com>:
> Is this a bug or an intentional break?  If it's the latter, shouldn't this
> at least be mentioned in the "What's new in Python 3.4" document?

IMO the change is intentional. The previous behaviour was not really expected.

Python 3.3 documentation is explicit: the result is a string and the
input paramter is an integer. logging.getLevelName("DEBUG") was more
an implementation

https://docs.python.org/3.3/library/logging.html#logging.getLevelName
"Returns the textual representation of logging level lvl. If the level
is one of the predefined levels CRITICAL, ERROR, WARNING, INFO or
DEBUG then you get the corresponding string. If you have associated
levels with names using addLevelName() then the name you have
associated with lvl is returned. If a numeric value corresponding to
one of the defined levels is passed in, the corresponding string
representation is returned. Otherwise, the string ?Level %s? % lvl is
returned."

If your code uses something like
logger.setLevel(logging.getLevelName("DEBUG")), use directly
logger.setLevel("DEBUG").

This issue was fixed in OpenStack with this change:
https://review.openstack.org/#/c/94028/6/openstack/common/log.py,cm
https://review.openstack.org/#/c/94028/6

Victor

From rymg19 at gmail.com  Fri Jun 13 01:55:08 2014
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Thu, 12 Jun 2014 18:55:08 -0500
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <CAPkN8x+Fav3LhCWgSou2Dp+RqvD2ko7vcfxM8N487w0jLfQJVQ@mail.gmail.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <CAPTjJmq4uTos7Bm9hmBNuX+L6pJVyO8FCA2ku5UMWVw0f9NkYA@mail.gmail.com>
 <CAPkN8x+Fav3LhCWgSou2Dp+RqvD2ko7vcfxM8N487w0jLfQJVQ@mail.gmail.com>
Message-ID: <CAO41-mO+1Cn2QXCjHZ7HiCrp74M4PO6pUu2Et+Ocv26gskK26Q@mail.gmail.com>

SHELLS ARE NOT CROSS-PLATFORM!!!! Seriously, there are going to be
differences. If you really must:

escape = lambda s: s.replace('^', '^^') if os.name == 'nt' else s

Viola.


On Wed, Jun 11, 2014 at 5:53 PM, anatoly techtonik <techtonik at gmail.com>
wrote:

> On Thu, Jun 12, 2014 at 1:30 AM, Chris Angelico <rosuav at gmail.com> wrote:
>
>> On Thu, Jun 12, 2014 at 7:58 AM, Ryan <rymg19 at gmail.com> wrote:
>> > In all seriousness, to me this is obvious. When you pass a command to
>> the
>> > shell, naturally, certain details are shell-specific.
>>
>
> On Windows cmd.exe is used by default:
> http://hg.python.org/cpython/file/38a325c84564/Lib/subprocess.py#l1108
> so it makes sense to make default behavior cross-platform.
>
>
>>  > -10000. Bad idea. Very bad idea. If you want the ^ to be escaped, do it
>> > yourself. Or better yet, don't pass shell=True.
>>
>> Definitely the latter. Why pass shell=True when executing a single
>> command? I don't get it.
>>
>
> This is a complete use case using Rietveld upload script:
>
> http://techtonik.rainforce.org/2013/07/code-review-with-rietveld-and-mercurial.html
>
> I am interested to know how to modify upload script without kludges:
> https://code.google.com/p/rietveld/source/browse/upload.py#1056
> I expect many people are facing with the same problem trying to wrap
> Git and HG with Python scripts.
> --
> anatoly t.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com
>
>


-- 
Ryan
If anybody ever asks me why I prefer C++ to C, my answer will be simple:
"It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was
nul-terminated."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140612/39320d74/attachment.html>

From Nikolaus at rath.org  Fri Jun 13 03:06:20 2014
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Thu, 12 Jun 2014 18:06:20 -0700
Subject: [Python-Dev] Why does IOBase.__del__ call .close?
In-Reply-To: <1402534493.31346.127850065.34AEEDD2@webmail.messagingengine.com>
 (Benjamin Peterson's message of "Wed, 11 Jun 2014 17:54:53 -0700")
References: <87d2egnsfq.fsf@vostro.rath.org>
 <5397B62F.80004@mrabarnett.plus.com> <87a99jnfzq.fsf@vostro.rath.org>
 <1402534493.31346.127850065.34AEEDD2@webmail.messagingengine.com>
Message-ID: <877g4lobxv.fsf@vostro.rath.org>

Benjamin Peterson <benjamin at python.org> writes:
> On Wed, Jun 11, 2014, at 17:11, Nikolaus Rath wrote:
>> MRAB <python at mrabarnett.plus.com> writes:
>> > On 2014-06-11 02:30, Nikolaus Rath wrote:
>> >> Hello,
>> >>
>> >> I recently noticed (after some rather protacted debugging) that the
>> >> io.IOBase class comes with a destructor that calls self.close():
>> >>
>> >> [0] nikratio at vostro:~/tmp$ cat test.py
>> >> import io
>> >> class Foo(io.IOBase):
>> >>      def close(self):
>> >>          print('close called')
>> >> r = Foo()
>> >> del r
>> >> [0] nikratio at vostro:~/tmp$ python3 test.py
>> >> close called
>> >>
>> >> To me, this came as quite a surprise, and the best "documentation" of
>> >> this feature seems to be the following note (from the io library
>> >> reference):
>> >>
>> >> "The abstract base classes also provide default implementations of some
>> >>   methods in order to help implementation of concrete stream classes. For
>> >>   example, BufferedIOBase provides unoptimized implementations of
>> >>   readinto() and readline()."
>> >>
>> >> For me, having __del__ call close() does not qualify as a reasonable
>> >> default implementation unless close() is required to be idempotent
>> >> (which one could deduce from the documentation if one tries to, but it's
>> >> far from clear).
>> >>
>> >> Is this behavior an accident, or was that a deliberate decision?
>> >>
>> > To me, it makes sense. You want to make sure that it's closed, releasing
>> > any resources it might be holding, even if you haven't done so
>> > explicitly.
>> 
>> I agree with your intentions, but I come to the opposite conclusion:
>> automatically calling close() in the destructor will hide that there's a
>> problem in the code. Without that automatic cleanup, there's at least a
>> good chance that a ResourceWarning will be emitted so the problem gets
>> noticed. "Silently work around bugs in caller's code" doesn't seem like
>> a very useful default to me...
>
> Things which actually hold system resources (like FileIO) give
> ResourceWarning if they close in __del__, so I don't understand your
> point.

Consider this simple example:

$ cat test.py 
import io
import warnings

class StridedStream(io.IOBase):
    def __init__(self, name, stride=2):
        super().__init__()
        self.fh = open(name, 'rb')
        self.stride = stride

    def read(self, len_):
        return self.fh.read(self.stride*len_)[::self.stride]

    def close(self):
        self.fh.close()

class FixedStridedStream(StridedStream):
    def __del__(self):
        # Prevent IOBase.__del__ frombeing called.
        pass

warnings.resetwarnings()
warnings.simplefilter('error')

print('Creating & loosing StridedStream..')
r = StridedStream('/dev/zero')
del r

print('Creating & loosing FixedStridedStream..')
r = FixedStridedStream('/dev/zero')
del r

$ python3 test.py 
Creating & loosing StridedStream..
Creating & loosing FixedStridedStream..
Exception ignored in: <_io.FileIO name='/dev/zero' mode='rb'>
ResourceWarning: unclosed file <_io.BufferedReader name='/dev/zero'>

In the first case, the destructor inherited from IOBase actually
prevents the ResourceWarning from being emitted.


Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From Nikolaus at rath.org  Fri Jun 13 04:11:07 2014
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Thu, 12 Jun 2014 19:11:07 -0700
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <20140611230030.6F56F250DC4@webabinitio.net> (R. David Murray's
 message of "Wed, 11 Jun 2014 19:00:29 -0400")
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net>
Message-ID: <874mzpo8xw.fsf@vostro.rath.org>

"R. David Murray" <rdmurray at bitdance.com> writes:
> Also notice that using a list with shell=True is using the API
> incorrectly.  It wouldn't even work on Linux, so that torpedoes
> the cross-platform concern already :)
>
> This kind of confusion is why I opened http://bugs.python.org/issue7839.

Can someone describe an use case where shell=True actually makes sense
at all?

It seems to me that whenever you need a shell, the argument's that you
pass to it will be shell specific. So instead of e.g.

Popen('for i in `seq 42`; do echo $i; done', shell=True)

you almost certainly want to do

Popen(['/bin/sh', 'for i in `seq 42`; do echo $i; done'], shell=False)

because if your shell happens to be tcsh or cmd.exe, things are going to
break.

Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From rosuav at gmail.com  Fri Jun 13 04:25:36 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Fri, 13 Jun 2014 12:25:36 +1000
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <874mzpo8xw.fsf@vostro.rath.org>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net>
 <874mzpo8xw.fsf@vostro.rath.org>
Message-ID: <CAPTjJmqnyj6M4Hqqtu8cf+jwO0w=MGPbgDjcuaPF6U2VPCN7Yw@mail.gmail.com>

On Fri, Jun 13, 2014 at 12:11 PM, Nikolaus Rath <Nikolaus at rath.org> wrote:
> Can someone describe an use case where shell=True actually makes sense
> at all?
>
> It seems to me that whenever you need a shell, the argument's that you
> pass to it will be shell specific. So instead of e.g.
>
> Popen('for i in `seq 42`; do echo $i; done', shell=True)
>
> you almost certainly want to do
>
> Popen(['/bin/sh', 'for i in `seq 42`; do echo $i; done'], shell=False)
>
> because if your shell happens to be tcsh or cmd.exe, things are going to
> break.

Some features, while technically shell-specific, are supported across
a lot of shells. You should be able to pipe output from one command
into another in most shells, for instance.

But yes, I generally don't use it.

ChrisA

From ncoghlan at gmail.com  Fri Jun 13 04:43:56 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 13 Jun 2014 12:43:56 +1000
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <874mzpo8xw.fsf@vostro.rath.org>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net>
 <874mzpo8xw.fsf@vostro.rath.org>
Message-ID: <CADiSq7fAReZmTzjXUi0_GRCYKdDYUd03Jym6GvQx2Y5v-LdoJQ@mail.gmail.com>

On 13 Jun 2014 12:12, "Nikolaus Rath" <Nikolaus at rath.org> wrote:
>
> "R. David Murray" <rdmurray at bitdance.com> writes:
> > Also notice that using a list with shell=True is using the API
> > incorrectly.  It wouldn't even work on Linux, so that torpedoes
> > the cross-platform concern already :)
> >
> > This kind of confusion is why I opened http://bugs.python.org/issue7839.
>
> Can someone describe an use case where shell=True actually makes sense
> at all?

When you're writing platform specific code, it's occasionally useful. It's
generally best avoided, though.

Cheers,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140613/f18c9fb1/attachment-0001.html>

From me at the-compiler.org  Fri Jun 13 06:18:52 2014
From: me at the-compiler.org (Florian Bruhin)
Date: Fri, 13 Jun 2014 06:18:52 +0200
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
 character
In-Reply-To: <874mzpo8xw.fsf@vostro.rath.org>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net>
 <874mzpo8xw.fsf@vostro.rath.org>
Message-ID: <20140613041852.GD19485@lupin>

* Nikolaus Rath <Nikolaus at rath.org> [2014-06-12 19:11:07 -0700]:
> "R. David Murray" <rdmurray at bitdance.com> writes:
> > Also notice that using a list with shell=True is using the API
> > incorrectly.  It wouldn't even work on Linux, so that torpedoes
> > the cross-platform concern already :)
> >
> > This kind of confusion is why I opened http://bugs.python.org/issue7839.
> 
> Can someone describe an use case where shell=True actually makes sense
> at all?
> 
> It seems to me that whenever you need a shell, the argument's that you
> pass to it will be shell specific. So instead of e.g.
> 
> Popen('for i in `seq 42`; do echo $i; done', shell=True)
> 
> you almost certainly want to do
> 
> Popen(['/bin/sh', 'for i in `seq 42`; do echo $i; done'], shell=False)
> 
> because if your shell happens to be tcsh or cmd.exe, things are going to
> break.

My usecase is a spawn-command in a GUI application, which the user can
use to spawn an executable. I want the user to be able to use the
usual shell features from there. However, I also pass an argument to
that command, and that should be escaped.

Florian

-- 
http://www.the-compiler.org | me at the-compiler.org (Mail/XMPP)
             GPG 0xFD55A072 | http://the-compiler.org/pubkey.asc
         I love long mails! | http://email.is-not-s.ms/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140613/d99f77dc/attachment.sig>

From greg.ewing at canterbury.ac.nz  Fri Jun 13 06:57:49 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 13 Jun 2014 16:57:49 +1200
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
 character
In-Reply-To: <874mzpo8xw.fsf@vostro.rath.org>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net> <874mzpo8xw.fsf@vostro.rath.org>
Message-ID: <539A84CD.90104@canterbury.ac.nz>

Nikolaus Rath wrote:
> you almost certainly want to do
> 
> Popen(['/bin/sh', 'for i in `seq 42`; do echo $i; done'], shell=False)
> 
> because if your shell happens to be tcsh or cmd.exe, things are going to
> break.

On Unix, the C library's system() and popen() functions
always use /bin/sh, NOT the user's current login shell,
for this very reason.

I would hope that the Python versions of these, and also
the new subprocess stuff, do the same.

That still leaves differences between Unix and Windows,
but explicitly naming the shell won't help with that.

-- 
Greg

From benjamin at python.org  Fri Jun 13 07:27:49 2014
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 12 Jun 2014 22:27:49 -0700
Subject: [Python-Dev] Why does IOBase.__del__ call .close?
In-Reply-To: <877g4lobxv.fsf@vostro.rath.org>
References: <87d2egnsfq.fsf@vostro.rath.org>
 <5397B62F.80004@mrabarnett.plus.com>
 <87a99jnfzq.fsf@vostro.rath.org>
 <1402534493.31346.127850065.34AEEDD2@webmail.messagingengine.com>
 <877g4lobxv.fsf@vostro.rath.org>
Message-ID: <1402637269.29254.128319501.4C662871@webmail.messagingengine.com>

On Thu, Jun 12, 2014, at 18:06, Nikolaus Rath wrote:
> Consider this simple example:
> 
> $ cat test.py 
> import io
> import warnings
> 
> class StridedStream(io.IOBase):
>     def __init__(self, name, stride=2):
>         super().__init__()
>         self.fh = open(name, 'rb')
>         self.stride = stride
> 
>     def read(self, len_):
>         return self.fh.read(self.stride*len_)[::self.stride]
> 
>     def close(self):
>         self.fh.close()
> 
> class FixedStridedStream(StridedStream):
>     def __del__(self):
>         # Prevent IOBase.__del__ frombeing called.
>         pass
> 
> warnings.resetwarnings()
> warnings.simplefilter('error')
> 
> print('Creating & loosing StridedStream..')
> r = StridedStream('/dev/zero')
> del r
> 
> print('Creating & loosing FixedStridedStream..')
> r = FixedStridedStream('/dev/zero')
> del r
> 
> $ python3 test.py 
> Creating & loosing StridedStream..
> Creating & loosing FixedStridedStream..
> Exception ignored in: <_io.FileIO name='/dev/zero' mode='rb'>
> ResourceWarning: unclosed file <_io.BufferedReader name='/dev/zero'>
> 
> In the first case, the destructor inherited from IOBase actually
> prevents the ResourceWarning from being emitted.

Ah, I see. I don't see any good ways to fix it, though, besides setting
some flag if close() is called from __del__.

From mail at timgolden.me.uk  Fri Jun 13 09:35:41 2014
From: mail at timgolden.me.uk (Tim Golden)
Date: Fri, 13 Jun 2014 08:35:41 +0100
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
 character
In-Reply-To: <874mzpo8xw.fsf@vostro.rath.org>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net> <874mzpo8xw.fsf@vostro.rath.org>
Message-ID: <539AA9CD.5090306@timgolden.me.uk>

On 13/06/2014 03:11, Nikolaus Rath wrote:
> "R. David Murray" <rdmurray at bitdance.com> writes:
>> Also notice that using a list with shell=True is using the API
>> incorrectly.  It wouldn't even work on Linux, so that torpedoes
>> the cross-platform concern already :)
>>
>> This kind of confusion is why I opened http://bugs.python.org/issue7839.
> 
> Can someone describe an use case where shell=True actually makes sense
> at all?

On Windows (where I think the OP is), Popen & friends ultimately invoke
CreateProcess.

In the case where shell=True, subprocess invokes the command interpreter
explictly under the covers and tweaks a few other things to avoid a
Brief Flash of Unstyled Console. This is the relevant snippet from
subprocess.py:

            if shell:
                startupinfo.dwFlags |= _winapi.STARTF_USESHOWWINDOW
                startupinfo.wShowWindow = _winapi.SW_HIDE
                comspec = os.environ.get("COMSPEC", "cmd.exe")
                args = '{} /c "{}"'.format (comspec, args)


That's all. It's more or less equivalent to your prefixing your commands
with "cmd.exe /c".

The only reasons you should need to do this are:

* If you're using one of the few commands which are actually built-in to
cmd.exe. I can't quickly find an online source for these, but typical
examples will be: "dir" or "copy".

* In some situations -- and I've never been able to nail this -- if
you're trying to run a .bat/.cmd file. I've certainly been able to run
batch files without shell=True but other people have failed within what
appears to  be the same configuration unless invoking cmd.exe via
shell=True.

I use hg.exe (from TortoiseHg) but ISTR that the base Mercurial install
supplies a .bat/.cmd. If that's the OP's case then he might find it
necessary to pass shell=True.


TJG

From taleinat at gmail.com  Fri Jun 13 12:24:38 2014
From: taleinat at gmail.com (Tal Einat)
Date: Fri, 13 Jun 2014 13:24:38 +0300
Subject: [Python-Dev] Raspberry Pi Buildbot
Message-ID: <CALWZvp7Ft7UKoQT+YOV_pyt3dWrFDNrtGV4niucyB2py6AMytQ@mail.gmail.com>

Is there one? If not, would you like me to set one up? I've got one at
home with Raspbian installed not doing anything, it could easily run a
buildslave.

Poking around on buildbot.python.org/all/builders, I can only see one
ARM buildbot[1], and it's just called "ARM v7".

I also found this python-dev thread[2] along with a blog.python.org
blog post[3] from 2012, which mentioned that Trent Nelson would be
receiving a Raspberry Pi and setting up a buildslave on it. But I
can't find mention of it on buildbot.python.org.

So I can't tell what the current state is. If anyone is interested,
just let me know!

.. [1]: http://buildbot.python.org/all/builders/ARMv7%203.x
.. [2]: http://thread.gmane.org/gmane.comp.python.devel/136388
.. [3]: http://blog.python.org/2012/12/pandaboard-raspberry-pi-coming-to.html

- Tal Einat

From rdmurray at bitdance.com  Fri Jun 13 14:40:17 2014
From: rdmurray at bitdance.com (R. David Murray)
Date: Fri, 13 Jun 2014 08:40:17 -0400
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <539A84CD.90104@canterbury.ac.nz>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net> <874mzpo8xw.fsf@vostro.rath.org>
 <539A84CD.90104@canterbury.ac.nz>
Message-ID: <20140613124017.9081A250D0C@webabinitio.net>

On Fri, 13 Jun 2014 16:57:49 +1200, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Nikolaus Rath wrote:
> > you almost certainly want to do
> > 
> > Popen(['/bin/sh', 'for i in `seq 42`; do echo $i; done'], shell=False)
> > 
> > because if your shell happens to be tcsh or cmd.exe, things are going to
> > break.
> 
> On Unix, the C library's system() and popen() functions
> always use /bin/sh, NOT the user's current login shell,
> for this very reason.
> 
> I would hope that the Python versions of these, and also
> the new subprocess stuff, do the same.

They do.

> That still leaves differences between Unix and Windows,
> but explicitly naming the shell won't help with that.

There are some non-windows platforms where /bin/sh doesn't
work (notably Android, where it is /system/bin/sh).  See
http://bugs.python.org/issue16353 for a proposal to create a standard
way to figure out what the system shell should be for Popen's use.
(The conclusion for Windows was to hardcode cmd.exe, though
that isn't what the most recent patch there implements.)

--David

From larry at hastings.org  Fri Jun 13 14:55:30 2014
From: larry at hastings.org (Larry Hastings)
Date: Fri, 13 Jun 2014 05:55:30 -0700
Subject: [Python-Dev] Moving Python 3.5 on Windows to a new compiler
In-Reply-To: <53978136.4000307@v.loewis.de>
References: <529cffa5961d4b5bb57d554affe9643c@BLUPR03MB389.namprd03.prod.outlook.com>
 <53921464.7030400@v.loewis.de>
 <CACac1F-jn02yBsMdzQ8vwhN93eT0JhpRb8nQtdn97rEw12OEXg@mail.gmail.com>
 <5392232A.2000102@v.loewis.de>
 <438e8a27e8e643f4841a22b24447b956@BLUPR03MB389.namprd03.prod.outlook.com>
 <53978136.4000307@v.loewis.de>
Message-ID: <539AF4C2.2090900@hastings.org>

On 06/10/2014 03:05 PM, "Martin v. L?wis" wrote:
> We certainly don't need to resolve this now. We should discuss it again
> when the release schedule for 3.5 is proposed.

I anticipate 3.5 should be released about 18 months after the release of 
3.4, putting it mid-September 2015.


//arry/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140613/cfdef2a9/attachment.html>

From saimadhavheblikar at gmail.com  Fri Jun 13 16:41:29 2014
From: saimadhavheblikar at gmail.com (Saimadhav Heblikar)
Date: Fri, 13 Jun 2014 20:11:29 +0530
Subject: [Python-Dev] [Idle-dev] KeyConfig,
	KeyBinding and other related issues.
In-Reply-To: <CAO3PiBh_vs=wfsKje2UQGfsrhSpyE74-=THRsgDkDKo_DKkHyQ@mail.gmail.com>
References: <CAO3PiBhmVmkgdPD0Lfr8nomCeL+kPSnDNE9DWjMf5Cke4H_qpg@mail.gmail.com>
 <539805D6.8020201@udel.edu>
 <CALWZvp4rE=VY-CHbLGFPB=G-HJ7QiEXZE+UV072qLAdJ6iFVbg@mail.gmail.com>
 <lnah7c$quj$1@ger.gmane.org>
 <CAO3PiBhnBCxeb9yVuy6XQU3hAcAtd=bxNXtf4pmoT+7Xynk0Sg@mail.gmail.com>
 <5399FB2D.1050208@udel.edu>
 <CALWZvp6UtvE=KpNtk6ovXnzPFH9skXK+Bb-R1Gtab+jG4x25kA@mail.gmail.com>
 <539A4C05.8060100@udel.edu>
 <CALWZvp5vAzWFiqZC1CevCLu-Qr+uP6SZm6+LRJO2DpVw5LZAJQ@mail.gmail.com>
 <CAO3PiBiGSSiAp8q6hLD3vkQ+R+vTSeYqqY6Yd-du311Zvv-PWA@mail.gmail.com>
 <CALWZvp53fV+j81gDKHo3p0Fza2ihmWynchtoDifH=F2icZ3kbw@mail.gmail.com>
 <CAO3PiBh_vs=wfsKje2UQGfsrhSpyE74-=THRsgDkDKo_DKkHyQ@mail.gmail.com>
Message-ID: <CAO3PiBjkv3rFVTR2VMWHjwfcRcQmF+hajuGsLaQgmqcTHkWddw@mail.gmail.com>

Hi,

I would like the keyseq validator to be reviewed.

The diff file: https://gist.github.com/sahutd/0a471db8138383fd73b2#file-test-keyseq-diff
A sample test runner file:
https://gist.github.com/sahutd/0a471db8138383fd73b2#file-test-keyseq-runner-py

In its current form, it supports/has
    modifiers = ['Shift', 'Control', 'Alt', 'Meta']
    alpha_uppercase = ['A']
    alpha_lowercase = ['a']
    direction = ['Up',]
    direction_key = ['Key-Up']

It supports validating combinations upto 4 in length.

Please test for the above set only. (It will extended easily to fully
represent the respective complete sets. The reason it cant be done
*now* is the due to how RE optionals are coded differently in my
patch. See CLEANUP below). I will also add remaining keys like
Backspace, Slash etc tomorrow.

# Cleanup:
If we decide to go ahead with RE validating keys as in the above patch,

0. I made the mistake of not coding RE optionals -> ((pat)|(pat)) same
for all sets. The result is that, extending the current key set is not
possible without making all RE optional patterns similar.(Read the
starting lines of is_valid_keyseq method).

1. There is a lot of places where refactoring can be done and
appropriate comment added.

2. I left the asserts as-is. They can be used in testing the validator
method itself.

3. The above patch still needs support for Backspace, slash etc to be
added. I decided to add, once I am sure we will use it.

4. I would like to know how it will affect Mac? What are system
specific differences? Please run the test-runner script on it and do
let me know.

---
My friend told that this thing can be done by "defining a grammar and
automata." I did read up about it, but found it hard to grasp
everything. Can you say whether it would be easier to solve it that
way than RE?

Regards


On 13 June 2014 17:15, Saimadhav Heblikar <saimadhavheblikar at gmail.com> wrote:
> On 13 June 2014 16:58, Tal Einat <taleinat at gmail.com> wrote:
>> On Fri, Jun 13, 2014 at 2:22 PM, Saimadhav Heblikar
>> <saimadhavheblikar at gmail.com> wrote:
>>> Just a heads up to both: I am writing a keyseq validator method.
>>> It currently works for over 800 permutations of ['Shift', 'Control',
>>> 'Alt', 'Meta', 'Key-a', 'Key-A', 'Up', 'Key-Up', 'a', 'A']. It works
>>> for permutations of length 2 and 3. Beyond that its not worth it IMO.
>>> I am currently trying to integrate it with test_configuration.py and
>>> catching permutations i missed out.
>>>
>>> I post this, so that we dont duplicate work. I hope it to be ready by
>>> the end of the day.(UTC +5.5)
>>
>> What is the method you are using?
>
> Regex. It is not something elegant. The permutations are coded in.(Not
> all 800+ obviously, but around 15-20 general ones.). The only
> advantage is it can be used without creating a new Tk instance.
>
>
>>
>> What do you mean by "permutations"? If you mean what I think, then I'm
>> not sure I agree with >3 not being worth it. I've used keyboard
>> bindings with more than 2 modifiers before, and we should certainly
>> support this properly.
>>
> I am sorry. I meant to write >3 modifier permutations.
> (i.eControl-Shift-Alt-Meta+Key-X is not covered. But
> Control-Shift-Alt-Key-X is.)
>
>
>
>
> --
> Regards
> Saimadhav Heblikar


-- 
Regards
Saimadhav Heblikar

From helou.pedro at gmail.com  Fri Jun 13 11:21:04 2014
From: helou.pedro at gmail.com (Pedro Helou)
Date: Fri, 13 Jun 2014 11:21:04 +0200
Subject: [Python-Dev] python-dev for MAC OS X
Message-ID: <CA+3KGru1ceBstbgdg_=QHsw=4WzxKTdCL7cZu4U0zKm0AQEhgg@mail.gmail.com>

Hey,

does anybody know how to install the python-dev headers and libraries for
MAC OS X?

-- 

Pedro Issa Helou

Network Communication Engineering

+ 36 20 262 9274
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140613/0a4ffb2f/attachment.html>

From taleinat at gmail.com  Fri Jun 13 17:32:48 2014
From: taleinat at gmail.com (Tal Einat)
Date: Fri, 13 Jun 2014 18:32:48 +0300
Subject: [Python-Dev] python-dev for MAC OS X
In-Reply-To: <CA+3KGru1ceBstbgdg_=QHsw=4WzxKTdCL7cZu4U0zKm0AQEhgg@mail.gmail.com>
References: <CA+3KGru1ceBstbgdg_=QHsw=4WzxKTdCL7cZu4U0zKm0AQEhgg@mail.gmail.com>
Message-ID: <CALWZvp6+pi1zXr9ewKuMgzAECV70Zb4p3mutazzgw55dm=gqcg@mail.gmail.com>

On Fri, Jun 13, 2014 at 12:21 PM, Pedro Helou <helou.pedro at gmail.com> wrote:
> Hey,
>
> does anybody know how to install the python-dev headers and libraries for
> MAC OS X?

Hi,

This list is for discussing the development *of* Python, not *with*
Python. Please ask on the python list, python-list at python.org (more
info here[1]) or on the #python channel on the Freenode IRC server.
StackOverflow is also a good place to search for information and ask
questions.

But while we're on the subject, on OSX I recommend using a binary
package manager such as Homebrew[2] or Macports[3] for this. I have
had good experiences using Homebrew.

Good luck,
- Tal Einat

.. [1]: https://mail.python.org/mailman/listinfo/python-list
.. [2]: http://brew.sh/
.. [3]: http://www.macports.org/

From saimadhavheblikar at gmail.com  Fri Jun 13 17:44:04 2014
From: saimadhavheblikar at gmail.com (Saimadhav Heblikar)
Date: Fri, 13 Jun 2014 21:14:04 +0530
Subject: [Python-Dev] [Idle-dev] KeyConfig,
	KeyBinding and other related issues.
In-Reply-To: <CAO3PiBjkv3rFVTR2VMWHjwfcRcQmF+hajuGsLaQgmqcTHkWddw@mail.gmail.com>
References: <CAO3PiBhmVmkgdPD0Lfr8nomCeL+kPSnDNE9DWjMf5Cke4H_qpg@mail.gmail.com>
 <539805D6.8020201@udel.edu>
 <CALWZvp4rE=VY-CHbLGFPB=G-HJ7QiEXZE+UV072qLAdJ6iFVbg@mail.gmail.com>
 <lnah7c$quj$1@ger.gmane.org>
 <CAO3PiBhnBCxeb9yVuy6XQU3hAcAtd=bxNXtf4pmoT+7Xynk0Sg@mail.gmail.com>
 <5399FB2D.1050208@udel.edu>
 <CALWZvp6UtvE=KpNtk6ovXnzPFH9skXK+Bb-R1Gtab+jG4x25kA@mail.gmail.com>
 <539A4C05.8060100@udel.edu>
 <CALWZvp5vAzWFiqZC1CevCLu-Qr+uP6SZm6+LRJO2DpVw5LZAJQ@mail.gmail.com>
 <CAO3PiBiGSSiAp8q6hLD3vkQ+R+vTSeYqqY6Yd-du311Zvv-PWA@mail.gmail.com>
 <CALWZvp53fV+j81gDKHo3p0Fza2ihmWynchtoDifH=F2icZ3kbw@mail.gmail.com>
 <CAO3PiBh_vs=wfsKje2UQGfsrhSpyE74-=THRsgDkDKo_DKkHyQ@mail.gmail.com>
 <CAO3PiBjkv3rFVTR2VMWHjwfcRcQmF+hajuGsLaQgmqcTHkWddw@mail.gmail.com>
Message-ID: <CAO3PiBjNz1j39NE1Ub=HkSdMUi8RAGr=D+Yzqrt0fNrgyiaWQA@mail.gmail.com>

Apologies for the accidental cross post. I intended to send it to idle-dev.
I am sorry again :(

On 13 June 2014 20:11, Saimadhav Heblikar <saimadhavheblikar at gmail.com> wrote:
> Hi,
>
> I would like the keyseq validator to be reviewed.
>
> The diff file: https://gist.github.com/sahutd/0a471db8138383fd73b2#file-test-keyseq-diff
> A sample test runner file:
> https://gist.github.com/sahutd/0a471db8138383fd73b2#file-test-keyseq-runner-py
>
> In its current form, it supports/has
>     modifiers = ['Shift', 'Control', 'Alt', 'Meta']
>     alpha_uppercase = ['A']
>     alpha_lowercase = ['a']
>     direction = ['Up',]
>     direction_key = ['Key-Up']
>
> It supports validating combinations upto 4 in length.
>
> Please test for the above set only. (It will extended easily to fully
> represent the respective complete sets. The reason it cant be done
> *now* is the due to how RE optionals are coded differently in my
> patch. See CLEANUP below). I will also add remaining keys like
> Backspace, Slash etc tomorrow.
>
> # Cleanup:
> If we decide to go ahead with RE validating keys as in the above patch,
>
> 0. I made the mistake of not coding RE optionals -> ((pat)|(pat)) same
> for all sets. The result is that, extending the current key set is not
> possible without making all RE optional patterns similar.(Read the
> starting lines of is_valid_keyseq method).
>
> 1. There is a lot of places where refactoring can be done and
> appropriate comment added.
>
> 2. I left the asserts as-is. They can be used in testing the validator
> method itself.
>
> 3. The above patch still needs support for Backspace, slash etc to be
> added. I decided to add, once I am sure we will use it.
>
> 4. I would like to know how it will affect Mac? What are system
> specific differences? Please run the test-runner script on it and do
> let me know.
>
> ---
> My friend told that this thing can be done by "defining a grammar and
> automata." I did read up about it, but found it hard to grasp
> everything. Can you say whether it would be easier to solve it that
> way than RE?
>
> Regards
>
>
>
> On 13 June 2014 17:15, Saimadhav Heblikar <saimadhavheblikar at gmail.com> wrote:
>> On 13 June 2014 16:58, Tal Einat <taleinat at gmail.com> wrote:
>>> On Fri, Jun 13, 2014 at 2:22 PM, Saimadhav Heblikar
>>> <saimadhavheblikar at gmail.com> wrote:
>>>> Just a heads up to both: I am writing a keyseq validator method.
>>>> It currently works for over 800 permutations of ['Shift', 'Control',
>>>> 'Alt', 'Meta', 'Key-a', 'Key-A', 'Up', 'Key-Up', 'a', 'A']. It works
>>>> for permutations of length 2 and 3. Beyond that its not worth it IMO.
>>>> I am currently trying to integrate it with test_configuration.py and
>>>> catching permutations i missed out.
>>>>
>>>> I post this, so that we dont duplicate work. I hope it to be ready by
>>>> the end of the day.(UTC +5.5)
>>>
>>> What is the method you are using?
>>
>> Regex. It is not something elegant. The permutations are coded in.(Not
>> all 800+ obviously, but around 15-20 general ones.). The only
>> advantage is it can be used without creating a new Tk instance.
>>
>>
>>>
>>> What do you mean by "permutations"? If you mean what I think, then I'm
>>> not sure I agree with >3 not being worth it. I've used keyboard
>>> bindings with more than 2 modifiers before, and we should certainly
>>> support this properly.
>>>
>> I am sorry. I meant to write >3 modifier permutations.
>> (i.eControl-Shift-Alt-Meta+Key-X is not covered. But
>> Control-Shift-Alt-Key-X is.)
>>
>>
>>
>>
>> --
>> Regards
>> Saimadhav Heblikar
>
>
>
> --
> Regards
> Saimadhav Heblikar


-- 
Regards
Saimadhav Heblikar

From status at bugs.python.org  Fri Jun 13 18:07:57 2014
From: status at bugs.python.org (Python tracker)
Date: Fri, 13 Jun 2014 18:07:57 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <20140613160757.9965156A83@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (2014-06-06 - 2014-06-13)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open    4662 (+12)
  closed 28859 (+57)
  total  33521 (+69)

Open issues with patches: 2150 


Issues opened (52)
==================

#15993: Windows: 3.3.0-rc2.msi: test_buffer fails
http://bugs.python.org/issue15993  reopened by skrah

#18910: IDle: test textView.py
http://bugs.python.org/issue18910  reopened by ned.deily

#20043: test_multiprocessing_main_handling fails --without-threads
http://bugs.python.org/issue20043  reopened by berker.peksag

#20578: BufferedIOBase.readinto1 is missing
http://bugs.python.org/issue20578  reopened by benjamin.peterson

#21684: inspect.signature bind doesn't include defaults or empty tuple
http://bugs.python.org/issue21684  opened by rmccampbell7

#21686: IDLE - Test hyperparser
http://bugs.python.org/issue21686  opened by sahutd

#21687: Py_SetPath: Path components separated by colons
http://bugs.python.org/issue21687  opened by fwalch

#21690: re documentation: re.compile links to re.search / re.match ins
http://bugs.python.org/issue21690  opened by jdg

#21694: IDLE - Test ParenMatch
http://bugs.python.org/issue21694  opened by sahutd

#21696: Idle: test configuration files
http://bugs.python.org/issue21696  opened by terry.reedy

#21697: shutil.copytree() handles symbolic directory incorrectly
http://bugs.python.org/issue21697  opened by shajunxing

#21699: Windows Python 3.4.1 pyvenv doesn't work in directories with s
http://bugs.python.org/issue21699  opened by Justin.Engel

#21702: asyncio: remote_addr of create_datagram_endpoint() is not docu
http://bugs.python.org/issue21702  opened by haypo

#21703: IDLE: Test UndoDelegator
http://bugs.python.org/issue21703  opened by sahutd

#21704: _multiprocessing module builds incorrectly when POSIX semaphor
http://bugs.python.org/issue21704  opened by Arfrever

#21705: cgi.py: Multipart with more than one file is misparsed
http://bugs.python.org/issue21705  opened by smurfix

#21706: Add base for enumerations (Functional API)
http://bugs.python.org/issue21706  opened by dkorchem

#21707: modulefinder uses wrong CodeType signature in .replace_paths_i
http://bugs.python.org/issue21707  opened by lemburg

#21708: Deprecate nonstandard behavior of a dumbdbm database
http://bugs.python.org/issue21708  opened by serhiy.storchaka

#21710: --install-base option ignored?
http://bugs.python.org/issue21710  opened by pitrou

#21714: Path.with_name can construct invalid paths
http://bugs.python.org/issue21714  opened by Antony.Lee

#21715: Chaining exceptions at C level
http://bugs.python.org/issue21715  opened by serhiy.storchaka

#21716: 3.4.1 download page link for OpenPGP signatures has no sigs
http://bugs.python.org/issue21716  opened by grossdm

#21717: Exclusive mode for ZipFile and TarFile
http://bugs.python.org/issue21717  opened by Antony.Lee

#21718: sqlite3 cursor.description seems to rely on incomplete stateme
http://bugs.python.org/issue21718  opened by zzzeek

#21719: Returning Windows file attribute information via os.stat()
http://bugs.python.org/issue21719  opened by benhoyt

#21720: "TypeError: Item in ``from list'' not a string"  message
http://bugs.python.org/issue21720  opened by davidszotten

#21721: socket.sendfile() should use TransmitFile on Windows
http://bugs.python.org/issue21721  opened by giampaolo.rodola

#21722: teach distutils "upload" to exit with code != 0 when error occ
http://bugs.python.org/issue21722  opened by mdengler

#21723: Float maxsize is treated as infinity in asyncio.Queue
http://bugs.python.org/issue21723  opened by vajrasky

#21724: resetwarnings doesn't reset warnings registry
http://bugs.python.org/issue21724  opened by pitrou

#21725: RFC 6531 (SMTPUTF8) support in smtpd
http://bugs.python.org/issue21725  opened by r.david.murray

#21726: Unnecessary line in documentation
http://bugs.python.org/issue21726  opened by Reid.Price

#21728: Confusing error message when initialising type inheriting obje
http://bugs.python.org/issue21728  opened by Gerrit.Holl

#21729: Use `with` statement in dbm.dumb
http://bugs.python.org/issue21729  opened by Claudiu.Popa

#21730: test_socket fails --without-threads
http://bugs.python.org/issue21730  opened by berker.peksag

#21731: Calendar Problem with Windows (XP)
http://bugs.python.org/issue21731  opened by Juebo

#21732: SubprocessTestsMixin.test_subprocess_terminate() hangs on "AMD
http://bugs.python.org/issue21732  opened by haypo

#21734: compilation of the _ctypes module fails on OpenIndiana: ffi_pr
http://bugs.python.org/issue21734  opened by haypo

#21735: test_threading.test_main_thread_after_fork_from_nonmain_thread
http://bugs.python.org/issue21735  opened by haypo

#21736: Add __file__ attribute to frozen modules
http://bugs.python.org/issue21736  opened by lemburg

#21737: runpy.run_path() fails with frozen __main__ modules
http://bugs.python.org/issue21737  opened by lemburg

#21738: Enum docs claim replacing __new__ is not possible
http://bugs.python.org/issue21738  opened by ethan.furman

#21739: Add hint about expression in list comprehensions (https://docs
http://bugs.python.org/issue21739  opened by krichter

#21740: doctest doesn't allow duck-typing callables
http://bugs.python.org/issue21740  opened by pitrou

#21741: Convert most of the test suite to using unittest.main()
http://bugs.python.org/issue21741  opened by zach.ware

#21742: WatchedFileHandler can fail due to race conditions or file ope
http://bugs.python.org/issue21742  opened by vishvananda

#21743: Create tests for RawTurtleScreen
http://bugs.python.org/issue21743  opened by Lita.Cho

#21744: itertools.islice() goes over all the pre-initial elements even
http://bugs.python.org/issue21744  opened by jcea

#21746: urlparse.BaseResult no longer exists
http://bugs.python.org/issue21746  opened by mgilson

#21748: glob.glob does not sort its results
http://bugs.python.org/issue21748  opened by drj

#21749: pkgutil ImpLoader does not support frozen modules
http://bugs.python.org/issue21749  opened by lemburg


Most recent 15 issues with no replies (15)
==========================================

#21749: pkgutil ImpLoader does not support frozen modules
http://bugs.python.org/issue21749

#21743: Create tests for RawTurtleScreen
http://bugs.python.org/issue21743

#21740: doctest doesn't allow duck-typing callables
http://bugs.python.org/issue21740

#21738: Enum docs claim replacing __new__ is not possible
http://bugs.python.org/issue21738

#21737: runpy.run_path() fails with frozen __main__ modules
http://bugs.python.org/issue21737

#21735: test_threading.test_main_thread_after_fork_from_nonmain_thread
http://bugs.python.org/issue21735

#21734: compilation of the _ctypes module fails on OpenIndiana: ffi_pr
http://bugs.python.org/issue21734

#21730: test_socket fails --without-threads
http://bugs.python.org/issue21730

#21726: Unnecessary line in documentation
http://bugs.python.org/issue21726

#21720: "TypeError: Item in ``from list'' not a string"  message
http://bugs.python.org/issue21720

#21717: Exclusive mode for ZipFile and TarFile
http://bugs.python.org/issue21717

#21716: 3.4.1 download page link for OpenPGP signatures has no sigs
http://bugs.python.org/issue21716

#21715: Chaining exceptions at C level
http://bugs.python.org/issue21715

#21710: --install-base option ignored?
http://bugs.python.org/issue21710

#21708: Deprecate nonstandard behavior of a dumbdbm database
http://bugs.python.org/issue21708


Most recent 15 issues waiting for review (15)
=============================================

#21749: pkgutil ImpLoader does not support frozen modules
http://bugs.python.org/issue21749

#21746: urlparse.BaseResult no longer exists
http://bugs.python.org/issue21746

#21742: WatchedFileHandler can fail due to race conditions or file ope
http://bugs.python.org/issue21742

#21741: Convert most of the test suite to using unittest.main()
http://bugs.python.org/issue21741

#21737: runpy.run_path() fails with frozen __main__ modules
http://bugs.python.org/issue21737

#21736: Add __file__ attribute to frozen modules
http://bugs.python.org/issue21736

#21730: test_socket fails --without-threads
http://bugs.python.org/issue21730

#21729: Use `with` statement in dbm.dumb
http://bugs.python.org/issue21729

#21725: RFC 6531 (SMTPUTF8) support in smtpd
http://bugs.python.org/issue21725

#21723: Float maxsize is treated as infinity in asyncio.Queue
http://bugs.python.org/issue21723

#21722: teach distutils "upload" to exit with code != 0 when error occ
http://bugs.python.org/issue21722

#21719: Returning Windows file attribute information via os.stat()
http://bugs.python.org/issue21719

#21715: Chaining exceptions at C level
http://bugs.python.org/issue21715

#21708: Deprecate nonstandard behavior of a dumbdbm database
http://bugs.python.org/issue21708

#21707: modulefinder uses wrong CodeType signature in .replace_paths_i
http://bugs.python.org/issue21707


Top 10 most discussed issues (10)
=================================

#18910: IDle: test textView.py
http://bugs.python.org/issue18910   8 msgs

#21722: teach distutils "upload" to exit with code != 0 when error occ
http://bugs.python.org/issue21722   8 msgs

#17822: Save on Close windows (IDLE)
http://bugs.python.org/issue17822   7 msgs

#20577: IDLE: Remove FormatParagraph's width setting from config dialo
http://bugs.python.org/issue20577   7 msgs

#21205: Add __qualname__ attribute to Python generators and change def
http://bugs.python.org/issue21205   6 msgs

#20578: BufferedIOBase.readinto1 is missing
http://bugs.python.org/issue20578   5 msgs

#21652: Python 2.7.7 regression in mimetypes module on Windows
http://bugs.python.org/issue21652   5 msgs

#21669: Custom error messages when print & exec are used as statements
http://bugs.python.org/issue21669   5 msgs

#21719: Returning Windows file attribute information via os.stat()
http://bugs.python.org/issue21719   5 msgs

#21725: RFC 6531 (SMTPUTF8) support in smtpd
http://bugs.python.org/issue21725   5 msgs


Issues closed (59)
==================

#1253: IDLE - Percolator overhaul
http://bugs.python.org/issue1253  closed by terry.reedy

#3938: Clearing globals; interpreter -- IDLE difference
http://bugs.python.org/issue3938  closed by terry.reedy

#7424: NetBSD: segmentation fault in listextend during install
http://bugs.python.org/issue7424  closed by ned.deily

#8378: PYTHONSTARTUP is not run by default when Idle is started
http://bugs.python.org/issue8378  closed by terry.reedy

#10498: calendar.LocaleHTMLCalendar.formatyearpage() results in traceb
http://bugs.python.org/issue10498  closed by r.david.murray

#10503: os.getuid() documentation should be clear on what kind of uid 
http://bugs.python.org/issue10503  closed by python-dev

#11709: help-method crashes if sys.stdin is None
http://bugs.python.org/issue11709  closed by python-dev

#12063: tokenize module appears to treat unterminated single and doubl
http://bugs.python.org/issue12063  closed by python-dev

#12561: Compiler workaround for wide string constants in Modules/getpa
http://bugs.python.org/issue12561  closed by Jim.Jewett

#13111: Error 2203 when installing Python/Perl?
http://bugs.python.org/issue13111  closed by loewis

#13223: pydoc removes 'self' in HTML for method docstrings with exampl
http://bugs.python.org/issue13223  closed by python-dev

#14758: SMTPServer of smptd does not support binding to an IPv6 addres
http://bugs.python.org/issue14758  closed by r.david.murray

#15780: IDLE (windows) with PYTHONPATH and multiple python versions
http://bugs.python.org/issue15780  closed by terry.reedy

#17457: Unittest discover fails with namespace packages and builtin mo
http://bugs.python.org/issue17457  closed by berker.peksag

#17552: Add a new socket.sendfile() method
http://bugs.python.org/issue17552  closed by giampaolo.rodola

#18039: dbm.open(..., flag="n") does not work and does not give a warn
http://bugs.python.org/issue18039  closed by serhiy.storchaka

#18141: tkinter.Image.__del__ can throw an exception if module globals
http://bugs.python.org/issue18141  closed by JanKanis

#19662: smtpd.py should not decode utf-8
http://bugs.python.org/issue19662  closed by r.david.murray

#19840: shutil.move(): Add ability to use custom copy function to allo
http://bugs.python.org/issue19840  closed by r.david.murray

#20903: smtplib.SMTP raises socket.timeout
http://bugs.python.org/issue20903  closed by r.david.murray

#21230: imghdr does not accept adobe photoshop mime type
http://bugs.python.org/issue21230  closed by r.david.murray

#21256: Sort keyword arguments in mock _format_call_signature
http://bugs.python.org/issue21256  closed by python-dev

#21310: ResourceWarning when open() fails with io.UnsupportedOperation
http://bugs.python.org/issue21310  closed by serhiy.storchaka

#21372: multiprocessing.util.register_after_fork inconsistency
http://bugs.python.org/issue21372  closed by sbt

#21404: Document options used to control compression level in tarfile
http://bugs.python.org/issue21404  closed by python-dev

#21463: RuntimeError when URLopener.ftpcache is full
http://bugs.python.org/issue21463  closed by python-dev

#21515: Use Linux O_TMPFILE flag in tempfile.TemporaryFile?
http://bugs.python.org/issue21515  closed by haypo

#21569: PEP 466: Python 2.7 What's New preamble changes
http://bugs.python.org/issue21569  closed by ncoghlan

#21596: asyncio.wait fails when futures list is empty
http://bugs.python.org/issue21596  closed by haypo

#21629: clinic.py --converters fails
http://bugs.python.org/issue21629  closed by larry

#21642: "_ if 1else _" does not compile
http://bugs.python.org/issue21642  closed by python-dev

#21656: Create test coverage for TurtleScreenBase in Turtle
http://bugs.python.org/issue21656  closed by Lita.Cho

#21659: IDLE: One corner calltip case
http://bugs.python.org/issue21659  closed by python-dev

#21667: Clarify status of O(1) indexing semantics of str objects
http://bugs.python.org/issue21667  closed by ncoghlan

#21671: CVE-2014-0224: OpenSSL upgrade to 1.0.1h on Windows required
http://bugs.python.org/issue21671  closed by zach.ware

#21677: Exception context set to string by BufferedWriter.close()
http://bugs.python.org/issue21677  closed by serhiy.storchaka

#21678: Add operation "plus" for dictionaries
http://bugs.python.org/issue21678  closed by terry.reedy

#21681: version string printed on STDERR
http://bugs.python.org/issue21681  closed by r.david.murray

#21682: Refleak in idle_test test_autocomplete
http://bugs.python.org/issue21682  closed by terry.reedy

#21683: Add Tix to the Windows buildbot scripts
http://bugs.python.org/issue21683  closed by zach.ware

#21685: zipfile module doesn't properly compress odt documents
http://bugs.python.org/issue21685  closed by r.david.murray

#21688: Improved error msg for make.bat htmlhelp
http://bugs.python.org/issue21688  closed by zach.ware

#21689: Docs for "Using Python on a Macintosh" needs to be updated.
http://bugs.python.org/issue21689  closed by ned.deily

#21691: set() returns random output with Python 3.4.1, in non-interact
http://bugs.python.org/issue21691  closed by benjamin.peterson

#21692: Wrong order of expected/actual for assert_called_once_with
http://bugs.python.org/issue21692  closed by michael.foord

#21693: Broken link to Pylons in the HOWTO TurboGears documentation
http://bugs.python.org/issue21693  closed by orsenthil

#21695: Idle 3.4.1-: closing Find in Files while in progress closes Id
http://bugs.python.org/issue21695  closed by terry.reedy

#21698: Platform.win32_ver() shows different values than expected on W
http://bugs.python.org/issue21698  closed by haypo

#21700: Missing mention of DatagramProtocol having connection_made and
http://bugs.python.org/issue21700  closed by haypo

#21701: create_datagram_endpoint does not receive when both local_addr
http://bugs.python.org/issue21701  closed by ariddell

#21709: logging.__init__ assumes that __file__ is always set
http://bugs.python.org/issue21709  closed by python-dev

#21711: Remove site-python support
http://bugs.python.org/issue21711  closed by pitrou

#21712: fractions.gcd failure
http://bugs.python.org/issue21712  closed by rhettinger

#21713: a mistype comment in PC/pyconfig.h
http://bugs.python.org/issue21713  closed by python-dev

#21727: Ambiguous sentence explaining `cycle` in itertools documentati
http://bugs.python.org/issue21727  closed by rhettinger

#21733: "mmap(size=9223372036854779904) failed" message when running t
http://bugs.python.org/issue21733  closed by ned.deily

#21745: Devguide: mention requirement to install Visual Studio SP1 on 
http://bugs.python.org/issue21745  closed by zach.ware

#21747: argvars: error while parsing under windows
http://bugs.python.org/issue21747  closed by r.david.murray

#1517993: IDLE: config-main.def contains windows-specific settings
http://bugs.python.org/issue1517993  closed by terry.reedy

From 4kir4.1i at gmail.com  Fri Jun 13 18:18:45 2014
From: 4kir4.1i at gmail.com (Akira Li)
Date: Fri, 13 Jun 2014 20:18:45 +0400
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net>
 <874mzpo8xw.fsf@vostro.rath.org> <20140613041852.GD19485@lupin>
Message-ID: <87zjhgkcka.fsf@gmail.com>

Florian Bruhin <me at the-compiler.org> writes:

> * Nikolaus Rath <Nikolaus at rath.org> [2014-06-12 19:11:07 -0700]:
>> "R. David Murray" <rdmurray at bitdance.com> writes:
>> > Also notice that using a list with shell=True is using the API
>> > incorrectly.  It wouldn't even work on Linux, so that torpedoes
>> > the cross-platform concern already :)
>> >
>> > This kind of confusion is why I opened http://bugs.python.org/issue7839.
>> 
>> Can someone describe an use case where shell=True actually makes sense
>> at all?
>> 
>> It seems to me that whenever you need a shell, the argument's that you
>> pass to it will be shell specific. So instead of e.g.
>> 
>> Popen('for i in `seq 42`; do echo $i; done', shell=True)
>> 
>> you almost certainly want to do
>> 
>> Popen(['/bin/sh', 'for i in `seq 42`; do echo $i; done'], shell=False)
>> 
>> because if your shell happens to be tcsh or cmd.exe, things are going to
>> break.
>
> My usecase is a spawn-command in a GUI application, which the user can
> use to spawn an executable. I want the user to be able to use the
> usual shell features from there. However, I also pass an argument to
> that command, and that should be escaped.

You should pass the command as a string and use cmd.exe quote rules [1]
(note: they are different from the one provided by
`subprocess.list2cmdline()` [2] that follows Microsoft C/C++ startup
code rules [3] e.g., `^` is not special unlike in cmd.exe case).

[1]: http://blogs.msdn.com/b/twistylittlepassagesallalike/archive/2011/04/23/everyone-quotes-arguments-the-wrong-way.aspx

[2]: https://docs.python.org/3.4/library/subprocess.html#converting-an-argument-sequence-to-a-string-on-windows

[3]: http://msdn.microsoft.com/en-us/library/17w5ykft%28v=vs.85%29.aspx


--
akira


From donspauldingii at gmail.com  Fri Jun 13 21:11:31 2014
From: donspauldingii at gmail.com (Don Spaulding)
Date: Fri, 13 Jun 2014 14:11:31 -0500
Subject: [Python-Dev] Backwards Incompatibility in logging module in 3.4?
In-Reply-To: <CAMpsgwawY0AkNm0sxOdPVB7PFOJPnR2KJmE2x1O+VGB=N4ee3A@mail.gmail.com>
References: <CAMaNpgWBXFkLPwVQO-c61qhdCVFpRrHjmmhQt+O=qwqBOYNhCw@mail.gmail.com>
 <CAMpsgwawY0AkNm0sxOdPVB7PFOJPnR2KJmE2x1O+VGB=N4ee3A@mail.gmail.com>
Message-ID: <CAMaNpgW7bzPhAnOqJ168bvDUvs1jt8cUW5i1OZPQstufxa6ZbA@mail.gmail.com>

On Thu, Jun 12, 2014 at 6:45 PM, Victor Stinner <victor.stinner at gmail.com>
wrote:

> Hi,
>
> 2014-06-13 0:38 GMT+02:00 Don Spaulding <donspauldingii at gmail.com>:
> > Is this a bug or an intentional break?  If it's the latter, shouldn't
> this
> > at least be mentioned in the "What's new in Python 3.4" document?
>
> IMO the change is intentional. The previous behaviour was not really
> expected.
>

I agree that the change seems intentional.  However, as Nick mentioned, the
ticket doesn't really discuss the repercussions of changing the output of
the function.  As far as I can tell, this function has returned an int when
given a string since it was introduced in Python 2.3.  I think it's
reasonable to call a function's behavior "expected" after 11 years in the
wild.


>
> Python 3.3 documentation is explicit: the result is a string and the
> input paramter is an integer. logging.getLevelName("DEBUG") was more
> an implementation
>
> https://docs.python.org/3.3/library/logging.html#logging.getLevelName
> "Returns the textual representation of logging level lvl. If the level
> is one of the predefined levels CRITICAL, ERROR, WARNING, INFO or
> DEBUG then you get the corresponding string. If you have associated
> levels with names using addLevelName() then the name you have
> associated with lvl is returned. If a numeric value corresponding to
> one of the defined levels is passed in, the corresponding string
> representation is returned. Otherwise, the string ?Level %s? % lvl is
> returned."
>
> If your code uses something like
> logger.setLevel(logging.getLevelName("DEBUG")), use directly
> logger.setLevel("DEBUG").
>
> This issue was fixed in OpenStack with this change:
> https://review.openstack.org/#/c/94028/6/openstack/common/log.py,cm
> https://review.openstack.org/#/c/94028/6
>
> Victor
>

I appreciate the pointer to the OpenStack fix.  I've actually already
worked around the issue in my project (although without much elegance, I'll
readily admit).

I opened up an issue on the tracker for this:
http://bugs.python.org/issue21752

I apologize if that was out of turn.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140613/6500e721/attachment.html>

From nad at acm.org  Sat Jun 14 00:34:33 2014
From: nad at acm.org (Ned Deily)
Date: Fri, 13 Jun 2014 15:34:33 -0700
Subject: [Python-Dev] python-dev for MAC OS X
References: <CA+3KGru1ceBstbgdg_=QHsw=4WzxKTdCL7cZu4U0zKm0AQEhgg@mail.gmail.com>
 <CALWZvp6+pi1zXr9ewKuMgzAECV70Zb4p3mutazzgw55dm=gqcg@mail.gmail.com>
Message-ID: <nad-0A00DA.15343313062014@news.gmane.org>

In article 
<CALWZvp6+pi1zXr9ewKuMgzAECV70Zb4p3mutazzgw55dm=gqcg at mail.gmail.com>,
 Tal Einat <taleinat at gmail.com> wrote:
> On Fri, Jun 13, 2014 at 12:21 PM, Pedro Helou <helou.pedro at gmail.com> wrote:
> > does anybody know how to install the python-dev headers and libraries for
> > MAC OS X?
> This list is for discussing the development *of* Python, not *with*
> Python. Please ask on the python list, python-list at python.org (more
> info here[1]) or on the #python channel on the Freenode IRC server.
> StackOverflow is also a good place to search for information and ask
> questions.

Like Tal said.  But I'm guessing you are asking about the headers for 
the Apple-supplied System Pythons.  On recent versions of OS X, they are 
not installed by default; you need to install the Command Line Tools 
component to install system headers include those for Python.  How you 
do that varies by OS X release.  In OS X 10.9 Mavericks, you can run 
"xcode-select --install".  For earlier releases, there may be an option 
in Xcode.app's Preferences.  Or you may be able to download the right 
Command Line Tools package from the Apple Developer Connection site.

-- 
 Ned Deily,
 nad at acm.org


From ncoghlan at gmail.com  Sat Jun 14 01:28:22 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 14 Jun 2014 09:28:22 +1000
Subject: [Python-Dev] Backwards Incompatibility in logging module in 3.4?
In-Reply-To: <CAMaNpgW7bzPhAnOqJ168bvDUvs1jt8cUW5i1OZPQstufxa6ZbA@mail.gmail.com>
References: <CAMaNpgWBXFkLPwVQO-c61qhdCVFpRrHjmmhQt+O=qwqBOYNhCw@mail.gmail.com>
 <CAMpsgwawY0AkNm0sxOdPVB7PFOJPnR2KJmE2x1O+VGB=N4ee3A@mail.gmail.com>
 <CAMaNpgW7bzPhAnOqJ168bvDUvs1jt8cUW5i1OZPQstufxa6ZbA@mail.gmail.com>
Message-ID: <CADiSq7cXF+HDuw02Uv=qVSNmijHDHaNK3vNBivMN7hz4XSXNJA@mail.gmail.com>

On 14 Jun 2014 06:18, "Don Spaulding" <donspauldingii at gmail.com> wrote:
>
> I opened up an issue on the tracker for this:
http://bugs.python.org/issue21752
>
> I apologize if that was out of turn.

At the very least, there should be a note in the "porting to Python 3.4"
section of the What's New and a versionchanged note on the API docs, so a
docs bug report is appropriate to add those.

Cheers,
Nick.

>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140614/a3fb026f/attachment.html>

From Nikolaus at rath.org  Sat Jun 14 05:04:23 2014
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Fri, 13 Jun 2014 20:04:23 -0700
Subject: [Python-Dev] Why does IOBase.__del__ call .close?
In-Reply-To: <1402637269.29254.128319501.4C662871@webmail.messagingengine.com>
 (Benjamin Peterson's message of "Thu, 12 Jun 2014 22:27:49 -0700")
References: <87d2egnsfq.fsf@vostro.rath.org>
 <5397B62F.80004@mrabarnett.plus.com> <87a99jnfzq.fsf@vostro.rath.org>
 <1402534493.31346.127850065.34AEEDD2@webmail.messagingengine.com>
 <877g4lobxv.fsf@vostro.rath.org>
 <1402637269.29254.128319501.4C662871@webmail.messagingengine.com>
Message-ID: <871tusnqdk.fsf@vostro.rath.org>

Benjamin Peterson <benjamin at python.org> writes:
> On Thu, Jun 12, 2014, at 18:06, Nikolaus Rath wrote:
>> Consider this simple example:
>> 
>> $ cat test.py 
>> import io
>> import warnings
>> 
>> class StridedStream(io.IOBase):
>>     def __init__(self, name, stride=2):
>>         super().__init__()
>>         self.fh = open(name, 'rb')
>>         self.stride = stride
>> 
>>     def read(self, len_):
>>         return self.fh.read(self.stride*len_)[::self.stride]
>> 
>>     def close(self):
>>         self.fh.close()
>> 
>> class FixedStridedStream(StridedStream):
>>     def __del__(self):
>>         # Prevent IOBase.__del__ frombeing called.
>>         pass
>> 
>> warnings.resetwarnings()
>> warnings.simplefilter('error')
>> 
>> print('Creating & loosing StridedStream..')
>> r = StridedStream('/dev/zero')
>> del r
>> 
>> print('Creating & loosing FixedStridedStream..')
>> r = FixedStridedStream('/dev/zero')
>> del r
>> 
>> $ python3 test.py 
>> Creating & loosing StridedStream..
>> Creating & loosing FixedStridedStream..
>> Exception ignored in: <_io.FileIO name='/dev/zero' mode='rb'>
>> ResourceWarning: unclosed file <_io.BufferedReader name='/dev/zero'>
>> 
>> In the first case, the destructor inherited from IOBase actually
>> prevents the ResourceWarning from being emitted.
>
> Ah, I see. I don't see any good ways to fix it, though, besides setting
> some flag if close() is called from __del__.

How about not having IOBase.__del__ call self.close()? Any resources
acquired by the derived class would still clean up after themselves when
they are garbage collected.


Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From benjamin at python.org  Sat Jun 14 05:26:02 2014
From: benjamin at python.org (Benjamin Peterson)
Date: Fri, 13 Jun 2014 20:26:02 -0700
Subject: [Python-Dev] Why does IOBase.__del__ call .close?
In-Reply-To: <871tusnqdk.fsf@vostro.rath.org>
References: <87d2egnsfq.fsf@vostro.rath.org>
 <5397B62F.80004@mrabarnett.plus.com>
 <87a99jnfzq.fsf@vostro.rath.org>
 <1402534493.31346.127850065.34AEEDD2@webmail.messagingengine.com>
 <877g4lobxv.fsf@vostro.rath.org>
 <1402637269.29254.128319501.4C662871@webmail.messagingengine.com>
 <871tusnqdk.fsf@vostro.rath.org>
Message-ID: <1402716362.12324.128662021.0B640B3D@webmail.messagingengine.com>

On Fri, Jun 13, 2014, at 20:04, Nikolaus Rath wrote:
> Benjamin Peterson <benjamin at python.org> writes:
> > On Thu, Jun 12, 2014, at 18:06, Nikolaus Rath wrote:
> >> Consider this simple example:
> >> 
> >> $ cat test.py 
> >> import io
> >> import warnings
> >> 
> >> class StridedStream(io.IOBase):
> >>     def __init__(self, name, stride=2):
> >>         super().__init__()
> >>         self.fh = open(name, 'rb')
> >>         self.stride = stride
> >> 
> >>     def read(self, len_):
> >>         return self.fh.read(self.stride*len_)[::self.stride]
> >> 
> >>     def close(self):
> >>         self.fh.close()
> >> 
> >> class FixedStridedStream(StridedStream):
> >>     def __del__(self):
> >>         # Prevent IOBase.__del__ frombeing called.
> >>         pass
> >> 
> >> warnings.resetwarnings()
> >> warnings.simplefilter('error')
> >> 
> >> print('Creating & loosing StridedStream..')
> >> r = StridedStream('/dev/zero')
> >> del r
> >> 
> >> print('Creating & loosing FixedStridedStream..')
> >> r = FixedStridedStream('/dev/zero')
> >> del r
> >> 
> >> $ python3 test.py 
> >> Creating & loosing StridedStream..
> >> Creating & loosing FixedStridedStream..
> >> Exception ignored in: <_io.FileIO name='/dev/zero' mode='rb'>
> >> ResourceWarning: unclosed file <_io.BufferedReader name='/dev/zero'>
> >> 
> >> In the first case, the destructor inherited from IOBase actually
> >> prevents the ResourceWarning from being emitted.
> >
> > Ah, I see. I don't see any good ways to fix it, though, besides setting
> > some flag if close() is called from __del__.
> 
> How about not having IOBase.__del__ call self.close()? Any resources
> acquired by the derived class would still clean up after themselves when
> they are garbage collected.

Well, yes, but that's probably a backwards compat problem.

From greg at krypto.org  Sat Jun 14 05:38:04 2014
From: greg at krypto.org (Gregory P. Smith)
Date: Fri, 13 Jun 2014 20:38:04 -0700
Subject: [Python-Dev] Raspberry Pi Buildbot
In-Reply-To: <CALWZvp7Ft7UKoQT+YOV_pyt3dWrFDNrtGV4niucyB2py6AMytQ@mail.gmail.com>
References: <CALWZvp7Ft7UKoQT+YOV_pyt3dWrFDNrtGV4niucyB2py6AMytQ@mail.gmail.com>
Message-ID: <CAGE7PNLS1Pj40ERD6h1qwjxepk3mOmyJdJ2u3Djbf1U5t91HRQ@mail.gmail.com>

On Fri, Jun 13, 2014 at 3:24 AM, Tal Einat <taleinat at gmail.com> wrote:

> Is there one? If not, would you like me to set one up? I've got one at
> home with Raspbian installed not doing anything, it could easily run a
> buildslave.
>
> Poking around on buildbot.python.org/all/builders, I can only see one
> ARM buildbot[1], and it's just called "ARM v7".
>

The ARM v7 buildbot is mine. It's a Samsung chromebook with a dual core
exynos5 cpu and usb3 SSD. ie: It's *at least* 10x faster than a raspberry
pi.

I don't think a pi buildbot would add much value but if you want to run
one, feel free. It should live in the unstable pool.

-gps
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140613/cdddcad7/attachment.html>

From pmiscml at gmail.com  Sat Jun 14 22:11:44 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Sat, 14 Jun 2014 23:11:44 +0300
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <20140610030303.GU10355@ando>
References: <20140610052312.280e49c9@x34f>
	<20140610030303.GU10355@ando>
Message-ID: <20140614231144.639bf852@x34f>

Hello,

On Tue, 10 Jun 2014 13:03:03 +1000
Steven D'Aprano <steve at pearwood.info> wrote:

> On Tue, Jun 10, 2014 at 05:23:12AM +0300, Paul Sokolovsky wrote:
> 
> > execfile() builtin function was removed in 3.0. This brings few
> > problems:
> > 
> > 1. It hampers interactive mode - instead of short and easy to type
> > execfile("file.py") one needs to use exec(open("file.py").read()).
> 
> If the amount of typing is the problem, that's easy to solve:
> 
> # do this once
> def execfile(name):
>     exec(open("file.py").read())

So, you here propose to workaround removal of core language feature
either a) on end user side, or b) on "system integrator" side. But such
solution is based on big number of assumptions, like: user wants to
workaround that at all (hint: they don't, they just want to use it);
you say "do this once", but actually it's "do it in each interactive
session again and again", and user may not have knowledge to "do it
once" instead; that if system integrator does that, the the function is
called "execfile": if system integrator didn't have enough Python
experience, and read only Python3 spec, they might call it something
else, and yet users with a bit of Python experience will expect it be
called exactly "execfile" and not anything else.

> 
> Another possibility is:
> 
> os.system("python file.py")
> 
> 
> > 2. Ok, assuming that exec(open().read()) idiom is still a way to go,
> > there's a problem - it requires to load entire file to memory. But
> > there can be not enough memory. Consider 1Mb file with 900Kb
> > comments (autogenerated, for example). execfile() could easily
> > parse it, using small buffer. But exec() requires to slurp entire
> > file into memory, and 1Mb is much more than heap sizes that we
> > target.
> 
> There's nothing stopping alternative implementations having their own 
> implementation-specific standard library modules.

And here you propose to workaround it on particular implementation's
level. But in my original mail, in excerpt that you removed, I kindly
asked to skip obvious suggestions (like that particular implementation
can do anything it wants).

I don't see how working around the issue on user, particular
distribution, or particular implementation level help *Python* language
in general, and *Python community* in general. So, any bright ideas how
to workaround the issue of execfile() removal on *language level*?

[]
> So you could do this:
> 
> from upy import execfile
> execfile("file.py")
> 
> So long as you make it clear that this is a platform specific module, 
> and don't advertise it as a language feature, I see no reason why you 
> cannot do that.

The case we discuss is clearly different. It's not about "platform
specific module", it's about functionality which was in Python all the
time, and was suddenly removed in Python3, for not fully clear, or
alternatively, not severe enough, reasons. If some implementation is to
re-add it, the description like above seems the most truthful way to
represent that function.


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From techtonik at gmail.com  Sat Jun 14 21:54:01 2014
From: techtonik at gmail.com (anatoly techtonik)
Date: Sat, 14 Jun 2014 22:54:01 +0300
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <CAO41-mO+1Cn2QXCjHZ7HiCrp74M4PO6pUu2Et+Ocv26gskK26Q@mail.gmail.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <CAPTjJmq4uTos7Bm9hmBNuX+L6pJVyO8FCA2ku5UMWVw0f9NkYA@mail.gmail.com>
 <CAPkN8x+Fav3LhCWgSou2Dp+RqvD2ko7vcfxM8N487w0jLfQJVQ@mail.gmail.com>
 <CAO41-mO+1Cn2QXCjHZ7HiCrp74M4PO6pUu2Et+Ocv26gskK26Q@mail.gmail.com>
Message-ID: <CAPkN8xJOuawpOvFXVBfUGf6Tprz-TSehvzj9Af8HkQscjy5wvg@mail.gmail.com>

On Fri, Jun 13, 2014 at 2:55 AM, Ryan Gonzalez <rymg19 at gmail.com> wrote:

> SHELLS ARE NOT CROSS-PLATFORM!!!! Seriously, there are going to be
> differences. If you really must:
>
> escape = lambda s: s.replace('^', '^^') if os.name == 'nt' else s
>

It is not about generic shell problem, it is about specific behavior that
on Windows Python already uses cmd.exe shell hardcoded in its sources. So
for crossplatform behavior on Windows, it should escape symbols on command
passed to cmd.exe that are special to this shell to avoid breaking Python
scripts. What you propose is a bad workaround, because it assumes that all
Python users who use subprocess to execute hg or git should possess apriori
knowledge about default subprocess behaviour with default shell on Windows
and implement workaround for that.
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140614/9f23e89d/attachment.html>

From techtonik at gmail.com  Sat Jun 14 22:04:23 2014
From: techtonik at gmail.com (anatoly techtonik)
Date: Sat, 14 Jun 2014 23:04:23 +0300
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <874mzpo8xw.fsf@vostro.rath.org>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net>
 <874mzpo8xw.fsf@vostro.rath.org>
Message-ID: <CAPkN8xKY5Rf8i3m_jcBGVBYMOLv0rX+gfTG2ixPmUgW5yzqoJA@mail.gmail.com>

On Fri, Jun 13, 2014 at 5:11 AM, Nikolaus Rath <Nikolaus at rath.org> wrote:

> "R. David Murray" <rdmurray at bitdance.com> writes:
> > Also notice that using a list with shell=True is using the API
> > incorrectly.  It wouldn't even work on Linux, so that torpedoes
> > the cross-platform concern already :)
> >
> > This kind of confusion is why I opened http://bugs.python.org/issue7839.
>
> Can someone describe an use case where shell=True actually makes sense
> at all?
>

You need to write a wrapper script to automate several user commands. It is
quite common to use shell pipe redirection for joining many utils and calls
together than to rewrite data pipes in Python.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140614/802a0168/attachment.html>

From techtonik at gmail.com  Sat Jun 14 22:07:27 2014
From: techtonik at gmail.com (anatoly techtonik)
Date: Sat, 14 Jun 2014 23:07:27 +0300
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <CAPTjJmqL9Zrj2UPZnqeKOq5YK7hp61AggY8fR2buZDom1PFyqg@mail.gmail.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net>
 <CAPkN8xJY_cNoUa7+OZk5_gmhX1d-11m-Uxzh4mCvu_U4yDGahw@mail.gmail.com>
 <CAPTjJmqnuk0orM7GEj79Wmn7BcuvwFUvj5r=HTqjLOLkh4uhxQ@mail.gmail.com>
 <CAPTjJmqL9Zrj2UPZnqeKOq5YK7hp61AggY8fR2buZDom1PFyqg@mail.gmail.com>
Message-ID: <CAPkN8xLd6FGXOU4EWVM-mJH48Fv56i60iwNFP94uX7wqB0T23g@mail.gmail.com>

On Thu, Jun 12, 2014 at 5:12 AM, Chris Angelico <rosuav at gmail.com> wrote:

> On Thu, Jun 12, 2014 at 12:07 PM, Chris Angelico <rosuav at gmail.com> wrote:
> > ISTM what you want is not shell=True, but a separate function that
> > follows the system policy for translating a command name into a
> > path-to-binary. That's something that, AFAIK, doesn't currently exist
> > in the Python 2 stdlib, but Python 3 has shutil.which(). If there's a
> > PyPI backport of that for Py2, you should be able to use that to
> > figure out the command name, and then avoid shell=False.
>
> Huh. Next time, Chris, search the web before you post. Via a
> StackOverflow post, learned about distutils.spawn.find_executable().
>

I remember I even wrote a patch for it, but I forgot about it already.
Still feels like a hack that is difficult to find and understand that you
need really it. In Rietveld case it won't work, because upload.py script
allows user to specify arbitrary diff command to send change for
review.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140614/388eab5e/attachment.html>

From pmiscml at gmail.com  Sat Jun 14 22:52:15 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Sat, 14 Jun 2014 23:52:15 +0300
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <CADiSq7d3FyByjZiwa1N4XZOoLUtz=TVYjM24P=iZMfNGzCCsPw@mail.gmail.com>
References: <20140610052312.280e49c9@x34f>
 <CADiSq7d3FyByjZiwa1N4XZOoLUtz=TVYjM24P=iZMfNGzCCsPw@mail.gmail.com>
Message-ID: <20140614235215.621e7571@x34f>

Hello,

On Tue, 10 Jun 2014 17:36:02 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 10 June 2014 12:23, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> > 1. It hampers interactive mode - instead of short and easy to type
> > execfile("file.py") one needs to use exec(open("file.py").read()).
> > I'm sure that's not going to bother a lot of people - after all, the
> > easiest way to execute a Python file is to drop back to shell and
> > restart python with file name, using all wonders of tab completion.
> > But now imagine that Python interpreter runs on bare hardware, and
> > its REPL is the only shell. That's exactly what we have with
> > MicroPython's Cortex-M port. But it's not really
> > MicroPython-specific, there's CPython port to baremetal either -
> > http://www.pycorn.org/ .
> 
> https://docs.python.org/3/library/runpy.html#runpy.run_path
> 
>     import runpy
>     file_globals = runpy.run_path("file.py")

Thanks, it's the most productive response surely. So, at least there's
alternative to removed execfile(). Unfortunately, I don't think it's
good alternative to execfile() in all respects. It clearly provides API
for that functionality, but is that solution of least surprise and is
it actually known by users at all (to be useful for them)?

Googling for "execfile python 3", top 3 hits I see are stackoverflow
questions, *none* of which mentions runpy. So, people either don't
consider it viable alternative to execfile, or don't know about it at
all (my guess it's the latter).


Like with previous discussion, its meaning goes beyond just Python
realm - there's competition all around. And internets bring funny
examples, like for example http://www.red-lang.org/p/contributions.html
(scroll down to diagram, or here's direct link:
http://3.bp.blogspot.com/-xhOP35Dm99w/UuXFKgY2dlI/AAAAAAAAAGA/YQu98_pPDjw/s1600/reichart-abstraction-diagram.png)
So, didn't you know that Ruby can be used for OS-level development, and
Python can't? Or that JavaScript DSL capabilities are better than
Python's (that's taking into account that JavaScript DSL capabilities
are represented by JSON, whose creators were so arrogant as to disallow
even usage of comments in it).

So, now suppose there's a discussion of how good different languages are
for interactive usage (out of the box apparently). It would be a little
hard to defend claim that Python is *excellent* interactive language,
if its latest series got -1 on that scale, by removing feature which
may be indispensable at times. Knowing that, one subconsciously may
start to wonder if Ruby or JavaScript are doing it (in wide sense)
better than Python.


--
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From markus at unterwaditzer.net  Sat Jun 14 23:00:59 2014
From: markus at unterwaditzer.net (Markus Unterwaditzer)
Date: Sat, 14 Jun 2014 23:00:59 +0200
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <20140610052312.280e49c9@x34f>
References: <20140610052312.280e49c9@x34f>
Message-ID: <20140614210059.GB20710@chromebot.lan>

On Tue, Jun 10, 2014 at 05:23:12AM +0300, Paul Sokolovsky wrote:
> Hello,
> 
> I was pleasantly surprised with the response to recent post about
> MicroPython implementation details 
> (https://mail.python.org/pipermail/python-dev/2014-June/134718.html). I
> hope that discussion means that posts about alternative implementations
> are not unwelcome here, so I would like to bring up another (of many)
> issues we faced while implementing MicroPython.
> 
> execfile() builtin function was removed in 3.0. This brings few
> problems:
> 
> 1. It hampers interactive mode - instead of short and easy to type
> execfile("file.py") one needs to use exec(open("file.py").read()). I'm
> sure that's not going to bother a lot of people - after all, the
> easiest way to execute a Python file is to drop back to shell and
> restart python with file name, using all wonders of tab completion. But
> now imagine that Python interpreter runs on bare hardware, and its REPL
> is the only shell. That's exactly what we have with MicroPython's
> Cortex-M port. But it's not really MicroPython-specific, there's
> CPython port to baremetal either - http://www.pycorn.org/ .

As far as i can see, minimizing the amount of characters to type was never a
design goal of the Python language. And because that goal never mattered as
much for the designers as it seems to do for you, the reason for it to get
removed -- reducing the amount of builtins without reducing functionality --
was the only one left.

> 2. Ok, assuming that exec(open().read()) idiom is still a way to go,
> there's a problem - it requires to load entire file to memory. But
> there can be not enough memory. Consider 1Mb file with 900Kb comments
> (autogenerated, for example). execfile() could easily parse it, using
> small buffer. But exec() requires to slurp entire file into memory, and
> 1Mb is much more than heap sizes that we target.

That is a valid concern, but i believe violating the language specification and
adding your own execfile implementation (either as a builtin or in a new stdlib
module) here is justified, even if it means you will have to modify your
existing Python 3 code to use it -- i don't think the majority of software
written in Python will be able to run under such memory constraints without
major modifications anyway.

> Comments, suggestions? Just to set a productive direction, please
> kindly don't consider the problems above as MicroPython's.

A new (not MicroPython-specific) stdlib module containing functions such as
execfile could be considered. Not really for Python-2-compatibility, but for
performance-critical situations.

I am not sure if this is a good solution. Not at all. Even though it's
separated from the builtins, i think it would still sacrifice the purity of the
the language (by which i mean having a minimal composable API), because people
are going to use it anyway. It reminds me of the situation in Python 2 where
developers are trying to use cStringIO with a fallback to StringIO as a matter
of principle, not because they actually need that kind of performance.

Another, IMO better idea which shifts the problem to the MicroPython devs is to
"just" detect code using

    exec(open(...).read())

and transparently rewrite it to something more memory-efficient. This is the
idea i actually think is a good one.


> I very much liked how last discussion went: I was pointed that
> https://docs.python.org/3/reference/index.html is not really a CPython
> reference, it's a *Python* reference, and there were even motion to
> clarify in it some points which came out from MicroPython discussion.
> So, what about https://docs.python.org/3/library/index.html - is it
> CPython, or Python standard library specification? Assuming the latter,
> what we have is that, by removal of previously available feature,
> *Python* became less friendly for interactive usage and less scalable.

"Less friendly for interactive usage" is a strong and vague statement. If
you're going after the amount of characters required to type, yes, absolutely,
but by that terms one could declare Bash and Perl to be superior languages.
Look at it from a different perspective: There are fewer builtins to remember.

> 
> 
> Thanks,
>  Paul                          mailto:pmiscml at gmail.com
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/markus%40unterwaditzer.net

From fabiofz at gmail.com  Sun Jun 15 00:15:37 2014
From: fabiofz at gmail.com (Fabio Zadrozny)
Date: Sat, 14 Jun 2014 19:15:37 -0300
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <20140614210059.GB20710@chromebot.lan>
References: <20140610052312.280e49c9@x34f>
 <20140614210059.GB20710@chromebot.lan>
Message-ID: <CANXBEFpxiGxuti0pqq6jjvZHXrKECtqYrqJiQJKfRqiGi7CVnw@mail.gmail.com>

On Sat, Jun 14, 2014 at 6:00 PM, Markus Unterwaditzer <
markus at unterwaditzer.net> wrote:

> On Tue, Jun 10, 2014 at 05:23:12AM +0300, Paul Sokolovsky wrote:
> > Hello,
> >
> > I was pleasantly surprised with the response to recent post about
> > MicroPython implementation details
> > (https://mail.python.org/pipermail/python-dev/2014-June/134718.html). I
> > hope that discussion means that posts about alternative implementations
> > are not unwelcome here, so I would like to bring up another (of many)
> > issues we faced while implementing MicroPython.
> >
> > execfile() builtin function was removed in 3.0. This brings few
> > problems:
> >
> > 1. It hampers interactive mode - instead of short and easy to type
> > execfile("file.py") one needs to use exec(open("file.py").read()). I'm
> > sure that's not going to bother a lot of people - after all, the
> > easiest way to execute a Python file is to drop back to shell and
> > restart python with file name, using all wonders of tab completion. But
> > now imagine that Python interpreter runs on bare hardware, and its REPL
> > is the only shell. That's exactly what we have with MicroPython's
> > Cortex-M port. But it's not really MicroPython-specific, there's
> > CPython port to baremetal either - http://www.pycorn.org/ .
>
> As far as i can see, minimizing the amount of characters to type was never
> a
> design goal of the Python language. And because that goal never mattered as
> much for the designers as it seems to do for you, the reason for it to get
> removed -- reducing the amount of builtins without reducing functionality
> --
> was the only one left.
>
> > 2. Ok, assuming that exec(open().read()) idiom is still a way to go,
> > there's a problem - it requires to load entire file to memory. But
> > there can be not enough memory. Consider 1Mb file with 900Kb comments
> > (autogenerated, for example). execfile() could easily parse it, using
> > small buffer. But exec() requires to slurp entire file into memory, and
> > 1Mb is much more than heap sizes that we target.
>
> That is a valid concern, but i believe violating the language
> specification and
> adding your own execfile implementation (either as a builtin or in a new
> stdlib
> module) here is justified, even if it means you will have to modify your
> existing Python 3 code to use it -- i don't think the majority of software
> written in Python will be able to run under such memory constraints without
> major modifications anyway.
>
> > Comments, suggestions? Just to set a productive direction, please
> > kindly don't consider the problems above as MicroPython's.
>
> A new (not MicroPython-specific) stdlib module containing functions such as
> execfile could be considered. Not really for Python-2-compatibility, but
> for
> performance-critical situations.
>
> I am not sure if this is a good solution. Not at all. Even though it's
> separated from the builtins, i think it would still sacrifice the purity
> of the
> the language (by which i mean having a minimal composable API), because
> people
> are going to use it anyway. It reminds me of the situation in Python 2
> where
> developers are trying to use cStringIO with a fallback to StringIO as a
> matter
> of principle, not because they actually need that kind of performance.
>
> Another, IMO better idea which shifts the problem to the MicroPython devs
> is to
> "just" detect code using
>
>     exec(open(...).read())
>
> and transparently rewrite it to something more memory-efficient. This is
> the
> idea i actually think is a good one.
>
>
> > I very much liked how last discussion went: I was pointed that
> > https://docs.python.org/3/reference/index.html is not really a CPython
> > reference, it's a *Python* reference, and there were even motion to
> > clarify in it some points which came out from MicroPython discussion.
> > So, what about https://docs.python.org/3/library/index.html - is it
> > CPython, or Python standard library specification? Assuming the latter,
> > what we have is that, by removal of previously available feature,
> > *Python* became less friendly for interactive usage and less scalable.
>
> "Less friendly for interactive usage" is a strong and vague statement. If
> you're going after the amount of characters required to type, yes,
> absolutely,
> but by that terms one could declare Bash and Perl to be superior languages.
> Look at it from a different perspective: There are fewer builtins to
> remember.
>
> >
>

Well, I must say that the exec(open().read()) is not really a proper
execfile implementation because it may fail because of encoding issues...
(i.e.: one has to check the file encoding to do the open with the proper
encoding, otherwise it's possible to end up with gibberish).

The PyDev debugger has an implementation (see:
https://github.com/fabioz/Pydev/blob/development/plugins/org.python.pydev/pysrc/_pydev_execfile.py)
which considers the encoding so that the result is ok (but it still has a
bug related to utf-8 with bom:
https://sw-brainwy.rhcloud.com/tracker/PyDev/346 which I plan to fix
soon...)

Personally, it's one thing that I think should be restored as the proper
implementation is actually quite tricky and the default recommended
solution does not work properly on some situations (and if micropython can
provide an optimized implementation which'd conform to Python, that'd be
one more point to add it back)...

Best Regards,

Fabio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140614/9a9f845e/attachment-0001.html>

From Nikolaus at rath.org  Sun Jun 15 00:39:19 2014
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Sat, 14 Jun 2014 15:39:19 -0700
Subject: [Python-Dev] Why does _pyio.*.readinto have to work with 'b' arrays?
Message-ID: <87y4wzm7zc.fsf@vostro.rath.org>

Hello,

The _pyio.BufferedIOBase class contains the following hack to make sure
that you can read-into array objects with format 'b':

        try:
            b[:n] = data
        except TypeError as err:
            import array
            if not isinstance(b, array.array):
                raise err
            b[:n] = array.array('b', data)

I am now wondering if I should implement the same hack in BufferedReader
(cf. issue 20578). Is there anything special about 'b' arrays that
justifies to treat them this way?

Note that readinto is supposed to work with any object implementing the
buffer protocol, but the Python implementation only works with
bytearrays and (with the above hack) 'b' arrays. Even using a 'B' array
fails:

>>> import _pyio
>>> from array import array
>>> buf = array('b', b'x' * 10)
>>> _pyio.open('/dev/zero', 'rb').readinto(buf) 
10
>>> buf = array('B', b'x' * 10)
>>> _pyio.open('/dev/zero', 'rb').readinto(buf)
Traceback (most recent call last):
  File "/home/nikratio/clones/cpython/Lib/_pyio.py", line 662, in readinto
    b[:n] = data
TypeError: can only assign array (not "bytes") to array slice

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/nikratio/clones/cpython/Lib/_pyio.py", line 667, in readinto
    b[:n] = array.array('b', data)
TypeError: bad argument type for built-in operation


It seems to me that a much cleaner solution would be to simply declare
_pyio's readinto to only work with bytearrays, and to explicitly raise a
(more helpful) TypeError if anything else is passed in.


Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From greg.ewing at canterbury.ac.nz  Sun Jun 15 01:18:51 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 15 Jun 2014 11:18:51 +1200
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <CANXBEFpxiGxuti0pqq6jjvZHXrKECtqYrqJiQJKfRqiGi7CVnw@mail.gmail.com>
References: <20140610052312.280e49c9@x34f>
 <20140614210059.GB20710@chromebot.lan>
 <CANXBEFpxiGxuti0pqq6jjvZHXrKECtqYrqJiQJKfRqiGi7CVnw@mail.gmail.com>
Message-ID: <539CD85B.1060104@canterbury.ac.nz>

Fabio Zadrozny wrote:
> Well, I must say that the exec(open().read()) is not really a proper 
> execfile implementation because it may fail because of encoding 
> issues...

It's not far off, though -- all it needs is an optional
encoding parameter.

-- 
Greg

From Steve.Dower at microsoft.com  Sun Jun 15 01:36:15 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Sat, 14 Jun 2014 23:36:15 +0000
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <539CD85B.1060104@canterbury.ac.nz>
References: <20140610052312.280e49c9@x34f>
 <20140614210059.GB20710@chromebot.lan>
 <CANXBEFpxiGxuti0pqq6jjvZHXrKECtqYrqJiQJKfRqiGi7CVnw@mail.gmail.com>,
 <539CD85B.1060104@canterbury.ac.nz>
Message-ID: <b0fde1f2267d49a18d2a67625db163aa@BLUPR03MB389.namprd03.prod.outlook.com>

I think the point is that the encoding may be embedded in the file as a coding comment and there's no obvious way to deal with that.

Top-posted from my Windows Phone
________________________________
From: Greg Ewing<mailto:greg.ewing at canterbury.ac.nz>
Sent: ?6/?14/?2014 16:19
To: python-dev at python.org<mailto:python-dev at python.org>
Subject: Re: [Python-Dev] Criticism of execfile() removal in Python3

Fabio Zadrozny wrote:
> Well, I must say that the exec(open().read()) is not really a proper
> execfile implementation because it may fail because of encoding
> issues...

It's not far off, though -- all it needs is an optional
encoding parameter.

--
Greg
_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40microsoft.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140614/153a6a39/attachment.html>

From skip.montanaro at gmail.com  Sun Jun 15 02:01:09 2014
From: skip.montanaro at gmail.com (Skip Montanaro)
Date: Sat, 14 Jun 2014 19:01:09 -0500
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <20140614231144.639bf852@x34f>
References: <20140610052312.280e49c9@x34f> <20140610030303.GU10355@ando>
 <20140614231144.639bf852@x34f>
Message-ID: <CANc-5UyooYJ_v4mnM4YmqgT8k_8rhtYEu_qjiw9dmrbbfNQ27Q@mail.gmail.com>

> you say "do this once", but actually it's "do it in each interactive
> session again and again",  ...

That's what your Python startup file is for. I have been running with
several tweaked builtin functions for years. Never have to consciously load
them. If I wanted execfile badly enough, I'd define it there.

I don't think I've used execfile more than a handful of times in the 20-odd
years I've been using Python. Perhaps our personal approaches to executing
code at the interpreter prompt are radically different, but I think if the
lack of execfile is such a big deal for you, you might want to check around
to see how other people use interactive mode.

Skip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140614/e697b64f/attachment.html>

From benjamin at python.org  Sun Jun 15 02:41:44 2014
From: benjamin at python.org (Benjamin Peterson)
Date: Sat, 14 Jun 2014 17:41:44 -0700
Subject: [Python-Dev] Why does _pyio.*.readinto have to work with 'b'
	arrays?
In-Reply-To: <87y4wzm7zc.fsf@vostro.rath.org>
References: <87y4wzm7zc.fsf@vostro.rath.org>
Message-ID: <1402792904.16337.128859457.27BBE77A@webmail.messagingengine.com>

On Sat, Jun 14, 2014, at 15:39, Nikolaus Rath wrote:
> It seems to me that a much cleaner solution would be to simply declare
> _pyio's readinto to only work with bytearrays, and to explicitly raise a
> (more helpful) TypeError if anything else is passed in.

That seems reasonable. I don't think _pyio's behavior is terribly
important compared to the C _io module.

From ncoghlan at gmail.com  Sun Jun 15 03:28:29 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 15 Jun 2014 11:28:29 +1000
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <20140614235215.621e7571@x34f>
References: <20140610052312.280e49c9@x34f>
 <CADiSq7d3FyByjZiwa1N4XZOoLUtz=TVYjM24P=iZMfNGzCCsPw@mail.gmail.com>
 <20140614235215.621e7571@x34f>
Message-ID: <CADiSq7d7p=0pB07cieqkb=S5UhZzbTdzzikrvd0J1WbJ8RqSPA@mail.gmail.com>

On 15 Jun 2014 06:52, "Paul Sokolovsky" <pmiscml at gmail.com> wrote:
>
> Hello,
>
> On Tue, 10 Jun 2014 17:36:02 +1000
> Nick Coghlan <ncoghlan at gmail.com> wrote:
>
> > On 10 June 2014 12:23, Paul Sokolovsky <pmiscml at gmail.com> wrote:
> > > 1. It hampers interactive mode - instead of short and easy to type
> > > execfile("file.py") one needs to use exec(open("file.py").read()).
> > > I'm sure that's not going to bother a lot of people - after all, the
> > > easiest way to execute a Python file is to drop back to shell and
> > > restart python with file name, using all wonders of tab completion.
> > > But now imagine that Python interpreter runs on bare hardware, and
> > > its REPL is the only shell. That's exactly what we have with
> > > MicroPython's Cortex-M port. But it's not really
> > > MicroPython-specific, there's CPython port to baremetal either -
> > > http://www.pycorn.org/ .
> >
> > https://docs.python.org/3/library/runpy.html#runpy.run_path
> >
> >     import runpy
> >     file_globals = runpy.run_path("file.py")
>
> Thanks, it's the most productive response surely. So, at least there's
> alternative to removed execfile(). Unfortunately, I don't think it's
> good alternative to execfile() in all respects. It clearly provides API
> for that functionality, but is that solution of least surprise and is
> it actually known by users at all (to be useful for them)?

We don't want people instinctively reaching for execfile (or run_path for
that matter). It's almost always the wrong answer to a problem (because it
runs code in a weird, ill-defined environment and has undefined behaviour
when used inside a function), meeting the definition of "attractive
nuisance".

We moved reload() to imp.reload() and reduce() to functools.reduce() for
similar reasons - they're too rarely the right answer to justify having
them globally available by default.

> Googling for "execfile python 3", top 3 hits I see are stackoverflow
> questions, *none* of which mentions runpy. So, people either don't
> consider it viable alternative to execfile, or don't know about it at
> all (my guess it's the latter).

Given the relative age of the two APIs, that seems likely. Adding answers
pointing users to the runpy APIs could be useful.

> Like with previous discussion, its meaning goes beyond just Python
> realm - there's competition all around. And internets bring funny
> examples, like for example http://www.red-lang.org/p/contributions.html
> (scroll down to diagram, or here's direct link:
>
http://3.bp.blogspot.com/-xhOP35Dm99w/UuXFKgY2dlI/AAAAAAAAAGA/YQu98_pPDjw/s1600/reichart-abstraction-diagram.png
)
> So, didn't you know that Ruby can be used for OS-level development, and
> Python can't? Or that JavaScript DSL capabilities are better than
> Python's (that's taking into account that JavaScript DSL capabilities
> are represented by JSON, whose creators were so arrogant as to disallow
> even usage of comments in it).

There's a lot of misinformation on the internet. While there is certainly
room for the PSF to do more in terms of effectively communicating Python's
ubiquity and strengths (and we're working on that), "people with no clue
post stuff on the internet" doesn't make a compelling *technical* argument
(which is what is needed to get new builtins added).

> So, now suppose there's a discussion of how good different languages are
> for interactive usage (out of the box apparently). It would be a little
> hard to defend claim that Python is *excellent* interactive language,
> if its latest series got -1 on that scale, by removing feature which
> may be indispensable at times. Knowing that, one subconsciously may
> start to wonder if Ruby or JavaScript are doing it (in wide sense)
> better than Python.

Yes, people get upset when we tell them we consider some aspects of their
software designs to be ill-advised. Running other code in the *current*
namespace is such a thing - it is typically preferable to run it in a
*different* namespace and then access the results, rather than implicitly
overwriting the contents of the current namespace.

That said, a question still worth asking is whether there is scope for
additional runpy APIs that are designed to more easily implement Python 2
and IPython style modes of operation where independent units of code
manipulate a shared namespace? That's actually a possibility, but any such
proposals need to be presented on python-ideas in terms of the *use case*
to be addressed, rather than the fact that execfile() happened to be the
preferred solution in Python 2.

Regards,
Nick.

>
>
> --
> Best regards,
>  Paul                          mailto:pmiscml at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140615/6601bbdc/attachment.html>

From ncoghlan at gmail.com  Sun Jun 15 03:31:44 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 15 Jun 2014 11:31:44 +1000
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <b0fde1f2267d49a18d2a67625db163aa@BLUPR03MB389.namprd03.prod.outlook.com>
References: <20140610052312.280e49c9@x34f>
 <20140614210059.GB20710@chromebot.lan>
 <CANXBEFpxiGxuti0pqq6jjvZHXrKECtqYrqJiQJKfRqiGi7CVnw@mail.gmail.com>
 <539CD85B.1060104@canterbury.ac.nz>
 <b0fde1f2267d49a18d2a67625db163aa@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <CADiSq7dsZkOs+vqeUm2NdejvA1xFd0Qp3Td2+r5ojhCS1jbhAg@mail.gmail.com>

On 15 Jun 2014 09:37, "Steve Dower" <Steve.Dower at microsoft.com> wrote:
>
> I think the point is that the encoding may be embedded in the file as a
coding comment and there's no obvious way to deal with that.

Opening source files correctly is the intended use case for tokenize.open().

Cheers,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140615/8b5a52a2/attachment.html>

From rymg19 at gmail.com  Sun Jun 15 03:22:52 2014
From: rymg19 at gmail.com (Ryan Gonzalez)
Date: Sat, 14 Jun 2014 20:22:52 -0500
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <CAPkN8xJOuawpOvFXVBfUGf6Tprz-TSehvzj9Af8HkQscjy5wvg@mail.gmail.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <CAPTjJmq4uTos7Bm9hmBNuX+L6pJVyO8FCA2ku5UMWVw0f9NkYA@mail.gmail.com>
 <CAPkN8x+Fav3LhCWgSou2Dp+RqvD2ko7vcfxM8N487w0jLfQJVQ@mail.gmail.com>
 <CAO41-mO+1Cn2QXCjHZ7HiCrp74M4PO6pUu2Et+Ocv26gskK26Q@mail.gmail.com>
 <CAPkN8xJOuawpOvFXVBfUGf6Tprz-TSehvzj9Af8HkQscjy5wvg@mail.gmail.com>
Message-ID: <CAO41-mPR2XgBEwFaq3utC=sPM3zL=4T-+XSF8kfLvOtD=gdf3g@mail.gmail.com>

Of course cmd.exe is hardcoded; there are no other shells on Windows! (I'm
purposely ignoring MinGW, Cygwin, command.com, etc.) If anything,
auto-escaping will break scripts that are already designed to escape carets
on Windows.


On Sat, Jun 14, 2014 at 2:54 PM, anatoly techtonik <techtonik at gmail.com>
wrote:

> On Fri, Jun 13, 2014 at 2:55 AM, Ryan Gonzalez <rymg19 at gmail.com> wrote:
>
>> SHELLS ARE NOT CROSS-PLATFORM!!!! Seriously, there are going to be
>> differences. If you really must:
>>
>> escape = lambda s: s.replace('^', '^^') if os.name == 'nt' else s
>>
>
> It is not about generic shell problem, it is about specific behavior that
> on Windows Python already uses cmd.exe shell hardcoded in its sources. So
> for crossplatform behavior on Windows, it should escape symbols on command
> passed to cmd.exe that are special to this shell to avoid breaking Python
> scripts. What you propose is a bad workaround, because it assumes that all
> Python users who use subprocess to execute hg or git should possess apriori
> knowledge about default subprocess behaviour with default shell on Windows
> and implement workaround for that.
>  --
> anatoly t.
>


-- 
Ryan
If anybody ever asks me why I prefer C++ to C, my answer will be simple:
"It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was
nul-terminated."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140614/f43d9044/attachment.html>

From greg.ewing at canterbury.ac.nz  Sun Jun 15 01:15:27 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 15 Jun 2014 11:15:27 +1200
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
 character
In-Reply-To: <CAPkN8xLd6FGXOU4EWVM-mJH48Fv56i60iwNFP94uX7wqB0T23g@mail.gmail.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net>
 <CAPkN8xJY_cNoUa7+OZk5_gmhX1d-11m-Uxzh4mCvu_U4yDGahw@mail.gmail.com>
 <CAPTjJmqnuk0orM7GEj79Wmn7BcuvwFUvj5r=HTqjLOLkh4uhxQ@mail.gmail.com>
 <CAPTjJmqL9Zrj2UPZnqeKOq5YK7hp61AggY8fR2buZDom1PFyqg@mail.gmail.com>
 <CAPkN8xLd6FGXOU4EWVM-mJH48Fv56i60iwNFP94uX7wqB0T23g@mail.gmail.com>
Message-ID: <539CD78F.9020908@canterbury.ac.nz>

>     On Thu, Jun 12, 2014 at 12:07 PM, Chris Angelico <rosuav at gmail.com
>     <mailto:rosuav at gmail.com>> wrote:
>      > ISTM what you want is not shell=True, but a separate function that
>      > follows the system policy for translating a command name into a
>      > path-to-binary.

According to the docs, subprocess.Popen should already be
doing this on Unix:

    On Unix, with shell=False (default): In this case, the Popen class
    uses os.execvp() to execute the child program.

and execvp() searches the user's PATH to find the program.

However, it says the Windows version uses CreateProcess, which
doesn't use PATH.

This seems like an unfortunate platform difference to me. It
would be better if PATH were searched on both platforms, or
better still, make it an option independent of shell=True.

-- 
Greg

From Steve.Dower at microsoft.com  Sun Jun 15 05:15:12 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Sun, 15 Jun 2014 03:15:12 +0000
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <CADiSq7dsZkOs+vqeUm2NdejvA1xFd0Qp3Td2+r5ojhCS1jbhAg@mail.gmail.com>
References: <20140610052312.280e49c9@x34f>
 <20140614210059.GB20710@chromebot.lan>
 <CANXBEFpxiGxuti0pqq6jjvZHXrKECtqYrqJiQJKfRqiGi7CVnw@mail.gmail.com>
 <539CD85B.1060104@canterbury.ac.nz>
 <b0fde1f2267d49a18d2a67625db163aa@BLUPR03MB389.namprd03.prod.outlook.com>,
 <CADiSq7dsZkOs+vqeUm2NdejvA1xFd0Qp3Td2+r5ojhCS1jbhAg@mail.gmail.com>
Message-ID: <481b9af010ac4134a3ecbafd32f3be31@BLUPR03MB389.namprd03.prod.outlook.com>

So is exec(tokenize.open(file).read()) the actual replacement for execfile()? Not too bad, but still not obvious (or widely promoted - I'd never heard of it).

Top-posted from my Windows Phone
________________________________
From: Nick Coghlan<mailto:ncoghlan at gmail.com>
Sent: ?6/?14/?2014 18:31
To: Steve Dower<mailto:Steve.Dower at microsoft.com>
Cc: Greg Ewing<mailto:greg.ewing at canterbury.ac.nz>; python-dev at python.org<mailto:python-dev at python.org>
Subject: Re: [Python-Dev] Criticism of execfile() removal in Python3


On 15 Jun 2014 09:37, "Steve Dower" <Steve.Dower at microsoft.com<mailto:Steve.Dower at microsoft.com>> wrote:
>
> I think the point is that the encoding may be embedded in the file as a coding comment and there's no obvious way to deal with that.

Opening source files correctly is the intended use case for tokenize.open().

Cheers,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140615/2d15e34d/attachment-0001.html>

From ncoghlan at gmail.com  Sun Jun 15 06:31:36 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 15 Jun 2014 14:31:36 +1000
Subject: [Python-Dev] Why does _pyio.*.readinto have to work with 'b'
	arrays?
In-Reply-To: <1402792904.16337.128859457.27BBE77A@webmail.messagingengine.com>
References: <87y4wzm7zc.fsf@vostro.rath.org>
 <1402792904.16337.128859457.27BBE77A@webmail.messagingengine.com>
Message-ID: <CADiSq7dXPHDGLfvqqPs+-XFj4q0C4GArJ8vaYd_G_tYtFVfOgA@mail.gmail.com>

On 15 June 2014 10:41, Benjamin Peterson <benjamin at python.org> wrote:
> On Sat, Jun 14, 2014, at 15:39, Nikolaus Rath wrote:
>> It seems to me that a much cleaner solution would be to simply declare
>> _pyio's readinto to only work with bytearrays, and to explicitly raise a
>> (more helpful) TypeError if anything else is passed in.
>
> That seems reasonable. I don't think _pyio's behavior is terribly
> important compared to the C _io module.

_pyio was written before the various memoryview fixes that were
implemented in Python 3.3 - it seems to me it would make more sense to
use memoryview to correctly handle arbitrary buffer exporters (we
implemented similar fixes for the base64 module in 3.4).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Sun Jun 15 06:42:50 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 15 Jun 2014 14:42:50 +1000
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <481b9af010ac4134a3ecbafd32f3be31@BLUPR03MB389.namprd03.prod.outlook.com>
References: <20140610052312.280e49c9@x34f>
 <20140614210059.GB20710@chromebot.lan>
 <CANXBEFpxiGxuti0pqq6jjvZHXrKECtqYrqJiQJKfRqiGi7CVnw@mail.gmail.com>
 <539CD85B.1060104@canterbury.ac.nz>
 <b0fde1f2267d49a18d2a67625db163aa@BLUPR03MB389.namprd03.prod.outlook.com>
 <CADiSq7dsZkOs+vqeUm2NdejvA1xFd0Qp3Td2+r5ojhCS1jbhAg@mail.gmail.com>
 <481b9af010ac4134a3ecbafd32f3be31@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <CADiSq7cAwazbA3Oh3T9HrvdMeVuFX1V43Vb-ni6772iqQGRbCg@mail.gmail.com>

On 15 June 2014 13:15, Steve Dower <Steve.Dower at microsoft.com> wrote:
> So is exec(tokenize.open(file).read()) the actual replacement for
> execfile()? Not too bad, but still not obvious (or widely promoted - I'd
> never heard of it).

Yes, that's pretty close. It's still a dubious idea due to the
implicit modification of the local namespace (and the resulting
differences in behaviour at function level due to the fact that
writing to locals() doesn't actually update the local namespace).

That said, the "implicit changes to the local namespace are a bad
idea" concern applies to exec() in general, so it was the "it's just a
shorthand for a particular use of exec" aspect that tipped in the
balance in the demise of execfile (this is also implied by the
phrasing of the relevant bullet point in PEP 3100).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From Nikolaus at rath.org  Sun Jun 15 06:57:12 2014
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Sat, 14 Jun 2014 21:57:12 -0700
Subject: [Python-Dev] Why does _pyio.*.readinto have to work with 'b'
 arrays?
In-Reply-To: <CADiSq7dXPHDGLfvqqPs+-XFj4q0C4GArJ8vaYd_G_tYtFVfOgA@mail.gmail.com>
References: <87y4wzm7zc.fsf@vostro.rath.org>	<1402792904.16337.128859457.27BBE77A@webmail.messagingengine.com>
 <CADiSq7dXPHDGLfvqqPs+-XFj4q0C4GArJ8vaYd_G_tYtFVfOgA@mail.gmail.com>
Message-ID: <539D27A8.7070505@rath.org>

On 06/14/2014 09:31 PM, Nick Coghlan wrote:
> On 15 June 2014 10:41, Benjamin Peterson <benjamin at python.org> wrote:
>> On Sat, Jun 14, 2014, at 15:39, Nikolaus Rath wrote:
>>> It seems to me that a much cleaner solution would be to simply declare
>>> _pyio's readinto to only work with bytearrays, and to explicitly raise a
>>> (more helpful) TypeError if anything else is passed in.
>>
>> That seems reasonable. I don't think _pyio's behavior is terribly
>> important compared to the C _io module.
> 
> _pyio was written before the various memoryview fixes that were
> implemented in Python 3.3 - it seems to me it would make more sense to
> use memoryview to correctly handle arbitrary buffer exporters (we
> implemented similar fixes for the base64 module in 3.4).

Definitely. But is there a way to do that without writing C code?

My attempts failed:

>>> from array import array
>>> a = array('b', b'x'*10)
>>> am = memoryview(a)
>>> am[:3] = b'foo'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: memoryview assignment: lvalue and rvalue have different
structures
>>> am[:3] = memoryview(b'foo')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: memoryview assignment: lvalue and rvalue have different
structures
>>> am.format = 'B'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: attribute 'format' of 'memoryview' objects is not writable

The only thing that works is:

>>> am[:3] = array('b', b'foo')

but that's again specific to a being a 'b'-array.


Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From mail at timgolden.me.uk  Sun Jun 15 08:07:18 2014
From: mail at timgolden.me.uk (Tim Golden)
Date: Sun, 15 Jun 2014 07:07:18 +0100
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
 character
In-Reply-To: <CAO41-mPR2XgBEwFaq3utC=sPM3zL=4T-+XSF8kfLvOtD=gdf3g@mail.gmail.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <CAPTjJmq4uTos7Bm9hmBNuX+L6pJVyO8FCA2ku5UMWVw0f9NkYA@mail.gmail.com>
 <CAPkN8x+Fav3LhCWgSou2Dp+RqvD2ko7vcfxM8N487w0jLfQJVQ@mail.gmail.com>
 <CAO41-mO+1Cn2QXCjHZ7HiCrp74M4PO6pUu2Et+Ocv26gskK26Q@mail.gmail.com>
 <CAPkN8xJOuawpOvFXVBfUGf6Tprz-TSehvzj9Af8HkQscjy5wvg@mail.gmail.com>
 <CAO41-mPR2XgBEwFaq3utC=sPM3zL=4T-+XSF8kfLvOtD=gdf3g@mail.gmail.com>
Message-ID: <539D3816.3080402@timgolden.me.uk>

On 15/06/2014 02:22, Ryan Gonzalez wrote:
> Of course cmd.exe is hardcoded;

Of course it's not:

(from Lib/subprocess.py)

comspec = os.environ.get("COMSPEC", "cmd.exe")

I don't often expect, in these post-command.com days, to get anything 
other than cmd.exe. But alternative command processors are certainly 
possible.

TJG

From ncoghlan at gmail.com  Sun Jun 15 08:37:48 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 15 Jun 2014 16:37:48 +1000
Subject: [Python-Dev] Why does _pyio.*.readinto have to work with 'b'
	arrays?
In-Reply-To: <539D27A8.7070505@rath.org>
References: <87y4wzm7zc.fsf@vostro.rath.org>
 <1402792904.16337.128859457.27BBE77A@webmail.messagingengine.com>
 <CADiSq7dXPHDGLfvqqPs+-XFj4q0C4GArJ8vaYd_G_tYtFVfOgA@mail.gmail.com>
 <539D27A8.7070505@rath.org>
Message-ID: <CADiSq7ffK--UQhk6brajUmBznpCfSF=w9AeL9=3nFiszNhHOEQ@mail.gmail.com>

On 15 June 2014 14:57, Nikolaus Rath <Nikolaus at rath.org> wrote:
> On 06/14/2014 09:31 PM, Nick Coghlan wrote:
>> On 15 June 2014 10:41, Benjamin Peterson <benjamin at python.org> wrote:
>>> On Sat, Jun 14, 2014, at 15:39, Nikolaus Rath wrote:
>>>> It seems to me that a much cleaner solution would be to simply declare
>>>> _pyio's readinto to only work with bytearrays, and to explicitly raise a
>>>> (more helpful) TypeError if anything else is passed in.
>>>
>>> That seems reasonable. I don't think _pyio's behavior is terribly
>>> important compared to the C _io module.
>>
>> _pyio was written before the various memoryview fixes that were
>> implemented in Python 3.3 - it seems to me it would make more sense to
>> use memoryview to correctly handle arbitrary buffer exporters (we
>> implemented similar fixes for the base64 module in 3.4).
>
> Definitely. But is there a way to do that without writing C code?

Yes, Python level reshaping and typecasting of memory views is one of
the key enhancements Stefan implemented for 3.3.

>>> from array import array
>>> a = array('b', b'x'*10)
>>> am = memoryview(a)
>>> a
array('b', [120, 120, 120, 120, 120, 120, 120, 120, 120, 120])
>>> am[:3] = memoryview(b'foo').cast('b')
>>> a
array('b', [102, 111, 111, 120, 120, 120, 120, 120, 120, 120])

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From p.f.moore at gmail.com  Sun Jun 15 09:54:50 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 15 Jun 2014 08:54:50 +0100
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <539CD78F.9020908@canterbury.ac.nz>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net>
 <CAPkN8xJY_cNoUa7+OZk5_gmhX1d-11m-Uxzh4mCvu_U4yDGahw@mail.gmail.com>
 <CAPTjJmqnuk0orM7GEj79Wmn7BcuvwFUvj5r=HTqjLOLkh4uhxQ@mail.gmail.com>
 <CAPTjJmqL9Zrj2UPZnqeKOq5YK7hp61AggY8fR2buZDom1PFyqg@mail.gmail.com>
 <CAPkN8xLd6FGXOU4EWVM-mJH48Fv56i60iwNFP94uX7wqB0T23g@mail.gmail.com>
 <539CD78F.9020908@canterbury.ac.nz>
Message-ID: <CACac1F-oF1jzGbtRL3nD3qkt35VVnJtADj9C4NRe6uEW6FnZfg@mail.gmail.com>

On 15 June 2014 00:15, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> However, it says the Windows version uses CreateProcess, which
> doesn't use PATH.

Huh? CreateProcess uses PATH:

>py -3.4
Python 3.4.0 (v3.4.0:04f714765c13, Mar 16 2014, 19:25:23) [MSC v.1600
64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import subprocess
>>> subprocess.check_call(['echo', 'hello'])
hello
0

"echo" is an executable "C:\Utils\GnuWin64\echo.exe" which is on PATH
but not in the current directory...

Paul

From mail at timgolden.me.uk  Sun Jun 15 10:58:32 2014
From: mail at timgolden.me.uk (Tim Golden)
Date: Sun, 15 Jun 2014 09:58:32 +0100
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
 character
In-Reply-To: <CACac1F-oF1jzGbtRL3nD3qkt35VVnJtADj9C4NRe6uEW6FnZfg@mail.gmail.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net>
 <CAPkN8xJY_cNoUa7+OZk5_gmhX1d-11m-Uxzh4mCvu_U4yDGahw@mail.gmail.com>
 <CAPTjJmqnuk0orM7GEj79Wmn7BcuvwFUvj5r=HTqjLOLkh4uhxQ@mail.gmail.com>
 <CAPTjJmqL9Zrj2UPZnqeKOq5YK7hp61AggY8fR2buZDom1PFyqg@mail.gmail.com>
 <CAPkN8xLd6FGXOU4EWVM-mJH48Fv56i60iwNFP94uX7wqB0T23g@mail.gmail.com>
 <539CD78F.9020908@canterbury.ac.nz>
 <CACac1F-oF1jzGbtRL3nD3qkt35VVnJtADj9C4NRe6uEW6FnZfg@mail.gmail.com>
Message-ID: <539D6038.2080309@timgolden.me.uk>

On 15/06/2014 08:54, Paul Moore wrote:
> On 15 June 2014 00:15, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>> However, it says the Windows version uses CreateProcess, which
>> doesn't use PATH.
>
> Huh? CreateProcess uses PATH:

Just to be precise:

CreateProcess *doesn't* use PATH if you pass an lpApplicationName 
parameter. It *does* use PATH if you pass a lpCommandLine parameter 
without an lpApplicationName parameter. It's possible to do either via 
the subprocess module, but the latter is the default.

If you call:

subprocess.Popen(['program.exe', 'a', 'b'])

or

subprocess.Popen('program.exe a b'])


Then CreateProcess will be called with a lpCommandLine but no 
lpApplicationName and PATH will be searched.

If, however, you call:

subprocess.Popen(['a', 'b'], executable="program.exe")

then CreateProcess will be called with lpApplicationName="program.exe" 
and lpCommandLine="a b" and the PATH will not be searched.

TJG

From victor.stinner at gmail.com  Sun Jun 15 11:31:43 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Sun, 15 Jun 2014 11:31:43 +0200
Subject: [Python-Dev] Why does _pyio.*.readinto have to work with 'b'
	arrays?
In-Reply-To: <1402792904.16337.128859457.27BBE77A@webmail.messagingengine.com>
References: <87y4wzm7zc.fsf@vostro.rath.org>
 <1402792904.16337.128859457.27BBE77A@webmail.messagingengine.com>
Message-ID: <CAMpsgwY6f2rWUkPuFTaPBF-pDqrOcPuuDj6KA=pJPZGKGKsaxQ@mail.gmail.com>

Le 15 juin 2014 02:42, "Benjamin Peterson" <benjamin at python.org> a ?crit :
> On Sat, Jun 14, 2014, at 15:39, Nikolaus Rath wrote:
> > It seems to me that a much cleaner solution would be to simply declare
> > _pyio's readinto to only work with bytearrays, and to explicitly raise a
> > (more helpful) TypeError if anything else is passed in.
>
> That seems reasonable. I don't think _pyio's behavior is terribly
> important compared to the C _io module.

Which types are accepted by the readinto() method of the C io module? If
the C module only accepts bytearray, the array hack must be removed from
_pyio.

The _pyio module is mostly used for testing purpose, it's much slower. I
hope that nobody uses it in production, the module is private (underscore
prefix). So it's fine to break backward compatibilty to have the same
behaviour then the C module.

Victor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140615/16300223/attachment.html>

From greg.ewing at canterbury.ac.nz  Sun Jun 15 12:47:54 2014
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 15 Jun 2014 22:47:54 +1200
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
 character
In-Reply-To: <CACac1F-oF1jzGbtRL3nD3qkt35VVnJtADj9C4NRe6uEW6FnZfg@mail.gmail.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
 <a41a57da-c6fa-42ae-9cfd-450ddb61cd94@email.android.com>
 <20140611230030.6F56F250DC4@webabinitio.net>
 <CAPkN8xJY_cNoUa7+OZk5_gmhX1d-11m-Uxzh4mCvu_U4yDGahw@mail.gmail.com>
 <CAPTjJmqnuk0orM7GEj79Wmn7BcuvwFUvj5r=HTqjLOLkh4uhxQ@mail.gmail.com>
 <CAPTjJmqL9Zrj2UPZnqeKOq5YK7hp61AggY8fR2buZDom1PFyqg@mail.gmail.com>
 <CAPkN8xLd6FGXOU4EWVM-mJH48Fv56i60iwNFP94uX7wqB0T23g@mail.gmail.com>
 <539CD78F.9020908@canterbury.ac.nz>
 <CACac1F-oF1jzGbtRL3nD3qkt35VVnJtADj9C4NRe6uEW6FnZfg@mail.gmail.com>
Message-ID: <539D79DA.8030608@canterbury.ac.nz>

Paul Moore wrote:
> Huh? CreateProcess uses PATH:

Hmm, in that case Microsoft's documentation
is lying, or subprocess is doing something itself
before passing the command name to CreateProcess.

Anyway, looks like there's no problem.

-- 
Greg

From Nikolaus at rath.org  Sun Jun 15 21:03:28 2014
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Sun, 15 Jun 2014 12:03:28 -0700
Subject: [Python-Dev] Why does _pyio.*.readinto have to work with 'b'
	arrays?
In-Reply-To: <CADiSq7ffK--UQhk6brajUmBznpCfSF=w9AeL9=3nFiszNhHOEQ@mail.gmail.com>
 (Nick Coghlan's message of "Sun, 15 Jun 2014 16:37:48 +1000")
References: <87y4wzm7zc.fsf@vostro.rath.org>
 <1402792904.16337.128859457.27BBE77A@webmail.messagingengine.com>
 <CADiSq7dXPHDGLfvqqPs+-XFj4q0C4GArJ8vaYd_G_tYtFVfOgA@mail.gmail.com>
 <539D27A8.7070505@rath.org>
 <CADiSq7ffK--UQhk6brajUmBznpCfSF=w9AeL9=3nFiszNhHOEQ@mail.gmail.com>
Message-ID: <87vbs2m1vj.fsf@vostro.rath.org>

Nick Coghlan <ncoghlan at gmail.com> writes:
> On 15 June 2014 14:57, Nikolaus Rath <Nikolaus at rath.org> wrote:
>> On 06/14/2014 09:31 PM, Nick Coghlan wrote:
>>> On 15 June 2014 10:41, Benjamin Peterson <benjamin at python.org> wrote:
>>>> On Sat, Jun 14, 2014, at 15:39, Nikolaus Rath wrote:
>>>>> It seems to me that a much cleaner solution would be to simply declare
>>>>> _pyio's readinto to only work with bytearrays, and to explicitly raise a
>>>>> (more helpful) TypeError if anything else is passed in.
>>>>
>>>> That seems reasonable. I don't think _pyio's behavior is terribly
>>>> important compared to the C _io module.
>>>
>>> _pyio was written before the various memoryview fixes that were
>>> implemented in Python 3.3 - it seems to me it would make more sense to
>>> use memoryview to correctly handle arbitrary buffer exporters (we
>>> implemented similar fixes for the base64 module in 3.4).
>>
>> Definitely. But is there a way to do that without writing C code?
>
> Yes, Python level reshaping and typecasting of memory views is one of
> the key enhancements Stefan implemented for 3.3.
[..]

Ah, nice. I'll use that. Thank you Stefan :-).


Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From Nikolaus at rath.org  Sun Jun 15 21:05:09 2014
From: Nikolaus at rath.org (Nikolaus Rath)
Date: Sun, 15 Jun 2014 12:05:09 -0700
Subject: [Python-Dev] Why does _pyio.*.readinto have to work with 'b'
	arrays?
In-Reply-To: <CAMpsgwY6f2rWUkPuFTaPBF-pDqrOcPuuDj6KA=pJPZGKGKsaxQ@mail.gmail.com>
 (Victor Stinner's message of "Sun, 15 Jun 2014 11:31:43 +0200")
References: <87y4wzm7zc.fsf@vostro.rath.org>
 <1402792904.16337.128859457.27BBE77A@webmail.messagingengine.com>
 <CAMpsgwY6f2rWUkPuFTaPBF-pDqrOcPuuDj6KA=pJPZGKGKsaxQ@mail.gmail.com>
Message-ID: <87sin6m1sq.fsf@vostro.rath.org>

Victor Stinner <victor.stinner at gmail.com> writes:
> Le 15 juin 2014 02:42, "Benjamin Peterson" <benjamin at python.org> a ?crit :
>> On Sat, Jun 14, 2014, at 15:39, Nikolaus Rath wrote:
>> > It seems to me that a much cleaner solution would be to simply declare
>> > _pyio's readinto to only work with bytearrays, and to explicitly raise a
>> > (more helpful) TypeError if anything else is passed in.
>>
>> That seems reasonable. I don't think _pyio's behavior is terribly
>> important compared to the C _io module.
>
> Which types are accepted by the readinto() method of the C io module?

Everything implementing the buffer protocol.

> If the C module only accepts bytearray, the array hack must be removed
> from _pyio.

_pyio currently accepts only bytearray and 'b'-type arrays. But it seems
with memoryview.cast() we now have a way to make it behave like the C
module.


Best,
-Nikolaus
-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From chris.barker at noaa.gov  Mon Jun 16 19:40:03 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Mon, 16 Jun 2014 10:40:03 -0700
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <20140614231144.639bf852@x34f>
References: <20140610052312.280e49c9@x34f> <20140610030303.GU10355@ando>
 <20140614231144.639bf852@x34f>
Message-ID: <CALGmxEKwa8g2LO98UkACgxNSz6vfudF_iQ6Fn+9CrDNwJPGXiw@mail.gmail.com>

On Sat, Jun 14, 2014 at 1:11 PM, Paul Sokolovsky <pmiscml at gmail.com> wrote:


> > > 1. It hampers interactive mode - instead of short and easy to type
> > > execfile("file.py") one needs to use exec(open("file.py").read()).

>
> > If the amount of typing is the problem, that's easy to solve:
> >
> > # do this once
> > def execfile(name):
> >     exec(open("file.py").read())
>

FWIW, when I started using python (15?) years ago -- the first thing I
looked for was a way to "just run a file", at the interactive prompt, like
I had in MATLAB. I found and used execfile().

However, it wasn't long before I discovered that excecfile() was really
kind of a pain, you've got namespaces, and all sorts of stuff that made it
often not work like I wanted, and was a pain to type. I stopped using it
all together

More recently, I discovered iPython and its "run" function -- very nice, it
does the obvious stuff for you the way you'd expect.

My conclusions:

1) runfile() is not really very usefull, it's fine to hve removed it.

2) the built-in interactive python interpreter is really pretty lame. If
you want a good interactive experience, you need something more anyway
(iPython, for instance) -- putting execfile() back is only one tiny
improvement that's not worth it.


So if this is about micropython -- I think it would serve the project very
well to have a micropython-specific interactive mode. iPython is fabulous,
but though I imagine too heavy weight. But perhaps you could borrow some
things from it -- like "run" , for example.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140616/2e0d06e0/attachment.html>

From ethan at stoneleaf.us  Mon Jun 16 19:52:18 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 16 Jun 2014 10:52:18 -0700
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <CALGmxEKwa8g2LO98UkACgxNSz6vfudF_iQ6Fn+9CrDNwJPGXiw@mail.gmail.com>
References: <20140610052312.280e49c9@x34f> <20140610030303.GU10355@ando>
 <20140614231144.639bf852@x34f>
 <CALGmxEKwa8g2LO98UkACgxNSz6vfudF_iQ6Fn+9CrDNwJPGXiw@mail.gmail.com>
Message-ID: <539F2ED2.5080105@stoneleaf.us>

On 06/16/2014 10:40 AM, Chris Barker wrote:
>
> My conclusions:
>
> 1) runfile() is not really very usefull, it's fine to hve removed it.

s/runfile/execfile

--
~Ethan~

From victor.stinner at gmail.com  Mon Jun 16 23:12:18 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Mon, 16 Jun 2014 23:12:18 +0200
Subject: [Python-Dev] Windows XP, Python 3.5 and PEP 11
Message-ID: <CAMpsgwY29SkdnQo4s15aYg4T+rojoHgoOTAfOu+93PE05dzrcw@mail.gmail.com>

Hi,

I would like to know if Python 3.5 will still support Windows XP or
not. Almost all flavors of Windows XP reached the end-of-life in
April, 2014 except "Windows XP Embedded". There is even an hack to use
Windows upgrades on the desktop flavor using the embedded flavor (by
changing a key in the registry). Extracts of the Wikipedia page:

"As of January 2014, at least 49% of all computers in China still ran XP. "

"In January 2014, it was estimated that more than 95% of the 3 million
automated teller machines in the world were still running Windows XP
(which largely replaced IBM's OS/2 as the predominant operating system
on ATMs)"

http://en.wikipedia.org/wiki/Windows_XP

<my recent experience with XP>
A few months ago, I installed an ISO of Windows XP, downloaded from
MSDN, to investigate a bug (something related to timer and HPET), but
then I realized that I can use my Windows 7 VM to reproduce the issue.
Now I cannot use my Windows XP VM anymore because I have to enter a
product key (before I had a delay of 30 days), but I don't have this
product key and my MSDN account expired. I don't want to waste my time
and money with the registration thing, so I just gave up.
</my recent experience with XP>

Any of you plan to invest time on issues specific to Windows XP and
produce binaries working on Windows XP? Or can we just provide
binaries without testing them?

For example, it looks like the following issue is specific to Windows XP:
http://bugs.python.org/issue6926

Oh, and the PEP 11:
http://legacy.python.org/dev/peps/pep-0011/#microsoft-windows
"Microsoft has established a policy called product support lifecycle
(...) Python's Windows support now follows this lifecycle."

Victor

From ncoghlan at gmail.com  Tue Jun 17 00:39:29 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 17 Jun 2014 08:39:29 +1000
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <CALGmxEKwa8g2LO98UkACgxNSz6vfudF_iQ6Fn+9CrDNwJPGXiw@mail.gmail.com>
References: <20140610052312.280e49c9@x34f> <20140610030303.GU10355@ando>
 <20140614231144.639bf852@x34f>
 <CALGmxEKwa8g2LO98UkACgxNSz6vfudF_iQ6Fn+9CrDNwJPGXiw@mail.gmail.com>
Message-ID: <CADiSq7cx_kM3XLNGXuxK+YQ2sgBwbm=QPtJjArB2a9UvrG+Jxw@mail.gmail.com>

On 17 Jun 2014 03:42, "Chris Barker" <chris.barker at noaa.gov> wrote:
>
> On Sat, Jun 14, 2014 at 1:11 PM, Paul Sokolovsky <pmiscml at gmail.com>
wrote:
>
>>
>> > > 1. It hampers interactive mode - instead of short and easy to type
>> > > execfile("file.py") one needs to use exec(open("file.py").read()).
>>
>> >
>> > If the amount of typing is the problem, that's easy to solve:
>> >
>> > # do this once
>> > def execfile(name):
>> >     exec(open("file.py").read())
>
>
> FWIW, when I started using python (15?) years ago -- the first thing I
looked for was a way to "just run a file", at the interactive prompt, like
I had in MATLAB. I found and used execfile().

Yes, if people are looking for a MATLAB replacement, they want IPython
rather than the default REPL.

The default one is deliberately minimal, IPython is designed to be a
comprehensive numeric and scientific workspace.

Cheers,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140617/9ae8e097/attachment.html>

From zachary.ware+pydev at gmail.com  Tue Jun 17 05:08:28 2014
From: zachary.ware+pydev at gmail.com (Zachary Ware)
Date: Mon, 16 Jun 2014 22:08:28 -0500
Subject: [Python-Dev] Windows XP, Python 3.5 and PEP 11
In-Reply-To: <CAMpsgwY29SkdnQo4s15aYg4T+rojoHgoOTAfOu+93PE05dzrcw@mail.gmail.com>
References: <CAMpsgwY29SkdnQo4s15aYg4T+rojoHgoOTAfOu+93PE05dzrcw@mail.gmail.com>
Message-ID: <CAKJDb-OXp9qcd_zbTqg5-BnKOaKwEQekVso_ose4dhnamnHrjQ@mail.gmail.com>

On Mon, Jun 16, 2014 at 4:12 PM, Victor Stinner
<victor.stinner at gmail.com> wrote:
> Hi,
>
> I would like to know if Python 3.5 will still support Windows XP or
> not. Almost all flavors of Windows XP reached the end-of-life in
> April, 2014 except "Windows XP Embedded". There is even an hack to use
> Windows upgrades on the desktop flavor using the embedded flavor (by
> changing a key in the registry). Extracts of the Wikipedia page:

This was recently discussed in the "Moving Python 3.5 on Windows to a
new compiler" thread, where Martin declared XP support to be ended
[1].  I believe Tim Golden is the only resident Windows dev from whom
I haven't seen at least implicit agreement that XP doesn't need
further support, so I'd say our support for XP is well and truly dead
:)

In any case, surely anyone stuck with XP can be happy with Python 3.4.
I'm perfectly fine with 3.2 on Win2k!

-- 
Zach

[1] https://mail.python.org/pipermail/python-dev/2014-June/134903.html

From mail at timgolden.me.uk  Tue Jun 17 07:01:29 2014
From: mail at timgolden.me.uk (Tim Golden)
Date: Tue, 17 Jun 2014 06:01:29 +0100
Subject: [Python-Dev] Windows XP, Python 3.5 and PEP 11
In-Reply-To: <CAKJDb-OXp9qcd_zbTqg5-BnKOaKwEQekVso_ose4dhnamnHrjQ@mail.gmail.com>
References: <CAMpsgwY29SkdnQo4s15aYg4T+rojoHgoOTAfOu+93PE05dzrcw@mail.gmail.com>
 <CAKJDb-OXp9qcd_zbTqg5-BnKOaKwEQekVso_ose4dhnamnHrjQ@mail.gmail.com>
Message-ID: <539FCBA9.2010903@timgolden.me.uk>

On 17/06/2014 04:08, Zachary Ware wrote:
> On Mon, Jun 16, 2014 at 4:12 PM, Victor Stinner
> <victor.stinner at gmail.com> wrote:
>> Hi,
>>
>> I would like to know if Python 3.5 will still support Windows XP or
>> not. Almost all flavors of Windows XP reached the end-of-life in
>> April, 2014 except "Windows XP Embedded". There is even an hack to use
>> Windows upgrades on the desktop flavor using the embedded flavor (by
>> changing a key in the registry). Extracts of the Wikipedia page:
>
> This was recently discussed in the "Moving Python 3.5 on Windows to a
> new compiler" thread, where Martin declared XP support to be ended
> [1].  I believe Tim Golden is the only resident Windows dev from whom
> I haven't seen at least implicit agreement that XP doesn't need
> further support, so I'd say our support for XP is well and truly dead
> :)
>
> In any case, surely anyone stuck with XP can be happy with Python 3.4.
> I'm perfectly fine with 3.2 on Win2k!
>

I think we're justified in dropping XP support, for all the reasons 
others have given. Like most people, I suppose, I'm support WinXP in 
various ways (including embedded) because "not supported" != "not 
working". But those are all running 2.x versions of Python.

It'll be good to be able stretch a little on the Windows API front 
without having to double-think about where a particular API came in.

TJG

From victor.stinner at gmail.com  Tue Jun 17 09:03:54 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Tue, 17 Jun 2014 09:03:54 +0200
Subject: [Python-Dev] Windows XP, Python 3.5 and PEP 11
In-Reply-To: <539FCBA9.2010903@timgolden.me.uk>
References: <CAMpsgwY29SkdnQo4s15aYg4T+rojoHgoOTAfOu+93PE05dzrcw@mail.gmail.com>
 <CAKJDb-OXp9qcd_zbTqg5-BnKOaKwEQekVso_ose4dhnamnHrjQ@mail.gmail.com>
 <539FCBA9.2010903@timgolden.me.uk>
Message-ID: <CAMpsgwbxOaGiZ7OVyUrcLzpBvcB-j1n6B54OZofkvo3h9KeZXQ@mail.gmail.com>

2014-06-17 7:01 GMT+02:00 Tim Golden <mail at timgolden.me.uk>:
> On 17/06/2014 04:08, Zachary Ware wrote:
>> This was recently discussed in the "Moving Python 3.5 on Windows to a
>> new compiler" thread, where Martin declared XP support to be ended
>> [1].  I believe Tim Golden is the only resident Windows dev from whom
>> I haven't seen at least implicit agreement that XP doesn't need
>> further support, so I'd say our support for XP is well and truly dead
>> :)
>>
>> In any case, surely anyone stuck with XP can be happy with Python 3.4.
>> I'm perfectly fine with 3.2 on Win2k!
>>
>
> I think we're justified in dropping XP support, for all the reasons others
> have given.

Would you be ok to make this official by adding Windows XP explicitly
to the PEP 11? (I can do the change, I'm just asking for a
confirmation.)

> Like most people, I suppose, I'm support WinXP in various ways
> (including embedded) because "not supported" != "not working". But those are
> all running 2.x versions of Python.

I'm ok to provide a best-effort support of Windows XP on Python 2.7
(and maybe also Python 3.4), especially if there are Windows XP
buildbots. We can drop Windows XP support in Python 3.5 only.

Victor

From victor.stinner at gmail.com  Tue Jun 17 09:11:45 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Tue, 17 Jun 2014 09:11:45 +0200
Subject: [Python-Dev] Commit "avoid a deadlock with the interpreter head
 lock and the GIL during finalization"
Message-ID: <CAMpsgwa3q_p+V5Q+WHKDO-isMgneDqTOCjZwtRMRd5=iF1WVNQ@mail.gmail.com>

Hi,

I just saw a change in Python finalization related to threads. I'm not
sure that it is correct to not call tstate_delete_common(). Is this
change related to an issue? I don't see any specific test.

---
changeset 91234:5ccb6901cf95 3.4

avoid a deadlock with the interpreter head lock and the GIL during finalization

author Benjamin Peterson <benjamin at python.org>
date Mon, 16 Jun 2014 23:07:49 -0700 (61 minutes ago)
parents d1d1ed421717
children 2ed64ea19d81 fceb3a907260
files Python/pystate.c
diffstat 1 files changed, 8 insertions(+), 0 deletions(-) [+]


http://hg.python.org/cpython/rev/5ccb6901cf95
---

Victor

From breamoreboy at yahoo.co.uk  Tue Jun 17 09:53:09 2014
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Tue, 17 Jun 2014 08:53:09 +0100
Subject: [Python-Dev] Windows XP, Python 3.5 and PEP 11
In-Reply-To: <CAMpsgwbxOaGiZ7OVyUrcLzpBvcB-j1n6B54OZofkvo3h9KeZXQ@mail.gmail.com>
References: <CAMpsgwY29SkdnQo4s15aYg4T+rojoHgoOTAfOu+93PE05dzrcw@mail.gmail.com>
 <CAKJDb-OXp9qcd_zbTqg5-BnKOaKwEQekVso_ose4dhnamnHrjQ@mail.gmail.com>
 <539FCBA9.2010903@timgolden.me.uk>
 <CAMpsgwbxOaGiZ7OVyUrcLzpBvcB-j1n6B54OZofkvo3h9KeZXQ@mail.gmail.com>
Message-ID: <lnos4s$vf9$1@ger.gmane.org>

On 17/06/2014 08:03, Victor Stinner wrote:
> 2014-06-17 7:01 GMT+02:00 Tim Golden <mail at timgolden.me.uk>:
>> On 17/06/2014 04:08, Zachary Ware wrote:
>>> This was recently discussed in the "Moving Python 3.5 on Windows to a
>>> new compiler" thread, where Martin declared XP support to be ended
>>> [1].  I believe Tim Golden is the only resident Windows dev from whom
>>> I haven't seen at least implicit agreement that XP doesn't need
>>> further support, so I'd say our support for XP is well and truly dead
>>> :)
>>>
>>> In any case, surely anyone stuck with XP can be happy with Python 3.4.
>>> I'm perfectly fine with 3.2 on Win2k!
>>>
>>
>> I think we're justified in dropping XP support, for all the reasons others
>> have given.
>
> Would you be ok to make this official by adding Windows XP explicitly
> to the PEP 11? (I can do the change, I'm just asking for a
> confirmation.)
>

 From PEP 11 the entire "Microsoft Windows" section.  Please see the 
third paragraph.

"Microsoft has established a policy called product support lifecycle 
[1]. Each product's lifecycle has a mainstream support phase, where the 
product is generally commercially available, and an extended support 
phase, where paid support is still available, and certain bug fixes are 
released (in particular security fixes).

Python's Windows support now follows this lifecycle. A new feature 
release X.Y.0 will support all Windows releases whose extended support 
phase is not yet expired. Subsequent bug fix releases will support the 
same Windows releases as the original feature release (even if the 
extended support phase has ended).

Because of this policy, no further Windows releases need to be listed in 
this PEP.

Each feature release is built by a specific version of Microsoft Visual 
Studio. That version should have mainstream support when the release is 
made. Developers of extension modules will generally need to use the 
same Visual Studio release; they are concerned both with the 
availability of the versions they need to use, and with keeping the zoo 
of versions small. The Python source tree will keep unmaintained build 
files for older Visual Studio releases, for which patches will be 
accepted. Such build files will be removed from the source tree 3 years 
after the extended support for the compiler has ended (but continue to 
remain available in revision control)."

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com


From chris.barker at noaa.gov  Tue Jun 17 17:59:13 2014
From: chris.barker at noaa.gov (Chris Barker)
Date: Tue, 17 Jun 2014 08:59:13 -0700
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <CADiSq7cx_kM3XLNGXuxK+YQ2sgBwbm=QPtJjArB2a9UvrG+Jxw@mail.gmail.com>
References: <20140610052312.280e49c9@x34f> <20140610030303.GU10355@ando>
 <20140614231144.639bf852@x34f>
 <CALGmxEKwa8g2LO98UkACgxNSz6vfudF_iQ6Fn+9CrDNwJPGXiw@mail.gmail.com>
 <CADiSq7cx_kM3XLNGXuxK+YQ2sgBwbm=QPtJjArB2a9UvrG+Jxw@mail.gmail.com>
Message-ID: <CALGmxEK4qtLPKauym1a=FA-kC1k1ggYSUzPm=OckAbJUxxnM0A@mail.gmail.com>

On Mon, Jun 16, 2014 at 3:39 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> > FWIW, when I started using python (15?) years ago -- the first thing I
> looked for was a way to "just run a file", at the interactive prompt, like
> I had in MATLAB. I found and used execfile().
>
> Yes, if people are looking for a MATLAB replacement, they want IPython
> rather than the default REPL.
>
I didn't meant o distract the conversation here -- what I meant was that
even before iPython existed, I still dropped using execfile("") it was
hardly ever the right thing.

And for the micropython example, I'm proposing that a micropython
interactive environment would be a really nice thing to build -- and worth
doing, even if execfile() was still there.

By the way: iPython, while coming from, and heavily used by, the
scientific/numeric computing community, is a great tool for all sorts of
other python development as well. But probably too heavyweight for
micropython.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140617/a618cc11/attachment.html>

From ayates at hp.com  Tue Jun 17 18:41:23 2014
From: ayates at hp.com (Yates, Andy (CS Houston, TX))
Date: Tue, 17 Jun 2014 16:41:23 +0000
Subject: [Python-Dev] Issue 21671: CVE-2014-0224 OpenSSL upgrade to 1.0.1h
 on Windows required
Message-ID: <8E2E2A615DD11C4CAFD572AA1370481675707068@G9W0341.americas.hpqcorp.net>

Python Dev,
Andy here. I have a Windows product based on Python and I'm getting hammered to release a version that includes the fix in OpenSSL 1.0.1h.  My product is built on a Windows system using Python installed from the standard Python installer at Python.org.  I would be grateful if I could get some advice on my options. Will Python.org be releasing a Windows installer with the fix any time soon or will it be at the next scheduled release in November?  If it is November, there's no way I can wait that long. Now what?  Would it be best to build my own Python? Is it possible to drop in new OpenSSL versions on Windows without rebuilding Python?  Looking for some guidance on how to handle these OpenSSL issues on Windows.

Thanks!
Andy Yates
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140617/6449bebd/attachment.html>

From Steve.Dower at microsoft.com  Tue Jun 17 20:27:30 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Tue, 17 Jun 2014 18:27:30 +0000
Subject: [Python-Dev] Issue 21671: CVE-2014-0224 OpenSSL upgrade to
 1.0.1h on Windows required
In-Reply-To: <8E2E2A615DD11C4CAFD572AA1370481675707068@G9W0341.americas.hpqcorp.net>
References: <8E2E2A615DD11C4CAFD572AA1370481675707068@G9W0341.americas.hpqcorp.net>
Message-ID: <81f84430ce0242e5bfa5b2264777df56@BLUPR03MB389.namprd03.prod.outlook.com>

Yates, Andy (CS Houston, TX) wrote:
> Python Dev,
> Andy here. I have a Windows product based on Python and I'm getting hammered to
> release a version that includes the fix in OpenSSL 1.0.1h. My product is built
> on a Windows system using Python installed from the standard Python installer at
> Python.org. I would be grateful if I could get some advice on my options. Will
> Python.org be releasing a Windows installer with the fix any time soon or will
> it be at the next scheduled release in November? If it is November, there's no
> way I can wait that long. Now what? Would it be best to build my own Python? Is
> it possible to drop in new OpenSSL versions on Windows without rebuilding
> Python? Looking for some guidance on how to handle these OpenSSL issues on
> Windows.

You'll only need to rebuild the _ssl and _hashlib extension modules with the new OpenSSL version. The easiest way to do this is to build from source (which has already been updated for 1.0.1h if you use the externals scripts in Tools\buildbot), and you should just be able to drop _ssl.pyd and _hashlib.pyd on top of a normal install.

Aside: I wonder if it's worth changing to dynamically linking to OpenSSL? It would make this kind of in-place upgrade easier when people need to do it. Any thoughts? (Does OpenSSL even support it?)

Cheers,
Steve

> Thanks!
> Andy Yates

From mal at egenix.com  Tue Jun 17 20:55:54 2014
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 17 Jun 2014 20:55:54 +0200
Subject: [Python-Dev] Issue 21671: CVE-2014-0224 OpenSSL upgrade to
 1.0.1h on Windows required
In-Reply-To: <81f84430ce0242e5bfa5b2264777df56@BLUPR03MB389.namprd03.prod.outlook.com>
References: <8E2E2A615DD11C4CAFD572AA1370481675707068@G9W0341.americas.hpqcorp.net>
 <81f84430ce0242e5bfa5b2264777df56@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <53A08F3A.30908@egenix.com>

On 17.06.2014 20:27, Steve Dower wrote:
> Yates, Andy (CS Houston, TX) wrote:
>> Python Dev,
>> Andy here. I have a Windows product based on Python and I'm getting hammered to
>> release a version that includes the fix in OpenSSL 1.0.1h. My product is built
>> on a Windows system using Python installed from the standard Python installer at
>> Python.org. I would be grateful if I could get some advice on my options. Will
>> Python.org be releasing a Windows installer with the fix any time soon or will
>> it be at the next scheduled release in November? If it is November, there's no
>> way I can wait that long. Now what? Would it be best to build my own Python? Is
>> it possible to drop in new OpenSSL versions on Windows without rebuilding
>> Python? Looking for some guidance on how to handle these OpenSSL issues on
>> Windows.
> 
> You'll only need to rebuild the _ssl and _hashlib extension modules with the new OpenSSL version. The easiest way to do this is to build from source (which has already been updated for 1.0.1h if you use the externals scripts in Tools\buildbot), and you should just be able to drop _ssl.pyd and _hashlib.pyd on top of a normal install.
> 
> Aside: I wonder if it's worth changing to dynamically linking to OpenSSL? It would make this kind of in-place upgrade easier when people need to do it. Any thoughts? (Does OpenSSL even support it?)

Yes, no problem at all, but you'd still have to either do a new release
every time a new OpenSSL problem is found (don't think that's an
option for Python) or provide new compiled versions
compatible with the Python modules needing the OpenSSL libs or
instructions on how to build these.

Note that the hash routines are rarely affected by these OpenSSL
bugs. They usually only affect the SSL/TLS protocol parts.

Alternatively, you could make use of our pyOpenSSL distribution,
which includes pyOpenSSL and the OpenSSL libs (also for Windows):

http://www.egenix.com/products/python/pyOpenSSL/

We created this to address the problem of having to update
OpenSSL rather often. It doesn't support Python 3 yet, but
on the plus side, you do get OpenSSL libs which are compiled
with the same compiler versions used for the Python.org
installers.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From nad at acm.org  Tue Jun 17 21:03:40 2014
From: nad at acm.org (Ned Deily)
Date: Tue, 17 Jun 2014 12:03:40 -0700
Subject: [Python-Dev] Issue 21671: CVE-2014-0224 OpenSSL upgrade to
	1.0.1h on Windows required
References: <8E2E2A615DD11C4CAFD572AA1370481675707068@G9W0341.americas.hpqcorp.net>
 <81f84430ce0242e5bfa5b2264777df56@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <nad-3FF979.12034017062014@news.gmane.org>

In article 
<81f84430ce0242e5bfa5b2264777df56 at BLUPR03MB389.namprd03.prod.outlook.com
>,
 Steve Dower <Steve.Dower at microsoft.com> wrote:
> You'll only need to rebuild the _ssl and _hashlib extension modules with the 
> new OpenSSL version. The easiest way to do this is to build from source 
> (which has already been updated for 1.0.1h if you use the externals scripts 
> in Tools\buildbot), and you should just be able to drop _ssl.pyd and 
> _hashlib.pyd on top of a normal install.

Should we consider doing a re-spin of the Windows installers for 2.7.7 
with 1.0.1h?  Or consider doing a 2.7.8 in the near future to address 
this and various 2.7.7 regressions that have been identified so far 
(Issues 21652 and 21672)?

> Aside: I wonder if it's worth changing to dynamically linking to OpenSSL? It 
> would make this kind of in-place upgrade easier when people need to do it. 
> Any thoughts? (Does OpenSSL even support it?)

OpenSSL is often dynamically linked in Python builds on various other 
platforms, for example, on Linux or OS X.

-- 
 Ned Deily,
 nad at acm.org


From benjamin at python.org  Tue Jun 17 21:07:06 2014
From: benjamin at python.org (Benjamin Peterson)
Date: Tue, 17 Jun 2014 12:07:06 -0700
Subject: [Python-Dev] Issue 21671: CVE-2014-0224 OpenSSL upgrade to
 1.0.1h on Windows required
In-Reply-To: <nad-3FF979.12034017062014@news.gmane.org>
References: <8E2E2A615DD11C4CAFD572AA1370481675707068@G9W0341.americas.hpqcorp.net>
 <81f84430ce0242e5bfa5b2264777df56@BLUPR03MB389.namprd03.prod.outlook.com>
 <nad-3FF979.12034017062014@news.gmane.org>
Message-ID: <1403032026.581.129869537.3E053BB6@webmail.messagingengine.com>

On Tue, Jun 17, 2014, at 12:03, Ned Deily wrote:
> In article 
> <81f84430ce0242e5bfa5b2264777df56 at BLUPR03MB389.namprd03.prod.outlook.com
> >,
>  Steve Dower <Steve.Dower at microsoft.com> wrote:
> > You'll only need to rebuild the _ssl and _hashlib extension modules with the 
> > new OpenSSL version. The easiest way to do this is to build from source 
> > (which has already been updated for 1.0.1h if you use the externals scripts 
> > in Tools\buildbot), and you should just be able to drop _ssl.pyd and 
> > _hashlib.pyd on top of a normal install.
> 
> Should we consider doing a re-spin of the Windows installers for 2.7.7 
> with 1.0.1h?  Or consider doing a 2.7.8 in the near future to address 
> this and various 2.7.7 regressions that have been identified so far 
> (Issues 21652 and 21672)?

I think we should do a 2.7.8 soon to pick up the openssl upgrade and
recent CGI security fix. I would like to see those two regressions fixed
first, though.

From antoine at python.org  Tue Jun 17 22:36:23 2014
From: antoine at python.org (Antoine Pitrou)
Date: Tue, 17 Jun 2014 16:36:23 -0400
Subject: [Python-Dev] Issue 21671: CVE-2014-0224 OpenSSL upgrade to
 1.0.1h on Windows required
In-Reply-To: <53A08F3A.30908@egenix.com>
References: <8E2E2A615DD11C4CAFD572AA1370481675707068@G9W0341.americas.hpqcorp.net>
 <81f84430ce0242e5bfa5b2264777df56@BLUPR03MB389.namprd03.prod.outlook.com>
 <53A08F3A.30908@egenix.com>
Message-ID: <lnq8s7$h5j$1@ger.gmane.org>

Le 17/06/2014 14:55, M.-A. Lemburg a ?crit :
>
> Alternatively, you could make use of our pyOpenSSL distribution,
> which includes pyOpenSSL and the OpenSSL libs (also for Windows):
>
> http://www.egenix.com/products/python/pyOpenSSL/
>
> We created this to address the problem of having to update
> OpenSSL rather often.

This is very nice, but does it also upgrade the OpenSSL version used by 
the _ssl and _hashlib modules?

Regards

Antoine.


From mal at egenix.com  Tue Jun 17 22:58:45 2014
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 17 Jun 2014 22:58:45 +0200
Subject: [Python-Dev] Issue 21671: CVE-2014-0224 OpenSSL upgrade to
 1.0.1h on Windows required
In-Reply-To: <lnq8s7$h5j$1@ger.gmane.org>
References: <8E2E2A615DD11C4CAFD572AA1370481675707068@G9W0341.americas.hpqcorp.net>	<81f84430ce0242e5bfa5b2264777df56@BLUPR03MB389.namprd03.prod.outlook.com>	<53A08F3A.30908@egenix.com>
 <lnq8s7$h5j$1@ger.gmane.org>
Message-ID: <53A0AC05.8050708@egenix.com>

On 17.06.2014 22:36, Antoine Pitrou wrote:
> Le 17/06/2014 14:55, M.-A. Lemburg a ?crit :
>>
>> Alternatively, you could make use of our pyOpenSSL distribution,
>> which includes pyOpenSSL and the OpenSSL libs (also for Windows):
>>
>> http://www.egenix.com/products/python/pyOpenSSL/
>>
>> We created this to address the problem of having to update
>> OpenSSL rather often.
> 
> This is very nice, but does it also upgrade the OpenSSL version used by the _ssl and _hashlib modules?

On Unix, tt will if you load pyOpenSSL before importing _ssl or
_hashlib (and those modules are built as shared libs).

Alternatively, you can set LD_LIBRARY_PATH to lib/python2.7/OpenSSL
to have the system linker use the embedded libs before starting
Python. Then it will always use the up-to-date libs.

On Windows, this won't work, because _ssl and _hashlib are
statically linked against the OpenSSL libs.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 17 2014)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2014-06-17: Released eGenix PyRun 2.0.0 ...       http://egenix.com/go58
2014-06-09: Released eGenix pyOpenSSL 0.13.3 ...  http://egenix.com/go57
2014-07-02: Python Meeting Duesseldorf ...                 15 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From ncoghlan at gmail.com  Wed Jun 18 00:00:49 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 18 Jun 2014 08:00:49 +1000
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <CALGmxEK4qtLPKauym1a=FA-kC1k1ggYSUzPm=OckAbJUxxnM0A@mail.gmail.com>
References: <20140610052312.280e49c9@x34f> <20140610030303.GU10355@ando>
 <20140614231144.639bf852@x34f>
 <CALGmxEKwa8g2LO98UkACgxNSz6vfudF_iQ6Fn+9CrDNwJPGXiw@mail.gmail.com>
 <CADiSq7cx_kM3XLNGXuxK+YQ2sgBwbm=QPtJjArB2a9UvrG+Jxw@mail.gmail.com>
 <CALGmxEK4qtLPKauym1a=FA-kC1k1ggYSUzPm=OckAbJUxxnM0A@mail.gmail.com>
Message-ID: <CADiSq7f6Bgb3X+tg_eLNzdEQjzaPBue_FzGizjaNO75T6aUGig@mail.gmail.com>

On 18 Jun 2014 01:59, "Chris Barker" <chris.barker at noaa.gov> wrote:
>
> By the way: iPython, while coming from, and heavily used by, the
scientific/numeric computing community, is a great tool for all sorts of
other python development as well. But probably too heavyweight for
micropython.

(we're drifting off topic, so this will be my last addition to this
subthread)

Yes, as great as IPython is, when it's considered out of scope for the
standard installers, it's unlikely to be a good fit for a version of Python
aimed at running *on* a microcontroller. Running on a Raspberry Pi or
remote PC and *talking* to an associated microcontroller is a different
story, though.

Cheers,
Nick.

>
> -CHB
>
>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140618/70adf5b7/attachment.html>

From cory at lukasa.co.uk  Wed Jun 18 09:18:24 2014
From: cory at lukasa.co.uk (Cory Benfield)
Date: Wed, 18 Jun 2014 08:18:24 +0100
Subject: [Python-Dev] Issue 21671: CVE-2014-0224 OpenSSL upgrade to
 1.0.1h on Windows required
In-Reply-To: <8E2E2A615DD11C4CAFD572AA1370481675707068@G9W0341.americas.hpqcorp.net>
References: <8E2E2A615DD11C4CAFD572AA1370481675707068@G9W0341.americas.hpqcorp.net>
Message-ID: <CAH_hAJHCABcZDLLQVuvP48YLUWFt9O0PJ5aMzwEiD9L=X_xpRw@mail.gmail.com>

On 17 June 2014 17:41, Yates, Andy (CS Houston, TX) <ayates at hp.com> wrote:
> Is it possible to drop in new OpenSSL versions
> on Windows without rebuilding Python?

If you think this is a problem you're going to have more than once,
you'll want to look  hard at whether it's worth using pyOpenSSL
(either the egenix version or the PyCA one[1]) instead, and delivering
binary releases with a bundled copy of OpenSSL. PyOpenSSL from PyCA is
actually considering bundling OpenSSL on Windows anyway[2], so you
might find this problem goes away.

[1] https://github.com/pyca/pyopenssl
[2] https://github.com/pyca/cryptography/issues/1121

From martin at v.loewis.de  Wed Jun 18 11:32:33 2014
From: martin at v.loewis.de (=?windows-1252?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 18 Jun 2014 11:32:33 +0200
Subject: [Python-Dev] Issue 21671: CVE-2014-0224 OpenSSL upgrade to
 1.0.1h on Windows required
In-Reply-To: <8E2E2A615DD11C4CAFD572AA1370481675707068@G9W0341.americas.hpqcorp.net>
References: <8E2E2A615DD11C4CAFD572AA1370481675707068@G9W0341.americas.hpqcorp.net>
Message-ID: <53A15CB1.5070401@v.loewis.de>

Am 17.06.14 18:41, schrieb Yates, Andy (CS Houston, TX):
> Python Dev,
> 
> Andy here. I have a Windows product based on Python and I?m getting
> hammered to release a version that includes the fix in OpenSSL 1.0.1h. 
> My product is built on a Windows system using Python installed from the
> standard Python installer at Python.org.  I would be grateful if I could
> get some advice on my options. 

Can you please report
- what version of Python you are distributing?
- why it absolutely has to be 1.0.1h that is included?

According to the CVE, 0.9.8za and 1.0.0m would work as well (and in our
case, would be preferred for older versions of Python).

Regards,
Martin


From martin at v.loewis.de  Wed Jun 18 11:46:46 2014
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 18 Jun 2014 11:46:46 +0200
Subject: [Python-Dev] Issue 21671: CVE-2014-0224 OpenSSL upgrade to
 1.0.1h on Windows required
In-Reply-To: <81f84430ce0242e5bfa5b2264777df56@BLUPR03MB389.namprd03.prod.outlook.com>
References: <8E2E2A615DD11C4CAFD572AA1370481675707068@G9W0341.americas.hpqcorp.net>
 <81f84430ce0242e5bfa5b2264777df56@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <53A16006.3090801@v.loewis.de>

Am 17.06.14 20:27, schrieb Steve Dower:
> You'll only need to rebuild the _ssl and _hashlib extension modules
> with the new OpenSSL version. The easiest way to do this is to build
> from source (which has already been updated for 1.0.1h if you use the
> externals scripts in Tools\buildbot), and you should just be able to
> drop _ssl.pyd and _hashlib.pyd on top of a normal install.
> 
> Aside: I wonder if it's worth changing to dynamically linking to
> OpenSSL? It would make this kind of in-place upgrade easier when
> people need to do it. Any thoughts? (Does OpenSSL even support it?)

We originally considered using prebuilt binaries, such as

http://slproweb.com/products/Win32OpenSSL.html

This is tricky because of CRT issues: they will likely bind to a
different version of the CRT, and
a) it is unclear whether this would reliably work, and
b) requires the Python installer to include a different version of
   the CRT, which we would not have a license to include (as the
   CRT redistribution license only applies to the version of the CRT
   that Python was built with)

There was also the desire to use the same compiler for all code
distributed, to use the same optimizations on all of it. In addition,
for OpenSSL, there is compile time configuration wrt. to the algorithms
built into the binaries where Python's build deviates from the default.

Having a separate project to build a DLL within pcbuild.sln was never
implemented. Doing so possibly increases the risk of DLL hell, if Python
picks up the wrong version of OpenSSL (e.g. if Python gets embedded
into some other application).

Regards,
Martin

From Steve.Dower at microsoft.com  Wed Jun 18 15:07:02 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Wed, 18 Jun 2014 13:07:02 +0000
Subject: [Python-Dev] Issue 21671: CVE-2014-0224 OpenSSL upgrade to
 1.0.1h on Windows required
In-Reply-To: <53A16006.3090801@v.loewis.de>
References: <8E2E2A615DD11C4CAFD572AA1370481675707068@G9W0341.americas.hpqcorp.net>
 <81f84430ce0242e5bfa5b2264777df56@BLUPR03MB389.namprd03.prod.outlook.com>,
 <53A16006.3090801@v.loewis.de>
Message-ID: <5fd7795d324f4bc59b2b09bb217502cc@BLUPR03MB389.namprd03.prod.outlook.com>

Yeah, the fact that it really has to be our own copy of the DLL negates the advantage. If someone can rebuild that, they could rebuild the modules that statically link it.

Cheers,
Steve

Top-posted from my Windows Phone
________________________________
From: Martin v. L?wis<mailto:martin at v.loewis.de>
Sent: ?6/?18/?2014 2:46
To: Steve Dower<mailto:Steve.Dower at microsoft.com>; Yates, Andy (CS Houston, TX)<mailto:ayates at hp.com>; Python-Dev at python.org<mailto:Python-Dev at python.org>
Subject: Re: [Python-Dev] Issue 21671: CVE-2014-0224 OpenSSL upgrade to 1.0.1h on Windows required

Am 17.06.14 20:27, schrieb Steve Dower:
> You'll only need to rebuild the _ssl and _hashlib extension modules
> with the new OpenSSL version. The easiest way to do this is to build
> from source (which has already been updated for 1.0.1h if you use the
> externals scripts in Tools\buildbot), and you should just be able to
> drop _ssl.pyd and _hashlib.pyd on top of a normal install.
>
> Aside: I wonder if it's worth changing to dynamically linking to
> OpenSSL? It would make this kind of in-place upgrade easier when
> people need to do it. Any thoughts? (Does OpenSSL even support it?)

We originally considered using prebuilt binaries, such as

http://slproweb.com/products/Win32OpenSSL.html

This is tricky because of CRT issues: they will likely bind to a
different version of the CRT, and
a) it is unclear whether this would reliably work, and
b) requires the Python installer to include a different version of
   the CRT, which we would not have a license to include (as the
   CRT redistribution license only applies to the version of the CRT
   that Python was built with)

There was also the desire to use the same compiler for all code
distributed, to use the same optimizations on all of it. In addition,
for OpenSSL, there is compile time configuration wrt. to the algorithms
built into the binaries where Python's build deviates from the default.

Having a separate project to build a DLL within pcbuild.sln was never
implemented. Doing so possibly increases the risk of DLL hell, if Python
picks up the wrong version of OpenSSL (e.g. if Python gets embedded
into some other application).

Regards,
Martin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140618/d9dca85e/attachment.html>

From ayates at hp.com  Thu Jun 19 20:06:51 2014
From: ayates at hp.com (Yates, Andy (CS Houston, TX))
Date: Thu, 19 Jun 2014 18:06:51 +0000
Subject: [Python-Dev] Issue 21671: CVE-2014-0224 OpenSSL upgrade to
 1.0.1h on Windows required
In-Reply-To: <1403032026.581.129869537.3E053BB6@webmail.messagingengine.com>
References: <8E2E2A615DD11C4CAFD572AA1370481675707068@G9W0341.americas.hpqcorp.net>
 <81f84430ce0242e5bfa5b2264777df56@BLUPR03MB389.namprd03.prod.outlook.com>
 <nad-3FF979.12034017062014@news.gmane.org>
 <1403032026.581.129869537.3E053BB6@webmail.messagingengine.com>
Message-ID: <8E2E2A615DD11C4CAFD572AA1370481675708954@G9W0341.americas.hpqcorp.net>

Thanks for all the good information.  We ended up building _ssl and _hashlib and dropping those into the existing Python on our build server.  That seems to be working fine. 

>From my perspective ssl libraries are a special case. I think I could handle any other included library having a flaw for weeks or months, but my management and customers are sensitive to releasing software with known ssl vulnerabilities.  For Windows Python it looks like the only option for updating OpenSSL is to build from source. For us that turned out to be no big deal. However, it may be beyond the reach of some, either technically or due to the lack of access to Dev Studio.  There's also some concern that a custom build of Python may not have some secret sauce or complier switch that could cause unexpected behavior.

That said, I'd like to see Python spin within a short period of time after a recognized OpenSSL vulnerability is fixed if is statically linked.  This would limit exposure to the unsuspecting user who downloads Windows Python from Python.org. The next best thing would be to dynamically link to Windows OpenSSL DLLs allowing users to drop in which ever version they like.

Thanks again!!

Andy


-----Original Message-----
From: Python-Dev [mailto:python-dev-bounces+ayates=hp.com at python.org] On Behalf Of Benjamin Peterson
Sent: Tuesday, June 17, 2014 2:07 PM
To: Ned Deily; python-dev at python.org
Subject: Re: [Python-Dev] Issue 21671: CVE-2014-0224 OpenSSL upgrade to 1.0.1h on Windows required

On Tue, Jun 17, 2014, at 12:03, Ned Deily wrote:
> In article
> <81f84430ce0242e5bfa5b2264777df56 at BLUPR03MB389.namprd03.prod.outlook.c
> om
> >,
>  Steve Dower <Steve.Dower at microsoft.com> wrote:
> > You'll only need to rebuild the _ssl and _hashlib extension modules 
> > with the new OpenSSL version. The easiest way to do this is to build 
> > from source (which has already been updated for 1.0.1h if you use 
> > the externals scripts in Tools\buildbot), and you should just be 
> > able to drop _ssl.pyd and _hashlib.pyd on top of a normal install.
> 
> Should we consider doing a re-spin of the Windows installers for 2.7.7 
> with 1.0.1h?  Or consider doing a 2.7.8 in the near future to address 
> this and various 2.7.7 regressions that have been identified so far 
> (Issues 21652 and 21672)?

I think we should do a 2.7.8 soon to pick up the openssl upgrade and recent CGI security fix. I would like to see those two regressions fixed first, though.
_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/ayates%40hp.com

From joseph.martinot-lagarde at m4x.org  Thu Jun 19 21:39:20 2014
From: joseph.martinot-lagarde at m4x.org (Joseph Martinot-Lagarde)
Date: Thu, 19 Jun 2014 21:39:20 +0200
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <481b9af010ac4134a3ecbafd32f3be31@BLUPR03MB389.namprd03.prod.outlook.com>
References: <20140610052312.280e49c9@x34f>
 <20140614210059.GB20710@chromebot.lan>
 <CANXBEFpxiGxuti0pqq6jjvZHXrKECtqYrqJiQJKfRqiGi7CVnw@mail.gmail.com>
 <539CD85B.1060104@canterbury.ac.nz>
 <b0fde1f2267d49a18d2a67625db163aa@BLUPR03MB389.namprd03.prod.outlook.com>,
 <CADiSq7dsZkOs+vqeUm2NdejvA1xFd0Qp3Td2+r5ojhCS1jbhAg@mail.gmail.com>
 <481b9af010ac4134a3ecbafd32f3be31@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <53A33C68.2070006@m4x.org>

Le 15/06/2014 05:15, Steve Dower a ?crit :
> So is exec(tokenize.open(file).read()) the actual replacement for
> execfile()? Not too bad, but still not obvious (or widely promoted - I'd
> never heard of it).
>
Another way is to open the file in binary, then exec() checks itself if 
an encoding is defined in the file. This is what is used in spyder:

exec(open(file, 'rb').read())

Here is the discussion for reference: 
https://bitbucket.org/spyder-ide/spyderlib/pull-request/3/execution-on-current-spyder-interpreter/diff

This behavior is not indicated in the documentation but is somehow 
confirmed on stackoverflow: 
http://stackoverflow.com/questions/6357361/alternative-to-execfile-in-python-3-2/6357418?noredirect=1#comment30467918_6357418

---
Ce courrier ?lectronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active.
http://www.avast.com


From p.f.moore at gmail.com  Thu Jun 19 22:46:02 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 19 Jun 2014 21:46:02 +0100
Subject: [Python-Dev] Criticism of execfile() removal in Python3
In-Reply-To: <53A33C68.2070006@m4x.org>
References: <20140610052312.280e49c9@x34f>
 <20140614210059.GB20710@chromebot.lan>
 <CANXBEFpxiGxuti0pqq6jjvZHXrKECtqYrqJiQJKfRqiGi7CVnw@mail.gmail.com>
 <539CD85B.1060104@canterbury.ac.nz>
 <b0fde1f2267d49a18d2a67625db163aa@BLUPR03MB389.namprd03.prod.outlook.com>
 <CADiSq7dsZkOs+vqeUm2NdejvA1xFd0Qp3Td2+r5ojhCS1jbhAg@mail.gmail.com>
 <481b9af010ac4134a3ecbafd32f3be31@BLUPR03MB389.namprd03.prod.outlook.com>
 <53A33C68.2070006@m4x.org>
Message-ID: <CACac1F8yD2Gj+G4qZKZSG3V+5HaLVEy3hmOrJxTk_x_aFghFpA@mail.gmail.com>

On 19 June 2014 20:39, Joseph Martinot-Lagarde
<joseph.martinot-lagarde at m4x.org> wrote:
> Another way is to open the file in binary, then exec() checks itself if an
> encoding is defined in the file. This is what is used in spyder:
>
> exec(open(file, 'rb').read())
>
> Here is the discussion for reference:
> https://bitbucket.org/spyder-ide/spyderlib/pull-request/3/execution-on-current-spyder-interpreter/diff

It would be good to document this. Could you open a docs bug to get this added?

Paul

From status at bugs.python.org  Fri Jun 20 18:07:58 2014
From: status at bugs.python.org (Python tracker)
Date: Fri, 20 Jun 2014 18:07:58 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <20140620160758.1EDB456920@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (2014-06-13 - 2014-06-20)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open    4655 ( -7)
  closed 28932 (+73)
  total  33587 (+66)

Open issues with patches: 2152 


Issues opened (49)
==================

#8110: subprocess.py doesn't correctly detect Windows machines
http://bugs.python.org/issue8110  reopened by r.david.murray

#21750: mock_open data is visible only once for the life of the class
http://bugs.python.org/issue21750  opened by pkoning

#21753: Windows cmd.exe character escaping function
http://bugs.python.org/issue21753  opened by Jim.Jewett

#21754: Add tests for turtle.TurtleScreenBase
http://bugs.python.org/issue21754  opened by ingrid

#21755: test_importlib.test_locks fails --without-threads
http://bugs.python.org/issue21755  opened by berker.peksag

#21756: IDLE - ParenMatch fails to find closing paren of multi-line st
http://bugs.python.org/issue21756  opened by taleinat

#21760: inspect documentation describes module type inaccurately
http://bugs.python.org/issue21760  opened by eric.snow

#21761: language reference describes the role of module.__file__ inacc
http://bugs.python.org/issue21761  opened by eric.snow

#21762: update the import machinery to only use __spec__
http://bugs.python.org/issue21762  opened by eric.snow

#21763: Clarify requirements for file-like objects
http://bugs.python.org/issue21763  opened by nikratio

#21765: Idle: make 3.x HyperParser work with non-ascii identifiers.
http://bugs.python.org/issue21765  opened by terry.reedy

#21767: singledispatch docs should explicitly mention support for abst
http://bugs.python.org/issue21767  opened by ncoghlan

#21768: Fix a NameError in test_pydoc
http://bugs.python.org/issue21768  opened by Claudiu.Popa

#21769: Fix a NameError in test_descr
http://bugs.python.org/issue21769  opened by Claudiu.Popa

#21770: Module not callable in script_helper.py
http://bugs.python.org/issue21770  opened by Claudiu.Popa

#21772: platform.uname() not EINTR safe
http://bugs.python.org/issue21772  opened by Tor.Colvin

#21775: shutil.copytree() crashes copying to VFAT on Linux: AttributeE
http://bugs.python.org/issue21775  opened by gward

#21776: distutils.upload uses the wrong order of exceptions
http://bugs.python.org/issue21776  opened by Claudiu.Popa

#21777: Separate out documentation of binary sequence methods
http://bugs.python.org/issue21777  opened by ncoghlan

#21778: PyBuffer_FillInfo() from 3.3
http://bugs.python.org/issue21778  opened by arigo

#21779: test_multiprocessing_spawn fails when ran with -Werror
http://bugs.python.org/issue21779  opened by serhiy.storchaka

#21780: make unicodedata module 64-bit safe
http://bugs.python.org/issue21780  opened by haypo

#21781: make _ssl module 64-bit clean
http://bugs.python.org/issue21781  opened by haypo

#21782: hashable documentation error: shouldn't mention id
http://bugs.python.org/issue21782  opened by Giacomo.Alzetta

#21783: smtpd.py does not allow multiple helo/ehlo commands
http://bugs.python.org/issue21783  opened by zvyn

#21784: __init__.py can be a directory
http://bugs.python.org/issue21784  opened by abraithwaite

#21785: __getitem__ and __setitem__ try to be smart when invoked with 
http://bugs.python.org/issue21785  opened by kt

#21786: Use assertEqual in test_pydoc
http://bugs.python.org/issue21786  opened by Claudiu.Popa

#21787: Idle: make 3.x  Hyperparser.get_expression recognize ...
http://bugs.python.org/issue21787  opened by terry.reedy

#21788: Rework Python finalization
http://bugs.python.org/issue21788  opened by haypo

#21790: Change blocksize in http.client to the value of resource.getpa
http://bugs.python.org/issue21790  opened by dbrecht

#21791: Proper return status of os.WNOHANG is not always (0, 0)
http://bugs.python.org/issue21791  opened by eradman

#21793: httplib client/server status refactor
http://bugs.python.org/issue21793  opened by dbrecht

#21795: smtpd.SMTPServer should announce 8BITMIME when supported
http://bugs.python.org/issue21795  opened by zvyn

#21796: tempfile.py", line 83,	in <module> once_lock = _allocate_lock(
http://bugs.python.org/issue21796  opened by pythonbug1shal

#21799: Py_SetPath() gives compile error: undefined reference to '__im
http://bugs.python.org/issue21799  opened by Pat.Le.Cat

#21800: Implement RFC 6855 (IMAP Support for UTF-8) in imaplib.
http://bugs.python.org/issue21800  opened by zvyn

#21801: inspect.signature doesn't always return a signature
http://bugs.python.org/issue21801  opened by Claudiu.Popa

#21802: Reader of BufferedRWPair is not closed if writer's close() fai
http://bugs.python.org/issue21802  opened by serhiy.storchaka

#21803: Remove macro indirections in complexobject
http://bugs.python.org/issue21803  opened by pitrou

#21804: Implement thr UTF8 command (RFC 6856) in poplib.
http://bugs.python.org/issue21804  opened by zvyn

#21806: Add tests for turtle.TPen class
http://bugs.python.org/issue21806  opened by ingrid

#21807: SysLogHandler closes TCP connection after first message
http://bugs.python.org/issue21807  opened by Omer.Katz

#21809: Building Python3 on VMS - External repository
http://bugs.python.org/issue21809  opened by John.Malmberg

#21811: Anticipate fixes to 3.x and 2.7 for OS X 10.10 Yosemite suppor
http://bugs.python.org/issue21811  opened by ned.deily

#21812: turtle.shapetransform doesn't transform the turtle on the firs
http://bugs.python.org/issue21812  opened by Lita.Cho

#21813: Enhance doc of os.stat_result
http://bugs.python.org/issue21813  opened by haypo

#21814: object.__setattr__ or super(...).__setattr__?
http://bugs.python.org/issue21814  opened by b9

#21815: imaplib truncates some untagged responses
http://bugs.python.org/issue21815  opened by rafales


Most recent 15 issues with no replies (15)
==========================================

#21814: object.__setattr__ or super(...).__setattr__?
http://bugs.python.org/issue21814

#21812: turtle.shapetransform doesn't transform the turtle on the firs
http://bugs.python.org/issue21812

#21806: Add tests for turtle.TPen class
http://bugs.python.org/issue21806

#21804: Implement thr UTF8 command (RFC 6856) in poplib.
http://bugs.python.org/issue21804

#21803: Remove macro indirections in complexobject
http://bugs.python.org/issue21803

#21802: Reader of BufferedRWPair is not closed if writer's close() fai
http://bugs.python.org/issue21802

#21801: inspect.signature doesn't always return a signature
http://bugs.python.org/issue21801

#21800: Implement RFC 6855 (IMAP Support for UTF-8) in imaplib.
http://bugs.python.org/issue21800

#21799: Py_SetPath() gives compile error: undefined reference to '__im
http://bugs.python.org/issue21799

#21796: tempfile.py", line 83,	in <module> once_lock = _allocate_lock(
http://bugs.python.org/issue21796

#21795: smtpd.SMTPServer should announce 8BITMIME when supported
http://bugs.python.org/issue21795

#21791: Proper return status of os.WNOHANG is not always (0, 0)
http://bugs.python.org/issue21791

#21787: Idle: make 3.x  Hyperparser.get_expression recognize ...
http://bugs.python.org/issue21787

#21783: smtpd.py does not allow multiple helo/ehlo commands
http://bugs.python.org/issue21783

#21781: make _ssl module 64-bit clean
http://bugs.python.org/issue21781


Most recent 15 issues waiting for review (15)
=============================================

#21813: Enhance doc of os.stat_result
http://bugs.python.org/issue21813

#21811: Anticipate fixes to 3.x and 2.7 for OS X 10.10 Yosemite suppor
http://bugs.python.org/issue21811

#21806: Add tests for turtle.TPen class
http://bugs.python.org/issue21806

#21804: Implement thr UTF8 command (RFC 6856) in poplib.
http://bugs.python.org/issue21804

#21803: Remove macro indirections in complexobject
http://bugs.python.org/issue21803

#21802: Reader of BufferedRWPair is not closed if writer's close() fai
http://bugs.python.org/issue21802

#21801: inspect.signature doesn't always return a signature
http://bugs.python.org/issue21801

#21793: httplib client/server status refactor
http://bugs.python.org/issue21793

#21790: Change blocksize in http.client to the value of resource.getpa
http://bugs.python.org/issue21790

#21786: Use assertEqual in test_pydoc
http://bugs.python.org/issue21786

#21781: make _ssl module 64-bit clean
http://bugs.python.org/issue21781

#21780: make unicodedata module 64-bit safe
http://bugs.python.org/issue21780

#21777: Separate out documentation of binary sequence methods
http://bugs.python.org/issue21777

#21776: distutils.upload uses the wrong order of exceptions
http://bugs.python.org/issue21776

#21772: platform.uname() not EINTR safe
http://bugs.python.org/issue21772


Top 10 most discussed issues (10)
=================================

#14534: Add method to mark unittest.TestCases as "do not run".
http://bugs.python.org/issue14534  14 msgs

#21763: Clarify requirements for file-like objects
http://bugs.python.org/issue21763  14 msgs

#10740: sqlite3 module breaks transactions and potentially corrupts da
http://bugs.python.org/issue10740  10 msgs

#21741: Convert most of the test suite to using unittest.main()
http://bugs.python.org/issue21741  10 msgs

#19495: Enhancement for timeit: measure time to run blocks of code usi
http://bugs.python.org/issue19495   8 msgs

#21772: platform.uname() not EINTR safe
http://bugs.python.org/issue21772   8 msgs

#15993: Windows: 3.3.0-rc2.msi: test_buffer fails
http://bugs.python.org/issue15993   7 msgs

#21765: Idle: make 3.x HyperParser work with non-ascii identifiers.
http://bugs.python.org/issue21765   6 msgs

#21784: __init__.py can be a directory
http://bugs.python.org/issue21784   6 msgs

#5207: extend strftime/strptime format for RFC3339 and RFC2822
http://bugs.python.org/issue5207   5 msgs


Issues closed (67)
==================

#3425: posixmodule.c always using res = utime(path, NULL)
http://bugs.python.org/issue3425  closed by r.david.murray

#5904: strftime docs do not explain locale effect on result string
http://bugs.python.org/issue5904  closed by r.david.murray

#6133: LOAD_CONST followed by LOAD_ATTR can be optimized to just	be a
http://bugs.python.org/issue6133  closed by terry.reedy

#6916: Remove deprecated items from asynchat
http://bugs.python.org/issue6916  closed by giampaolo.rodola

#6966: Ability to refer to arguments in TestCase.fail* methods
http://bugs.python.org/issue6966  closed by r.david.murray

#9693: asynchat push_callable() patch
http://bugs.python.org/issue9693  closed by giampaolo.rodola

#9727: Add callbacks to be invoked when locale changes
http://bugs.python.org/issue9727  closed by loewis

#9972: PyGILState_XXX missing in Python builds without threads
http://bugs.python.org/issue9972  closed by ned.deily

#10002: Installer doesn't install on Windows Server 2008 DataCenter R2
http://bugs.python.org/issue10002  closed by loewis

#10084: SSL support for asyncore
http://bugs.python.org/issue10084  closed by giampaolo.rodola

#10136: kill_python doesn't work with short path
http://bugs.python.org/issue10136  closed by zach.ware

#10310: signed:1 bitfields rarely make sense
http://bugs.python.org/issue10310  closed by berker.peksag

#10524: Patch to add Pardus to supported dists in platform
http://bugs.python.org/issue10524  closed by berker.peksag

#11287: Add context manager support to dbm modules
http://bugs.python.org/issue11287  closed by Claudiu.Popa

#11394: Tools/demo, etc. are not installed
http://bugs.python.org/issue11394  closed by terry.reedy

#11736: windows installers ssl module / openssl broken for some sites
http://bugs.python.org/issue11736  closed by loewis

#11792: asyncore module print to stdout
http://bugs.python.org/issue11792  closed by giampaolo.rodola

#12617: Mutable Sequence Type can work not only with iterable in slice
http://bugs.python.org/issue12617  closed by Claudiu.Popa

#13102: xml.dom.minidom does not support default namespaces
http://bugs.python.org/issue13102  closed by ezio.melotti

#13779: os.walk: bottom-up
http://bugs.python.org/issue13779  closed by python-dev

#16587: Py_Initialize breaks wprintf on Windows
http://bugs.python.org/issue16587  closed by haypo

#18612: More elaborate documentation on how list comprehensions and ge
http://bugs.python.org/issue18612  closed by uglemat

#18703: To change the doc of html/faq/gui.html
http://bugs.python.org/issue18703  closed by r.david.murray

#19362: Documentation for len() fails to mention that it works on sets
http://bugs.python.org/issue19362  closed by terry.reedy

#19493: Report skipped ctypes tests as skipped
http://bugs.python.org/issue19493  closed by zach.ware

#19768: Not so correct error message when giving incorrect type to max
http://bugs.python.org/issue19768  closed by rhettinger

#19898: No tests for dequereviter_new
http://bugs.python.org/issue19898  closed by rhettinger

#20062: Remove emacs page from devguide
http://bugs.python.org/issue20062  closed by ezio.melotti

#20068: collections.Counter documentation leaves out interesting useca
http://bugs.python.org/issue20068  closed by rhettinger

#20091: An index entry for __main__ in "30.5 runpy" is missing
http://bugs.python.org/issue20091  closed by orsenthil

#20457: Use partition and enumerate make getopt easier
http://bugs.python.org/issue20457  closed by ezio.melotti

#20708: commands has no "RANDOM" environment?
http://bugs.python.org/issue20708  closed by zach.ware

#20880: Windows installation problem with 3.3.5
http://bugs.python.org/issue20880  closed by BreamoreBoy

#20915: Add "pip" section to experts list in devguide
http://bugs.python.org/issue20915  closed by ezio.melotti

#21205: Add __qualname__ attribute to Python generators and change def
http://bugs.python.org/issue21205  closed by haypo

#21326: asyncio: request clearer error message when event loop closed
http://bugs.python.org/issue21326  closed by haypo

#21559: OverflowError should not happen for integer operations
http://bugs.python.org/issue21559  closed by terry.reedy

#21595: asyncio: Creating many subprocess generates lots of internal B
http://bugs.python.org/issue21595  closed by python-dev

#21669: Custom error messages when print & exec are used as statements
http://bugs.python.org/issue21669  closed by ncoghlan

#21686: IDLE - Test hyperparser
http://bugs.python.org/issue21686  closed by terry.reedy

#21690: re documentation: re.compile links to re.search / re.match ins
http://bugs.python.org/issue21690  closed by ezio.melotti

#21694: IDLE - Test ParenMatch
http://bugs.python.org/issue21694  closed by terry.reedy

#21719: Returning Windows file attribute information via os.stat()
http://bugs.python.org/issue21719  closed by zach.ware

#21722: teach distutils "upload" to exit with code != 0 when error occ
http://bugs.python.org/issue21722  closed by pitrou

#21723: Float maxsize is treated as infinity in asyncio.Queue
http://bugs.python.org/issue21723  closed by haypo

#21726: Unnecessary line in documentation
http://bugs.python.org/issue21726  closed by terry.reedy

#21730: test_socket fails --without-threads
http://bugs.python.org/issue21730  closed by terry.reedy

#21742: WatchedFileHandler can fail due to race conditions or file ope
http://bugs.python.org/issue21742  closed by python-dev

#21744: itertools.islice() goes over all the pre-initial elements even
http://bugs.python.org/issue21744  closed by rhettinger

#21751: Expand zipimport to support bzip2 and lzma
http://bugs.python.org/issue21751  closed by serhiy.storchaka

#21752: Document Backwards Incompatible change to logging in 3.4
http://bugs.python.org/issue21752  closed by python-dev

#21757: Can't reenable menus in Tkinter on Mac
http://bugs.python.org/issue21757  closed by ned.deily

#21758: Not so correct documentation about asyncio.subprocess_shell me
http://bugs.python.org/issue21758  closed by python-dev

#21759: URL Typo in Documentation FAQ
http://bugs.python.org/issue21759  closed by python-dev

#21764: Document that IOBase.__del__ calls self.close
http://bugs.python.org/issue21764  closed by python-dev

#21766: CGIHTTPServer File Disclosure
http://bugs.python.org/issue21766  closed by python-dev

#21771: name of 2nd parameter to itertools.groupby()
http://bugs.python.org/issue21771  closed by rhettinger

#21773: Fix a NameError in test_enum
http://bugs.python.org/issue21773  closed by haypo

#21774: Fix a NameError in xml.dom.minidom
http://bugs.python.org/issue21774  closed by rhettinger

#21789: Broken link to PEP 263 in Python 2.7 error message
http://bugs.python.org/issue21789  closed by ned.deily

#21792: Spam
http://bugs.python.org/issue21792  closed by SilentGhost

#21794: stack frame contains name of wrapper method, not that of wrapp
http://bugs.python.org/issue21794  closed by zach.ware

#21797: mmap read of single byte accesses more that just that byte
http://bugs.python.org/issue21797  closed by jcea

#21798: Allow adding Path or str to Path
http://bugs.python.org/issue21798  closed by pitrou

#21805: Argparse Revert config_file defaults
http://bugs.python.org/issue21805  closed by r.david.murray

#21808: 65001 code page not supported
http://bugs.python.org/issue21808  closed by r.david.murray

#21810: SIGSEGV in PyObject_Malloc when ARENAS_USE_MMAP
http://bugs.python.org/issue21810  closed by neologix

From ezio.melotti at gmail.com  Fri Jun 20 19:30:55 2014
From: ezio.melotti at gmail.com (Ezio Melotti)
Date: Fri, 20 Jun 2014 20:30:55 +0300
Subject: [Python-Dev] Tracker Stats
Message-ID: <CACBhJdE1iLn3pqTTRg6GZKEjExwu37-M7siDDS3sCk5RnqGorA@mail.gmail.com>

Hi,
I added a new "stats" page to the bug tracker:
http://bugs.python.org/issue?@template=stats
The page can be reached from the sidebar of the bug tracker: Summaries -> Stats
The data are updated once a week, together with the Summary of Python
tracker issues.

Best Regards,
Ezio Melotti

From raymond.hettinger at gmail.com  Fri Jun 20 20:23:54 2014
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Fri, 20 Jun 2014 11:23:54 -0700
Subject: [Python-Dev] Tracker Stats
In-Reply-To: <CACBhJdE1iLn3pqTTRg6GZKEjExwu37-M7siDDS3sCk5RnqGorA@mail.gmail.com>
References: <CACBhJdE1iLn3pqTTRg6GZKEjExwu37-M7siDDS3sCk5RnqGorA@mail.gmail.com>
Message-ID: <320108CE-AAEE-4B57-BD89-281BDB84A07D@gmail.com>


On Jun 20, 2014, at 10:30 AM, Ezio Melotti <ezio.melotti at gmail.com> wrote:

> I added a new "stats" page to the bug tracker:
> http://bugs.python.org/issue?@template=stats
> The page can be reached from the sidebar of the bug tracker: Summaries -> Stats
> The data are updated once a week, together with the Summary of Python
> tracker issues.

Thank you.  That gives nice visibility to all the work being done on the tracker.


Raymond
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140620/05496b9f/attachment.html>

From pjenvey at underboss.org  Fri Jun 20 22:32:10 2014
From: pjenvey at underboss.org (Philip Jenvey)
Date: Fri, 20 Jun 2014 13:32:10 -0700
Subject: [Python-Dev] PyPy3 2.3.1 released
Message-ID: <42C29176-D101-483F-97DC-91C18443D393@underboss.org>

=====================
PyPy3 2.3.1 - Fulcrum
=====================

We're pleased to announce the first stable release of PyPy3. PyPy3
targets Python 3 (3.2.5) compatibility.

We would like to thank all of the people who donated_ to the `py3k proposal`_
for supporting the work that went into this.

You can download the PyPy3 2.3.1 release here:

    http://pypy.org/download.html#pypy3-2-3-1

Highlights
==========

* The first stable release of PyPy3: support for Python 3!

* The stdlib has been updated to Python 3.2.5

* Additional support for the u'unicode' syntax (`PEP 414`_) from Python 3.3

* Updates from the default branch, such as incremental GC and various JIT
  improvements

* Resolved some notable JIT performance regressions from PyPy2:

 - Re-enabled the previously disabled collection (list/dict/set) strategies

 - Resolved performance of iteration over range objects

 - Resolved handling of Python 3's exception __context__ unnecessarily forcing
   frame object overhead

.. _`PEP 414`: http://legacy.python.org/dev/peps/pep-0414/

What is PyPy?
==============

PyPy is a very compliant Python interpreter, almost a drop-in replacement for
CPython 2.7.6 or 3.2.5. It's fast due to its integrated tracing JIT compiler.

This release supports x86 machines running Linux 32/64, Mac OS X 64, Windows,
and OpenBSD,
as well as newer ARM hardware (ARMv6 or ARMv7, with VFPv3) running Linux.

While we support 32 bit python on Windows, work on the native Windows 64
bit python is still stalling, we would welcome a volunteer
to `handle that`_.

.. _`handle that`: http://doc.pypy.org/en/latest/windows.html#what-is-missing-for-a-full-64-bit-translation

How to use PyPy?
=================

We suggest using PyPy from a `virtualenv`_. Once you have a virtualenv
installed, you can follow instructions from `pypy documentation`_ on how
to proceed. This document also covers other `installation schemes`_.

.. _donated: http://morepypy.blogspot.com/2012/01/py3k-and-numpy-first-stage-thanks-to.html
.. _`py3k proposal`: http://pypy.org/py3donate.html
.. _`pypy documentation`: http://doc.pypy.org/en/latest/getting-started.html#installing-using-virtualenv
.. _`virtualenv`: http://www.virtualenv.org/en/latest/
.. _`installation schemes`: http://doc.pypy.org/en/latest/getting-started.html#installing-pypy


Cheers,
the PyPy team

--
Philip Jenvey


From ericsnowcurrently at gmail.com  Sat Jun 21 01:18:16 2014
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Fri, 20 Jun 2014 17:18:16 -0600
Subject: [Python-Dev] PyPy3 2.3.1 released
In-Reply-To: <42C29176-D101-483F-97DC-91C18443D393@underboss.org>
References: <42C29176-D101-483F-97DC-91C18443D393@underboss.org>
Message-ID: <CALFfu7BfjGjNRmLE2oj1y3BnOsn07me-2bDZzOec6hKtZpzkCA@mail.gmail.com>

On Fri, Jun 20, 2014 at 2:32 PM, Philip Jenvey <pjenvey at underboss.org> wrote:
> =====================
> PyPy3 2.3.1 - Fulcrum
> =====================
>
> We're pleased to announce the first stable release of PyPy3. PyPy3
> targets Python 3 (3.2.5) compatibility.

Awesome!

-eric

From ncoghlan at gmail.com  Sat Jun 21 03:24:58 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 21 Jun 2014 11:24:58 +1000
Subject: [Python-Dev] PyPy3 2.3.1 released
In-Reply-To: <42C29176-D101-483F-97DC-91C18443D393@underboss.org>
References: <42C29176-D101-483F-97DC-91C18443D393@underboss.org>
Message-ID: <CADiSq7cbcaDyYvtBD8wR0PPr1bDZumRiwnW-jvhj+eV_E9mk3g@mail.gmail.com>

On 21 Jun 2014 06:39, "Philip Jenvey" <pjenvey at underboss.org> wrote:
>
> =====================
> PyPy3 2.3.1 - Fulcrum
> =====================
>
> We're pleased to announce the first stable release of PyPy3. PyPy3
> targets Python 3 (3.2.5) compatibility.

Congratulations, that's another critical milestone in the Python 3
migration reached! :)

Cheers,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140621/fd110a4b/attachment.html>

From wizzat at gmail.com  Sat Jun 21 05:12:19 2014
From: wizzat at gmail.com (Mark Roberts)
Date: Fri, 20 Jun 2014 20:12:19 -0700
Subject: [Python-Dev] PyPy3 2.3.1 released
In-Reply-To: <42C29176-D101-483F-97DC-91C18443D393@underboss.org>
References: <42C29176-D101-483F-97DC-91C18443D393@underboss.org>
Message-ID: <2125E459-F4A3-4F2E-A8E1-77263D0BEE5C@gmail.com>

That's fantastic! Great job - that's a lot of work :)

-Mark

> On Jun 20, 2014, at 13:32, Philip Jenvey <pjenvey at underboss.org> wrote:
> 
> =====================
> PyPy3 2.3.1 - Fulcrum
> =====================
> 
> We're pleased to announce the first stable release of PyPy3. PyPy3
> targets Python 3 (3.2.5) compatibility.
> 
> We would like to thank all of the people who donated_ to the `py3k proposal`_
> for supporting the work that went into this.
> 
> You can download the PyPy3 2.3.1 release here:
> 
>    http://pypy.org/download.html#pypy3-2-3-1
> 
> Highlights
> ==========
> 
> * The first stable release of PyPy3: support for Python 3!
> 
> * The stdlib has been updated to Python 3.2.5
> 
> * Additional support for the u'unicode' syntax (`PEP 414`_) from Python 3.3
> 
> * Updates from the default branch, such as incremental GC and various JIT
>  improvements
> 
> * Resolved some notable JIT performance regressions from PyPy2:
> 
> - Re-enabled the previously disabled collection (list/dict/set) strategies
> 
> - Resolved performance of iteration over range objects
> 
> - Resolved handling of Python 3's exception __context__ unnecessarily forcing
>   frame object overhead
> 
> .. _`PEP 414`: http://legacy.python.org/dev/peps/pep-0414/
> 
> What is PyPy?
> ==============
> 
> PyPy is a very compliant Python interpreter, almost a drop-in replacement for
> CPython 2.7.6 or 3.2.5. It's fast due to its integrated tracing JIT compiler.
> 
> This release supports x86 machines running Linux 32/64, Mac OS X 64, Windows,
> and OpenBSD,
> as well as newer ARM hardware (ARMv6 or ARMv7, with VFPv3) running Linux.
> 
> While we support 32 bit python on Windows, work on the native Windows 64
> bit python is still stalling, we would welcome a volunteer
> to `handle that`_.
> 
> .. _`handle that`: http://doc.pypy.org/en/latest/windows.html#what-is-missing-for-a-full-64-bit-translation
> 
> How to use PyPy?
> =================
> 
> We suggest using PyPy from a `virtualenv`_. Once you have a virtualenv
> installed, you can follow instructions from `pypy documentation`_ on how
> to proceed. This document also covers other `installation schemes`_.
> 
> .. _donated: http://morepypy.blogspot.com/2012/01/py3k-and-numpy-first-stage-thanks-to.html
> .. _`py3k proposal`: http://pypy.org/py3donate.html
> .. _`pypy documentation`: http://doc.pypy.org/en/latest/getting-started.html#installing-using-virtualenv
> .. _`virtualenv`: http://www.virtualenv.org/en/latest/
> .. _`installation schemes`: http://doc.pypy.org/en/latest/getting-started.html#installing-pypy
> 
> 
> Cheers,
> the PyPy team
> 
> --
> Philip Jenvey
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/wizzat%40gmail.com

From mal at egenix.com  Sat Jun 21 12:27:17 2014
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat, 21 Jun 2014 12:27:17 +0200
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
Message-ID: <53A55E05.5020906@egenix.com>

With PEP 466 and the constant flow of OpenSSL security fixes
which are currently being handled via Python patch level releases,
we will soon reach 2.7.10 and quickly go beyond that (also see
http://bugs.python.org/issue21308).

This opens up a potential backwards incompatibility with existing
tools that assume the Python release version number to use the
"x.y.z" single digit approach, e.g. code that uses sys.version[:5]
for the Python version or relies on the lexicographic ordering
of the version string (sys.version > '2.7.2').

Some questions we should probably ask ourselves (I've added my
thoughts inline):

 * Is it a good strategy to ship to Python releases for every
   single OpenSSL security release or is there a better way to
   handle these 3rd party issues ?

   I think we should link to the OpenSSL libs dynamically rather
   than statically in Python 2.7 for Windows so that it's possible
   to provide drop-in updates for such issues.

 * Should we try to avoid two digit patch level release numbers
   by using some other mechanism such as e.g. a release date
   after 2.7.9 ?

   Grepping through our code, this will introduce some breakage,
   but not much. Most older code branches on minor versions,
   not patch levels. More recent code uses sys.python_info so
   is not affected.

 * Should we make use of the potential breakage with 2.7.10
   to introduce a new Windows compiler version for Python 2.7 ?

   I think this would be a good chance to update the compiler
   to whatever we use for Python 3 at the time.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 21 2014)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2014-06-17: Released eGenix PyRun 2.0.0 ...       http://egenix.com/go58
2014-06-09: Released eGenix pyOpenSSL 0.13.3 ...  http://egenix.com/go57
2014-07-02: Python Meeting Duesseldorf ...                 11 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From ncoghlan at gmail.com  Sat Jun 21 12:51:54 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 21 Jun 2014 20:51:54 +1000
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <53A55E05.5020906@egenix.com>
References: <53A55E05.5020906@egenix.com>
Message-ID: <CADiSq7cs5zNW-bbPJZG5SotBDvYL_z-6xopoKgm2ofh2JjrSsA@mail.gmail.com>

On 21 June 2014 20:27, M.-A. Lemburg <mal at egenix.com> wrote:
> With PEP 466 and the constant flow of OpenSSL security fixes
> which are currently being handled via Python patch level releases,
> we will soon reach 2.7.10 and quickly go beyond that (also see
> http://bugs.python.org/issue21308).
>
> This opens up a potential backwards incompatibility with existing
> tools that assume the Python release version number to use the
> "x.y.z" single digit approach, e.g. code that uses sys.version[:5]
> for the Python version or relies on the lexicographic ordering
> of the version string (sys.version > '2.7.2').

Such code has an easy fix available, though, as sys.version_info has
existed since 2.0, and handles two digit micro releases just fine. The
docs for sys.version also have this explicit disclaimer: "Do not
extract version information out of it, rather, use version_info and
the functions provided by the platform module."

Making it harder to tell whether or not someone's Python installation
is affected by an OpenSSL CVE is also an undesirable outcome. On a
Linux distro, folks will check the distro package database directly
for the OpenSSL version, but on Windows, no such centralised audit
mechanism is available by default. With OpenSSL statically linked,
Python versions can just be mapped to OpenSSL versions (so, for
example, 2.7.7 has 1.0.1g)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From barry at python.org  Sat Jun 21 18:40:29 2014
From: barry at python.org (Barry Warsaw)
Date: Sat, 21 Jun 2014 12:40:29 -0400
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <53A55E05.5020906@egenix.com>
References: <53A55E05.5020906@egenix.com>
Message-ID: <20140621124029.1314ead6@limelight.wooz.org>

On Jun 21, 2014, at 12:27 PM, M.-A. Lemburg wrote:

>This opens up a potential backwards incompatibility with existing
>tools that assume the Python release version number to use the
>"x.y.z" single digit approach, e.g. code that uses sys.version[:5]
>for the Python version or relies on the lexicographic ordering
>of the version string (sys.version > '2.7.2').

Patient: Doctor, it hurts when I do this.
Doctor: Don't do that!

> * Should we try to avoid two digit patch level release numbers
>   by using some other mechanism such as e.g. a release date
>   after 2.7.9 ?
>
>   Grepping through our code, this will introduce some breakage,
>   but not much. Most older code branches on minor versions,
>   not patch levels. More recent code uses sys.python_info so
>   is not affected.

s/sys.python_info/sys.version_info/ and yes the latter has been preferred for
a long time now.

Given that 2.7 is a long term support release, it's inevitable that we'll
break the 2-digit micro release number barrier.  So be it.  A 2.7.10 isn't the
end of the world.

-Barry

From mal at egenix.com  Sat Jun 21 18:57:57 2014
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat, 21 Jun 2014 18:57:57 +0200
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <CADiSq7cs5zNW-bbPJZG5SotBDvYL_z-6xopoKgm2ofh2JjrSsA@mail.gmail.com>
References: <53A55E05.5020906@egenix.com>
 <CADiSq7cs5zNW-bbPJZG5SotBDvYL_z-6xopoKgm2ofh2JjrSsA@mail.gmail.com>
Message-ID: <53A5B995.6040802@egenix.com>

On 21.06.2014 12:51, Nick Coghlan wrote:
> On 21 June 2014 20:27, M.-A. Lemburg <mal at egenix.com> wrote:
>> With PEP 466 and the constant flow of OpenSSL security fixes
>> which are currently being handled via Python patch level releases,
>> we will soon reach 2.7.10 and quickly go beyond that (also see
>> http://bugs.python.org/issue21308).
>>
>> This opens up a potential backwards incompatibility with existing
>> tools that assume the Python release version number to use the
>> "x.y.z" single digit approach, e.g. code that uses sys.version[:5]
>> for the Python version or relies on the lexicographic ordering
>> of the version string (sys.version > '2.7.2').
> 
> Such code has an easy fix available, though, as sys.version_info has
> existed since 2.0, and handles two digit micro releases just fine. The
> docs for sys.version also have this explicit disclaimer: "Do not
> extract version information out of it, rather, use version_info and
> the functions provided by the platform module."

I don't think that's a good argument. Of course, there are
better ways to figure out the version number, but fact is,
existing code, even in the stdlib, does use and parse
the sys.version string version.

During Python's lifetime, we've always avoided two digit
version numbers, so people have been relying on this, even
if it was never (AFAIK) documented anywhere.

> Making it harder to tell whether or not someone's Python installation
> is affected by an OpenSSL CVE is also an undesirable outcome. On a
> Linux distro, folks will check the distro package database directly
> for the OpenSSL version, but on Windows, no such centralised audit
> mechanism is available by default. With OpenSSL statically linked,
> Python versions can just be mapped to OpenSSL versions (so, for
> example, 2.7.7 has 1.0.1g)

I have to disagree here as well :-)

If people cannot upgrade to a higher patch level for whatever
reason (say a patch level release introduced some other bugs),
but still need to upgrade to the current OpenSSL version, they'd
be stuck if we continue to bind the Python version number to
some OpenSSL release version.

We should definitely make it possible to address OpenSSL
bugs without having to upgrade Python and it's not hard to
do: just replace the static binding with dynamic binding
and include the two OpenSSL DLLs with the Windows installer.

People can then drop in new versions of those DLLs
as needed, without having the core devs do a complete
new release every time someone finds a new problem those
libs. Security libs simply have a much higher release
rate (if they are well maintained) than most other
software.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 21 2014)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2014-06-17: Released eGenix PyRun 2.0.0 ...       http://egenix.com/go58
2014-06-09: Released eGenix pyOpenSSL 0.13.3 ...  http://egenix.com/go57
2014-07-02: Python Meeting Duesseldorf ...                 11 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From nad at acm.org  Sat Jun 21 20:47:08 2014
From: nad at acm.org (Ned Deily)
Date: Sat, 21 Jun 2014 11:47:08 -0700
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
References: <53A55E05.5020906@egenix.com>
 <CADiSq7cs5zNW-bbPJZG5SotBDvYL_z-6xopoKgm2ofh2JjrSsA@mail.gmail.com>
 <53A5B995.6040802@egenix.com>
Message-ID: <nad-8070FE.11470821062014@news.gmane.org>

In article <53A5B995.6040802 at egenix.com>,
 "M.-A. Lemburg" <mal at egenix.com> wrote:
> > Making it harder to tell whether or not someone's Python installation
> > is affected by an OpenSSL CVE is also an undesirable outcome. On a
> > Linux distro, folks will check the distro package database directly
> > for the OpenSSL version, but on Windows, no such centralised audit
> > mechanism is available by default. With OpenSSL statically linked,
> > Python versions can just be mapped to OpenSSL versions (so, for
> > example, 2.7.7 has 1.0.1g)
> 
> I have to disagree here as well :-)
> 
> If people cannot upgrade to a higher patch level for whatever
> reason (say a patch level release introduced some other bugs),
> but still need to upgrade to the current OpenSSL version, they'd
> be stuck if we continue to bind the Python version number to
> some OpenSSL release version.
> 
> We should definitely make it possible to address OpenSSL
> bugs without having to upgrade Python and it's not hard to
> do: just replace the static binding with dynamic binding
> and include the two OpenSSL DLLs with the Windows installer.
> 
> People can then drop in new versions of those DLLs
> as needed, without having the core devs do a complete
> new release every time someone finds a new problem those
> libs. Security libs simply have a much higher release
> rate (if they are well maintained) than most other
> software.

I agree that with Nick and Barry that, due to the extended support 
period for 2.7, we have no choice but to bite the bullet and deal with 
micro levels exceeding 9.  On the other hand, it would also be good to 
be better able to deal with third-party library revisions that only 
affect the Windows or OS X binary installers we supply.  I don't know 
that we've ever had a process/policy in place to do that, certainly not 
recently.  Even for statically linked libraries, we presumably could 
supply replacement re-linked files or even carefully repacked installers 
with the updated files.  This might be something to discuss and add to 
PEP 101 or a new PEP.

Up to now, this hasn't been a major concern since there have usually 
been other reasons to do full releases as well, e.g. source regressions.  
Given the still relatively high churn rate for changes going into 2.7 
and the growing distance between the 2.7 and 3.x code bases (among other 
things, leading to more frequent inadvertent backporting errors), we'll 
probably need to keep making relatively frequent 2.7 releases unless we 
can further slow down the 2.7 change rate.

-- 
 Ned Deily,
 nad at acm.org


From rosuav at gmail.com  Sat Jun 21 22:34:23 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 22 Jun 2014 06:34:23 +1000
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <53A5B995.6040802@egenix.com>
References: <53A55E05.5020906@egenix.com>
 <CADiSq7cs5zNW-bbPJZG5SotBDvYL_z-6xopoKgm2ofh2JjrSsA@mail.gmail.com>
 <53A5B995.6040802@egenix.com>
Message-ID: <CAPTjJmqFuwG-fqOaV=Z27RBOX6n=5Dmi_H5hw_Ujchj+bvz8NQ@mail.gmail.com>

On Sun, Jun 22, 2014 at 2:57 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> On 21.06.2014 12:51, Nick Coghlan wrote:
>> Such code has an easy fix available, though, as sys.version_info has
>> existed since 2.0, and handles two digit micro releases just fine. The
>> docs for sys.version also have this explicit disclaimer: "Do not
>> extract version information out of it, rather, use version_info and
>> the functions provided by the platform module."
>
> I don't think that's a good argument. Of course, there are
> better ways to figure out the version number, but fact is,
> existing code, even in the stdlib, does use and parse
> the sys.version string version.
>
> During Python's lifetime, we've always avoided two digit
> version numbers, so people have been relying on this, even
> if it was never (AFAIK) documented anywhere.

It's going to be a broken-code-breaking change that's introduced in a
point release, but since PEP 404 implicitly says that there won't be a
2.10.0, there's no way around that. Although actually, a glance at the
stdlib suggests that 2.10.0 (or 3.10.0) would break a lot more than
2.7.10 would break - there are places where sys.version[:3] is used
(or equivalents like "... %.3s ..." % sys.version), or a whole-string
comparison is done against a two-part version string (eg: sys.version
>= "2.6"), and at least one place that checks sys.version[0] for the
major version number, but I didn't find any that look at
sys.version[:5] or equivalent. Everything that cares about the
three-part version number seems to either look at
sys.version.split()[0] or sys.version_info. Do you know where this
problematic code is?

I checked this in the 2.7.3 stdlib as packaged on my Debian Wheezy
system, for what it's worth.

ChrisA

From phd at phdru.name  Sat Jun 21 22:58:04 2014
From: phd at phdru.name (Oleg Broytman)
Date: Sat, 21 Jun 2014 22:58:04 +0200
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <CAPTjJmqFuwG-fqOaV=Z27RBOX6n=5Dmi_H5hw_Ujchj+bvz8NQ@mail.gmail.com>
References: <53A55E05.5020906@egenix.com>
 <CADiSq7cs5zNW-bbPJZG5SotBDvYL_z-6xopoKgm2ofh2JjrSsA@mail.gmail.com>
 <53A5B995.6040802@egenix.com>
 <CAPTjJmqFuwG-fqOaV=Z27RBOX6n=5Dmi_H5hw_Ujchj+bvz8NQ@mail.gmail.com>
Message-ID: <20140621205804.GA12098@phdru.name>

On Sun, Jun 22, 2014 at 06:34:23AM +1000, Chris Angelico <rosuav at gmail.com> wrote:
> Do you know where this problematic code is?

   In many places:

https://encrypted.google.com/search?q=%22sys.version[%3A3]%22

https://encrypted.google.com/search?q=%22sys.version[%3A5]%22

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.

From mal at egenix.com  Sat Jun 21 23:37:21 2014
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat, 21 Jun 2014 23:37:21 +0200
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <CAPTjJmqFuwG-fqOaV=Z27RBOX6n=5Dmi_H5hw_Ujchj+bvz8NQ@mail.gmail.com>
References: <53A55E05.5020906@egenix.com>	<CADiSq7cs5zNW-bbPJZG5SotBDvYL_z-6xopoKgm2ofh2JjrSsA@mail.gmail.com>	<53A5B995.6040802@egenix.com>
 <CAPTjJmqFuwG-fqOaV=Z27RBOX6n=5Dmi_H5hw_Ujchj+bvz8NQ@mail.gmail.com>
Message-ID: <53A5FB11.5020302@egenix.com>

On 21.06.2014 22:34, Chris Angelico wrote:
> On Sun, Jun 22, 2014 at 2:57 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>> On 21.06.2014 12:51, Nick Coghlan wrote:
>>> Such code has an easy fix available, though, as sys.version_info has
>>> existed since 2.0, and handles two digit micro releases just fine. The
>>> docs for sys.version also have this explicit disclaimer: "Do not
>>> extract version information out of it, rather, use version_info and
>>> the functions provided by the platform module."
>>
>> I don't think that's a good argument. Of course, there are
>> better ways to figure out the version number, but fact is,
>> existing code, even in the stdlib, does use and parse
>> the sys.version string version.
>>
>> During Python's lifetime, we've always avoided two digit
>> version numbers, so people have been relying on this, even
>> if it was never (AFAIK) documented anywhere.
> 
> It's going to be a broken-code-breaking change that's introduced in a
> point release, but since PEP 404 implicitly says that there won't be a
> 2.10.0, there's no way around that. Although actually, a glance at the
> stdlib suggests that 2.10.0 (or 3.10.0) would break a lot more than
> 2.7.10 would break - there are places where sys.version[:3] is used
> (or equivalents like "... %.3s ..." % sys.version), or a whole-string
> comparison is done against a two-part version string (eg: sys.version
>> = "2.6"), and at least one place that checks sys.version[0] for the
> major version number, but I didn't find any that look at
> sys.version[:5] or equivalent. Everything that cares about the
> three-part version number seems to either look at
> sys.version.split()[0] or sys.version_info. Do you know where this
> problematic code is?
> 
> I checked this in the 2.7.3 stdlib as packaged on my Debian Wheezy
> system, for what it's worth.

There are no places in the stdlib that parse sys.version in a
way that would break wtih 2.7.10, AFAIK. I was just referring
to the statement that Nick quoted. sys.version *is* used for
parsing the Python version or using parts of it to build
e.g. filenames and that's really no surprise.

That said, and I also included this in my answers to the questions
that Nick removed in his reply, I don't think that a lot of
code would be affected by this. I do believe that we can use
this potential breakage as a chance for improvement. See the last
question (listed here again)...

1. Is it a good strategy to ship to Python releases for every
   single OpenSSL security release or is there a better way to
   handle these 3rd party issues ?

2. Should we try to avoid two digit patch level release numbers
   by using some other mechanism such as e.g. a release date
   after 2.7.9 ?

3. Should we make use of the potential breakage with 2.7.10
   to introduce a new Windows compiler version for Python 2.7 ?

My answers to these are: 1. We should use dynamic linking
instead and not let OpenSSL bugs trigger Python releases; 2.
It's not a big problem; 3. Yes, please, since it is difficult
for people to develop and debug their extensions with a
2008 compiler, when the rest of the world has long moved on.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 21 2014)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2014-06-17: Released eGenix PyRun 2.0.0 ...       http://egenix.com/go58
2014-06-09: Released eGenix pyOpenSSL 0.13.3 ...  http://egenix.com/go57
2014-07-02: Python Meeting Duesseldorf ...                 11 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From phil at riverbankcomputing.com  Sat Jun 21 23:57:38 2014
From: phil at riverbankcomputing.com (Phil Thompson)
Date: Sat, 21 Jun 2014 22:57:38 +0100
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <53A5FB11.5020302@egenix.com>
References: "\"<53A55E05.5020906@egenix.com>"
 <CADiSq7cs5zNW-bbPJZG5SotBDvYL_z-6xopoKgm2ofh2JjrSsA@mail.gmail.com>"
 <53A5B995.6040802@egenix.com>
 <CAPTjJmqFuwG-fqOaV=Z27RBOX6n=5Dmi_H5hw_Ujchj+bvz8NQ@mail.gmail.com>
 <53A5FB11.5020302@egenix.com>
Message-ID: <880dda6fcfa7666894993fd515d889d3@www.riverbankcomputing.com>

On 21/06/2014 10:37 pm, M.-A. Lemburg wrote:
> That said, and I also included this in my answers to the questions
> that Nick removed in his reply, I don't think that a lot of
> code would be affected by this. I do believe that we can use
> this potential breakage as a chance for improvement. See the last
> question (listed here again)...
> 
> 1. Is it a good strategy to ship to Python releases for every
>    single OpenSSL security release or is there a better way to
>    handle these 3rd party issues ?

Isn't this only a packaging issue? There is no change to the Python API 
or implementation, so there is no need to change the version number. So 
just make new Windows packages.

The precedent is to add a dash and a package number. I can't remember 
what version this was applied to before - but I got a +1 from Guido for 
suggesting it :)

Phil

From ethan at stoneleaf.us  Sat Jun 21 23:48:34 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Sat, 21 Jun 2014 14:48:34 -0700
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <53A5FB11.5020302@egenix.com>
References: <53A55E05.5020906@egenix.com>	<CADiSq7cs5zNW-bbPJZG5SotBDvYL_z-6xopoKgm2ofh2JjrSsA@mail.gmail.com>	<53A5B995.6040802@egenix.com>
 <CAPTjJmqFuwG-fqOaV=Z27RBOX6n=5Dmi_H5hw_Ujchj+bvz8NQ@mail.gmail.com>
 <53A5FB11.5020302@egenix.com>
Message-ID: <53A5FDB2.1080000@stoneleaf.us>

On 06/21/2014 02:37 PM, M.-A. Lemburg wrote:
>
> My answers to these are: 1. We should use dynamic linking
> instead and not let OpenSSL bugs trigger Python releases; 2.
> It's not a big problem; 3. Yes, please, since it is difficult
> for people to develop and debug their extensions with a
> 2008 compiler, when the rest of the world has long moved on.

+1  (assuming not incredibly difficult and those that can are willing ;)

--
~Ethan~

From Steve.Dower at microsoft.com  Sun Jun 22 00:00:14 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Sat, 21 Jun 2014 22:00:14 +0000
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <53A5FB11.5020302@egenix.com>
References: <53A55E05.5020906@egenix.com>
 <CADiSq7cs5zNW-bbPJZG5SotBDvYL_z-6xopoKgm2ofh2JjrSsA@mail.gmail.com>
 <53A5B995.6040802@egenix.com>
 <CAPTjJmqFuwG-fqOaV=Z27RBOX6n=5Dmi_H5hw_Ujchj+bvz8NQ@mail.gmail.com>,
 <53A5FB11.5020302@egenix.com>
Message-ID: <51068485f2924c599a6fe238ea81c8bd@BLUPR03MB389.namprd03.prod.outlook.com>

We can always lie about the version in sys.version. Existing code is unaffected and new code will have to use version_info (Windows developers will know that Windows pulls tricks like this every other version... doesn't make it a great idea, but it works).

Changing compiler without changing at least the install directory and preventing in place upgrades is a really bad idea, and with those mitigations is only pretty bad. I'm torn here, because I know the current situation hurts, but it'd probably only move to VC10 which will hurt just as much in a few years... there are better tooling solutions (yes, I'm working on some behind the scenes).

A separate distro of _ssl and _hashlib wouldn't be too hard and has the same effect as a dynamically linked OpenSSL. Maybe we can make these PyPI updateable?

Top-posted from my Windows Phone
________________________________
From: M.-A. Lemburg<mailto:mal at egenix.com>
Sent: ?6/?21/?2014 14:38
To: Chris Angelico<mailto:rosuav at gmail.com>
Cc: Python-Dev<mailto:python-dev at python.org>
Subject: Re: [Python-Dev] Python 2.7 patch levels turning two digit

On 21.06.2014 22:34, Chris Angelico wrote:
> On Sun, Jun 22, 2014 at 2:57 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>> On 21.06.2014 12:51, Nick Coghlan wrote:
>>> Such code has an easy fix available, though, as sys.version_info has
>>> existed since 2.0, and handles two digit micro releases just fine. The
>>> docs for sys.version also have this explicit disclaimer: "Do not
>>> extract version information out of it, rather, use version_info and
>>> the functions provided by the platform module."
>>
>> I don't think that's a good argument. Of course, there are
>> better ways to figure out the version number, but fact is,
>> existing code, even in the stdlib, does use and parse
>> the sys.version string version.
>>
>> During Python's lifetime, we've always avoided two digit
>> version numbers, so people have been relying on this, even
>> if it was never (AFAIK) documented anywhere.
>
> It's going to be a broken-code-breaking change that's introduced in a
> point release, but since PEP 404 implicitly says that there won't be a
> 2.10.0, there's no way around that. Although actually, a glance at the
> stdlib suggests that 2.10.0 (or 3.10.0) would break a lot more than
> 2.7.10 would break - there are places where sys.version[:3] is used
> (or equivalents like "... %.3s ..." % sys.version), or a whole-string
> comparison is done against a two-part version string (eg: sys.version
>> = "2.6"), and at least one place that checks sys.version[0] for the
> major version number, but I didn't find any that look at
> sys.version[:5] or equivalent. Everything that cares about the
> three-part version number seems to either look at
> sys.version.split()[0] or sys.version_info. Do you know where this
> problematic code is?
>
> I checked this in the 2.7.3 stdlib as packaged on my Debian Wheezy
> system, for what it's worth.

There are no places in the stdlib that parse sys.version in a
way that would break wtih 2.7.10, AFAIK. I was just referring
to the statement that Nick quoted. sys.version *is* used for
parsing the Python version or using parts of it to build
e.g. filenames and that's really no surprise.

That said, and I also included this in my answers to the questions
that Nick removed in his reply, I don't think that a lot of
code would be affected by this. I do believe that we can use
this potential breakage as a chance for improvement. See the last
question (listed here again)...

1. Is it a good strategy to ship to Python releases for every
   single OpenSSL security release or is there a better way to
   handle these 3rd party issues ?

2. Should we try to avoid two digit patch level release numbers
   by using some other mechanism such as e.g. a release date
   after 2.7.9 ?

3. Should we make use of the potential breakage with 2.7.10
   to introduce a new Windows compiler version for Python 2.7 ?

My answers to these are: 1. We should use dynamic linking
instead and not let OpenSSL bugs trigger Python releases; 2.
It's not a big problem; 3. Yes, please, since it is difficult
for people to develop and debug their extensions with a
2008 compiler, when the rest of the world has long moved on.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 21 2014)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2014-06-17: Released eGenix PyRun 2.0.0 ...       http://egenix.com/go58
2014-06-09: Released eGenix pyOpenSSL 0.13.3 ...  http://egenix.com/go57
2014-07-02: Python Meeting Duesseldorf ...                 11 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/
_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40microsoft.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140621/a1c2ee10/attachment-0001.html>

From donald at stufft.io  Sun Jun 22 00:18:05 2014
From: donald at stufft.io (Donald Stufft)
Date: Sat, 21 Jun 2014 18:18:05 -0400
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <51068485f2924c599a6fe238ea81c8bd@BLUPR03MB389.namprd03.prod.outlook.com>
References: <53A55E05.5020906@egenix.com>
 <CADiSq7cs5zNW-bbPJZG5SotBDvYL_z-6xopoKgm2ofh2JjrSsA@mail.gmail.com>
 <53A5B995.6040802@egenix.com>
 <CAPTjJmqFuwG-fqOaV=Z27RBOX6n=5Dmi_H5hw_Ujchj+bvz8NQ@mail.gmail.com>,
 <53A5FB11.5020302@egenix.com>
 <51068485f2924c599a6fe238ea81c8bd@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <B9C35903-0109-4D31-A661-5C9DC0E77C62@stufft.io>


On Jun 21, 2014, at 6:00 PM, Steve Dower <Steve.Dower at microsoft.com> wrote:

> We can always lie about the version in sys.version. Existing code is unaffected and new code will have to use version_info (Windows developers will know that Windows pulls tricks like this every other version... doesn't make it a great idea, but it works).
> 
> Changing compiler without changing at least the install directory and preventing in place upgrades is a really bad idea, and with those mitigations is only pretty bad. I'm torn here, because I know the current situation hurts, but it'd probably only move to VC10 which will hurt just as much in a few years... there are better tooling solutions (yes, I'm working on some behind the scenes).
> 
> A separate distro of _ssl and _hashlib wouldn't be too hard and has the same effect as a dynamically linked OpenSSL. Maybe we can make these PyPI updateable?

Stuff from PyPI installs later on in the sys.path than the stdlib. I wish it were different but it means without sys.path shenanigans you can?t replace the stdlib with something from PyPI.

> 
> Top-posted from my Windows Phone
> From: M.-A. Lemburg
> Sent: ?6/?21/?2014 14:38
> To: Chris Angelico
> Cc: Python-Dev
> Subject: Re: [Python-Dev] Python 2.7 patch levels turning two digit
> 
> On 21.06.2014 22:34, Chris Angelico wrote:
> > On Sun, Jun 22, 2014 at 2:57 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> >> On 21.06.2014 12:51, Nick Coghlan wrote:
> >>> Such code has an easy fix available, though, as sys.version_info has
> >>> existed since 2.0, and handles two digit micro releases just fine. The
> >>> docs for sys.version also have this explicit disclaimer: "Do not
> >>> extract version information out of it, rather, use version_info and
> >>> the functions provided by the platform module."
> >>
> >> I don't think that's a good argument. Of course, there are
> >> better ways to figure out the version number, but fact is,
> >> existing code, even in the stdlib, does use and parse
> >> the sys.version string version.
> >>
> >> During Python's lifetime, we've always avoided two digit
> >> version numbers, so people have been relying on this, even
> >> if it was never (AFAIK) documented anywhere.
> > 
> > It's going to be a broken-code-breaking change that's introduced in a
> > point release, but since PEP 404 implicitly says that there won't be a
> > 2.10.0, there's no way around that. Although actually, a glance at the
> > stdlib suggests that 2.10.0 (or 3.10.0) would break a lot more than
> > 2.7.10 would break - there are places where sys.version[:3] is used
> > (or equivalents like "... %.3s ..." % sys.version), or a whole-string
> > comparison is done against a two-part version string (eg: sys.version
> >> = "2.6"), and at least one place that checks sys.version[0] for the
> > major version number, but I didn't find any that look at
> > sys.version[:5] or equivalent. Everything that cares about the
> > three-part version number seems to either look at
> > sys.version.split()[0] or sys.version_info. Do you know where this
> > problematic code is?
> > 
> > I checked this in the 2.7.3 stdlib as packaged on my Debian Wheezy
> > system, for what it's worth.
> 
> There are no places in the stdlib that parse sys.version in a
> way that would break wtih 2.7.10, AFAIK. I was just referring
> to the statement that Nick quoted. sys.version *is* used for
> parsing the Python version or using parts of it to build
> e.g. filenames and that's really no surprise.
> 
> That said, and I also included this in my answers to the questions
> that Nick removed in his reply, I don't think that a lot of
> code would be affected by this. I do believe that we can use
> this potential breakage as a chance for improvement. See the last
> question (listed here again)...
> 
> 1. Is it a good strategy to ship to Python releases for every
>    single OpenSSL security release or is there a better way to
>    handle these 3rd party issues ?
> 
> 2. Should we try to avoid two digit patch level release numbers
>    by using some other mechanism such as e.g. a release date
>    after 2.7.9 ?
> 
> 3. Should we make use of the potential breakage with 2.7.10
>    to introduce a new Windows compiler version for Python 2.7 ?
> 
> My answers to these are: 1. We should use dynamic linking
> instead and not let OpenSSL bugs trigger Python releases; 2.
> It's not a big problem; 3. Yes, please, since it is difficult
> for people to develop and debug their extensions with a
> 2008 compiler, when the rest of the world has long moved on.
> 
> -- 
> Marc-Andre Lemburg
> eGenix.com
> 
> Professional Python Services directly from the Source  (#1, Jun 21 2014)
> >>> Python Projects, Consulting and Support ...   http://www.egenix.com/
> >>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
> >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
> ________________________________________________________________________
> 2014-06-17: Released eGenix PyRun 2.0.0 ...       http://egenix.com/go58
> 2014-06-09: Released eGenix pyOpenSSL 0.13.3 ...  http://egenix.com/go57
> 2014-07-02: Python Meeting Duesseldorf ...                 11 days to go
> 
>    eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>            Registered at Amtsgericht Duesseldorf: HRB 46611
>                http://www.egenix.com/company/contact/
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40microsoft.com
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/donald%40stufft.io


-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140621/84d43d27/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140621/84d43d27/attachment.sig>

From rosuav at gmail.com  Sun Jun 22 01:10:41 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 22 Jun 2014 09:10:41 +1000
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <51068485f2924c599a6fe238ea81c8bd@BLUPR03MB389.namprd03.prod.outlook.com>
References: <53A55E05.5020906@egenix.com>
 <CADiSq7cs5zNW-bbPJZG5SotBDvYL_z-6xopoKgm2ofh2JjrSsA@mail.gmail.com>
 <53A5B995.6040802@egenix.com>
 <CAPTjJmqFuwG-fqOaV=Z27RBOX6n=5Dmi_H5hw_Ujchj+bvz8NQ@mail.gmail.com>
 <53A5FB11.5020302@egenix.com>
 <51068485f2924c599a6fe238ea81c8bd@BLUPR03MB389.namprd03.prod.outlook.com>
Message-ID: <CAPTjJmrgtqkRrDeZ+0ak1ZJ_uO-2bkDQCBZNA0koRevMOrJOGQ@mail.gmail.com>

On Sun, Jun 22, 2014 at 8:00 AM, Steve Dower <Steve.Dower at microsoft.com> wrote:
> We can always lie about the version in sys.version. Existing code is
> unaffected and new code will have to use version_info (Windows developers
> will know that Windows pulls tricks like this every other version... doesn't
> make it a great idea, but it works).

I'd prefer a change of format to an outright lie. Something like
"2.7._10" which will sort after "2.7.9". But ideally, nothing at all -
just go smoothly to "2.7.10" and let broken code be broken. It'll
think it's running on 2.7.1, and if anything needs to distinguish
between 2.7.1 and 2.7.x, hopefully it's using version_info.

ChrisA

From rosuav at gmail.com  Sun Jun 22 01:11:34 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 22 Jun 2014 09:11:34 +1000
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <53A5FB11.5020302@egenix.com>
References: <53A55E05.5020906@egenix.com>
 <CADiSq7cs5zNW-bbPJZG5SotBDvYL_z-6xopoKgm2ofh2JjrSsA@mail.gmail.com>
 <53A5B995.6040802@egenix.com>
 <CAPTjJmqFuwG-fqOaV=Z27RBOX6n=5Dmi_H5hw_Ujchj+bvz8NQ@mail.gmail.com>
 <53A5FB11.5020302@egenix.com>
Message-ID: <CAPTjJmptQ_uavVY1s-q1ToKT45w1hDgYmYK0CkTNaaoS4kQOxQ@mail.gmail.com>

On Sun, Jun 22, 2014 at 7:37 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> There are no places in the stdlib that parse sys.version in a
> way that would break wtih 2.7.10, AFAIK. I was just referring
> to the statement that Nick quoted. sys.version *is* used for
> parsing the Python version or using parts of it to build
> e.g. filenames and that's really no surprise.

Right, good to know. I thought you were implying that stuff would
break. Yes, stuff definitely does parse out the version number from
sys.version, lots of that happens.

ChrisA

From breamoreboy at yahoo.co.uk  Sun Jun 22 11:17:11 2014
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Sun, 22 Jun 2014 10:17:11 +0100
Subject: [Python-Dev] subprocess shell=True on Windows doesn't escape ^
	character
In-Reply-To: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
References: <CAPkN8xKBitUZ=77+jr0HcUT-TrjmC1E-Pc2jDXHjgtsDNmf=RA@mail.gmail.com>
Message-ID: <lo66un$h26$1@ger.gmane.org>

On 11/06/2014 21:26, anatoly techtonik wrote:
> I am banned from tracker, so I post the bug here:
>

The OP's approach to the Python community is beautifully summarised here 
http://bugs.python.org/issue8940

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com


From martin at v.loewis.de  Mon Jun 23 08:09:32 2014
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 23 Jun 2014 08:09:32 +0200
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <53A55E05.5020906@egenix.com>
References: <53A55E05.5020906@egenix.com>
Message-ID: <53A7C49C.1090107@v.loewis.de>

>  * Is it a good strategy to ship to Python releases for every
>    single OpenSSL security release or is there a better way to
>    handle these 3rd party issues ?

At least for Windows, a new release certainly needs to be made.
It could be possible to produce MSI patch files, but this would
still be a new release.

>    I think we should link to the OpenSSL libs dynamically rather
>    than statically in Python 2.7 for Windows so that it's possible
>    to provide drop-in updates for such issues.

It is possible to provide drop-in updates regardless of whether the
OpenSSL libs are dynamically linked, as the _ssl module itself is a
dynamic lib.

>  * Should we try to avoid two digit patch level release numbers
>    by using some other mechanism such as e.g. a release date
>    after 2.7.9 ?

If it was for me, then yes, certainly: the development of 2.7 should
just stop :-)

>  * Should we make use of the potential breakage with 2.7.10
>    to introduce a new Windows compiler version for Python 2.7 ?

Assuming it is a good idea to continue producing Windows binaries
for 2.7, I think it would be a bad idea to switch compilers. It will
cause severe breakage of 2.7 installations, much more problematic
than switching to two-digit version numbers.

Regards,
Martin


From francismb at email.de  Mon Jun 23 17:52:33 2014
From: francismb at email.de (francis)
Date: Mon, 23 Jun 2014 17:52:33 +0200
Subject: [Python-Dev] Tracker Stats
In-Reply-To: <CACBhJdE1iLn3pqTTRg6GZKEjExwu37-M7siDDS3sCk5RnqGorA@mail.gmail.com>
References: <CACBhJdE1iLn3pqTTRg6GZKEjExwu37-M7siDDS3sCk5RnqGorA@mail.gmail.com>
Message-ID: <53A84D41.6070508@email.de>


> Hi,
> I added a new "stats" page to the bug tracker:
> http://bugs.python.org/issue?@template=stats
Thanks Ezio,

Two questions:
how hard would be to add (or enhance) a chart with the
?open issues type enhancement? and ?open issues type bug?
info ?

In the summaries there is a link to ?Issues with patch?,
means that the ones not listed there are in ?needs patch?
or ?new? status?

Regards,
francis


From donald at stufft.io  Mon Jun 23 18:09:28 2014
From: donald at stufft.io (Donald Stufft)
Date: Mon, 23 Jun 2014 12:09:28 -0400
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <53A7C49C.1090107@v.loewis.de>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
Message-ID: <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>


On Jun 23, 2014, at 2:09 AM, Martin v. L?wis <martin at v.loewis.de> wrote:

>> 
>> * Should we make use of the potential breakage with 2.7.10
>>   to introduce a new Windows compiler version for Python 2.7 ?
> 
> Assuming it is a good idea to continue producing Windows binaries
> for 2.7, I think it would be a bad idea to switch compilers. It will
> cause severe breakage of 2.7 installations, much more problematic
> than switching to two-digit version numbers.

I agree with this, we?ve just finally started getting things to the point where
it makes a lot of sense for binary distributions for Windows. Breaking all
of them on 2.7 would be very bad.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140623/13141f96/attachment.sig>

From mal at egenix.com  Mon Jun 23 21:27:47 2014
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 23 Jun 2014 21:27:47 +0200
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
Message-ID: <53A87FB3.2000100@egenix.com>

On 23.06.2014 18:09, Donald Stufft wrote:
> 
> On Jun 23, 2014, at 2:09 AM, Martin v. L?wis <martin at v.loewis.de> wrote:
> 
>>>
>>> * Should we make use of the potential breakage with 2.7.10
>>>   to introduce a new Windows compiler version for Python 2.7 ?
>>
>> Assuming it is a good idea to continue producing Windows binaries
>> for 2.7, I think it would be a bad idea to switch compilers. It will
>> cause severe breakage of 2.7 installations, much more problematic
>> than switching to two-digit version numbers.
> 
> I agree with this, we?ve just finally started getting things to the point where
> it makes a lot of sense for binary distributions for Windows. Breaking all
> of them on 2.7 would be very bad.

Not sure what you mean. We've had binary wininst distributions
for Windows for more than a decade, and egg and msi distributions
for 8 years :-)

But without access to the VS 2008 compiler that is needed to
compile those extensions, it will become increasingly difficult
for package authors to provide such binary packages, so we have to
ask ourselves:

What's worse: breaking old Windows binaries for Python 2.7
or not having updated and new Windows binaries for Python 2.7
at all in a few years ?

Switching to a newer compiler will make things easier for everyone
and we'd see more binary packages for Windows again.

Given that Python 2.7 support was extended for another 5 years at the
recent Python Language Summit to 2020, we have to face this
breakage sooner or later anyway. Extended support for VS 2008
will end in 2018 (but then: Python developers usually don't have
extended support contracts with MS). Service pack support has already
ended in 2009.

Depending on how you see it, using such an old compiler also
poses security risks. The last security update for VS 2008 dates
back to 2011 (http://support.microsoft.com/kb/2538243).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 23 2014)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2014-06-17: Released eGenix PyRun 2.0.0 ...       http://egenix.com/go58
2014-07-02: Python Meeting Duesseldorf ...                  9 days to go
2014-07-21: EuroPython 2014, Berlin, Germany ...           28 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From nad at acm.org  Mon Jun 23 21:53:14 2014
From: nad at acm.org (Ned Deily)
Date: Mon, 23 Jun 2014 12:53:14 -0700
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com>
Message-ID: <nad-88F7E2.12531423062014@news.gmane.org>

In article <53A87FB3.2000100 at egenix.com>,
 "M.-A. Lemburg" <mal at egenix.com> wrote:
[...]
> But without access to the VS 2008 compiler that is needed to
> compile those extensions, it will become increasingly difficult
> for package authors to provide such binary packages, so we have to
> ask ourselves:
> 
> What's worse: breaking old Windows binaries for Python 2.7
> or not having updated and new Windows binaries for Python 2.7
> at all in a few years ?
> 
> Switching to a newer compiler will make things easier for everyone
> and we'd see more binary packages for Windows again.

It does seem like a conundrum.  As I have no deep Windows experience to 
be able to have an appreciation of all of the technical issues involved, 
I ask out of ignorance: would it be possible and desirable to provide a 
transition period of n 2.7.x maintenance releases (where n is between 1 
and, say, 3) where we would provide 2 sets of Windows installers, one 
set (32- and 64-bit) with the older compiler and CRT and one with the 
newer, and campaign to get users and packagers who provide binary 
extensions to migrate?  Would that mitigate the pain, assuming that 
Steve (or someone else) would be willing to build the additional 
installers for the transition period?  I've done something similar on a 
smaller scale with the OS X 32-bit installer for 2.7.x but that impact 
is much less as the audience for that installer is much smaller.

-- 
 Ned Deily,
 nad at acm.org


From antoine at python.org  Mon Jun 23 22:04:29 2014
From: antoine at python.org (Antoine Pitrou)
Date: Mon, 23 Jun 2014 16:04:29 -0400
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <53A87FB3.2000100@egenix.com>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com>
Message-ID: <loa18d$hqp$1@ger.gmane.org>

Le 23/06/2014 15:27, M.-A. Lemburg a ?crit :
>
> Not sure what you mean. We've had binary wininst distributions
> for Windows for more than a decade, and egg and msi distributions
> for 8 years :-)
>
> But without access to the VS 2008 compiler that is needed to
> compile those extensions,

It does seem to be available:
http://www.microsoft.com/en-us/download/details.aspx?id=13276

What am I missing?

Regards

Antoine.


From rdmurray at bitdance.com  Mon Jun 23 22:12:24 2014
From: rdmurray at bitdance.com (R. David Murray)
Date: Mon, 23 Jun 2014 16:12:24 -0400
Subject: [Python-Dev] Tracker Stats
In-Reply-To: <53A84D41.6070508@email.de>
References: <CACBhJdE1iLn3pqTTRg6GZKEjExwu37-M7siDDS3sCk5RnqGorA@mail.gmail.com>
 <53A84D41.6070508@email.de>
Message-ID: <20140623201225.0DA80250DE6@webabinitio.net>

On Mon, 23 Jun 2014 17:52:33 +0200, francis <francismb at email.de> wrote:
> 
> > Hi,
> > I added a new "stats" page to the bug tracker:
> > http://bugs.python.org/issue?@template=stats
> Thanks Ezio,
> 
> Two questions:
> how hard would be to add (or enhance) a chart with the
> ???open issues type enhancement??? and ???open issues type bug???
> info ?
> 
> In the summaries there is a link to ???Issues with patch???,
> means that the ones not listed there are in ???needs patch???
> or ???new??? status?

The stats graphs are based on the data generated for the
weekly issue report.  I have a patched version of that
report that adds the bug/enhancement info.  I'll try to dig
it up this week; someone ping me if I forget :)  It think
the patch will need to be updated based on Ezio's changes.

--David

From donald at stufft.io  Mon Jun 23 22:20:30 2014
From: donald at stufft.io (Donald Stufft)
Date: Mon, 23 Jun 2014 16:20:30 -0400
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <53A87FB3.2000100@egenix.com>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com>
Message-ID: <91E82F8F-339A-403C-8EEA-997E27FBEE59@stufft.io>


On Jun 23, 2014, at 3:27 PM, M.-A. Lemburg <mal at egenix.com> wrote:

> On 23.06.2014 18:09, Donald Stufft wrote:
>> 
>> On Jun 23, 2014, at 2:09 AM, Martin v. L?wis <martin at v.loewis.de> wrote:
>> 
>>>> 
>>>> * Should we make use of the potential breakage with 2.7.10
>>>>  to introduce a new Windows compiler version for Python 2.7 ?
>>> 
>>> Assuming it is a good idea to continue producing Windows binaries
>>> for 2.7, I think it would be a bad idea to switch compilers. It will
>>> cause severe breakage of 2.7 installations, much more problematic
>>> than switching to two-digit version numbers.
>> 
>> I agree with this, we?ve just finally started getting things to the point where
>> it makes a lot of sense for binary distributions for Windows. Breaking all
>> of them on 2.7 would be very bad.

Err, sorry that ?We? was with my pip hat on.

> 
> Not sure what you mean. We've had binary wininst distributions
> for Windows for more than a decade, and egg and msi distributions
> for 8 years :-)

Nonetheless, changing the compiler will not only break pip, but every
automated installer tool (easy_install, buildout) that i?m aware of. The
blow back for binary installation is going to be huge I think.

> 
> But without access to the VS 2008 compiler that is needed to
> compile those extensions, it will become increasingly difficult
> for package authors to provide such binary packages, so we have to
> ask ourselves:
> 
> What's worse: breaking old Windows binaries for Python 2.7
> or not having updated and new Windows binaries for Python 2.7
> at all in a few years ?

At the risk of getting Guido to post his slide again, I still think the
solution to the old compiler is to just roll a 2.8 with minimal changes.
It could even be a good place to move to the ssl backport changes
too since they were the riskier set of changes in PEP466.

But either way, if a compiler does change in a 2.7 release we?ll need
to update a lot of tooling to cope with that, so any plan to do that should
include that and a timeline for adoption of that.

> 
> Switching to a newer compiler will make things easier for everyone
> and we'd see more binary packages for Windows again.
> 
> Given that Python 2.7 support was extended for another 5 years at the
> recent Python Language Summit to 2020, we have to face this
> breakage sooner or later anyway. Extended support for VS 2008
> will end in 2018 (but then: Python developers usually don't have
> extended support contracts with MS). Service pack support has already
> ended in 2009.
> 
> Depending on how you see it, using such an old compiler also
> poses security risks. The last security update for VS 2008 dates
> back to 2011 (http://support.microsoft.com/kb/2538243).
> 
> -- 
> Marc-Andre Lemburg
> eGenix.com
> 
> Professional Python Services directly from the Source  (#1, Jun 23 2014)
>>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
> ________________________________________________________________________
> 2014-06-17: Released eGenix PyRun 2.0.0 ...       http://egenix.com/go58
> 2014-07-02: Python Meeting Duesseldorf ...                  9 days to go
> 2014-07-21: EuroPython 2014, Berlin, Germany ...           28 days to go
> 
>   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>           Registered at Amtsgericht Duesseldorf: HRB 46611
>               http://www.egenix.com/company/contact/


-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140623/d3dd7382/attachment.sig>

From barry at python.org  Mon Jun 23 22:31:03 2014
From: barry at python.org (Barry Warsaw)
Date: Mon, 23 Jun 2014 16:31:03 -0400
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <91E82F8F-339A-403C-8EEA-997E27FBEE59@stufft.io>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com>
 <91E82F8F-339A-403C-8EEA-997E27FBEE59@stufft.io>
Message-ID: <20140623163103.75073882@anarchist.wooz.org>

On Jun 23, 2014, at 04:20 PM, Donald Stufft wrote:

>At the risk of getting Guido to post his slide again, I still think the
>solution to the old compiler is to just roll a 2.8 with minimal changes.

No.  It's not going to happen, for all the reasons discussed previously.
Python 2.8 is not a solution to anything.

If a new, incompatible compiler suite is required, why can't there just be
multiple Windows downloads on https://www.python.org/download/releases/2.7.7/
?  Well, on reason is that you'd have to convince MvL or someone else to take
over the work that would require, but that's gotta be *much* lighter weight
than releasing a Python 2.8.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140623/cc124a16/attachment.sig>

From ethan at stoneleaf.us  Mon Jun 23 22:12:15 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 23 Jun 2014 13:12:15 -0700
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <loa18d$hqp$1@ger.gmane.org>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com> <loa18d$hqp$1@ger.gmane.org>
Message-ID: <53A88A1F.308@stoneleaf.us>

On 06/23/2014 01:04 PM, Antoine Pitrou wrote:
> Le 23/06/2014 15:27, M.-A. Lemburg a ?crit :
>>
>> Not sure what you mean. We've had binary wininst distributions
>> for Windows for more than a decade, and egg and msi distributions
>> for 8 years :-)
>>
>> But without access to the VS 2008 compiler that is needed to
>> compile those extensions,
>
> It does seem to be available:
> http://www.microsoft.com/en-us/download/details.aspx?id=13276
>
> What am I missing?

Is that VS 2008 /with/ the SP, or just the SP?

--
~Ethan~

From martin at v.loewis.de  Mon Jun 23 22:40:30 2014
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 23 Jun 2014 22:40:30 +0200
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <loa18d$hqp$1@ger.gmane.org>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com> <loa18d$hqp$1@ger.gmane.org>
Message-ID: <53A890BE.3000102@v.loewis.de>

Am 23.06.14 22:04, schrieb Antoine Pitrou:
> Le 23/06/2014 15:27, M.-A. Lemburg a ?crit :
>>
>> Not sure what you mean. We've had binary wininst distributions
>> for Windows for more than a decade, and egg and msi distributions
>> for 8 years :-)
>>
>> But without access to the VS 2008 compiler that is needed to
>> compile those extensions,
> 
> It does seem to be available:
> http://www.microsoft.com/en-us/download/details.aspx?id=13276
> 
> What am I missing?

I believe (without testing) that this is just the service pack.
Installing it requires a pre-existing installation of Visual
Studio 2008, or else the installer will refuse to do anything.
Note that it also won't install on top of Visual Studio Express:
you need a licensed copy of Visual Studio to install the service
pack.

Visual Studio 2008 still *is* available to MSDN users. It's
just not available through regular sales channels anymore.

Regards,
Martin


From donald at stufft.io  Mon Jun 23 22:43:31 2014
From: donald at stufft.io (Donald Stufft)
Date: Mon, 23 Jun 2014 16:43:31 -0400
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <20140623163103.75073882@anarchist.wooz.org>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com>
 <91E82F8F-339A-403C-8EEA-997E27FBEE59@stufft.io>
 <20140623163103.75073882@anarchist.wooz.org>
Message-ID: <DA056505-4F4D-417A-9E40-DC071F34CCE9@stufft.io>


On Jun 23, 2014, at 4:31 PM, Barry Warsaw <barry at python.org> wrote:

> On Jun 23, 2014, at 04:20 PM, Donald Stufft wrote:
> 
>> At the risk of getting Guido to post his slide again, I still think the
>> solution to the old compiler is to just roll a 2.8 with minimal changes.
> 
> No.  It's not going to happen, for all the reasons discussed previously.
> Python 2.8 is not a solution to anything.
> 
> If a new, incompatible compiler suite is required, why can't there just be
> multiple Windows downloads on https://www.python.org/download/releases/2.7.7/
> ?  Well, on reason is that you'd have to convince MvL or someone else to take
> over the work that would require, but that's gotta be *much* lighter weight
> than releasing a Python 2.8.

As far as I am aware, a 2.7 with a different compiler, even if it?s just an option
is an attractive nuisance. None of the tooling right now differentiates between
binary compatibility by anything other than ?CPython 2.7?.

The end result of having a 2.7 which is built with the old compiler, and a 2.7 built
with the new compiler is that you?ll end up with binary distributions which work
sometimes if you?re lucky and the creator of the binary distribution and you
happened to pick the same ?variant? of 2.7. Most likely result is all the binary
distributions will *mostly* still depend on using the old compiler because of the
corpus of existing binary packages that depend on that. Which means that the
2.7 with new compiler will exist entirely to act as a footgun to anyone who picks
it and also wants to use binary packages.

> 
> -Barry
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/donald%40stufft.io


-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140623/4b12039b/attachment-0001.sig>

From martin at v.loewis.de  Mon Jun 23 22:42:40 2014
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 23 Jun 2014 22:42:40 +0200
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <20140623163103.75073882@anarchist.wooz.org>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com>
 <91E82F8F-339A-403C-8EEA-997E27FBEE59@stufft.io>
 <20140623163103.75073882@anarchist.wooz.org>
Message-ID: <53A89140.80609@v.loewis.de>

Am 23.06.14 22:31, schrieb Barry Warsaw:
> On Jun 23, 2014, at 04:20 PM, Donald Stufft wrote:
> 
>> At the risk of getting Guido to post his slide again, I still think the
>> solution to the old compiler is to just roll a 2.8 with minimal changes.
> 
> No.  It's not going to happen, for all the reasons discussed previously.
> Python 2.8 is not a solution to anything.
> 
> If a new, incompatible compiler suite is required, why can't there just be
> multiple Windows downloads on https://www.python.org/download/releases/2.7.7/
> ?  Well, on reason is that you'd have to convince MvL or someone else to take
> over the work that would require, but that's gotta be *much* lighter weight
> than releasing a Python 2.8.

See my other message. It's actually heavier, since it requires changes
to distutils, PyPI, pip, buildout etc., all which know how to deal with
Python minor version numbers, but are unaware of the notion of competing
ABIs on Windows (except that they know how to deal with 32-bit vs. 64-bit).

Regards,
Martin


From martin at v.loewis.de  Mon Jun 23 22:31:41 2014
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 23 Jun 2014 22:31:41 +0200
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <nad-88F7E2.12531423062014@news.gmane.org>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com> <nad-88F7E2.12531423062014@news.gmane.org>
Message-ID: <53A88EAD.6040600@v.loewis.de>

Am 23.06.14 21:53, schrieb Ned Deily:
> It does seem like a conundrum.  As I have no deep Windows experience to 
> be able to have an appreciation of all of the technical issues involved, 
> I ask out of ignorance: would it be possible and desirable to provide a 
> transition period of n 2.7.x maintenance releases (where n is between 1 
> and, say, 3) where we would provide 2 sets of Windows installers, one 
> set (32- and 64-bit) with the older compiler and CRT and one with the 
> newer, and campaign to get users and packagers who provide binary 
> extensions to migrate? 

The question is how exactly you implement the transition. I see two
alternatives:
1. "Hijack" the 2.7 name space, in particular the name "python27.dll",
   along with registry keys, the .pyd extension, etc.
   Doing so would permit users to mix binaries from different compilers,
   and doing so would lead to crashes. Users would have to be careful
   to either install packages built for the old compiler or packages
   for the new compiler, and never mix.
2. "Sandbox" the 2.7 name space; come up with new names for everything.
   E.g. "python27vs13.dll", ".pydvs13" (or "_vs13.pyd"),
   "C:\Python27vs13", along with the corresponding changes to PyPI,
   pip, buildout, etc. which would need to learn to look for the right
   variant of a Python 2.7 package.
   This should work, but might take several years to implement: you
   need to find all the places in existing code that refer to the
   "old" names.
   If you do it right, you are done about the time when VS 13 becomes
   unavailable, so you'd then do another such transition to VS 2015,
   which promises forward-binary compatibility to future releases of
   Visual Studio.

> Would that mitigate the pain, assuming that 
> Steve (or someone else) would be willing to build the additional 
> installers for the transition period?  I've done something similar on a 
> smaller scale with the OS X 32-bit installer for 2.7.x but that impact 
> is much less as the audience for that installer is much smaller.

Well, the question really is whether precompiled extension modules
available from PyPI would work on both compilers. I understand that
for OSX, you typically don't have precompiled binaries for the extension
modules, so installation compiles the modules from scratch. This is
easier, as it can use the ABI of the Python which will be installed
to.

If you go the "parallel ABIs" route, extension authors have to provide
two parallel sets of packages as well. Given 32-bit and 64-bit packages,
this will make actually two additional packages - just as if they had
to support another Python version.

Regards,
Martin


From Steve.Dower at microsoft.com  Mon Jun 23 22:31:19 2014
From: Steve.Dower at microsoft.com (Steve Dower)
Date: Mon, 23 Jun 2014 20:31:19 +0000
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <loa18d$hqp$1@ger.gmane.org>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com> <loa18d$hqp$1@ger.gmane.org>
Message-ID: <6622306b01cc4446a2df6ae90c3c087c@BLUPR03MB389.namprd03.prod.outlook.com>

> Antoine Pitrou wrote:
> Le 23/06/2014 15:27, M.-A. Lemburg a ?crit :
>>
>> Not sure what you mean. We've had binary wininst distributions for 
>> Windows for more than a decade, and egg and msi distributions for 8 
>> years :-)
>>
>> But without access to the VS 2008 compiler that is needed to compile 
>> those extensions,
> 
> It does seem to be available:
> http://www.microsoft.com/en-us/download/details.aspx?id=13276
> 
> What am I missing?

That's the service pack, which will only install if you already have VS 2008 installed.

The only official source for VS 2008 these days is through an MSDN subscription, though there's a link floating around that will get to an ISO of VC 2008 Express (but it could disappear or move at any time, which would break the link for good).

It's also possible to get VC9 standalone through the Windows SDK for Windows 7 and .NET 3.5, but this installer has bugs, and distutils does not detect VC installs properly (it only detects Visual Studio and then assumes VC). This is fixable with a few extra files and registry keys, but not simple.

The best answer here is making VC9 available in a long-term, unsupported manner (support is the main MSFT concern - simply throwing products out there and forgetting about them is very counter-cultural). I'm working on getting people to recognize the importance of keeping the old compilers available, but it's an uphill battle. Obviously I'll post here as soon as I have something I can officially share. :)

Cheers,
Steve

> Regards
> 
> Antoine.

From donald at stufft.io  Mon Jun 23 22:55:55 2014
From: donald at stufft.io (Donald Stufft)
Date: Mon, 23 Jun 2014 16:55:55 -0400
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <53A88EAD.6040600@v.loewis.de>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com> <nad-88F7E2.12531423062014@news.gmane.org>
 <53A88EAD.6040600@v.loewis.de>
Message-ID: <14DE41E2-5314-4E49-BE93-85EEEDDDEEAD@stufft.io>


On Jun 23, 2014, at 4:31 PM, Martin v. L?wis <martin at v.loewis.de> wrote:

>> 
>> Would that mitigate the pain, assuming that 
>> Steve (or someone else) would be willing to build the additional 
>> installers for the transition period?  I've done something similar on a 
>> smaller scale with the OS X 32-bit installer for 2.7.x but that impact 
>> is much less as the audience for that installer is much smaller.
> 
> Well, the question really is whether precompiled extension modules
> available from PyPI would work on both compilers. I understand that
> for OSX, you typically don't have precompiled binaries for the extension
> modules, so installation compiles the modules from scratch. This is
> easier, as it can use the ABI of the Python which will be installed
> to.
> 
> If you go the "parallel ABIs" route, extension authors have to provide
> two parallel sets of packages as well. Given 32-bit and 64-bit packages,
> this will make actually two additional packages - just as if they had
> to support another Python version.

As far as I know, stuff on OSX is generally built for ?X compiler or later?
so binary compatibility is kept as long as you?re using an ?or later? but
I could be wrong about that. Using binary packages on OSX is a much
less frequent thing I think though since getting a working compiler toolchain
is easier there.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140623/579f0a14/attachment.sig>

From martin at v.loewis.de  Mon Jun 23 22:49:32 2014
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 23 Jun 2014 22:49:32 +0200
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <20140623163103.75073882@anarchist.wooz.org>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com>
 <91E82F8F-339A-403C-8EEA-997E27FBEE59@stufft.io>
 <20140623163103.75073882@anarchist.wooz.org>
Message-ID: <53A892DC.7080106@v.loewis.de>

Am 23.06.14 22:31, schrieb Barry Warsaw:
> Well, on reason is that you'd have to convince MvL or someone else to take
> over the work that would require, but that's gotta be *much* lighter weight
> than releasing a Python 2.8.

Just to point this out in a separate message: it will have to be
somebody else. I stepped down as the Windows release maintainer for 2.7
when I learned about the extended life of 2.7, much because I feared
that exactly the thing would happen that we see happen now - and I
didn't want to be the one who would have to deal with it. It is a mess,
and it will get bigger the more time passes.

Playing-the-role-of-Cassandra-ly y'rs,
Martin


From mal at egenix.com  Mon Jun 23 23:07:02 2014
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 23 Jun 2014 23:07:02 +0200
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <91E82F8F-339A-403C-8EEA-997E27FBEE59@stufft.io>
References: <53A55E05.5020906@egenix.com>
 <53A7C49C.1090107@v.loewis.de>	<A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>	<53A87FB3.2000100@egenix.com>
 <91E82F8F-339A-403C-8EEA-997E27FBEE59@stufft.io>
Message-ID: <53A896F6.2030401@egenix.com>

On 23.06.2014 22:20, Donald Stufft wrote:
> 
> On Jun 23, 2014, at 3:27 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> 
>> On 23.06.2014 18:09, Donald Stufft wrote:
>>>
>>> On Jun 23, 2014, at 2:09 AM, Martin v. L?wis <martin at v.loewis.de> wrote:
>>>
>>>>>
>>>>> * Should we make use of the potential breakage with 2.7.10
>>>>>  to introduce a new Windows compiler version for Python 2.7 ?
>>>>
>>>> Assuming it is a good idea to continue producing Windows binaries
>>>> for 2.7, I think it would be a bad idea to switch compilers. It will
>>>> cause severe breakage of 2.7 installations, much more problematic
>>>> than switching to two-digit version numbers.
>>>
>>> I agree with this, we?ve just finally started getting things to the point where
>>> it makes a lot of sense for binary distributions for Windows. Breaking all
>>> of them on 2.7 would be very bad.
> 
> Err, sorry that ?We? was with my pip hat on.
> 
>>
>> Not sure what you mean. We've had binary wininst distributions
>> for Windows for more than a decade, and egg and msi distributions
>> for 8 years :-)
> 
> Nonetheless, changing the compiler will not only break pip, but every
> automated installer tool (easy_install, buildout) that i?m aware of. The
> blow back for binary installation is going to be huge I think.
>
>> But without access to the VS 2008 compiler that is needed to
>> compile those extensions, it will become increasingly difficult
>> for package authors to provide such binary packages, so we have to
>> ask ourselves:
>>
>> What's worse: breaking old Windows binaries for Python 2.7
>> or not having updated and new Windows binaries for Python 2.7
>> at all in a few years ?
> 
> At the risk of getting Guido to post his slide again, I still think the
> solution to the old compiler is to just roll a 2.8 with minimal changes.
> It could even be a good place to move to the ssl backport changes
> too since they were the riskier set of changes in PEP466.
> 
> But either way, if a compiler does change in a 2.7 release we?ll need
> to update a lot of tooling to cope with that, so any plan to do that should
> include that and a timeline for adoption of that.

Sure, and we'd need to hash out possible solutions to minimize
breakage, but first we'll have to see whether we want to consider
this step or not.


BTW: It's strange that I'm arguing for breaking things. I'm usually
on the other side of such arguments :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From nad at acm.org  Mon Jun 23 23:14:41 2014
From: nad at acm.org (Ned Deily)
Date: Mon, 23 Jun 2014 14:14:41 -0700
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com> <nad-88F7E2.12531423062014@news.gmane.org>
 <53A88EAD.6040600@v.loewis.de>
 <14DE41E2-5314-4E49-BE93-85EEEDDDEEAD@stufft.io>
Message-ID: <nad-D5F29C.14144023062014@news.gmane.org>

In article <14DE41E2-5314-4E49-BE93-85EEEDDDEEAD at stufft.io>,
 Donald Stufft <donald at stufft.io> wrote:

> On Jun 23, 2014, at 4:31 PM, Martin v. Lowis <martin at v.loewis.de> wrote:
> 
> >> 
> >> Would that mitigate the pain, assuming that 
> >> Steve (or someone else) would be willing to build the additional 
> >> installers for the transition period?  I've done something similar on a 
> >> smaller scale with the OS X 32-bit installer for 2.7.x but that impact 
> >> is much less as the audience for that installer is much smaller.
> > 
> > Well, the question really is whether precompiled extension modules
> > available from PyPI would work on both compilers. I understand that
> > for OSX, you typically don't have precompiled binaries for the extension
> > modules, so installation compiles the modules from scratch. This is
> > easier, as it can use the ABI of the Python which will be installed
> > to.
> > 
> > If you go the "parallel ABIs" route, extension authors have to provide
> > two parallel sets of packages as well. Given 32-bit and 64-bit packages,
> > this will make actually two additional packages - just as if they had
> > to support another Python version.
> 
> As far as I know, stuff on OSX is generally built for "X compiler or later"
> so binary compatibility is kept as long as you're using an "or later" but
> I could be wrong about that. Using binary packages on OSX is a much
> less frequent thing I think though since getting a working compiler toolchain
> is easier there.

Both points are generally true on OS X so, yes, binary extensions are 
much less of an issue there.  Where binary distributions on OS X are 
most used, I think, is when there are dependencies on third-party 
non-Python libraries that are not shipped by Apple with OS X.

But, yes, if we were to go down the route of two sets of installers, 
that could mean two sets of third-party packages.  I suppose there could 
potentially be some pip / wheel / possibly Distutils help by 
conditioning the platform name or other component used to generate the 
egg / wheel / and/or bdist file names on the CRT version (or compiler 
version), much as what we do today with OS X deployment target.  Again, 
I'm speculating in ignorance here.  If that were feasible, things built 
with the old toolchain could have unchanged names. And, clearly, we 
would want to keep that "n" number of releases with two sets of 
installers to be as small as possible, like 1.  While there would be a 
certain amount of unavoidable disruption for some folks, others *might* 
welcome the opportunity to no longer have to keep around old versions of 
the tool chain, particularly if they now could use the same tool chain 
to produce binaries for both Py2 and Py3.

-- 
 Ned Deily,
 nad at acm.org


From donald at stufft.io  Mon Jun 23 23:15:05 2014
From: donald at stufft.io (Donald Stufft)
Date: Mon, 23 Jun 2014 17:15:05 -0400
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <53A896F6.2030401@egenix.com>
References: <53A55E05.5020906@egenix.com>
 <53A7C49C.1090107@v.loewis.de>	<A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>	<53A87FB3.2000100@egenix.com>
 <91E82F8F-339A-403C-8EEA-997E27FBEE59@stufft.io>
 <53A896F6.2030401@egenix.com>
Message-ID: <685B2505-BABC-41DE-9A2E-BE64D2CE6AF8@stufft.io>


On Jun 23, 2014, at 5:07 PM, M.-A. Lemburg <mal at egenix.com> wrote:

> On 23.06.2014 22:20, Donald Stufft wrote:
>> 
>> On Jun 23, 2014, at 3:27 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>> 
>>> On 23.06.2014 18:09, Donald Stufft wrote:
>>>> 
>>>> On Jun 23, 2014, at 2:09 AM, Martin v. L?wis <martin at v.loewis.de> wrote:
>>>> 
>>>>>> 
>>>>>> * Should we make use of the potential breakage with 2.7.10
>>>>>> to introduce a new Windows compiler version for Python 2.7 ?
>>>>> 
>>>>> Assuming it is a good idea to continue producing Windows binaries
>>>>> for 2.7, I think it would be a bad idea to switch compilers. It will
>>>>> cause severe breakage of 2.7 installations, much more problematic
>>>>> than switching to two-digit version numbers.
>>>> 
>>>> I agree with this, we?ve just finally started getting things to the point where
>>>> it makes a lot of sense for binary distributions for Windows. Breaking all
>>>> of them on 2.7 would be very bad.
>> 
>> Err, sorry that ?We? was with my pip hat on.
>> 
>>> 
>>> Not sure what you mean. We've had binary wininst distributions
>>> for Windows for more than a decade, and egg and msi distributions
>>> for 8 years :-)
>> 
>> Nonetheless, changing the compiler will not only break pip, but every
>> automated installer tool (easy_install, buildout) that i?m aware of. The
>> blow back for binary installation is going to be huge I think.
>> 
>>> But without access to the VS 2008 compiler that is needed to
>>> compile those extensions, it will become increasingly difficult
>>> for package authors to provide such binary packages, so we have to
>>> ask ourselves:
>>> 
>>> What's worse: breaking old Windows binaries for Python 2.7
>>> or not having updated and new Windows binaries for Python 2.7
>>> at all in a few years ?
>> 
>> At the risk of getting Guido to post his slide again, I still think the
>> solution to the old compiler is to just roll a 2.8 with minimal changes.
>> It could even be a good place to move to the ssl backport changes
>> too since they were the riskier set of changes in PEP466.
>> 
>> But either way, if a compiler does change in a 2.7 release we?ll need
>> to update a lot of tooling to cope with that, so any plan to do that should
>> include that and a timeline for adoption of that.
> 
> Sure, and we'd need to hash out possible solutions to minimize
> breakage, but first we'll have to see whether we want to consider
> this step or not.
> 
> 
> BTW: It's strange that I'm arguing for breaking things. I'm usually
> on the other side of such arguments :-)

Ok. I?m just making sure that any proposal to do this includes how
it plans to work around/minimize that. I agree with Martin (I think)
that trying to fix the entire ecosystem to cope with that change is
going to be far more work than folks realize and that it needs to
be an explicit part of the discussion when deciding how to solve
the problem.

Normally when I see someone suggest that switching compilers
in 2.7.x is likely to be less work than releasing a 2.8 It normally
appears to me they haven?t looked at the impact on the packaging
tooling.

> 
> -- 
> Marc-Andre Lemburg
> eGenix.com
> 
> Professional Python Services directly from the Source
>>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
> ________________________________________________________________________
> 
> ::: Try our new mxODBC.Connect Python Database Interface for free ! ::::
> 
> 
>   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>           Registered at Amtsgericht Duesseldorf: HRB 46611
>               http://www.egenix.com/company/contact/


-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140623/b8709c16/attachment.sig>

From ckaynor at zindagigames.com  Mon Jun 23 23:14:44 2014
From: ckaynor at zindagigames.com (Chris Kaynor)
Date: Mon, 23 Jun 2014 14:14:44 -0700
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <53A896F6.2030401@egenix.com>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io> <53A87FB3.2000100@egenix.com>
 <91E82F8F-339A-403C-8EEA-997E27FBEE59@stufft.io> <53A896F6.2030401@egenix.com>
Message-ID: <CALvWhxsUB9u+g2VHsz_e=gTtwy+ATRf8rjs9eX8C5ONYH-0GrQ@mail.gmail.com>

Not being a Python developer, I normally just lurk on Py-Dev, but I figured
I'd throw this out there for this thread:

Recent version of Maya embed Python 2.x, and the newer version of Maya (I
believe 2012 was the first version) embeds a Python 2.7 compiled with VS
2010. From my experience, most C extensions work across compiler versions,
however when they don't, it's generally a fairly difficult to debug issue -
at least unless you know what to look for in the call stacks, and have
access to the symbol files.


Chris


On Mon, Jun 23, 2014 at 2:07 PM, M.-A. Lemburg <mal at egenix.com> wrote:

> On 23.06.2014 22:20, Donald Stufft wrote:
> >
> > On Jun 23, 2014, at 3:27 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> >
> >> On 23.06.2014 18:09, Donald Stufft wrote:
> >>>
> >>> On Jun 23, 2014, at 2:09 AM, Martin v. L?wis <martin at v.loewis.de>
> wrote:
> >>>
> >>>>>
> >>>>> * Should we make use of the potential breakage with 2.7.10
> >>>>>  to introduce a new Windows compiler version for Python 2.7 ?
> >>>>
> >>>> Assuming it is a good idea to continue producing Windows binaries
> >>>> for 2.7, I think it would be a bad idea to switch compilers. It will
> >>>> cause severe breakage of 2.7 installations, much more problematic
> >>>> than switching to two-digit version numbers.
> >>>
> >>> I agree with this, we?ve just finally started getting things to the
> point where
> >>> it makes a lot of sense for binary distributions for Windows. Breaking
> all
> >>> of them on 2.7 would be very bad.
> >
> > Err, sorry that ?We? was with my pip hat on.
> >
> >>
> >> Not sure what you mean. We've had binary wininst distributions
> >> for Windows for more than a decade, and egg and msi distributions
> >> for 8 years :-)
> >
> > Nonetheless, changing the compiler will not only break pip, but every
> > automated installer tool (easy_install, buildout) that i?m aware of. The
> > blow back for binary installation is going to be huge I think.
> >
> >> But without access to the VS 2008 compiler that is needed to
> >> compile those extensions, it will become increasingly difficult
> >> for package authors to provide such binary packages, so we have to
> >> ask ourselves:
> >>
> >> What's worse: breaking old Windows binaries for Python 2.7
> >> or not having updated and new Windows binaries for Python 2.7
> >> at all in a few years ?
> >
> > At the risk of getting Guido to post his slide again, I still think the
> > solution to the old compiler is to just roll a 2.8 with minimal changes.
> > It could even be a good place to move to the ssl backport changes
> > too since they were the riskier set of changes in PEP466.
> >
> > But either way, if a compiler does change in a 2.7 release we?ll need
> > to update a lot of tooling to cope with that, so any plan to do that
> should
> > include that and a timeline for adoption of that.
>
> Sure, and we'd need to hash out possible solutions to minimize
> breakage, but first we'll have to see whether we want to consider
> this step or not.
>
>
> BTW: It's strange that I'm arguing for breaking things. I'm usually
> on the other side of such arguments :-)
>
> --
> Marc-Andre Lemburg
> eGenix.com
>
> Professional Python Services directly from the Source
> >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
> >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
> >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
> ________________________________________________________________________
>
> ::: Try our new mxODBC.Connect Python Database Interface for free ! ::::
>
>
>    eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>            Registered at Amtsgericht Duesseldorf: HRB 46611
>                http://www.egenix.com/company/contact/
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/ckaynor%40zindagigames.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140623/83dbc7de/attachment-0001.html>

From barry at python.org  Mon Jun 23 23:22:27 2014
From: barry at python.org (Barry Warsaw)
Date: Mon, 23 Jun 2014 17:22:27 -0400
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <685B2505-BABC-41DE-9A2E-BE64D2CE6AF8@stufft.io>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com>
 <91E82F8F-339A-403C-8EEA-997E27FBEE59@stufft.io>
 <53A896F6.2030401@egenix.com>
 <685B2505-BABC-41DE-9A2E-BE64D2CE6AF8@stufft.io>
Message-ID: <20140623172227.3c58884a@anarchist.wooz.org>

On Jun 23, 2014, at 05:15 PM, Donald Stufft wrote:

>Normally when I see someone suggest that switching compilers
>in 2.7.x is likely to be less work than releasing a 2.8 It normally
>appears to me they haven?t looked at the impact on the packaging
>tooling.

Just to be clear, releasing a Python 2.8 has enormous impact outside of just
the amount of work to do it.  It's an exceedingly bad idea.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140623/db2f8923/attachment.sig>

From donald at stufft.io  Mon Jun 23 23:28:23 2014
From: donald at stufft.io (Donald Stufft)
Date: Mon, 23 Jun 2014 17:28:23 -0400
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <20140623172227.3c58884a@anarchist.wooz.org>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com>
 <91E82F8F-339A-403C-8EEA-997E27FBEE59@stufft.io>
 <53A896F6.2030401@egenix.com>
 <685B2505-BABC-41DE-9A2E-BE64D2CE6AF8@stufft.io>
 <20140623172227.3c58884a@anarchist.wooz.org>
Message-ID: <09813B4B-373A-4058-B5FE-887939E07B55@stufft.io>


On Jun 23, 2014, at 5:22 PM, Barry Warsaw <barry at python.org> wrote:

> On Jun 23, 2014, at 05:15 PM, Donald Stufft wrote:
> 
>> Normally when I see someone suggest that switching compilers
>> in 2.7.x is likely to be less work than releasing a 2.8 It normally
>> appears to me they haven?t looked at the impact on the packaging
>> tooling.
> 
> Just to be clear, releasing a Python 2.8 has enormous impact outside of just
> the amount of work to do it.  It's an exceedingly bad idea.

Can you clarify?

Also FWIW I?m not really married to the 2.8 thing, it?s mostly that, on Windows, the X.Y release
prior to the ABI thing in 3.x _was_ the ABI so all the tooling builds on that. So you need to
either

1) Stick with the old Compiler
2) Release 2.8
3) Do all the work to fix all the tooling to cope with the fact that X.Y isn?t the ABI on 2.x anymore

I don?t think a reasonable option is:

4) Just switch compilers and leave it on someone else?s doorsteps to fix the entire packaging
    tool chain to cope.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140623/eb5f6cfd/attachment.sig>

From ethan at stoneleaf.us  Mon Jun 23 23:19:13 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 23 Jun 2014 14:19:13 -0700
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <53A5FDB2.1080000@stoneleaf.us>
References: <53A55E05.5020906@egenix.com>	<CADiSq7cs5zNW-bbPJZG5SotBDvYL_z-6xopoKgm2ofh2JjrSsA@mail.gmail.com>	<53A5B995.6040802@egenix.com>
 <CAPTjJmqFuwG-fqOaV=Z27RBOX6n=5Dmi_H5hw_Ujchj+bvz8NQ@mail.gmail.com>
 <53A5FB11.5020302@egenix.com> <53A5FDB2.1080000@stoneleaf.us>
Message-ID: <53A899D1.6050306@stoneleaf.us>

On 06/21/2014 02:48 PM, Ethan Furman wrote:
> On 06/21/2014 02:37 PM, M.-A. Lemburg wrote:
>>
>> My answers to these are: 1. We should use dynamic linking
>> instead and not let OpenSSL bugs trigger Python releases; 2.
>> It's not a big problem; 3. Yes, please, since it is difficult
>> for people to develop and debug their extensions with a
>> 2008 compiler, when the rest of the world has long moved on.
>
> +1  (assuming not incredibly difficult and those that can are willing ;)

Revising this to:

+1, -0, -1

It seems to me the intention of supporting 2.7 for so long was not to give ourselves additional nightmares, but to 
provide a basic level of support for those who are needing longer time before migrating.  One of the reasons to migrate 
is to avoid future pain (pain is an excellent motivator -- it's why we don't go to the doctor when we're healthy, right? 
;)  If getting new or updated modules becomes more painful then that's motivation to upgrade -- not motivation for us to 
make both our lives (with the extra work) and everyone's else lives (why isn't this module working? oh, wrong compiler) 
more difficult.

--
~Ethan~

From barry at python.org  Mon Jun 23 23:47:27 2014
From: barry at python.org (Barry Warsaw)
Date: Mon, 23 Jun 2014 17:47:27 -0400
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <09813B4B-373A-4058-B5FE-887939E07B55@stufft.io>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com>
 <91E82F8F-339A-403C-8EEA-997E27FBEE59@stufft.io>
 <53A896F6.2030401@egenix.com>
 <685B2505-BABC-41DE-9A2E-BE64D2CE6AF8@stufft.io>
 <20140623172227.3c58884a@anarchist.wooz.org>
 <09813B4B-373A-4058-B5FE-887939E07B55@stufft.io>
Message-ID: <20140623174727.5ab8feb4@anarchist.wooz.org>

On Jun 23, 2014, at 05:28 PM, Donald Stufft wrote:

>Can you clarify?

What support guarantees will we make about Python 2.8?  Will it be supported
as long as Python 2.7?  Longer?  Will we now have two long-term support
versions or change *years* of expectations that users should transition off of
Python 2.7 onto Python 2.8?  Will all the LTS policies for 2.7 (e.g. PEP 466)
be retired for 2.7 and/or adopted completely into 2.8?

What should Linux distros do?  Should they support both 2.7 and 2.8 or begin
the long and potentially arduous process of certifying and transitioning to
2.8?  What about other operating systems and package managers, including
commercial redistributors?

Who is going to do the work to make sure patch are forward ported from 2.7 to
2.8?  Who is going to be the 2.8 release manager?  Will they be strong enough
to reject any and all new features that wouldn't have already made it into 2.7
(due to the already approved, narrow exemptions)?  Or will we open the flood
gates to Just One More Little New Feature To Make It Easier To Port to Python
3?

How will we manage the PR surrounding our backtracking on Python 2.8?  How
will we manage expectations that it's only released to support a new Windows
compiler?  Should non-Windows users just ignore it (much like the Python 1.6
releases were mostly ignored)?

How do you know which tools, workflows, and processes that will break with a
Python 2.8 release?  What assumptions about 2.7 being EOL for Python 2 are
baked into the ecosystems outside of core Python?

I could probably go on, but I'm exhausted.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140623/e62942c7/attachment.sig>

From rosuav at gmail.com  Mon Jun 23 23:48:06 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Tue, 24 Jun 2014 07:48:06 +1000
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <53A89140.80609@v.loewis.de>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com>
 <91E82F8F-339A-403C-8EEA-997E27FBEE59@stufft.io>
 <20140623163103.75073882@anarchist.wooz.org>
 <53A89140.80609@v.loewis.de>
Message-ID: <CAPTjJmq49y6GriZJm9Pzz5wRxCGye11PURuWP_appaoW8ufw2w@mail.gmail.com>

On Tue, Jun 24, 2014 at 6:42 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> See my other message. It's actually heavier, since it requires changes
> to distutils, PyPI, pip, buildout etc., all which know how to deal with
> Python minor version numbers, but are unaware of the notion of competing
> ABIs on Windows (except that they know how to deal with 32-bit vs. 64-bit).

Is it possible to hijack the "deal with 32-bit vs 64-bit"ness of
things to handle the different compilers? So, for instance, there
might be a "32-bit-NewCompiler" and a "64-bit-NewCompiler", two new
architectures, just as if someone came out with a 128-bit Windows and
built Python 2.7 for it. Would packaging be able to handle that more
easily than a compiler change within the same architecture?

ChrisA

From donald at stufft.io  Tue Jun 24 00:04:26 2014
From: donald at stufft.io (Donald Stufft)
Date: Mon, 23 Jun 2014 18:04:26 -0400
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <CAPTjJmq49y6GriZJm9Pzz5wRxCGye11PURuWP_appaoW8ufw2w@mail.gmail.com>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com>
 <91E82F8F-339A-403C-8EEA-997E27FBEE59@stufft.io>
 <20140623163103.75073882@anarchist.wooz.org> <53A89140.80609@v.loewis.de>
 <CAPTjJmq49y6GriZJm9Pzz5wRxCGye11PURuWP_appaoW8ufw2w@mail.gmail.com>
Message-ID: <7D629B76-5800-40D6-A071-D26CB0F794E7@stufft.io>


On Jun 23, 2014, at 5:48 PM, Chris Angelico <rosuav at gmail.com> wrote:

> On Tue, Jun 24, 2014 at 6:42 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> See my other message. It's actually heavier, since it requires changes
>> to distutils, PyPI, pip, buildout etc., all which know how to deal with
>> Python minor version numbers, but are unaware of the notion of competing
>> ABIs on Windows (except that they know how to deal with 32-bit vs. 64-bit).
> 
> Is it possible to hijack the "deal with 32-bit vs 64-bit"ness of
> things to handle the different compilers? So, for instance, there
> might be a "32-bit-NewCompiler" and a "64-bit-NewCompiler", two new
> architectures, just as if someone came out with a 128-bit Windows and
> built Python 2.7 for it. Would packaging be able to handle that more
> easily than a compiler change within the same architecture?
> 
> ChrisA
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/donald%40stufft.io

I?m not sure about this FWIW. I?d have to look at the implementations of
stuff to see how they?d cope with a new thing like that.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140623/83cd0172/attachment.sig>

From amk at amk.ca  Tue Jun 24 00:25:14 2014
From: amk at amk.ca (A.M. Kuchling)
Date: Mon, 23 Jun 2014 18:25:14 -0400
Subject: [Python-Dev] Tracker Stats
In-Reply-To: <20140623201225.0DA80250DE6@webabinitio.net>
References: <CACBhJdE1iLn3pqTTRg6GZKEjExwu37-M7siDDS3sCk5RnqGorA@mail.gmail.com>
 <53A84D41.6070508@email.de>
 <20140623201225.0DA80250DE6@webabinitio.net>
Message-ID: <20140623222514.GA74324@datlandrewk.home>

On Mon, Jun 23, 2014 at 04:12:24PM -0400, R. David Murray wrote:
> The stats graphs are based on the data generated for the
> weekly issue report.  I have a patched version of that
> report that adds the bug/enhancement info.

After PyCon, I started working on a scraper that would produce a bunch
of different lists and charts.  My ideas were:

* pie charts of issues by status and type.

* list or histogram of open library issues by module, perhaps limited to the
  top N modules

* list of N oldest issues with no subsequent activity (the unreviewed ones)

* list of N people with the most open issues assigned to them

The idea is to provide charts that help us direct effort to particular
subsets of bugs.

--amk

From ncoghlan at gmail.com  Tue Jun 24 01:42:26 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 24 Jun 2014 09:42:26 +1000
Subject: [Python-Dev] Python 2.7 patch levels turning two digit
In-Reply-To: <09813B4B-373A-4058-B5FE-887939E07B55@stufft.io>
References: <53A55E05.5020906@egenix.com> <53A7C49C.1090107@v.loewis.de>
 <A51E9A3D-C02A-412F-BB7C-347F6925F5BA@stufft.io>
 <53A87FB3.2000100@egenix.com>
 <91E82F8F-339A-403C-8EEA-997E27FBEE59@stufft.io>
 <53A896F6.2030401@egenix.com>
 <685B2505-BABC-41DE-9A2E-BE64D2CE6AF8@stufft.io>
 <20140623172227.3c58884a@anarchist.wooz.org>
 <09813B4B-373A-4058-B5FE-887939E07B55@stufft.io>
Message-ID: <CADiSq7f+BtTr-4dqJMCJ54OYwfrhvPudoWkKD+5L84fEqPOpnA@mail.gmail.com>

On 24 Jun 2014 07:29, "Donald Stufft" <donald at stufft.io> wrote:
>
>
> On Jun 23, 2014, at 5:22 PM, Barry Warsaw <barry at python.org> wrote:
>
> > On Jun 23, 2014, at 05:15 PM, Donald Stufft wrote:
> >
> >> Normally when I see someone suggest that switching compilers
> >> in 2.7.x is likely to be less work than releasing a 2.8 It normally
> >> appears to me they haven?t looked at the impact on the packaging
> >> tooling.
> >
> > Just to be clear, releasing a Python 2.8 has enormous impact outside of
just
> > the amount of work to do it.  It's an exceedingly bad idea.
>
> Can you clarify?
>
> Also FWIW I?m not really married to the 2.8 thing, it?s mostly that, on
Windows, the X.Y release
> prior to the ABI thing in 3.x _was_ the ABI so all the tooling builds on
that. So you need to
> either
>
> 1) Stick with the old Compiler

This is what we're going with. Steve is working on making that more
manageable from the Visual Studio side, and there are some folks in the
numeric/scientific community looking at improving the usability of the
MinGW toolchain for the purpose of building Python 2.7 C extensions.

> 2) Release 2.8

Impractical for the various reasons Barry listed.

> 3) Do all the work to fix all the tooling to cope with the fact that X.Y
isn?t the ABI on 2.x anymore

Impractical for the various reasons you listed.

> I don?t think a reasonable option is:
>
> 4) Just switch compilers and leave it on someone else?s doorsteps to fix
the entire packaging
>     tool chain to cope.

Agreed. We discussed this option in detail when the Stackless folks asked
about it a while ago, and the conclusion was that the risk of obscure
breakage was just too high.

Cheers,
Nick.

>
> -----------------
> Donald Stufft
> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372
DCFA
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140624/2155e958/attachment.html>

From ezio.melotti at gmail.com  Tue Jun 24 03:50:53 2014
From: ezio.melotti at gmail.com (Ezio Melotti)
Date: Tue, 24 Jun 2014 04:50:53 +0300
Subject: [Python-Dev] Tracker Stats
In-Reply-To: <53A84D41.6070508@email.de>
References: <CACBhJdE1iLn3pqTTRg6GZKEjExwu37-M7siDDS3sCk5RnqGorA@mail.gmail.com>
 <53A84D41.6070508@email.de>
Message-ID: <CACBhJdGxs1NLb+m939CSVytc3jFBfZQ6sM3Ftf6mUHAnW6WioQ@mail.gmail.com>

On Mon, Jun 23, 2014 at 6:52 PM, francis <francismb at email.de> wrote:
>
>> Hi,
>> I added a new "stats" page to the bug tracker:
>> http://bugs.python.org/issue?@template=stats
>
> Thanks Ezio,
>
> Two questions:
> how hard would be to add (or enhance) a chart with the
> ?open issues type enhancement? and ?open issues type bug?
> info ?
>

Not particularly hard, but I won't have time to get back to this
project for a while (contributions are welcomed though!).

> In the summaries there is a link to ?Issues with patch?,
> means that the ones not listed there are in ?needs patch?
> or ?new? status?

That summary lists all the issues with the "patch" keyword, and the
ones not listed simply don't have it.
The keyword is added automatically whenever an attachment is added to
the issue, so there might be false positives (e.g. if the attachment
is a script to reproduce the issue rather than a patch, or if the
available patches are outdated). The might also be issues with patches
that are not included in the summary (e.g. if someone accidentally
removed the keyword), but that shouldn't be very common.

>From the first graph you can see that out of the 4500+ open issues,
about 2000 have a patch.
We need more reviewers and committers :)

Best Regards,
Ezio Melotti

>
> Regards,
> francis
>

From ezio.melotti at gmail.com  Tue Jun 24 04:33:52 2014
From: ezio.melotti at gmail.com (Ezio Melotti)
Date: Tue, 24 Jun 2014 05:33:52 +0300
Subject: [Python-Dev] Tracker Stats
In-Reply-To: <20140623222514.GA74324@datlandrewk.home>
References: <CACBhJdE1iLn3pqTTRg6GZKEjExwu37-M7siDDS3sCk5RnqGorA@mail.gmail.com>
 <53A84D41.6070508@email.de>
 <20140623201225.0DA80250DE6@webabinitio.net>
 <20140623222514.GA74324@datlandrewk.home>
Message-ID: <CACBhJdFLH+27srqLiS52Gn-1M0OvG7pcOkSNQ65cyZ8-rWGKjA@mail.gmail.com>

On Tue, Jun 24, 2014 at 1:25 AM, A.M. Kuchling <amk at amk.ca> wrote:
> On Mon, Jun 23, 2014 at 04:12:24PM -0400, R. David Murray wrote:
>> The stats graphs are based on the data generated for the
>> weekly issue report.  I have a patched version of that
>> report that adds the bug/enhancement info.
>
> After PyCon, I started working on a scraper that would produce a bunch
> of different lists and charts.  My ideas were:
>
> * pie charts of issues by status and type.
>
> * list or histogram of open library issues by module, perhaps limited to the
>   top N modules
>

We don't have module-specific tags yet (see the core-workflow ML for
discussions about that), but I have other scripts that analyze all the
patches and divide them by module.
I didn't have time to integrate this in the tracker though.

> * list of N oldest issues with no subsequent activity (the unreviewed ones)
>

You can search for issues with only one message:
http://bugs.python.org/issue?%40sort0=activity&%40sort1=&%40group0=&%40group1=&%40columns=title%2Cid%2Cactivity%2Cstatus&%40filter=status%2Cmessage_count&status=1&message_count=1&%40pagesize=50&%40startwith=0

> * list of N people with the most open issues assigned to them
>

And then poke them with a goad until they fix them? :)

> The idea is to provide charts that help us direct effort to particular
> subsets of bugs.
>

If someone wants to experiment with and/or improve the tracker stats,
this is how it works:
1) The roundup-summary script [0] analyzes the issues once a week and
produce the weekly report and a  static JSON file [1];
2) The stats page [2] request the JSON file and uses the data to
generate the charts client-side.

Now there are two ways to improve it:
1) the easy way is just to use the roundup-summary script to expose
more of its data or to find new ones and add them to the JSON file
(and possibly to the summary too);
2) the hard way is to decouple the roundup-summary and the stats page
and either make another weekly (or daily/hourly) script to generate
the JSON file, or a template page that generates the data in
real-time.

Once the data are in the JSON file is quite easy to use jqPlot [4] to
make any kind of charts.
Keep in mind that some things are trivial to get out from the DB (e.g.
number of issues for each status/type), but other things are a bit
more complicated (e.g. things involving specific periods of time) and
currently the roundup-summary takes a few minutes to analyze all the
issues.
I also tried to include just a few useful charts on the stats page --
at first I had several more charts but then I removed them.
Feel free to ping me on IRC (#python-dev at Freenode) if you have questions.

Best Regards,
Ezio Melotti

[0]: http://hg.python.org/tracker/python-dev/file/default/scripts/roundup-summary
[1]: http://bugs.python.org/@@file/issue.stats.json
[2]: http://hg.python.org/tracker/python-dev/file/bbbe6c190a99/html/issue.stats.html#l20
[3]: http://www.jqplot.com/tests/

> --amk

From storchaka at gmail.com  Tue Jun 24 10:22:11 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Tue, 24 Jun 2014 11:22:11 +0300
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
Message-ID: <lobcee$uvq$1@ger.gmane.org>

I submitted a number of patches which fixes currently broken 
Unicode-disabled build of Python 2.7 (built with --disable-unicode 
configure option). I suppose this was broken in 2.7 when C 
implementation of the io module was introduced.

http://bugs.python.org/issue21833 -- main patch which fixes the io 
module and adds helpers for testing.

http://bugs.python.org/issue21834 -- a lot of minor fixes for tests.

Following issues fix different modules and related tests:

http://bugs.python.org/issue21854 -- cookielib
http://bugs.python.org/issue21838 -- ctypes
http://bugs.python.org/issue21855 -- decimal
http://bugs.python.org/issue21839 -- distutils
http://bugs.python.org/issue21843 -- doctest
http://bugs.python.org/issue21851 -- gettext
http://bugs.python.org/issue21844 -- HTMLParser
http://bugs.python.org/issue21850 -- httplib and SimpleHTTPServer
http://bugs.python.org/issue21842 -- IDLE
http://bugs.python.org/issue21853 -- inspect
http://bugs.python.org/issue21848 -- logging
http://bugs.python.org/issue21849 -- multiprocessing
http://bugs.python.org/issue21852 -- optparse
http://bugs.python.org/issue21840 -- os.path
http://bugs.python.org/issue21845 -- plistlib
http://bugs.python.org/issue21836 -- sqlite3
http://bugs.python.org/issue21837 -- tarfile
http://bugs.python.org/issue21835 -- Tkinter
http://bugs.python.org/issue21847 -- xmlrpc
http://bugs.python.org/issue21841 -- xml.sax
http://bugs.python.org/issue21846 -- zipfile

Most fixes are trivial and are only several lines of a code.


From victor.stinner at gmail.com  Tue Jun 24 10:55:21 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Tue, 24 Jun 2014 10:55:21 +0200
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <lobcee$uvq$1@ger.gmane.org>
References: <lobcee$uvq$1@ger.gmane.org>
Message-ID: <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>

Hi,

I don't know anyone building Python without Unicode. I would prefer to
modify configure to raise an error, and drop #ifdef in the code. (Stop
supporting building Python 2 without Unicode.)

Building Python 2 without Unicode support is not an innocent change.
Python is moving strongly to Unicode: Python 3 uses Unicode by
default. So to me it sounds really weird to work on building Python 2
without Unicode support. It means that you may have "Python 2" and
"Python 2 without Unicode" which are not exactly the same language.
IMO u"unicode" is part of the Python 2 language.

--disable-unicode is an old option added while Python 1.5 was very
slowly moving to Unicode.

I have the same opinion on --without-thread option (we should stop
supporting it, this option is useless). I worked in the embedded
world, Python used for the UI of a TV set top box. Even if the
hardware was slow and old, Python was compiled with threads and
Unicode. Unicode was mandatory to handle correctly letters with
diacritics, threads were used to handle network and D-Bus for
examples.

Victor


2014-06-24 10:22 GMT+02:00 Serhiy Storchaka <storchaka at gmail.com>:
> I submitted a number of patches which fixes currently broken
> Unicode-disabled build of Python 2.7 (built with --disable-unicode configure
> option). I suppose this was broken in 2.7 when C implementation of the io
> module was introduced.
>
> http://bugs.python.org/issue21833 -- main patch which fixes the io module
> and adds helpers for testing.
>
> http://bugs.python.org/issue21834 -- a lot of minor fixes for tests.
>
> Following issues fix different modules and related tests:
>
> http://bugs.python.org/issue21854 -- cookielib
> http://bugs.python.org/issue21838 -- ctypes
> http://bugs.python.org/issue21855 -- decimal
> http://bugs.python.org/issue21839 -- distutils
> http://bugs.python.org/issue21843 -- doctest
> http://bugs.python.org/issue21851 -- gettext
> http://bugs.python.org/issue21844 -- HTMLParser
> http://bugs.python.org/issue21850 -- httplib and SimpleHTTPServer
> http://bugs.python.org/issue21842 -- IDLE
> http://bugs.python.org/issue21853 -- inspect
> http://bugs.python.org/issue21848 -- logging
> http://bugs.python.org/issue21849 -- multiprocessing
> http://bugs.python.org/issue21852 -- optparse
> http://bugs.python.org/issue21840 -- os.path
> http://bugs.python.org/issue21845 -- plistlib
> http://bugs.python.org/issue21836 -- sqlite3
> http://bugs.python.org/issue21837 -- tarfile
> http://bugs.python.org/issue21835 -- Tkinter
> http://bugs.python.org/issue21847 -- xmlrpc
> http://bugs.python.org/issue21841 -- xml.sax
> http://bugs.python.org/issue21846 -- zipfile
>
> Most fixes are trivial and are only several lines of a code.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com

From skip at pobox.com  Tue Jun 24 13:04:41 2014
From: skip at pobox.com (Skip Montanaro)
Date: Tue, 24 Jun 2014 06:04:41 -0500
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
Message-ID: <CANc-5UwEbDyVATK8-OQwONzazA6ZvHwFxKEQ2fhSfuRvKVbgDA@mail.gmail.com>

I can't see any reason to make a backwards-incompatible change to
Python 2 to only support Unicode. You're bound to break somebody's
setup. Wouldn't it be better to fix bugs as Serhiy has done?

Skip

From antoine at python.org  Tue Jun 24 13:47:37 2014
From: antoine at python.org (Antoine Pitrou)
Date: Tue, 24 Jun 2014 07:47:37 -0400
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <CANc-5UwEbDyVATK8-OQwONzazA6ZvHwFxKEQ2fhSfuRvKVbgDA@mail.gmail.com>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
 <CANc-5UwEbDyVATK8-OQwONzazA6ZvHwFxKEQ2fhSfuRvKVbgDA@mail.gmail.com>
Message-ID: <lobogp$h0o$1@ger.gmane.org>

Le 24/06/2014 07:04, Skip Montanaro a ?crit :
> I can't see any reason to make a backwards-incompatible change to
> Python 2 to only support Unicode. You're bound to break somebody's
> setup.

Apparently, that setup would already have been broken for years.

Regards

Antoine.


From victor.stinner at gmail.com  Tue Jun 24 13:50:25 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Tue, 24 Jun 2014 13:50:25 +0200
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <CANc-5UwEbDyVATK8-OQwONzazA6ZvHwFxKEQ2fhSfuRvKVbgDA@mail.gmail.com>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
 <CANc-5UwEbDyVATK8-OQwONzazA6ZvHwFxKEQ2fhSfuRvKVbgDA@mail.gmail.com>
Message-ID: <CAMpsgwY7K-iXBFgP+MnhPE6s6MkuyUYnm3C1KS_nOL_dyoKiDg@mail.gmail.com>

2014-06-24 13:04 GMT+02:00 Skip Montanaro <skip at pobox.com>:
> I can't see any reason to make a backwards-incompatible change to
> Python 2 to only support Unicode. You're bound to break somebody's
> setup. Wouldn't it be better to fix bugs as Serhiy has done?

According to the long list of issues, I don't think that it's possible
to compile and use Python stdlib when Python is compiled without
Unicode support. So I'm not sure that we can say that it's an
backward-incompatible change.

Who is somebody? Who compiles Python without Unicode support? Which
version of Python?

With Python 2.6, ./configure --disable-unicode fails with:
"checking what type to use for unicode... configure: error: invalid
value for --enable-unicode. Use either ucs2 or ucs4 (lowercase)."

So I'm not sure that anyone used this option recently.

The configure script was fixed 2 years ago in Python 2.7 (2 years
after the release of Python 2.7.0):
http://hg.python.org/cpython/rev/d7aff4423172
http://bugs.python.org/issue21833

"./configure --disable-unicode" works on Python 2.5.6: unicode type
doesn't exist, and u'abc' is a bytes string.

It works with Python 2.7.7+ too.

Victor

From storchaka at gmail.com  Tue Jun 24 14:10:07 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Tue, 24 Jun 2014 15:10:07 +0300
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <CAMpsgwY7K-iXBFgP+MnhPE6s6MkuyUYnm3C1KS_nOL_dyoKiDg@mail.gmail.com>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
 <CANc-5UwEbDyVATK8-OQwONzazA6ZvHwFxKEQ2fhSfuRvKVbgDA@mail.gmail.com>
 <CAMpsgwY7K-iXBFgP+MnhPE6s6MkuyUYnm3C1KS_nOL_dyoKiDg@mail.gmail.com>
Message-ID: <lobppq$3cc$1@ger.gmane.org>

24.06.14 14:50, Victor Stinner ???????(??):
> 2014-06-24 13:04 GMT+02:00 Skip Montanaro <skip at pobox.com>:
>> I can't see any reason to make a backwards-incompatible change to
>> Python 2 to only support Unicode. You're bound to break somebody's
>> setup. Wouldn't it be better to fix bugs as Serhiy has done?
>
> According to the long list of issues, I don't think that it's possible
> to compile and use Python stdlib when Python is compiled without
> Unicode support. So I'm not sure that we can say that it's an
> backward-incompatible change.

Python has about 300 modules, my patches fix about 30 modules (only 8 of 
them cause compiling error). And that's almost all. Left only pickle, 
json, etree, email and unicode-specific modules (codecs, unicodedata and 
encodings). Besides pickle I'm not sure that others can be fixed.

The fact that only small fraction of modules needs fixes means that 
Python without unicode support can be pretty usable.

The main problem was with testing itself. Test suite depends on 
tempfile, which now uses io.open, which didn't work without unicode 
support (at least since 2.7).


From tjreedy at udel.edu  Tue Jun 24 16:24:01 2014
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 24 Jun 2014 10:24:01 -0400
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <lobcee$uvq$1@ger.gmane.org>
References: <lobcee$uvq$1@ger.gmane.org>
Message-ID: <loc1m4$cfv$1@ger.gmane.org>

On 6/24/2014 4:22 AM, Serhiy Storchaka wrote:
> I submitted a number of patches which fixes currently broken
> Unicode-disabled build of Python 2.7 (built with --disable-unicode
> configure option). I suppose this was broken in 2.7 when C
> implementation of the io module was introduced.
>
> http://bugs.python.org/issue21833 -- main patch which fixes the io
> module and adds helpers for testing.
>
> http://bugs.python.org/issue21834 -- a lot of minor fixes for tests.
>
> Following issues fix different modules and related tests:

This list and more to follow suggests that --disable-unicode was 
somewhat broken long before 2.7 and the introduction of _io.

> http://bugs.python.org/issue21854 -- cookielib
> http://bugs.python.org/issue21838 -- ctypes
> http://bugs.python.org/issue21855 -- decimal
> http://bugs.python.org/issue21839 -- distutils
> http://bugs.python.org/issue21843 -- doctest
> http://bugs.python.org/issue21851 -- gettext
> http://bugs.python.org/issue21844 -- HTMLParser
> http://bugs.python.org/issue21850 -- httplib and SimpleHTTPServer
> http://bugs.python.org/issue21842 -- IDLE
> http://bugs.python.org/issue21853 -- inspect
> http://bugs.python.org/issue21848 -- logging
> http://bugs.python.org/issue21849 -- multiprocessing
> http://bugs.python.org/issue21852 -- optparse
> http://bugs.python.org/issue21840 -- os.path
> http://bugs.python.org/issue21845 -- plistlib
> http://bugs.python.org/issue21836 -- sqlite3
> http://bugs.python.org/issue21837 -- tarfile
> http://bugs.python.org/issue21835 -- Tkinter
> http://bugs.python.org/issue21847 -- xmlrpc
> http://bugs.python.org/issue21841 -- xml.sax
> http://bugs.python.org/issue21846 -- zipfile
>
> Most fixes are trivial and are only several lines of a code.
>


-- 
Terry Jan Reedy


From benjamin at python.org  Tue Jun 24 18:06:10 2014
From: benjamin at python.org (Benjamin Peterson)
Date: Tue, 24 Jun 2014 09:06:10 -0700
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
Message-ID: <1403625970.6550.133062453.693ECDEA@webmail.messagingengine.com>

If Serhiy wants to spend his time supporting this arcane feature, he can
do that. It doesn't really seem worth risking regressions to do this,
though.

On Tue, Jun 24, 2014, at 01:55, Victor Stinner wrote:
> Hi,
> 
> I don't know anyone building Python without Unicode. I would prefer to
> modify configure to raise an error, and drop #ifdef in the code. (Stop
> supporting building Python 2 without Unicode.)
> 
> Building Python 2 without Unicode support is not an innocent change.
> Python is moving strongly to Unicode: Python 3 uses Unicode by
> default. So to me it sounds really weird to work on building Python 2
> without Unicode support. It means that you may have "Python 2" and
> "Python 2 without Unicode" which are not exactly the same language.
> IMO u"unicode" is part of the Python 2 language.
> 
> --disable-unicode is an old option added while Python 1.5 was very
> slowly moving to Unicode.
> 
> I have the same opinion on --without-thread option (we should stop
> supporting it, this option is useless). I worked in the embedded
> world, Python used for the UI of a TV set top box. Even if the
> hardware was slow and old, Python was compiled with threads and
> Unicode. Unicode was mandatory to handle correctly letters with
> diacritics, threads were used to handle network and D-Bus for
> examples.
> 
> Victor
> 
> 
> 2014-06-24 10:22 GMT+02:00 Serhiy Storchaka <storchaka at gmail.com>:
> > I submitted a number of patches which fixes currently broken
> > Unicode-disabled build of Python 2.7 (built with --disable-unicode configure
> > option). I suppose this was broken in 2.7 when C implementation of the io
> > module was introduced.
> >
> > http://bugs.python.org/issue21833 -- main patch which fixes the io module
> > and adds helpers for testing.
> >
> > http://bugs.python.org/issue21834 -- a lot of minor fixes for tests.
> >
> > Following issues fix different modules and related tests:
> >
> > http://bugs.python.org/issue21854 -- cookielib
> > http://bugs.python.org/issue21838 -- ctypes
> > http://bugs.python.org/issue21855 -- decimal
> > http://bugs.python.org/issue21839 -- distutils
> > http://bugs.python.org/issue21843 -- doctest
> > http://bugs.python.org/issue21851 -- gettext
> > http://bugs.python.org/issue21844 -- HTMLParser
> > http://bugs.python.org/issue21850 -- httplib and SimpleHTTPServer
> > http://bugs.python.org/issue21842 -- IDLE
> > http://bugs.python.org/issue21853 -- inspect
> > http://bugs.python.org/issue21848 -- logging
> > http://bugs.python.org/issue21849 -- multiprocessing
> > http://bugs.python.org/issue21852 -- optparse
> > http://bugs.python.org/issue21840 -- os.path
> > http://bugs.python.org/issue21845 -- plistlib
> > http://bugs.python.org/issue21836 -- sqlite3
> > http://bugs.python.org/issue21837 -- tarfile
> > http://bugs.python.org/issue21835 -- Tkinter
> > http://bugs.python.org/issue21847 -- xmlrpc
> > http://bugs.python.org/issue21841 -- xml.sax
> > http://bugs.python.org/issue21846 -- zipfile
> >
> > Most fixes are trivial and are only several lines of a code.
> >
> > _______________________________________________
> > Python-Dev mailing list
> > Python-Dev at python.org
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> > https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/benjamin%40python.org

From francismb at email.de  Tue Jun 24 20:43:41 2014
From: francismb at email.de (francis)
Date: Tue, 24 Jun 2014 20:43:41 +0200
Subject: [Python-Dev] Tracker Stats
In-Reply-To: <CACBhJdGxs1NLb+m939CSVytc3jFBfZQ6sM3Ftf6mUHAnW6WioQ@mail.gmail.com>
References: <CACBhJdE1iLn3pqTTRg6GZKEjExwu37-M7siDDS3sCk5RnqGorA@mail.gmail.com>	<53A84D41.6070508@email.de>
 <CACBhJdGxs1NLb+m939CSVytc3jFBfZQ6sM3Ftf6mUHAnW6WioQ@mail.gmail.com>
Message-ID: <53A9C6DD.7070505@email.de>

On 06/24/2014 03:50 AM, Ezio Melotti wrote:
>>From the first graph you can see that out of the 4500+ open issues,
> about 2000 have a patch.
One would like to start with the ones that are bugs ;-) and see some
status line trying to drop to 0 (is that possible :-) ?)

> We need more reviewers and committers :)
more patch writers: yes,
more patch reviewers: yes,
more committers: ?? automate!! :-)


Regards,
francis


From rdmurray at bitdance.com  Tue Jun 24 20:58:09 2014
From: rdmurray at bitdance.com (R. David Murray)
Date: Tue, 24 Jun 2014 14:58:09 -0400
Subject: [Python-Dev] Tracker Stats
In-Reply-To: <53A9C6DD.7070505@email.de>
References: <CACBhJdE1iLn3pqTTRg6GZKEjExwu37-M7siDDS3sCk5RnqGorA@mail.gmail.com>
 <53A84D41.6070508@email.de>
 <CACBhJdGxs1NLb+m939CSVytc3jFBfZQ6sM3Ftf6mUHAnW6WioQ@mail.gmail.com>
 <53A9C6DD.7070505@email.de>
Message-ID: <20140624185809.ED50B250E00@webabinitio.net>

On Tue, 24 Jun 2014 20:43:41 +0200, francis <francismb at email.de> wrote:
> On 06/24/2014 03:50 AM, Ezio Melotti wrote:
> >>From the first graph you can see that out of the 4500+ open issues,
> > about 2000 have a patch.
> One would like to start with the ones that are bugs ;-) and see some
> status line trying to drop to 0 (is that possible :-) ?)
> 
> > We need more reviewers and committers :)
> more patch writers: yes,
> more patch reviewers: yes,

Anyone can review patches, in case that isn't clear.

> more committers: ?? automate!! :-)

That's a goal of the python-workflow interest group.  Unfortunately
between billable work and GSOC mentoring I haven't had time to do much
there lately.  Our first goal is to make the review step easier to manage
(know which patches really need review, be able to list patches where
community review is thought to be complete) by improving the tracker,
then we'll look at creating the patch gating system Nick has talked
about previously.  Still needs a committer to approve the patch, but it
should increase the throughput considerably.

In the meantime, something that would help would be if people would do
reviews and say on the issue "I think this is commit ready" and have
the issue moved to 'commit review' stage.  Do that a few times where
people who are already triagers/committers agree with you, and you'll
get triage privileges on the tracker.

--David

From nad at acm.org  Tue Jun 24 21:54:29 2014
From: nad at acm.org (Ned Deily)
Date: Tue, 24 Jun 2014 12:54:29 -0700
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
 <1403625970.6550.133062453.693ECDEA@webmail.messagingengine.com>
Message-ID: <nad-E06C00.12542824062014@news.gmane.org>

In article 
<1403625970.6550.133062453.693ECDEA at webmail.messagingengine.com>,
 Benjamin Peterson <benjamin at python.org> wrote:
> If Serhiy wants to spend his time supporting this arcane feature, he can
> do that. It doesn't really seem worth risking regressions to do this,
> though.

That's why I'm concerned about applying these 20+ patches that touch 
many parts of the code base.  I don't have any objection to the "arcane 
feature" per se and I appreciate the obvious effort that Serhiy put into 
the patches but, at this stage of the life of Python 2, our overriding 
concern should be stability.  That's really why most users of Python 2.7 
continue to use it.  As I see it, maintenance mode is a promise from us 
to our users that we will try our best, in general, to only make changes 
that fix serious problems, either due to bugs in Python itself or 
changes in the external world (new OS releases, etc).  We don't 
automatically fix all bugs.  Any time we make a change, we're making an 
engineering decision with cost-benefit tradeoffs.  The more lines of 
code changed, the greater the risk that we introduce new bugs; 
inadvertently adding regressions has been an issue over a number of the 
2.7.x releases, including the most recent one.  The cost-benefit of this 
set of changes seems to me to be:

Costs:
- Code changes in many modules:
   - careful review -> additional work for multiple core developers
   - careful testing on all platforms including this option that we 
don't currently test at all, AFAIK -> added work for platform experts
   - risk of regressions not caught prior to release, at worst requiring 
another early followup release -> added work for release team, 
third-party packagers, users
   - possibly making backporting of other issues more difficult due to 
merge conflicts
   - possible invalidation of waiting-for-review patches forcing patch 
refreshes and retests -> added work for potential contributors
   - possible invalidation of user local patches -> added work for users
- may encourage use of an apparently little-used feature that has no 
equivalent in Python 3, another incentive to stay with Py2?

Benefit:
- Fixes documented feature that may be of benefit to users of Python in 
applications with very limited memory available, although there aren't 
any open issues from users requesting this (AFAIK).  No benefit to the 
overwhelming majority of Python users, who only use Unicode-enabled 
builds.

That just doesn't seem like a good trade-off to me.  I'll certainly 
abide by the release manager's decision but I think we all need to be 
thinking more about these kinds of cost-benefit tradeoffs and recognize 
that there are often non-obvious costs of making changes, costs that can 
affect our entire community.  Yes, we are committed to maintaining 
Python 2.7 for multiple years but that doesn't mean we have to fix every 
open issue or even most open issues.  Any or all of the above costs may 
apply to any changes we make.  For many of our users, the best 
maintenance policy for Python 2.7 would be the least change possible.

-- 
 Ned Deily,
 nad at acm.org


From ethan at stoneleaf.us  Tue Jun 24 22:10:48 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 24 Jun 2014 13:10:48 -0700
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <nad-E06C00.12542824062014@news.gmane.org>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
 <1403625970.6550.133062453.693ECDEA@webmail.messagingengine.com>
 <nad-E06C00.12542824062014@news.gmane.org>
Message-ID: <53A9DB48.4050506@stoneleaf.us>

On 06/24/2014 12:54 PM, Ned Deily wrote:
>
> Yes, we are committed to maintaining
> Python 2.7 for multiple years but that doesn't mean we have to fix every
> open issue or even most open issues.  Any or all of the above costs may
> apply to any changes we make.  For many of our users, the best
> maintenance policy for Python 2.7 would be the least change possible.

+1

We need to keep 2.7 running, but we don't need to kill ourselves doing it.  If a bug has been there for a while, the 
affected users are probably working around it by now.  ;)

--
~Ethan~

From jimjjewett at gmail.com  Tue Jun 24 23:03:27 2014
From: jimjjewett at gmail.com (Jim J. Jewett)
Date: Tue, 24 Jun 2014 14:03:27 -0700 (PDT)
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <loc1m4$cfv$1@ger.gmane.org>
Message-ID: <53a9e79f.455ce00a.4e6e.4680@mx.google.com>


On 6/24/2014 4:22 AM, Serhiy Storchaka wrote:
> I submitted a number of patches which fixes currently broken
> Unicode-disabled build of Python 2.7 (built with --disable-unicode
> configure option). I suppose this was broken in 2.7 when C
> implementation of the io module was introduced.

It has frequently been broken.  Without a buildbot, it will continue
to break.  I have given at least a quick look at all your proposed
changes; most are fixes to test code, such as skip decorators.

People checked in tests without the right guards because it did work
on their own builds, and on all stable buildbots.  That will probably
continue to happen unless/until a --disable-unicode buildbot is added.

It would be good to fix the tests (and actual library issues).
Unfortunately, some of the specifically proposed changes (such as
defining and using _unicode instead of unicode within python code)
look to me as though they would trigger problems in the normal build
(where the unicode object *does* exist, but would no longer be used).
Other changes, such as the use of \x escapes, appear correct, but make
the tests harder to read -- and might end up removing a test for
correct unicode funtionality across different spellings.

Even if we assume that the tests are fine, and I'm just an idiot who
misread them, the fact that there is any confusion means that these
particular changes may be tricky enough to be for a bad tradeoff for 2.7.

It *might* work if you could make a more focused change.  For example,
instead of leaving the 'unicode' name unbound, provide an object that
simply returns false for isinstance and raises a UnicodeError for any
other method call.  Even *this* might be too aggressive to 2.7, but the
fact that it would only appear in the --disable-unicode builds, and
would make them more similar to the regular build are points in its
favor.

Before doing that, though, please document what the --disable-unicode
mode is actually *supposed* to do when interacting with byte-streams
that a standard defines as UTF-8.  (For example, are the changes to
_xml_dumps and _xml_loads at
    http://bugs.python.org/file35758/multiprocessing.patch
correct, or do those functions assume they get bytes as input, or
should the functions raise an exception any time they are called?)


-jJ

--

If there are still threading problems with my replies, please
email me with details, so that I can try to resolve them.  -jJ


From ncoghlan at gmail.com  Wed Jun 25 01:15:27 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 25 Jun 2014 09:15:27 +1000
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <53A9DB48.4050506@stoneleaf.us>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
 <1403625970.6550.133062453.693ECDEA@webmail.messagingengine.com>
 <nad-E06C00.12542824062014@news.gmane.org>
 <53A9DB48.4050506@stoneleaf.us>
Message-ID: <CADiSq7eF63N5at7YX1ToapihYe6j1N98Mvkp=fEqyqNcKopuUA@mail.gmail.com>

On 25 Jun 2014 07:05, "Ethan Furman" <ethan at stoneleaf.us> wrote:
>
> On 06/24/2014 12:54 PM, Ned Deily wrote:
>>
>>
>> Yes, we are committed to maintaining
>> Python 2.7 for multiple years but that doesn't mean we have to fix every
>> open issue or even most open issues.  Any or all of the above costs may
>> apply to any changes we make.  For many of our users, the best
>> maintenance policy for Python 2.7 would be the least change possible.
>
>
> +1
>
> We need to keep 2.7 running, but we don't need to kill ourselves doing
it.  If a bug has been there for a while, the affected users are probably
working around it by now.  ;)

Aye, in this case, I'm in the "officially deprecate the feature" camp.
Don't actively try to break it further, just slap a warning in the docs to
say it is no longer a supported configuration.

In my own personal case, I not only wasn't aware that there was still an
option to turn off the Unicode support, but I also wouldn't really class a
build with it turned off as still being Python. As Jim noted, there are
quite a lot of APIs that don't make sense if there's no Unicode type
available.

Cheers,
Nick.

>
> --
> ~Ethan~
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140625/2e6a20e9/attachment-0001.html>

From skip at pobox.com  Wed Jun 25 14:20:49 2014
From: skip at pobox.com (Skip Montanaro)
Date: Wed, 25 Jun 2014 07:20:49 -0500
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <CADiSq7eF63N5at7YX1ToapihYe6j1N98Mvkp=fEqyqNcKopuUA@mail.gmail.com>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
 <1403625970.6550.133062453.693ECDEA@webmail.messagingengine.com>
 <nad-E06C00.12542824062014@news.gmane.org>
 <53A9DB48.4050506@stoneleaf.us>
 <CADiSq7eF63N5at7YX1ToapihYe6j1N98Mvkp=fEqyqNcKopuUA@mail.gmail.com>
Message-ID: <CANc-5UwGz8UOmogGjkVUtS6vxfp9eEmTmUTZU7rLv-JBXt7eQA@mail.gmail.com>

On Tue, Jun 24, 2014 at 6:15 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Aye, in this case, I'm in the "officially deprecate the feature" camp.

Definitely preferable to the suggestion to remove the configure flag.

Skip

From storchaka at gmail.com  Wed Jun 25 14:55:35 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 25 Jun 2014 15:55:35 +0300
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <53a9e79f.455ce00a.4e6e.4680@mx.google.com>
References: <loc1m4$cfv$1@ger.gmane.org>
 <53a9e79f.455ce00a.4e6e.4680@mx.google.com>
Message-ID: <loegr0$fm8$1@ger.gmane.org>

25.06.14 00:03, Jim J. Jewett ???????(??):
> It would be good to fix the tests (and actual library issues).
> Unfortunately, some of the specifically proposed changes (such as
> defining and using _unicode instead of unicode within python code)
> look to me as though they would trigger problems in the normal build
> (where the unicode object *does* exist, but would no longer be used).

This is recomended by MvL [1] and widely used (19 times in source code) 
idiom.

[1] http://bugs.python.org/issue8767#msg159473

> Other changes, such as the use of \x escapes, appear correct, but make
> the tests harder to read -- and might end up removing a test for
> correct unicode funtionality across different spellings.


>
> Even if we assume that the tests are fine, and I'm just an idiot who
> misread them, the fact that there is any confusion means that these
> particular changes may be tricky enough to be for a bad tradeoff for 2.7.
>
> It *might* work if you could make a more focused change.  For example,
> instead of leaving the 'unicode' name unbound, provide an object that
> simply returns false for isinstance and raises a UnicodeError for any
> other method call.  Even *this* might be too aggressive to 2.7, but the
> fact that it would only appear in the --disable-unicode builds, and
> would make them more similar to the regular build are points in its
> favor.

No, existing code use different approach. "unicode" doesn't exist, while 
encode/decode methods exist but are useless. If my memory doesn't fail 
me, there is even special explanatory comment about this historical 
decision somewhere. This decision was made many years ago.

> Before doing that, though, please document what the --disable-unicode
> mode is actually *supposed* to do when interacting with byte-streams
> that a standard defines as UTF-8.  (For example, are the changes to
> _xml_dumps and _xml_loads at
>      http://bugs.python.org/file35758/multiprocessing.patch
> correct, or do those functions assume they get bytes as input, or
> should the functions raise an exception any time they are called?)

Looking more carefully, I see that there is a bug in unicode-enable 
build (wrong backporting from 3.x). In 2.x xmlrpclib.dumps produces 
already utf-8 encoded string, in 3.x xmlrpc.client.dumps produces 
unicode string. multiprocessing should fail with non-ascii str or unicode.

Side benefit of my patches is that they expose existing errors in 
unicode-enable build.


From storchaka at gmail.com  Wed Jun 25 14:58:02 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 25 Jun 2014 15:58:02 +0300
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <nad-E06C00.12542824062014@news.gmane.org>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
 <1403625970.6550.133062453.693ECDEA@webmail.messagingengine.com>
 <nad-E06C00.12542824062014@news.gmane.org>
Message-ID: <loegvg$fm8$2@ger.gmane.org>

24.06.14 22:54, Ned Deily ???????(??):
> Benefit:
> - Fixes documented feature that may be of benefit to users of Python in
> applications with very limited memory available, although there aren't
> any open issues from users requesting this (AFAIK).  No benefit to the
> overwhelming majority of Python users, who only use Unicode-enabled
> builds.

Other benefit: patches exposed several bugs in code (mainly errors in 
backporting from 3.x).


From victor.stinner at gmail.com  Wed Jun 25 15:29:01 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Wed, 25 Jun 2014 15:29:01 +0200
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <loegvg$fm8$2@ger.gmane.org>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
 <1403625970.6550.133062453.693ECDEA@webmail.messagingengine.com>
 <nad-E06C00.12542824062014@news.gmane.org> <loegvg$fm8$2@ger.gmane.org>
Message-ID: <CAMpsgwb1GkBbPKLdF3qmA-h00wEqM0G5=UVicgdvOJ=su+hG2Q@mail.gmail.com>

2014-06-25 14:58 GMT+02:00 Serhiy Storchaka <storchaka at gmail.com>:
> 24.06.14 22:54, Ned Deily ???????(??):
>
>> Benefit:
>> - Fixes documented feature that may be of benefit to users of Python in
>> applications with very limited memory available, although there aren't
>> any open issues from users requesting this (AFAIK).  No benefit to the
>> overwhelming majority of Python users, who only use Unicode-enabled
>> builds.
>
>
> Other benefit: patches exposed several bugs in code (mainly errors in
> backporting from 3.x).

Oh, interesting. Do you have examples of such bugs?

Victor

From storchaka at gmail.com  Wed Jun 25 16:00:42 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Wed, 25 Jun 2014 17:00:42 +0300
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <CAMpsgwb1GkBbPKLdF3qmA-h00wEqM0G5=UVicgdvOJ=su+hG2Q@mail.gmail.com>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
 <1403625970.6550.133062453.693ECDEA@webmail.messagingengine.com>
 <nad-E06C00.12542824062014@news.gmane.org> <loegvg$fm8$2@ger.gmane.org>
 <CAMpsgwb1GkBbPKLdF3qmA-h00wEqM0G5=UVicgdvOJ=su+hG2Q@mail.gmail.com>
Message-ID: <loekl5$2m7$1@ger.gmane.org>

25.06.14 16:29, Victor Stinner ???????(??):
> 2014-06-25 14:58 GMT+02:00 Serhiy Storchaka <storchaka at gmail.com>:
>> Other benefit: patches exposed several bugs in code (mainly errors in
>> backporting from 3.x).
>
> Oh, interesting. Do you have examples of such bugs?

In posixpath branches for unicode and str should be reversed.
In multiprocessing .encode('utf-8') is applied on utf-8 encoded str 
(this is unicode string in Python 3). And there is similar error in at 
least one other place. Tests for bytearray actually test bytes, not 
bytearray. That is what I remember.


From nad at acm.org  Wed Jun 25 20:35:36 2014
From: nad at acm.org (Ned Deily)
Date: Wed, 25 Jun 2014 11:35:36 -0700
Subject: [Python-Dev] cpython (3.3): Closes #20872: dbm/gdbm/ndbm close
	methods are not documented
References: <3gz1lK1lYkz7Lk0@mail.python.org>
Message-ID: <nad-D2EBE5.11353625062014@news.gmane.org>

In article <3gz1lK1lYkz7Lk0 at mail.python.org>,
 jesus.cea <python-checkins at python.org> wrote:

> http://hg.python.org/cpython/rev/cf156cfb12e7
> changeset:   91398:cf156cfb12e7
> branch:      3.3
> parent:      91384:92d691c3ca00
> user:        Jesus Cea <jcea at jcea.es>
> date:        Wed Jun 25 13:05:31 2014 +0200
> summary:
>   Closes #20872: dbm/gdbm/ndbm close methods are not documented

The 3.3 branch is open only to security fixes. Please don't backport 
other patches to there.

https://docs.python.org/devguide/devcycle.html#summary

-- 
 Ned Deily,
 nad at acm.org


From jcea at jcea.es  Thu Jun 26 00:56:39 2014
From: jcea at jcea.es (Jesus Cea)
Date: Thu, 26 Jun 2014 00:56:39 +0200
Subject: [Python-Dev] cpython (3.3): Closes #20872: dbm/gdbm/ndbm close
 methods are not documented
In-Reply-To: <nad-D2EBE5.11353625062014@news.gmane.org>
References: <3gz1lK1lYkz7Lk0@mail.python.org>
 <nad-D2EBE5.11353625062014@news.gmane.org>
Message-ID: <53AB53A7.6050403@jcea.es>

On 25/06/14 20:35, Ned Deily wrote:
> The 3.3 branch is open only to security fixes. Please don't backport 
> other patches to there.
> 
> https://docs.python.org/devguide/devcycle.html#summary

Ned, I am aware. It is a doc-only fix, like fixing a typo or correcting
an incorrect statement. It that is against policy, let me know.

That said, looks like 3.3 documentation is not "sphinxed" anymore to the
webpage, so that actually makes the point high and clear. I have a
browser tab open to check a 3.3 doc fix and it is not showing.

Thanks for the heads up. Sorry for the inconvenience.

-- 
Jes?s Cea Avi?n                         _/_/      _/_/_/        _/_/_/
jcea at jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
Twitter: @jcea                        _/_/    _/_/          _/_/_/_/_/
jabber / xmpp:jcea at jabber.org  _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 538 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140626/9b2eb789/attachment.sig>

From ncoghlan at gmail.com  Thu Jun 26 01:28:35 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 26 Jun 2014 09:28:35 +1000
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <loekl5$2m7$1@ger.gmane.org>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
 <1403625970.6550.133062453.693ECDEA@webmail.messagingengine.com>
 <nad-E06C00.12542824062014@news.gmane.org>
 <loegvg$fm8$2@ger.gmane.org>
 <CAMpsgwb1GkBbPKLdF3qmA-h00wEqM0G5=UVicgdvOJ=su+hG2Q@mail.gmail.com>
 <loekl5$2m7$1@ger.gmane.org>
Message-ID: <CADiSq7ecypARn9Rf6C=tBHnb6vPxi8RgtBhAyujHORhJKVheug@mail.gmail.com>

On 26 Jun 2014 01:13, "Serhiy Storchaka" <storchaka at gmail.com> wrote:
>
> 25.06.14 16:29, Victor Stinner ???????(??):
>>
>> 2014-06-25 14:58 GMT+02:00 Serhiy Storchaka <storchaka at gmail.com>:
>>>
>>> Other benefit: patches exposed several bugs in code (mainly errors in
>>> backporting from 3.x).
>>
>>
>> Oh, interesting. Do you have examples of such bugs?
>
>
> In posixpath branches for unicode and str should be reversed.
> In multiprocessing .encode('utf-8') is applied on utf-8 encoded str (this
is unicode string in Python 3). And there is similar error in at least one
other place. Tests for bytearray actually test bytes, not bytearray. That
is what I remember.

OK, *that* sounds like an excellent reason to keep the Unicode disabled
builds functional, and make sure they stay that way with a buildbot: to
help make sure we're not accidentally running afoul of the implicit
interoperability between str and unicode when backporting fixes from Python
3.

Helping to ensure correct handling of str values makes this capability
something of benefit to *all* Python 2 users, not just those that turn off
the Unicode support. It also makes it a potentially useful testing tool
when assessing str/unicode handling in general.

Regards,
Nick.

>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140626/0237e3f3/attachment.html>

From nad at acm.org  Thu Jun 26 08:38:46 2014
From: nad at acm.org (Ned Deily)
Date: Wed, 25 Jun 2014 23:38:46 -0700
Subject: [Python-Dev] cpython (3.3): Closes #20872: dbm/gdbm/ndbm close
	methods are not documented
References: <3gz1lK1lYkz7Lk0@mail.python.org>
 <nad-D2EBE5.11353625062014@news.gmane.org> <53AB53A7.6050403@jcea.es>
Message-ID: <nad-11C8ED.23384625062014@news.gmane.org>

In article <53AB53A7.6050403 at jcea.es>, Jesus Cea <jcea at jcea.es> wrote:

> On 25/06/14 20:35, Ned Deily wrote:
> > The 3.3 branch is open only to security fixes. Please don't backport 
> > other patches to there.
> > 
> > https://docs.python.org/devguide/devcycle.html#summary
> 
> Ned, I am aware. It is a doc-only fix, like fixing a typo or correcting
> an incorrect statement. It that is against policy, let me know.

My understanding is that doc changes are treated the same as any other 
code changes.  As you noticed, after a release leaves maintenance mode, 
its documentation is no longer updated on the web site.

-- 
 Ned Deily,
 nad at acm.org


From storchaka at gmail.com  Thu Jun 26 09:15:06 2014
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Thu, 26 Jun 2014 10:15:06 +0300
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <CADiSq7ecypARn9Rf6C=tBHnb6vPxi8RgtBhAyujHORhJKVheug@mail.gmail.com>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
 <1403625970.6550.133062453.693ECDEA@webmail.messagingengine.com>
 <nad-E06C00.12542824062014@news.gmane.org> <loegvg$fm8$2@ger.gmane.org>
 <CAMpsgwb1GkBbPKLdF3qmA-h00wEqM0G5=UVicgdvOJ=su+hG2Q@mail.gmail.com>
 <loekl5$2m7$1@ger.gmane.org>
 <CADiSq7ecypARn9Rf6C=tBHnb6vPxi8RgtBhAyujHORhJKVheug@mail.gmail.com>
Message-ID: <logh8i$s07$1@ger.gmane.org>

26.06.14 02:28, Nick Coghlan ???????(??):
> OK, *that* sounds like an excellent reason to keep the Unicode disabled
> builds functional, and make sure they stay that way with a buildbot: to
> help make sure we're not accidentally running afoul of the implicit
> interoperability between str and unicode when backporting fixes from
> Python 3.
>
> Helping to ensure correct handling of str values makes this capability
> something of benefit to *all* Python 2 users, not just those that turn
> off the Unicode support. It also makes it a potentially useful testing
> tool when assessing str/unicode handling in general.

Do you want to make some patch reviews?


From antoine at python.org  Thu Jun 26 13:04:53 2014
From: antoine at python.org (Antoine Pitrou)
Date: Thu, 26 Jun 2014 07:04:53 -0400
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <CADiSq7ecypARn9Rf6C=tBHnb6vPxi8RgtBhAyujHORhJKVheug@mail.gmail.com>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
 <1403625970.6550.133062453.693ECDEA@webmail.messagingengine.com>
 <nad-E06C00.12542824062014@news.gmane.org> <loegvg$fm8$2@ger.gmane.org>
 <CAMpsgwb1GkBbPKLdF3qmA-h00wEqM0G5=UVicgdvOJ=su+hG2Q@mail.gmail.com>
 <loekl5$2m7$1@ger.gmane.org>
 <CADiSq7ecypARn9Rf6C=tBHnb6vPxi8RgtBhAyujHORhJKVheug@mail.gmail.com>
Message-ID: <loguom$58s$1@ger.gmane.org>

Le 25/06/2014 19:28, Nick Coghlan a ?crit :
>
> OK, *that* sounds like an excellent reason to keep the Unicode disabled
> builds functional, and make sure they stay that way with a buildbot: to
> help make sure we're not accidentally running afoul of the implicit
> interoperability between str and unicode when backporting fixes from
> Python 3.
>
> Helping to ensure correct handling of str values makes this capability
> something of benefit to *all* Python 2 users, not just those that turn
> off the Unicode support. It also makes it a potentially useful testing
> tool when assessing str/unicode handling in general.

Hmmm... From my perspective, trying to enforce unicode-disabled builds 
will only lower the (already low) chance that I may want to write / 
backport bug fixes for 2.7.

For the same reason, I agree with Victor that we should ditch the 
threading-disabled builds. It's too much of a hassle for no actual, 
practical benefit. People who want a threadless unicodeless Python can 
install Python 1.5.2 for all I care.

Regards

Antoine.


From rosuav at gmail.com  Thu Jun 26 14:49:40 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Thu, 26 Jun 2014 22:49:40 +1000
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <loguom$58s$1@ger.gmane.org>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
 <1403625970.6550.133062453.693ECDEA@webmail.messagingengine.com>
 <nad-E06C00.12542824062014@news.gmane.org>
 <loegvg$fm8$2@ger.gmane.org>
 <CAMpsgwb1GkBbPKLdF3qmA-h00wEqM0G5=UVicgdvOJ=su+hG2Q@mail.gmail.com>
 <loekl5$2m7$1@ger.gmane.org>
 <CADiSq7ecypARn9Rf6C=tBHnb6vPxi8RgtBhAyujHORhJKVheug@mail.gmail.com>
 <loguom$58s$1@ger.gmane.org>
Message-ID: <CAPTjJmoc8TXPtQybNGDM-1UFbvJym6U7Qpw0dFUPNrLFWJBhNQ@mail.gmail.com>

On Thu, Jun 26, 2014 at 9:04 PM, Antoine Pitrou <antoine at python.org> wrote:
> For the same reason, I agree with Victor that we should ditch the
> threading-disabled builds. It's too much of a hassle for no actual,
> practical benefit. People who want a threadless unicodeless Python can
> install Python 1.5.2 for all I care.

Or some other implementation of Python. It's looking like micropython
will be permanently supporting a non-Unicode build (although I stepped
away from the project after a strong disagreement over what would and
would not make sense, and haven't been following it since). If someone
wants a Python that doesn't have stuff that the core CPython devs
treat as essential, s/he probably wants something like uPy anyway.

ChrisA

From benjamin at python.org  Thu Jun 26 18:21:46 2014
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 26 Jun 2014 09:21:46 -0700
Subject: [Python-Dev] cpython (3.3): Closes #20872: dbm/gdbm/ndbm close
 methods are not documented
In-Reply-To: <nad-11C8ED.23384625062014@news.gmane.org>
References: <3gz1lK1lYkz7Lk0@mail.python.org>
 <nad-D2EBE5.11353625062014@news.gmane.org> <53AB53A7.6050403@jcea.es>
 <nad-11C8ED.23384625062014@news.gmane.org>
Message-ID: <1403799706.1921.134890741.5AFCC6E2@webmail.messagingengine.com>

On Wed, Jun 25, 2014, at 23:38, Ned Deily wrote:
> In article <53AB53A7.6050403 at jcea.es>, Jesus Cea <jcea at jcea.es> wrote:
> 
> > On 25/06/14 20:35, Ned Deily wrote:
> > > The 3.3 branch is open only to security fixes. Please don't backport 
> > > other patches to there.
> > > 
> > > https://docs.python.org/devguide/devcycle.html#summary
> > 
> > Ned, I am aware. It is a doc-only fix, like fixing a typo or correcting
> > an incorrect statement. It that is against policy, let me know.
> 
> My understanding is that doc changes are treated the same as any other 
> code changes.  As you noticed, after a release leaves maintenance mode, 
> its documentation is no longer updated on the web site.

To echo Ned, committing a doc change to 3.3 isn't the end of the world.
We just want to make sure energy is focused on the 3 branches we do
fully maintain.

From petertbrady at gmail.com  Thu Jun 26 17:38:50 2014
From: petertbrady at gmail.com (Peter Brady)
Date: Thu, 26 Jun 2014 09:38:50 -0600
Subject: [Python-Dev] C version of functools.lru_cache
Message-ID: <CALoNiQdLx70FRvMiPbTfrFxd+4Hm5dG-+dQ=nvXsHC+LHcOpug@mail.gmail.com>

Hello python devs,

I was recently in need of some faster caching and thought this would be a
good opportunity to familiarize myself with the Python/C api so I wrote a C
extension for the lru_cache in functools.  The source is at
https://github.com/pbrady/fastcache.git and I've posted it as a package on
PyPI (fastcache).  There are some simple benchmarks on the github page
showing about 9x speedup.  I would like to submit this for incorporation
into the standard library.  Is there any interest in this? I suspect it
probably requires some changes/cleanup especially since I haven't addressed
thread-safety at all.

Thanks,
Peter.

P.S. This was the motivation for the faster caching
https://github.com/sympy/sympy/pull/7464.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140626/c55b9487/attachment.html>

From benjamin at python.org  Thu Jun 26 18:33:29 2014
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 26 Jun 2014 09:33:29 -0700
Subject: [Python-Dev] C version of functools.lru_cache
In-Reply-To: <CALoNiQdLx70FRvMiPbTfrFxd+4Hm5dG-+dQ=nvXsHC+LHcOpug@mail.gmail.com>
References: <CALoNiQdLx70FRvMiPbTfrFxd+4Hm5dG-+dQ=nvXsHC+LHcOpug@mail.gmail.com>
Message-ID: <1403800409.4541.134895341.6BD26137@webmail.messagingengine.com>

You might look at https://bugs.python.org/issue14373

On Thu, Jun 26, 2014, at 08:38, Peter Brady wrote:
> Hello python devs,
> 
> I was recently in need of some faster caching and thought this would be a
> good opportunity to familiarize myself with the Python/C api so I wrote a
> C
> extension for the lru_cache in functools.  The source is at
> https://github.com/pbrady/fastcache.git and I've posted it as a package
> on
> PyPI (fastcache).  There are some simple benchmarks on the github page
> showing about 9x speedup.  I would like to submit this for incorporation
> into the standard library.  Is there any interest in this? I suspect it
> probably requires some changes/cleanup especially since I haven't
> addressed
> thread-safety at all.
> 
> Thanks,
> Peter.
> 
> P.S. This was the motivation for the faster caching
> https://github.com/sympy/sympy/pull/7464.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/benjamin%40python.org

From petertbrady at gmail.com  Thu Jun 26 19:23:06 2014
From: petertbrady at gmail.com (Peter Brady)
Date: Thu, 26 Jun 2014 11:23:06 -0600
Subject: [Python-Dev] C version of functools.lru_cache
In-Reply-To: <1403800409.4541.134895341.6BD26137@webmail.messagingengine.com>
References: <CALoNiQdLx70FRvMiPbTfrFxd+4Hm5dG-+dQ=nvXsHC+LHcOpug@mail.gmail.com>
 <1403800409.4541.134895341.6BD26137@webmail.messagingengine.com>
Message-ID: <CALoNiQdZzxb5VQbzrQnTdwBE4R-F5cYSM91S9b7vbV9_B2y8yw@mail.gmail.com>

Looks like it's already in the works!  Nevermind


On Thu, Jun 26, 2014 at 10:33 AM, Benjamin Peterson <benjamin at python.org>
wrote:

> You might look at https://bugs.python.org/issue14373
>
> On Thu, Jun 26, 2014, at 08:38, Peter Brady wrote:
> > Hello python devs,
> >
> > I was recently in need of some faster caching and thought this would be a
> > good opportunity to familiarize myself with the Python/C api so I wrote a
> > C
> > extension for the lru_cache in functools.  The source is at
> > https://github.com/pbrady/fastcache.git and I've posted it as a package
> > on
> > PyPI (fastcache).  There are some simple benchmarks on the github page
> > showing about 9x speedup.  I would like to submit this for incorporation
> > into the standard library.  Is there any interest in this? I suspect it
> > probably requires some changes/cleanup especially since I haven't
> > addressed
> > thread-safety at all.
> >
> > Thanks,
> > Peter.
> >
> > P.S. This was the motivation for the faster caching
> > https://github.com/sympy/sympy/pull/7464.
> > _______________________________________________
> > Python-Dev mailing list
> > Python-Dev at python.org
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> > https://mail.python.org/mailman/options/python-dev/benjamin%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140626/fc533248/attachment.html>

From gregory.szorc at gmail.com  Thu Jun 26 20:34:03 2014
From: gregory.szorc at gmail.com (Gregory Szorc)
Date: Thu, 26 Jun 2014 11:34:03 -0700
Subject: [Python-Dev] Binary CPython distribution for Linux
Message-ID: <53AC679B.1000408@gmail.com>

I'm an advocate of getting users and projects to move to modern Python 
versions. I believe dropping support for end-of-lifed Python versions is 
important for the health of the Python community. If you've done any 
amount of Python 3 porting work, you know things get much harder the 
more 2.x legacy versions you need to support.

I led the successful charge to drop support for Python 2.6 and below 
from Firefox's build system. I failed to win the argument that Mercurial 
should drop 2.4 and 2.5 [1]. A few years ago, I started a similar 
conversation with the LLVM project [2]. I wrote a blog post on the 
subject [3] that even got Slashdotted [4] (although I don't think that's 
the honor it was a decade ago).

While much of the opposition to dropping Python <2.7 stems from the RHEL 
community (they still have 2.4 in extended support and 2.7 wasn't in a 
release until a few weeks ago), a common objection from the users is "I 
can't install a different Python" or "it's too difficult to install a 
different Python." The former is a legit complaint - if you are on 
shared hosting and don't have root, as easy as it is to add an alternate 
package repository that provides 2.7 (or newer), you don't have the 
permissions so you can't do it.

This leaves users with attempting a userland install of Python. 
Personally, I think installing Python in userland is relatively simple. 
Tools like pyenv make this turnkey. Worst case you fall back to 
configure + make. But I'm an experienced developer and have a compiler 
toolchain and library dependencies on my machine. What about less 
experienced users or people that don't have the necessary build 
dependencies? And, even if they do manage to find or build a Python 
distribution, we all know that there's enough finicky behavior with 
things like site-packages default paths to cause many headaches, even 
for experienced Python hackers.

I'd like to propose a solution to this problem: a pre-built distribution 
of CPython for Linux available via www.python.org in the list of 
downloads for a particular release [5]. This distribution could be 
downloaded and unarchived into the user's home directory and users could 
start running it immediately by setting an environment variable or two, 
creating a symlink, or even running a basic installer script. This would 
hopefully remove the hurdles of obtaining a (sane) Python distribution 
on Linux. This would allow projects to more easily drop end-of-life 
Python versions and would speed adoption of modern Python, including 
Python 3 (because porting is much easier if you only have to target 2.7).

I understand there may be technical challenges with doing this for some 
distributions and with producing a universal binary distribution. I 
would settle for a binary distribution that was targeted towards RHEL 
users and variant distros, as that is the user population that I 
perceive to be the most conservative and responsible for holding modern 
Python adoption back.

[1] 
http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/68902
[2] http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-December/056545.html
[3] 
http://gregoryszorc.com/blog/2014/01/08/why-do-projects-support-old-python-releases/
[4] 
http://developers.slashdot.org/story/14/01/09/1940232/why-do-projects-continue-to-support-old-python-releases
[5] https://www.python.org/download/releases/2.7.7/

From joseph.martinot-lagarde at m4x.org  Thu Jun 26 21:23:10 2014
From: joseph.martinot-lagarde at m4x.org (Joseph Martinot-Lagarde)
Date: Thu, 26 Jun 2014 21:23:10 +0200
Subject: [Python-Dev] Binary CPython distribution for Linux
In-Reply-To: <53AC679B.1000408@gmail.com>
References: <53AC679B.1000408@gmail.com>
Message-ID: <53AC731E.5010604@m4x.org>

Le 26/06/2014 20:34, Gregory Szorc a ?crit :
> I'm an advocate of getting users and projects to move to modern Python
> versions. I believe dropping support for end-of-lifed Python versions is
> important for the health of the Python community. If you've done any
> amount of Python 3 porting work, you know things get much harder the
> more 2.x legacy versions you need to support.
>
> I led the successful charge to drop support for Python 2.6 and below
> from Firefox's build system. I failed to win the argument that Mercurial
> should drop 2.4 and 2.5 [1]. A few years ago, I started a similar
> conversation with the LLVM project [2]. I wrote a blog post on the
> subject [3] that even got Slashdotted [4] (although I don't think that's
> the honor it was a decade ago).
>
> While much of the opposition to dropping Python <2.7 stems from the RHEL
> community (they still have 2.4 in extended support and 2.7 wasn't in a
> release until a few weeks ago), a common objection from the users is "I
> can't install a different Python" or "it's too difficult to install a
> different Python." The former is a legit complaint - if you are on
> shared hosting and don't have root, as easy as it is to add an alternate
> package repository that provides 2.7 (or newer), you don't have the
> permissions so you can't do it.
>
> This leaves users with attempting a userland install of Python.
> Personally, I think installing Python in userland is relatively simple.
> Tools like pyenv make this turnkey. Worst case you fall back to
> configure + make. But I'm an experienced developer and have a compiler
> toolchain and library dependencies on my machine. What about less
> experienced users or people that don't have the necessary build
> dependencies? And, even if they do manage to find or build a Python
> distribution, we all know that there's enough finicky behavior with
> things like site-packages default paths to cause many headaches, even
> for experienced Python hackers.
>
> I'd like to propose a solution to this problem: a pre-built distribution
> of CPython for Linux available via www.python.org in the list of
> downloads for a particular release [5]. This distribution could be
> downloaded and unarchived into the user's home directory and users could
> start running it immediately by setting an environment variable or two,
> creating a symlink, or even running a basic installer script. This would
> hopefully remove the hurdles of obtaining a (sane) Python distribution
> on Linux. This would allow projects to more easily drop end-of-life
> Python versions and would speed adoption of modern Python, including
> Python 3 (because porting is much easier if you only have to target 2.7).
>
> I understand there may be technical challenges with doing this for some
> distributions and with producing a universal binary distribution. I
> would settle for a binary distribution that was targeted towards RHEL
> users and variant distros, as that is the user population that I
> perceive to be the most conservative and responsible for holding modern
> Python adoption back.
>
> [1]
> http://permalink.gmane.org/gmane.comp.version-control.mercurial.devel/68902
> [2] http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-December/056545.html
> [3]
> http://gregoryszorc.com/blog/2014/01/08/why-do-projects-support-old-python-releases/
>
> [4]
> http://developers.slashdot.org/story/14/01/09/1940232/why-do-projects-continue-to-support-old-python-releases
>
> [5] https://www.python.org/download/releases/2.7.7/

Just today I installed Anaconda 
(https://store.continuum.io/cshop/anaconda/) on Linux servers running 
CentOS 6.4. It installs in a directory anywhere in the filesystem (no 
need to be root), and using it globally is just a matter of prepending a 
folder to the PATH and it was done.

Of course Anaconda is oriented towards scientific applications but it is 
a proof that a pre-build binary installer works and can be simple to use.

If someone wants to try it without all scientific libraries they provide 
Miniconda (http://conda.pydata.org/miniconda.html) which contains only 
python and the python package manager conda.

Joseph


From a.cavallo at cavallinux.eu  Thu Jun 26 22:00:38 2014
From: a.cavallo at cavallinux.eu (Antonio Cavallo)
Date: Thu, 26 Jun 2014 21:00:38 +0100
Subject: [Python-Dev] Binary CPython distribution for Linux
In-Reply-To: <53AC731E.5010604@m4x.org>
References: <53AC679B.1000408@gmail.com> <53AC731E.5010604@m4x.org>
Message-ID: <53AC7BE6.6060207@cavallinux.eu>

I have a little pet project for building rpm of python 2.7 (it should be 
trivial to port to 3.x):

https://build.opensuse.org/project/show/home:cavallo71:opt-python-modules

If there's enough interest I can help to integrate with python.org.


 >> I understand there may be technical challenges with doing this for some
 >> distributions and with producing a universal binary distribution.

Opensuse provides the vm to build binaries for multiple platforms 
already since a very long time.

 > Of course Anaconda is oriented towards scientific applications but it is
 > a proof that a pre-build binary installer works and can be simple to use.

Rpm are the "blessed" way to instal software on linux: it supports what 
most sysadmin expect (easy to list the installed packages, easy to 
validate if tampering to a package occurred, which file belongs to a 
package? etc..).

Anaconda might appeal some group of user, but for deployment 
company-wide rpm is the best technical solution given its integration in 
linux.


I hope this helps,
Antonio

From joseph.martinot-lagarde at m4x.org  Thu Jun 26 23:27:39 2014
From: joseph.martinot-lagarde at m4x.org (Joseph Martinot-Lagarde)
Date: Thu, 26 Jun 2014 23:27:39 +0200
Subject: [Python-Dev] Binary CPython distribution for Linux
In-Reply-To: <53AC7BE6.6060207@cavallinux.eu>
References: <53AC679B.1000408@gmail.com> <53AC731E.5010604@m4x.org>
 <53AC7BE6.6060207@cavallinux.eu>
Message-ID: <53AC904B.7090907@m4x.org>

Le 26/06/2014 22:00, Antonio Cavallo a ?crit :
>  > Of course Anaconda is oriented towards scientific applications but it is
>  > a proof that a pre-build binary installer works and can be simple to
> use.
>
> Rpm are the "blessed" way to instal software on linux: it supports what
> most sysadmin expect (easy to list the installed packages, easy to
> validate if tampering to a package occurred, which file belongs to a
> package? etc..).
>
> Anaconda might appeal some group of user, but for deployment
> company-wide rpm is the best technical solution given its integration in
> linux.

1. Not all Linux distros use rpm (Debian, Ubuntu, Arch Linux...)
2. rpm need to be root to be installed.

Btw, Anaconda is multiplatform and can be installed on Linux, Windows 
and Mac.

Joseph


From benhoyt at gmail.com  Fri Jun 27 00:59:45 2014
From: benhoyt at gmail.com (Ben Hoyt)
Date: Thu, 26 Jun 2014 18:59:45 -0400
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
	faster directory iterator
Message-ID: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>

Hi Python dev folks,

I've written a PEP proposing a specific os.scandir() API for a
directory iterator that returns the stat-like info from the OS, the
main advantage of which is to speed up os.walk() and similar
operations between 4-20x, depending on your OS and file system. Full
details, background info, and context links are in the PEP, which
Victor Stinner has uploaded at the following URL, and I've also copied
inline below.

http://legacy.python.org/dev/peps/pep-0471/

Would love feedback on the PEP, but also of course on the proposal itself.

-Ben


PEP: 471
Title: os.scandir() function -- a better and faster directory iterator
Version: $Revision$
Last-Modified: $Date$
Author: Ben Hoyt <benhoyt at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 30-May-2014
Python-Version: 3.5


Abstract
========

This PEP proposes including a new directory iteration function,
``os.scandir()``, in the standard library. This new function adds
useful functionality and increases the speed of ``os.walk()`` by 2-10
times (depending on the platform and file system) by significantly
reducing the number of times ``stat()`` needs to be called.


Rationale
=========

Python's built-in ``os.walk()`` is significantly slower than it needs
to be, because -- in addition to calling ``os.listdir()`` on each
directory -- it executes the system call ``os.stat()`` or
``GetFileAttributes()`` on each file to determine whether the entry is
a directory or not.

But the underlying system calls -- ``FindFirstFile`` /
``FindNextFile`` on Windows and ``readdir`` on Linux and OS X --
already tell you whether the files returned are directories or not, so
no further system calls are needed. In short, you can reduce the
number of system calls from approximately 2N to N, where N is the
total number of files and directories in the tree. (And because
directory trees are usually much wider than they are deep, it's often
much better than this.)

In practice, removing all those extra system calls makes ``os.walk()``
about **8-9 times as fast on Windows**, and about **2-3 times as fast
on Linux and Mac OS X**. So we're not talking about micro-
optimizations. See more `benchmarks`_.

.. _`benchmarks`: https://github.com/benhoyt/scandir#benchmarks

Somewhat relatedly, many people (see Python `Issue 11406`_) are also
keen on a version of ``os.listdir()`` that yields filenames as it
iterates instead of returning them as one big list. This improves
memory efficiency for iterating very large directories.

So as well as providing a ``scandir()`` iterator function for calling
directly, Python's existing ``os.walk()`` function could be sped up a
huge amount.

.. _`Issue 11406`: http://bugs.python.org/issue11406


Implementation
==============

The implementation of this proposal was written by Ben Hoyt (initial
version) and Tim Golden (who helped a lot with the C extension
module). It lives on GitHub at `benhoyt/scandir`_.

.. _`benhoyt/scandir`: https://github.com/benhoyt/scandir

Note that this module has been used and tested (see "Use in the wild"
section in this PEP), so it's more than a proof-of-concept. However,
it is marked as beta software and is not extensively battle-tested.
It will need some cleanup and more thorough testing before going into
the standard library, as well as integration into `posixmodule.c`.


Specifics of proposal
=====================

Specifically, this PEP proposes adding a single function to the ``os``
module in the standard library, ``scandir``, that takes a single,
optional string as its argument::

    scandir(path='.') -> generator of DirEntry objects

Like ``listdir``, ``scandir`` calls the operating system's directory
iteration system calls to get the names of the files in the ``path``
directory, but it's different from ``listdir`` in two ways:

* Instead of bare filename strings, it returns lightweight
  ``DirEntry`` objects that hold the filename string and provide
  simple methods that allow access to the stat-like data the operating
  system returned.

* It returns a generator instead of a list, so that ``scandir`` acts
  as a true iterator instead of returning the full list immediately.

``scandir()`` yields a ``DirEntry`` object for each file and directory
in ``path``. Just like ``listdir``, the ``'.'`` and ``'..'``
pseudo-directories are skipped, and the entries are yielded in
system-dependent order. Each ``DirEntry`` object has the following
attributes and methods:

* ``name``: the entry's filename, relative to ``path`` (corresponds to
  the return values of ``os.listdir``)

* ``is_dir()``: like ``os.path.isdir()``, but requires no system calls
  on most systems (Linux, Windows, OS X)

* ``is_file()``: like ``os.path.isfile()``, but requires no system
  calls on most systems (Linux, Windows, OS X)

* ``is_symlink()``: like ``os.path.islink()``, but requires no system
  calls on most systems (Linux, Windows, OS X)

* ``lstat()``: like ``os.lstat()``, but requires no system calls on
  Windows

The ``DirEntry`` attribute and method names were chosen to be the same
as those in the new ``pathlib`` module for consistency.


Notes on caching
----------------

The ``DirEntry`` objects are relatively dumb -- the ``name`` attribute
is obviously always cached, and the ``is_X`` and ``lstat`` methods
cache their values (immediately on Windows via ``FindNextFile``, and
on first use on Linux / OS X via a ``stat`` call) and never refetch
from the system.

For this reason, ``DirEntry`` objects are intended to be used and
thrown away after iteration, not stored in long-lived data structured
and the methods called again and again.

If a user wants to do that (for example, for watching a file's size
change), they'll need to call the regular ``os.lstat()`` or
``os.path.getsize()`` functions which force a new system call each
time.


Examples
========

Here's a good usage pattern for ``scandir``. This is in fact almost
exactly how the scandir module's faster ``os.walk()`` implementation
uses it::

    dirs = []
    non_dirs = []
    for entry in scandir(path):
        if entry.is_dir():
            dirs.append(entry)
        else:
            non_dirs.append(entry)

The above ``os.walk()``-like code will be significantly using scandir
on both Windows and Linux or OS X.

Or, for getting the total size of files in a directory tree -- showing
use of the ``DirEntry.lstat()`` method::

    def get_tree_size(path):
        """Return total size of files in path and subdirs."""
        size = 0
        for entry in scandir(path):
            if entry.is_dir():
                sub_path = os.path.join(path, entry.name)
                size += get_tree_size(sub_path)
            else:
                size += entry.lstat().st_size
        return size

Note that ``get_tree_size()`` will get a huge speed boost on Windows,
because no extra stat call are needed, but on Linux and OS X the size
information is not returned by the directory iteration functions, so
this function won't gain anything there.


Support
=======

The scandir module on GitHub has been forked and used quite a bit (see
"Use in the wild" in this PEP), but there's also been a fair bit of
direct support for a scandir-like function from core developers and
others on the python-dev and python-ideas mailing lists. A sampling:

* **Nick Coghlan**, a core Python developer: "I've had the local Red
  Hat release engineering team express their displeasure at having to
  stat every file in a network mounted directory tree for info that is
  present in the dirent structure, so a definite +1 to os.scandir from
  me, so long as it makes that info available."
  [`source1 <http://bugs.python.org/issue11406>`_]

* **Tim Golden**, a core Python developer, supports scandir enough to
  have spent time refactoring and significantly improving scandir's C
  extension module.
  [`source2 <https://github.com/tjguk/scandir>`_]

* **Christian Heimes**, a core Python developer: "+1 for something
  like yielddir()"
  [`source3 <https://mail.python.org/pipermail/python-ideas/2012-November/017772.html>`_]
  and "Indeed! I'd like to see the feature in 3.4 so I can remove my
  own hack from our code base."
  [`source4 <http://bugs.python.org/issue11406>`_]

* **Gregory P. Smith**, a core Python developer: "As 3.4beta1 happens
  tonight, this isn't going to make 3.4 so i'm bumping this to 3.5.
  I really like the proposed design outlined above."
  [`source5 <http://bugs.python.org/issue11406>`_]

* **Guido van Rossum** on the possibility of adding scandir to Python
  3.5 (as it was too late for 3.4): "The ship has likewise sailed for
  adding scandir() (whether to os or pathlib). By all means experiment
  and get it ready for consideration for 3.5, but I don't want to add
  it to 3.4."
  [`source6 <https://mail.python.org/pipermail/python-dev/2013-November/130583.html>`_]

Support for this PEP itself (meta-support?) was given by Nick Coghlan
on python-dev: "A PEP reviewing all this for 3.5 and proposing a
specific os.scandir API would be a good thing."
[`source7 <https://mail.python.org/pipermail/python-dev/2013-November/130588.html>`_]


Use in the wild
===============

To date, ``scandir`` is definitely useful, but has been clearly marked
"beta", so it's uncertain how much use of it there is in the wild. Ben
Hoyt has had several reports from people using it. For example:

* Chris F: "I am processing some pretty large directories and was half
  expecting to have to modify getdents. So thanks for saving me the
  effort." [via personal email]

* bschollnick: "I wanted to let you know about this, since I am using
  Scandir as a building block for this code. Here's a good example of
  scandir making a radical performance improvement over os.listdir."
  [`source8 <https://github.com/benhoyt/scandir/issues/19>`_]

* Avram L: "I'm testing our scandir for a project I'm working on.
  Seems pretty solid, so first thing, just want to say nice work!"
  [via personal email]

Others have `requested a PyPI package`_ for it, which has been
created. See `PyPI package`_.

.. _`requested a PyPI package`: https://github.com/benhoyt/scandir/issues/12
.. _`PyPI package`: https://pypi.python.org/pypi/scandir

GitHub stats don't mean too much, but scandir does have several
watchers, issues, forks, etc. Here's the run-down as of the stats as
of June 5, 2014:

* Watchers: 17
* Stars: 48
* Forks: 15
* Issues: 2 open, 19 closed

**However, the much larger point is this:**, if this PEP is accepted,
``os.walk()`` can easily be reimplemented using ``scandir`` rather
than ``listdir`` and ``stat``, increasing the speed of ``os.walk()``
very significantly. There are thousands of developers, scripts, and
production code that would benefit from this large speedup of
``os.walk()``. For example, on GitHub, there are almost as many uses
of ``os.walk`` (194,000) as there are of ``os.mkdir`` (230,000).


Open issues and optional things
===============================

There are a few open issues or optional additions:


Should scandir be in its own module?
------------------------------------

Should the function be included in the standard library in a new
module, ``scandir.scandir()``, or just as ``os.scandir()`` as
discussed? The preference of this PEP's author (Ben Hoyt) would be
``os.scandir()``, as it's just a single function.


Should there be a way to access the full path?
----------------------------------------------

Should ``DirEntry``'s have a way to get the full path without using
``os.path.join(path, entry.name)``? This is a pretty common pattern,
and it may be useful to add pathlib-like ``str(entry)`` functionality.
This functionality has also been requested in `issue 13`_ on GitHub.

.. _`issue 13`: https://github.com/benhoyt/scandir/issues/13


Should it expose Windows wildcard functionality?
------------------------------------------------

Should ``scandir()`` have a way of exposing the wildcard functionality
in the Windows ``FindFirstFile`` / ``FindNextFile`` functions? The
scandir module on GitHub exposes this as a ``windows_wildcard``
keyword argument, allowing Windows power users the option to pass a
custom wildcard to ``FindFirstFile``, which may avoid the need to use
``fnmatch`` or similar on the resulting names. It is named the
unwieldly ``windows_wildcard`` to remind you you're writing power-
user, Windows-only code if you use it.

This boils down to whether ``scandir`` should be about exposing all of
the system's directory iteration features, or simply providing a fast,
simple, cross-platform directory iteration API.

This PEP's author votes for not including ``windows_wildcard`` in the
standard library version, because even though it could be useful in
rare cases (say the Windows Dropbox client?), it'd be too easy to use
it just because you're a Windows developer, and create code that is
not cross-platform.


Possible improvements
=====================

There are many possible improvements one could make to scandir, but
here is a short list of some this PEP's author has in mind:

* scandir could potentially be further sped up by calling ``readdir``
  / ``FindNextFile`` say 50 times per ``Py_BEGIN_ALLOW_THREADS`` block
  so that it stays in the C extension module for longer, and may be
  somewhat faster as a result. This approach hasn't been tested, but
  was suggested by on Issue 11406 by Antoine Pitrou.
  [`source9 <http://bugs.python.org/msg130125>`_]


Previous discussion
===================

* `Original thread Ben Hoyt started on python-ideas`_ about speeding
  up ``os.walk()``

* Python `Issue 11406`_, which includes the original proposal for a
  scandir-like function

* `Further thread Ben Hoyt started on python-dev`_ that refined the
  ``scandir()`` API, including Nick Coghlan's suggestion of scandir
  yielding ``DirEntry``-like objects

* `Final thread Ben Hoyt started on python-dev`_ to discuss the
  interaction between scandir and the new ``pathlib`` module

* `Question on StackOverflow`_ about why ``os.walk()`` is slow and
  pointers on how to fix it (this inspired the author of this PEP
  early on)

* `BetterWalk`_, this PEP's author's previous attempt at this, on
  which the scandir code is based

.. _`Original thread Ben Hoyt started on python-ideas`:
https://mail.python.org/pipermail/python-ideas/2012-November/017770.html
.. _`Further thread Ben Hoyt started on python-dev`:
https://mail.python.org/pipermail/python-dev/2013-May/126119.html
.. _`Final thread Ben Hoyt started on python-dev`:
https://mail.python.org/pipermail/python-dev/2013-November/130572.html
.. _`Question on StackOverflow`:
http://stackoverflow.com/questions/2485719/very-quickly-getting-total-size-of-folder
.. _`BetterWalk`: https://github.com/benhoyt/betterwalk


Copyright
=========

This document has been placed in the public domain.


..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:

From python at mrabarnett.plus.com  Fri Jun 27 01:28:20 2014
From: python at mrabarnett.plus.com (MRAB)
Date: Fri, 27 Jun 2014 00:28:20 +0100
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
Message-ID: <53ACAC94.1050206@mrabarnett.plus.com>

On 2014-06-26 23:59, Ben Hoyt wrote:
> Hi Python dev folks,
>
> I've written a PEP proposing a specific os.scandir() API for a
> directory iterator that returns the stat-like info from the OS, the
> main advantage of which is to speed up os.walk() and similar
> operations between 4-20x, depending on your OS and file system. Full
> details, background info, and context links are in the PEP, which
> Victor Stinner has uploaded at the following URL, and I've also
> copied inline below.
>
> http://legacy.python.org/dev/peps/pep-0471/
>
> Would love feedback on the PEP, but also of course on the proposal
> itself.
>
[snip]
Personally, I'd prefer the name 'iterdir' because it emphasises that
it's an iterator.

From timothy.c.delaney at gmail.com  Fri Jun 27 01:36:28 2014
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Fri, 27 Jun 2014 09:36:28 +1000
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <53ACAC94.1050206@mrabarnett.plus.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53ACAC94.1050206@mrabarnett.plus.com>
Message-ID: <CAN8CLgnuWW+0dVug3tfzqzF5D-WuSTaVS_N26zNctYic9Hoo2Q@mail.gmail.com>

On 27 June 2014 09:28, MRAB <python at mrabarnett.plus.com> wrote:

> Personally, I'd prefer the name 'iterdir' because it emphasises that
> it's an iterator.


Exactly what I was going to post (with the added note that thee's an
obvious symmetry with listdir).

+1 for iterdir rather than scandir

Other than that:

+1 for adding scandir to the stdlib
-1 for windows_wildcard (it would be an attractive nuisance to write
windows-only code)

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140627/173eb80f/attachment.html>

From pmiscml at gmail.com  Fri Jun 27 02:07:46 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Fri, 27 Jun 2014 03:07:46 +0300
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
Message-ID: <20140627030746.15641d7e@x34f>

Hello,

On Thu, 26 Jun 2014 18:59:45 -0400
Ben Hoyt <benhoyt at gmail.com> wrote:

> Hi Python dev folks,
> 
> I've written a PEP proposing a specific os.scandir() API for a
> directory iterator that returns the stat-like info from the OS, the
> main advantage of which is to speed up os.walk() and similar
> operations between 4-20x, depending on your OS and file system. Full
> details, background info, and context links are in the PEP, which
> Victor Stinner has uploaded at the following URL, and I've also copied
> inline below.

I noticed obvious inefficiency of os.walk() implemented in terms of
os.listdir() when I worked on "os" module for MicroPython. I essentially
did what your PEP suggests - introduced internal generator function
(ilistdir_ex() in
https://github.com/micropython/micropython-lib/blob/master/os/os/__init__.py#L85
), in terms of which both os.listdir() and os.walk() are implemented.


With my MicroPython hat on, os.scandir() would make things only worse.
With current interface, one can either have inefficient implementation
(like CPython chose) or efficient implementation (like MicroPython
chose) - all transparently. os.scandir() supposedly opens up efficient
implementation for everyone, but at the price of bloating API and
introducing heavy-weight objects to wrap info. PEP calls it
"lightweight DirEntry objects", but that cannot be true, because all
Python objects are heavy-weight, especially those which have methods.

It would be better if os.scandir() was specified to return a struct
(named tuple) compatible with return value of os.stat() (with only
fields relevant to underlying readdir()-like system call). The grounds
for that are obvious: it's already existing data interface in module
"os", which is also based on open standard for operating systems -
POSIX, so if one is to expect something about file attributes, it's
what one can reasonably base expectations on.


But reusing os.stat struct is glaringly not what's proposed. And
it's clear where that comes from - "[DirEntry.]lstat(): like os.lstat(),
but requires no system calls on Windows". Nice, but OS "FooBar" can do
much more than Windows - it has a system call to send a file by email,
right when scanning a directory containing it. So, why not to have
DirEntry.send_by_email(recipient) method? I hear the answer - it's
because CPython strives to support Windows well, while doesn't care
about "FooBar" OS.

And then it again leads to the question I posed several times - where's
line between "CPython" and "Python"? Is it grounded for CPython to add
(or remove) to Python stdlib something which is useful for its users,
but useless or complicating for other Python implementations?
Especially taking into account that there's "win32api" module allowing
Windows users to use all wonders of its API? Especially that os.stat
struct is itself pretty extensible
(https://docs.python.org/3.4/library/os.html#os.stat : "On other Unix
systems (such as FreeBSD), the following attributes may be
available ...", "On Mac OS systems...", - so extra fields can be added
for Windows just the same, if really needed).


> 
> http://legacy.python.org/dev/peps/pep-0471/
> 
> Would love feedback on the PEP, but also of course on the proposal
> itself.
> 
> -Ben
> 

[]

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From ethan at stoneleaf.us  Fri Jun 27 01:43:43 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 26 Jun 2014 16:43:43 -0700
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CAN8CLgnuWW+0dVug3tfzqzF5D-WuSTaVS_N26zNctYic9Hoo2Q@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53ACAC94.1050206@mrabarnett.plus.com>
 <CAN8CLgnuWW+0dVug3tfzqzF5D-WuSTaVS_N26zNctYic9Hoo2Q@mail.gmail.com>
Message-ID: <53ACB02F.4020402@stoneleaf.us>

On 06/26/2014 04:36 PM, Tim Delaney wrote:
> On 27 June 2014 09:28, MRAB wrote:
>>
>> Personally, I'd prefer the name 'iterdir' because it emphasises that
>> it's an iterator.
>
> Exactly what I was going to post (with the added note that thee's an obvious symmetry with listdir).
>
> +1 for iterdir rather than scandir
>
> Other than that:
>
> +1 for adding [it] to the stdlib

+1 for all of above

--
~Ethan~

From benjamin at python.org  Fri Jun 27 02:35:21 2014
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 26 Jun 2014 17:35:21 -0700
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <20140627030746.15641d7e@x34f>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <20140627030746.15641d7e@x34f>
Message-ID: <1403829321.29631.135045201.6BD5CF6A@webmail.messagingengine.com>

On Thu, Jun 26, 2014, at 17:07, Paul Sokolovsky wrote:
> 
> With my MicroPython hat on, os.scandir() would make things only worse.
> With current interface, one can either have inefficient implementation
> (like CPython chose) or efficient implementation (like MicroPython
> chose) - all transparently. os.scandir() supposedly opens up efficient
> implementation for everyone, but at the price of bloating API and
> introducing heavy-weight objects to wrap info. PEP calls it
> "lightweight DirEntry objects", but that cannot be true, because all
> Python objects are heavy-weight, especially those which have methods.

Why do you think methods make an object more heavyweight? namedtuples
have methods.

From pmiscml at gmail.com  Fri Jun 27 02:47:08 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Fri, 27 Jun 2014 03:47:08 +0300
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <1403829321.29631.135045201.6BD5CF6A@webmail.messagingengine.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <20140627030746.15641d7e@x34f>
 <1403829321.29631.135045201.6BD5CF6A@webmail.messagingengine.com>
Message-ID: <20140627034708.04dc58f1@x34f>

Hello,

On Thu, 26 Jun 2014 17:35:21 -0700
Benjamin Peterson <benjamin at python.org> wrote:

> On Thu, Jun 26, 2014, at 17:07, Paul Sokolovsky wrote:
> > 
> > With my MicroPython hat on, os.scandir() would make things only
> > worse. With current interface, one can either have inefficient
> > implementation (like CPython chose) or efficient implementation
> > (like MicroPython chose) - all transparently. os.scandir()
> > supposedly opens up efficient implementation for everyone, but at
> > the price of bloating API and introducing heavy-weight objects to
> > wrap info. PEP calls it "lightweight DirEntry objects", but that
> > cannot be true, because all Python objects are heavy-weight,
> > especially those which have methods.
> 
> Why do you think methods make an object more heavyweight? 

Because you need to call them. And if the only thing they do is return
object field, call overhead is rather noticeable.

> namedtuples have methods.

Yes, unfortunately. But fortunately, named tuple is a subclass of
tuple, so user caring for efficiency can just use numeric indexing
which existed for os.stat values all the time, blissfully ignoring
cruft which have been accumulating there since 1.5 times.


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From rymg19 at gmail.com  Fri Jun 27 03:01:18 2014
From: rymg19 at gmail.com (Ryan)
Date: Thu, 26 Jun 2014 20:01:18 -0500
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
	faster directory iterator
In-Reply-To: <CAN8CLgnuWW+0dVug3tfzqzF5D-WuSTaVS_N26zNctYic9Hoo2Q@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53ACAC94.1050206@mrabarnett.plus.com>
 <CAN8CLgnuWW+0dVug3tfzqzF5D-WuSTaVS_N26zNctYic9Hoo2Q@mail.gmail.com>
Message-ID: <00665839-bca9-4a5d-ae09-b75b6a2abb0e@email.android.com>

+1 for scandir.
-1 for iterdir(scandir sounds fancier).
-99999999 for windows_wildcard.

Tim Delaney <timothy.c.delaney at gmail.com> wrote:
>On 27 June 2014 09:28, MRAB <python at mrabarnett.plus.com> wrote:
>
>> Personally, I'd prefer the name 'iterdir' because it emphasises that
>> it's an iterator.
>
>
>Exactly what I was going to post (with the added note that thee's an
>obvious symmetry with listdir).
>
>+1 for iterdir rather than scandir
>
>Other than that:
>
>+1 for adding scandir to the stdlib
>-1 for windows_wildcard (it would be an attractive nuisance to write
>windows-only code)
>
>Tim Delaney
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Python-Dev mailing list
>Python-Dev at python.org
>https://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe:
>https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140626/d79bd792/attachment.html>

From benhoyt at gmail.com  Fri Jun 27 03:37:50 2014
From: benhoyt at gmail.com (Ben Hoyt)
Date: Thu, 26 Jun 2014 21:37:50 -0400
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <53ACB02F.4020402@stoneleaf.us>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53ACAC94.1050206@mrabarnett.plus.com>
 <CAN8CLgnuWW+0dVug3tfzqzF5D-WuSTaVS_N26zNctYic9Hoo2Q@mail.gmail.com>
 <53ACB02F.4020402@stoneleaf.us>
Message-ID: <CAL9jXCE0wFnkdPmnaoa25CH9nOkemdVp-LrJXE6wtQr7thqY4Q@mail.gmail.com>

I don't mind iterdir() and would take it :-), but I'll just say why I
chose the name scandir() -- though it wasn't my suggestion originally:

iterdir() sounds like just an iterator version of listdir(), kinda
like keys() and iterkeys() in Python 2. Whereas in actual fact the
return values are quite different (DirEntry objects vs strings), and
so the name change reflects that difference a little.

I'm also -1 on windows_wildcard. I think it's asking for trouble, and
wouldn't gain much on Windows in most cases anyway.

-Ben

On Thu, Jun 26, 2014 at 7:43 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
> On 06/26/2014 04:36 PM, Tim Delaney wrote:
>>
>> On 27 June 2014 09:28, MRAB wrote:
>>>
>>>
>>> Personally, I'd prefer the name 'iterdir' because it emphasises that
>>> it's an iterator.
>>
>>
>> Exactly what I was going to post (with the added note that thee's an
>> obvious symmetry with listdir).
>>
>> +1 for iterdir rather than scandir
>>
>> Other than that:
>>
>> +1 for adding [it] to the stdlib
>
>
> +1 for all of above
>
> --
> ~Ethan~
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/benhoyt%40gmail.com

From python at mrabarnett.plus.com  Fri Jun 27 03:50:38 2014
From: python at mrabarnett.plus.com (MRAB)
Date: Fri, 27 Jun 2014 02:50:38 +0100
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CAL9jXCE0wFnkdPmnaoa25CH9nOkemdVp-LrJXE6wtQr7thqY4Q@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53ACAC94.1050206@mrabarnett.plus.com>
 <CAN8CLgnuWW+0dVug3tfzqzF5D-WuSTaVS_N26zNctYic9Hoo2Q@mail.gmail.com>
 <53ACB02F.4020402@stoneleaf.us>
 <CAL9jXCE0wFnkdPmnaoa25CH9nOkemdVp-LrJXE6wtQr7thqY4Q@mail.gmail.com>
Message-ID: <53ACCDEE.9070906@mrabarnett.plus.com>

On 2014-06-27 02:37, Ben Hoyt wrote:
> I don't mind iterdir() and would take it :-), but I'll just say why I
> chose the name scandir() -- though it wasn't my suggestion originally:
>
> iterdir() sounds like just an iterator version of listdir(), kinda
> like keys() and iterkeys() in Python 2. Whereas in actual fact the
> return values are quite different (DirEntry objects vs strings), and
> so the name change reflects that difference a little.
>
[snip]

The re module has 'findall', which returns a list of strings, and
'finditer', which returns an iterator that yields match objects, so
there's a precedent. :-)


From benhoyt at gmail.com  Fri Jun 27 03:52:43 2014
From: benhoyt at gmail.com (Ben Hoyt)
Date: Thu, 26 Jun 2014 21:52:43 -0400
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <20140627030746.15641d7e@x34f>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <20140627030746.15641d7e@x34f>
Message-ID: <CAL9jXCGFy=aXCBiJD_wpR9jw0XxHjuVOyP7htHgMHgG_M=YsvQ@mail.gmail.com>

> os.listdir() when I worked on "os" module for MicroPython. I essentially
> did what your PEP suggests - introduced internal generator function
> (ilistdir_ex() in
> https://github.com/micropython/micropython-lib/blob/master/os/os/__init__.py#L85
> ), in terms of which both os.listdir() and os.walk() are implemented.

Nice (though I see the implementation is very *nix specific).

> With my MicroPython hat on, os.scandir() would make things only worse.
> With current interface, one can either have inefficient implementation
> (like CPython chose) or efficient implementation (like MicroPython
> chose) - all transparently. os.scandir() supposedly opens up efficient
> implementation for everyone, but at the price of bloating API and
> introducing heavy-weight objects to wrap info. PEP calls it
> "lightweight DirEntry objects", but that cannot be true, because all
> Python objects are heavy-weight, especially those which have methods.

It's a fair point that os.walk() can be implemented efficiently
without adding a new function and API. However, often you'll want more
info, like the file size, which scandir() can give you via
DirEntry.lstat(), which is free on Windows. So opening up this
efficient API is beneficial.

In CPython, I think the DirEntry objects are as lightweight as
stat_result objects.

I'm an embedded developer by background, so I know the constraints
here, but I really don't think Python's development should be tailored
to fit MicroPython. If os.scandir() is not very efficient on
MicroPython, so be it -- 99% of all desktop/server users will gain
from it.

> It would be better if os.scandir() was specified to return a struct
> (named tuple) compatible with return value of os.stat() (with only
> fields relevant to underlying readdir()-like system call). The grounds
> for that are obvious: it's already existing data interface in module
> "os", which is also based on open standard for operating systems -
> POSIX, so if one is to expect something about file attributes, it's
> what one can reasonably base expectations on.

Yes, we considered this early on (see the python-ideas and python-dev
threads referenced in the PEP), but decided it wasn't a great API to
overload stat_result further, and have most of the attributes None or
not present on Linux.

> Especially that os.stat struct is itself pretty extensible
> (https://docs.python.org/3.4/library/os.html#os.stat : "On other Unix
> systems (such as FreeBSD), the following attributes may be
> available ...", "On Mac OS systems...", - so extra fields can be added
> for Windows just the same, if really needed).

Yes. Incidentally, I just submitted an (accepted) patch for Python 3.5
that adds the full Win32 file attribute data to stat_result objects on
Windows (see https://docs.python.org/3.5/whatsnew/3.5.html#os).

However, for scandir() to be useful, you also need the name. My
original version of this directory iterator returned two-tuples of
(name, stat_result). But most people didn't like the API, and I don't
really either. You could overload stat_result with a .name attribute
in this case, but it still isn't a nice API to have most of the
attributes None, and then you have to test for that, etc.

So basically we tweaked the API to do what was best, and ended up with
it returning DirEntry objects with is_file() and similar methods.

Hope that helps give a bit more context. If you haven't read the
relevant python-ideas and python-dev threads, those are interesting
too.

-Ben

From greg at krypto.org  Fri Jun 27 04:04:16 2014
From: greg at krypto.org (Gregory P. Smith)
Date: Thu, 26 Jun 2014 19:04:16 -0700
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CAL9jXCE0wFnkdPmnaoa25CH9nOkemdVp-LrJXE6wtQr7thqY4Q@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53ACAC94.1050206@mrabarnett.plus.com>
 <CAN8CLgnuWW+0dVug3tfzqzF5D-WuSTaVS_N26zNctYic9Hoo2Q@mail.gmail.com>
 <53ACB02F.4020402@stoneleaf.us>
 <CAL9jXCE0wFnkdPmnaoa25CH9nOkemdVp-LrJXE6wtQr7thqY4Q@mail.gmail.com>
Message-ID: <CAGE7PNJCiAb4z+=42YvF1hHX6xuAmjTwnbf_+2xtRvB8WHX2QA@mail.gmail.com>

+1 on getting this in for 3.5.

If the only objection people are having is the stupid paint color of the
name I don't care what it's called!  scandir matches the libc API of the
same name.  iterdir also makes sense to anyone reading it.  Whoever checks
this in can pick one and be done with it.  We have other Python APIs with
iter in the name and tend not to be trying to mirror C so much these days
so the iterdir folks do have a valid point.

I'm not a huge fan of the DirEntry object and the method calls on it
instead of simply yielding tuples of (filename,
partially_filled_in_stat_result) but I don't *really* care which is used as
they both work fine and it is trivial to wrap with another generator
expression to turn it into exactly what you want anyways.

Python not having the ability to operate on large directories means Python
simply cannot be used for common system maintenance tasks.  Python being
slow to walk a file system due to unnecessary stat calls (often each an
entire io op. requiring a disk seek!) due to the existing information that
it throws away not being used via listdir is similarly a problem. This
addresses both.

IMNSHO, it is a single function, it belongs in the os module right next to
listdir.

-gps


On Thu, Jun 26, 2014 at 6:37 PM, Ben Hoyt <benhoyt at gmail.com> wrote:

> I don't mind iterdir() and would take it :-), but I'll just say why I
> chose the name scandir() -- though it wasn't my suggestion originally:
>
> iterdir() sounds like just an iterator version of listdir(), kinda
> like keys() and iterkeys() in Python 2. Whereas in actual fact the
> return values are quite different (DirEntry objects vs strings), and
> so the name change reflects that difference a little.
>
> I'm also -1 on windows_wildcard. I think it's asking for trouble, and
> wouldn't gain much on Windows in most cases anyway.
>
> -Ben
>
> On Thu, Jun 26, 2014 at 7:43 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
> > On 06/26/2014 04:36 PM, Tim Delaney wrote:
> >>
> >> On 27 June 2014 09:28, MRAB wrote:
> >>>
> >>>
> >>> Personally, I'd prefer the name 'iterdir' because it emphasises that
> >>> it's an iterator.
> >>
> >>
> >> Exactly what I was going to post (with the added note that thee's an
> >> obvious symmetry with listdir).
> >>
> >> +1 for iterdir rather than scandir
> >>
> >> Other than that:
> >>
> >> +1 for adding [it] to the stdlib
> >
> >
> > +1 for all of above
> >
> > --
> > ~Ethan~
> >
> > _______________________________________________
> > Python-Dev mailing list
> > Python-Dev at python.org
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> > https://mail.python.org/mailman/options/python-dev/benhoyt%40gmail.com
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/greg%40krypto.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140626/6f1b4d94/attachment-0001.html>

From steve at pearwood.info  Fri Jun 27 04:08:41 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 27 Jun 2014 12:08:41 +1000
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
	faster directory iterator
In-Reply-To: <20140627030746.15641d7e@x34f>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <20140627030746.15641d7e@x34f>
Message-ID: <20140627020841.GD13014@ando>

On Fri, Jun 27, 2014 at 03:07:46AM +0300, Paul Sokolovsky wrote:

> With my MicroPython hat on, os.scandir() would make things only worse.
> With current interface, one can either have inefficient implementation
> (like CPython chose) or efficient implementation (like MicroPython
> chose) - all transparently. os.scandir() supposedly opens up efficient
> implementation for everyone, but at the price of bloating API and
> introducing heavy-weight objects to wrap info. 

os.scandir is not part of the Python API, it is not a built-in function. 
It is part of the CPython standard library. That means (in my opinion) 
that there is an expectation that other Pythons should provide it, but 
not an absolute requirement. Especially for the os module, which by 
definition is platform-specific. In my opinion that means you have four 
options:

1. provide os.scandir, with exactly the same semantics as on CPython;

2. provide os.scandir, but change its semantics to be more lightweight 
   (e.g. return an ordinary tuple, as you already suggest);

3. don't provide os.scandir at all; or

4. do something different depending on whether the platform is Linux
   or an embedded system.

I would consider any of those acceptable for a library feature, but not 
for a language feature.


[...]
> But reusing os.stat struct is glaringly not what's proposed. And
> it's clear where that comes from - "[DirEntry.]lstat(): like os.lstat(),
> but requires no system calls on Windows". Nice, but OS "FooBar" can do
> much more than Windows - it has a system call to send a file by email,
> right when scanning a directory containing it. So, why not to have
> DirEntry.send_by_email(recipient) method? I hear the answer - it's
> because CPython strives to support Windows well, while doesn't care
> about "FooBar" OS.

Correct. If there is sufficient demand for FooBar, then CPython may 
support it. Until then, FooBarPython can support it, and offer whatever 
platform-specific features are needed within its standard library.


> And then it again leads to the question I posed several times - where's
> line between "CPython" and "Python"? Is it grounded for CPython to add
> (or remove) to Python stdlib something which is useful for its users,
> but useless or complicating for other Python implementations?

I think so. And other implementations are free to do the same thing.

Of course there is an expectation that the standard library of most 
implementations will be broadly similar, but not that they will be 
identical.

I am surprised that both Jython and IronPython offer an non-functioning 
dis module: you can import it successfully, but if there's a way to 
actually use it, I haven't found it:


steve at orac:~$ jython
Jython 2.5.1+ (Release_2_5_1, Aug 4 2010, 07:18:19)
[OpenJDK Server VM (Sun Microsystems Inc.)] on java1.6.0_27
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> dis.dis(lambda x: x+1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/share/jython/Lib/dis.py", line 42, in dis
    disassemble(x)
  File "/usr/share/jython/Lib/dis.py", line 64, in disassemble
    linestarts = dict(findlinestarts(co))
  File "/usr/share/jython/Lib/dis.py", line 183, in findlinestarts
    byte_increments = [ord(c) for c in code.co_lnotab[0::2]]
AttributeError: 'tablecode' object has no attribute 'co_lnotab'


IronPython gives a different exception:

steve at orac:~$ ipy
IronPython 2.6 Beta 2 DEBUG (2.6.0.20) on .NET 2.0.50727.1433
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> dis.dis(lambda x: x+1)
Traceback (most recent call last):
TypeError: don't know how to disassemble code objects


It's quite annoying, I would have rather that they just removed the 
module altogether. Better still would have been to disassemble code 
objects to whatever byte code the Java and .Net platforms use. But 
there's surely no requirement to disassemble to CPython byte code!


-- 
Steven

From steve at pearwood.info  Fri Jun 27 04:21:15 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 27 Jun 2014 12:21:15 +1000
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
	faster directory iterator
In-Reply-To: <CAL9jXCE0wFnkdPmnaoa25CH9nOkemdVp-LrJXE6wtQr7thqY4Q@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53ACAC94.1050206@mrabarnett.plus.com>
 <CAN8CLgnuWW+0dVug3tfzqzF5D-WuSTaVS_N26zNctYic9Hoo2Q@mail.gmail.com>
 <53ACB02F.4020402@stoneleaf.us>
 <CAL9jXCE0wFnkdPmnaoa25CH9nOkemdVp-LrJXE6wtQr7thqY4Q@mail.gmail.com>
Message-ID: <20140627022115.GE13014@ando>

On Thu, Jun 26, 2014 at 09:37:50PM -0400, Ben Hoyt wrote:
> I don't mind iterdir() and would take it :-), but I'll just say why I
> chose the name scandir() -- though it wasn't my suggestion originally:
> 
> iterdir() sounds like just an iterator version of listdir(), kinda
> like keys() and iterkeys() in Python 2. Whereas in actual fact the
> return values are quite different (DirEntry objects vs strings), and
> so the name change reflects that difference a little.

+1 

I think that's a good objective reason to prefer scandir, which suits 
me, because my subjective opinion is that "iterdir" is an inelegant 
and less than attractive name.


-- 
Steven

From v+python at g.nevcal.com  Fri Jun 27 04:43:34 2014
From: v+python at g.nevcal.com (Glenn Linderman)
Date: Thu, 26 Jun 2014 19:43:34 -0700
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
Message-ID: <53ACDA56.9050803@g.nevcal.com>

I'm generally +1, with opinions noted below on these two topics.

On 6/26/2014 3:59 PM, Ben Hoyt wrote:
> Should there be a way to access the full path?
> ----------------------------------------------
>
> Should ``DirEntry``'s have a way to get the full path without using
> ``os.path.join(path, entry.name)``? This is a pretty common pattern,
> and it may be useful to add pathlib-like ``str(entry)`` functionality.
> This functionality has also been requested in `issue 13`_ on GitHub.
>
> .. _`issue 13`:https://github.com/benhoyt/scandir/issues/13

+1

> Should it expose Windows wildcard functionality?
> ------------------------------------------------
>
> Should ``scandir()`` have a way of exposing the wildcard functionality
> in the Windows ``FindFirstFile`` / ``FindNextFile`` functions? The
> scandir module on GitHub exposes this as a ``windows_wildcard``
> keyword argument, allowing Windows power users the option to pass a
> custom wildcard to ``FindFirstFile``, which may avoid the need to use
> ``fnmatch`` or similar on the resulting names. It is named the
> unwieldly ``windows_wildcard`` to remind you you're writing power-
> user, Windows-only code if you use it.
>
> This boils down to whether ``scandir`` should be about exposing all of
> the system's directory iteration features, or simply providing a fast,
> simple, cross-platform directory iteration API.
>
> This PEP's author votes for not including ``windows_wildcard`` in the
> standard library version, because even though it could be useful in
> rare cases (say the Windows Dropbox client?), it'd be too easy to use
> it just because you're a Windows developer, and create code that is
> not cross-platform.

Because another common pattern is to check for name matches pattern, I 
think it would be good to have a feature that provides such. I do that 
in my own private directory listing extensions, and also some command 
lines expose it to the user.  Where exposed to the user, I use -p 
windows-pattern and -P regexp. My implementation converts the 
windows-pattern to a regexp, and then uses common code, but for this 
particular API, because the windows_wildcard can be optimized by the 
window API call used, it would make more sense to pass windows_wildcard 
directly to FindFirst on Windows, but on *nix convert it to a regexp. 
Both Windows and *nix would call re to process pattern matches except 
for the case on Windows of having a Windows pattern passed in. The 
alternate parameter could simply be called wildcard, and would be a 
regexp. If desired, other flavors of wildcard bsd_wildcard? could also 
be implemented, but I'm not sure there are any benefits to them, as 
there are, as far as I am aware, no optimizations for those patterns in 
those systems.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140626/b71d7c18/attachment.html>

From p.f.moore at gmail.com  Fri Jun 27 08:47:21 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 27 Jun 2014 07:47:21 +0100
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
Message-ID: <CACac1F-NSCwvEJEHyVqeDAJGPG9A1q0Xz-gfOe+SEfH8qNQJRg@mail.gmail.com>

On 26 June 2014 23:59, Ben Hoyt <benhoyt at gmail.com> wrote:
> Would love feedback on the PEP, but also of course on the proposal itself.

A solid +1 from me.

Some specific points:

- I'm in favour of it being in the os module. It's more discoverable
there, as well as the other reasons mentioned.
- I prefer scandir as the name, for the reason you gave (the output
isn't the same as an iterator version of listdir)
- I'm mildly against windows_wildcard (even though I'm a windows user)
- You mention the caching behaviour of DirEntry objects. The
limitations should be clearly covered in the final docs, as it's the
sort of thing people will get wrong otherwise.

Paul

From bkabrda at redhat.com  Fri Jun 27 09:07:23 2014
From: bkabrda at redhat.com (Bohuslav Kabrda)
Date: Fri, 27 Jun 2014 03:07:23 -0400 (EDT)
Subject: [Python-Dev] Binary CPython distribution for Linux
In-Reply-To: <53AC679B.1000408@gmail.com>
References: <53AC679B.1000408@gmail.com>
Message-ID: <1792609645.45624621.1403852843057.JavaMail.zimbra@redhat.com>

----- Original Message -----
> While much of the opposition to dropping Python <2.7 stems from the RHEL
> community (they still have 2.4 in extended support and 2.7 wasn't in a
> release until a few weeks ago), a common objection from the users is "I
> can't install a different Python" or "it's too difficult to install a
> different Python." The former is a legit complaint - if you are on
> shared hosting and don't have root, as easy as it is to add an alternate
> package repository that provides 2.7 (or newer), you don't have the
> permissions so you can't do it.

It's not true that 2.7 wasn't released until few weeks ago. It was released few weeks ago as part of RHEL 7, but Red Hat has been shipping Red Hat Software Collections (RHSCL) 1.0, that contain Python 2.7 and Python 3.3, for almost a year now [1] - RHSCL is installable on RHEL 6; RHSCL 1.1 (also with 2.7 and 3.3) has been released few weeks ago and is supported on RHEL 6 and 7. Also, these collections now have their community rebuilds at [2], so you can just download them without needing to talk to Red Hat at all. But yeah, these are all RPMs, so you have to be root to install them.

> I'd like to propose a solution to this problem: a pre-built distribution
> of CPython for Linux available via www.python.org in the list of
> downloads for a particular release [5]. This distribution could be
> downloaded and unarchived into the user's home directory and users could
> start running it immediately by setting an environment variable or two,
> creating a symlink, or even running a basic installer script. This would
> hopefully remove the hurdles of obtaining a (sane) Python distribution
> on Linux. This would allow projects to more easily drop end-of-life
> Python versions and would speed adoption of modern Python, including
> Python 3 (because porting is much easier if you only have to target 2.7).
> 
> I understand there may be technical challenges with doing this for some
> distributions and with producing a universal binary distribution. I
> would settle for a binary distribution that was targeted towards RHEL
> users and variant distros, as that is the user population that I
> perceive to be the most conservative and responsible for holding modern
> Python adoption back.

Speaking with my Fedora/RHEL/RHSCL Python maintainer's hat on, prebuilding Python is not as easy task as it may seem :) Someone has to write the build scripts (e.g. sort of specfile, but rpm/specfile wouldn't really work for you, since you want to install in user's home dirs). Someone has to update them when new Python comes out, so in the worst case you end up with slightly different build scripts for different versions of Python. Someone has to do rebuilds when there is CVE. Or a bug. Or a user requests a feature that makes sense. Someone has to do that for *each packaged version* - and each packaged version needs to be maintained for some amount of time so that it all actually makes sense.
Maintaining a prebuilt distribution of Python is a time consuming task even if you do it just for one Linux distro. If you want to maintain a *universal* prebuilt Python distribution, then you'll find out that it's a) undoable b) consumes so many resources and it's so fragile, that it's probably not worth it. You could just bundle all Python dependencies into your distribution to make it "easier", but that would just make the result grow in size (perhaps significantly) and you would then also need to update/bugfix/securityfix the bundled dependencies (which would consume even more time).
Please don't take this as a criticism of your ideas, I see what you're trying to solve. I just think the way you're trying to solve it is unachievable or would consume so much community resources, that it would end up unmaintained and buggy most of the time.

-- 
Regards,
Bohuslav "Slavek" Kabrda.

[1] http://developerblog.redhat.com/2013/09/12/rhscl1-ga/
[2] https://www.softwarecollections.org/en/scls/

From victor.stinner at gmail.com  Fri Jun 27 09:44:17 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Fri, 27 Jun 2014 09:44:17 +0200
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
Message-ID: <CAMpsgwbphfv+6wb07T+gJROMay2nnrqMvNJhLVYm8_e23QtQ6w@mail.gmail.com>

Hi,

You wrote a great PEP Ben, thanks :-) But it's now time  for comments!

> But the underlying system calls -- ``FindFirstFile`` /
> ``FindNextFile`` on Windows and ``readdir`` on Linux and OS X --

What about FreeBSD, OpenBSD, NetBSD, Solaris, etc. They don't provide readdir?

You should add a link to FindFirstFile doc:
http://msdn.microsoft.com/en-us/library/windows/desktop/aa364418%28v=vs.85%29.aspx

It looks like the WIN32_FIND_DATA has a dwFileAttributes field. So we
should mimic stat_result recent addition: the new
stat_result.file_attributes field. Add DirEntry.file_attributes which
would only be available on Windows.

The Windows structure also contains

  FILETIME ftCreationTime;
  FILETIME ftLastAccessTime;
  FILETIME ftLastWriteTime;
  DWORD    nFileSizeHigh;
  DWORD    nFileSizeLow;

It would be nice to expose them as well. I'm  no more surprised that
the exact API is different depending on the OS for functions of the os
module.

> * Instead of bare filename strings, it returns lightweight
>   ``DirEntry`` objects that hold the filename string and provide
>   simple methods that allow access to the stat-like data the operating
>   system returned.

Does your implementation uses a free list to avoid the cost of memory
allocation? A short free list of 10 or maybe just 1 may help. The free
list may be stored directly in the generator object.

> ``scandir()`` yields a ``DirEntry`` object for each file and directory
> in ``path``. Just like ``listdir``, the ``'.'`` and ``'..'``
> pseudo-directories are skipped, and the entries are yielded in
> system-dependent order. Each ``DirEntry`` object has the following
> attributes and methods:

Does it support also bytes filenames on UNIX?

Python now supports undecodable filenames thanks to the PEP 383
(surrogateescape). I prefer to use the same type for filenames on
Linux and Windows, so Unicode is better. But some users might prefer
bytes for other reasons.

> The ``DirEntry`` attribute and method names were chosen to be the same
> as those in the new ``pathlib`` module for consistency.

Great! That's exactly what I expected :-) Consistency with other modules.

> Notes on caching
> ----------------
>
> The ``DirEntry`` objects are relatively dumb -- the ``name`` attribute
> is obviously always cached, and the ``is_X`` and ``lstat`` methods
> cache their values (immediately on Windows via ``FindNextFile``, and
> on first use on Linux / OS X via a ``stat`` call) and never refetch
> from the system.
>
> For this reason, ``DirEntry`` objects are intended to be used and
> thrown away after iteration, not stored in long-lived data structured
> and the methods called again and again.
>
> If a user wants to do that (for example, for watching a file's size
> change), they'll need to call the regular ``os.lstat()`` or
> ``os.path.getsize()`` functions which force a new system call each
> time.

Crazy idea: would it be possible to "convert" a DirEntry object to a
pathlib.Path object without losing the cache? I guess that
pathlib.Path expects a full  stat_result object.

> Or, for getting the total size of files in a directory tree -- showing
> use of the ``DirEntry.lstat()`` method::
>
>     def get_tree_size(path):
>         """Return total size of files in path and subdirs."""
>         size = 0
>         for entry in scandir(path):
>             if entry.is_dir():
>                 sub_path = os.path.join(path, entry.name)
>                 size += get_tree_size(sub_path)
>             else:
>                 size += entry.lstat().st_size
>         return size
>
> Note that ``get_tree_size()`` will get a huge speed boost on Windows,
> because no extra stat call are needed, but on Linux and OS X the size
> information is not returned by the directory iteration functions, so
> this function won't gain anything there.

I don't understand how you can build a full lstat() result without
really calling stat. I see that WIN32_FIND_DATA contains the size, but
here you call lstat(). If you know that it's not a symlink, you
already know the size, but you still have to call stat() to retrieve
all fields required to build a stat_result no?

> Support
> =======
>
> The scandir module on GitHub has been forked and used quite a bit (see
> "Use in the wild" in this PEP),

Do you plan to continue to maintain your module for Python < 3.5, but
upgrade your module for the final PEP?

> Should scandir be in its own module?
> ------------------------------------
>
> Should the function be included in the standard library in a new
> module, ``scandir.scandir()``, or just as ``os.scandir()`` as
> discussed? The preference of this PEP's author (Ben Hoyt) would be
> ``os.scandir()``, as it's just a single function.

Yes, put it in the os module which is already bloated :-)

> Should there be a way to access the full path?
> ----------------------------------------------
>
> Should ``DirEntry``'s have a way to get the full path without using
> ``os.path.join(path, entry.name)``? This is a pretty common pattern,
> and it may be useful to add pathlib-like ``str(entry)`` functionality.
> This functionality has also been requested in `issue 13`_ on GitHub.
>
> .. _`issue 13`: https://github.com/benhoyt/scandir/issues/13

I think that it would be very convinient to store the directory name
in the DirEntry. It should be light, it's just a reference.

And provide a fullname() name which would just return
os.path.join(path, entry.name) without trying to resolve path to get
an absolute path.

> Should it expose Windows wildcard functionality?
> ------------------------------------------------
>
> Should ``scandir()`` have a way of exposing the wildcard functionality
> in the Windows ``FindFirstFile`` / ``FindNextFile`` functions? The
> scandir module on GitHub exposes this as a ``windows_wildcard``
> keyword argument, allowing Windows power users the option to pass a
> custom wildcard to ``FindFirstFile``, which may avoid the need to use
> ``fnmatch`` or similar on the resulting names. It is named the
> unwieldly ``windows_wildcard`` to remind you you're writing power-
> user, Windows-only code if you use it.
>
> This boils down to whether ``scandir`` should be about exposing all of
> the system's directory iteration features, or simply providing a fast,
> simple, cross-platform directory iteration API.

Would it be hard to implement the wildcard feature on UNIX to compare
performances of scandir('*.jpg') with and without the wildcard built
in os.scandir?

I implemented it in C for the tracemalloc module (Filter object):
http://hg.python.org/features/tracemalloc

Get the revision 69fd2d766005 and search match_filename_joker() in
Modules/_tracemalloc.c. The function matchs the filename backward
because it most cases, the last latter is enough to reject a filename
(ex: "*.jpg" => reject filenames not ending with "g").

The filename is normalized before matching the pattern: converted to
lowercase and / is replaced with \ on Windows.

It was decided to drop the Filter object to keep the tracemalloc
module as simple as possible. Charles-Fran?ois was not convinced by
the speedup.

But tracemalloc case is different because the OS didn't provide an API for that.

Victor

From nad at acm.org  Fri Jun 27 11:14:52 2014
From: nad at acm.org (Ned Deily)
Date: Fri, 27 Jun 2014 02:14:52 -0700
Subject: [Python-Dev] buildbot.python.org down?
Message-ID: <nad-3B7B5F.02145227062014@news.gmane.org>

The buildbot web site seems to have been down for some hours and still 
is as of 0915 UTC.  I'm not sure who is watching over it but I'll ping 
the infrastructure team as well.

-- 
 Ned Deily,
 nad at acm.org


From ncoghlan at gmail.com  Fri Jun 27 12:54:18 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 27 Jun 2014 20:54:18 +1000
Subject: [Python-Dev] Binary CPython distribution for Linux
In-Reply-To: <1792609645.45624621.1403852843057.JavaMail.zimbra@redhat.com>
References: <53AC679B.1000408@gmail.com>
 <1792609645.45624621.1403852843057.JavaMail.zimbra@redhat.com>
Message-ID: <CADiSq7e7nP0r5VNZ_eB-CJUm=OObh4N96QmkTvQnM8cXMsgxMQ@mail.gmail.com>

On 27 Jun 2014 17:33, "Bohuslav Kabrda" <bkabrda at redhat.com> wrote:
>
> It's not true that 2.7 wasn't released until few weeks ago. It was
released few weeks ago as part of RHEL 7, but Red Hat has been shipping Red
Hat Software Collections (RHSCL) 1.0, that contain Python 2.7 and Python
3.3, for almost a year now [1] - RHSCL is installable on RHEL 6; RHSCL 1.1
(also with 2.7 and 3.3) has been released few weeks ago and is supported on
RHEL 6 and 7. Also, these collections now have their community rebuilds at
[2], so you can just download them without needing to talk to Red Hat at
all. But yeah, these are all RPMs, so you have to be root to install them.

Indeed, while there are still some rough edges, software collections look
like the best approach to doing maintainable system installs of Python
runtimes other than the system Python into Fedora/RHEL/CentOS et al (and I
say that while wearing both my upstream and downstream hats).

Collections solve this problem in a general (rather than CPython specific)
way, since they can be used to get upgraded versions of language runtimes,
databases, web servers, etc, all without risking the stability of the OS
itself. I hope to see someone put together collections for PyPy and PyPy3
as well.

The approaches used for runtime isolation of software collections should
also be applicable to Debian systems, but (as far as I am aware) the
tooling to build them as debs rather than RPMs doesn't exist yet.

> Please don't take this as a criticism of your ideas, I see what you're
trying to solve. I just think the way you're trying to solve it is
unachievable or would consume so much community resources, that it would
end up unmaintained and buggy most of the time.

For prebuilt userland installs on Linux, I think "miniconda" is the current
best available approach. It has its challenges (especially around its
handling of security concerns), but it's designed to offer a full cross
platform package management system that makes it well suited to the task of
managing prebuilt language runtimes in user space.

Cheers,
Nick.

>
> --
> Regards,
> Bohuslav "Slavek" Kabrda.
>
> [1] http://developerblog.redhat.com/2013/09/12/rhscl1-ga/
> [2] https://www.softwarecollections.org/en/scls/
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140627/6b59d8aa/attachment.html>

From chris.barker at noaa.gov  Fri Jun 27 02:09:01 2014
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Thu, 26 Jun 2014 17:09:01 -0700
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CAN8CLgnuWW+0dVug3tfzqzF5D-WuSTaVS_N26zNctYic9Hoo2Q@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53ACAC94.1050206@mrabarnett.plus.com>
 <CAN8CLgnuWW+0dVug3tfzqzF5D-WuSTaVS_N26zNctYic9Hoo2Q@mail.gmail.com>
Message-ID: <3153899029119710294@unknownmsgid>

On Jun 26, 2014, at 4:38 PM, Tim Delaney <timothy.c.delaney at gmail.com>
wrote:

On 27 June 2014 09:28, MRAB <python at mrabarnett.plus.com> wrote:

>
> -1 for windows_wildcard (it would be an attractive nuisance to write
windows-only code)


Could you emulate it on other platforms?

+1 on the rest of it.

-Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140626/0f7df509/attachment.html>

From pmiscml at gmail.com  Fri Jun 27 13:48:17 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Fri, 27 Jun 2014 14:48:17 +0300
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CAL9jXCGFy=aXCBiJD_wpR9jw0XxHjuVOyP7htHgMHgG_M=YsvQ@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <20140627030746.15641d7e@x34f>
 <CAL9jXCGFy=aXCBiJD_wpR9jw0XxHjuVOyP7htHgMHgG_M=YsvQ@mail.gmail.com>
Message-ID: <20140627144817.2290a544@x34f>

Hello,

On Thu, 26 Jun 2014 21:52:43 -0400
Ben Hoyt <benhoyt at gmail.com> wrote:

[]

> It's a fair point that os.walk() can be implemented efficiently
> without adding a new function and API. However, often you'll want more
> info, like the file size, which scandir() can give you via
> DirEntry.lstat(), which is free on Windows. So opening up this
> efficient API is beneficial.
> 
> In CPython, I think the DirEntry objects are as lightweight as
> stat_result objects.
> 
> I'm an embedded developer by background, so I know the constraints
> here, but I really don't think Python's development should be tailored
> to fit MicroPython. If os.scandir() is not very efficient on
> MicroPython, so be it -- 99% of all desktop/server users will gain
> from it.

Surely, tailoring Python to MicroPython's needs is completely not what
I suggest. It was an example of alternative implementation which
optimized os.walk() without need for any additional public module APIs.
Vice-versa, high-level nature of API call like os.walk() and
underspecification of low-level details (like which function
implemented in terms of which others) allow MicroPython provide
optimized implementation even with its resource constraints. So, power
of high-level interfaces and underspecification should not be
underestimated ;-).

But I don't want to argue that os.scandir() is "not needed", because
that's hardly productive. Something I'd like to prototype in uPy and
ideally lead further up to PEP status is to add iterator-based string
methods, and I pretty much can expect "we lived without it" response,
so don't want to go the same way regarding addition of other
iterator-based APIs - it's clear that more iterator/generator based APIs
is a good direction for Python to evolve.

> > It would be better if os.scandir() was specified to return a struct
> > (named tuple) compatible with return value of os.stat() (with only
> > fields relevant to underlying readdir()-like system call). The
> > grounds for that are obvious: it's already existing data interface
> > in module "os", which is also based on open standard for operating
> > systems - POSIX, so if one is to expect something about file
> > attributes, it's what one can reasonably base expectations on.
> 
> Yes, we considered this early on (see the python-ideas and python-dev
> threads referenced in the PEP), but decided it wasn't a great API to
> overload stat_result further, and have most of the attributes None or
> not present on Linux.
> 
[]

> 
> However, for scandir() to be useful, you also need the name. My
> original version of this directory iterator returned two-tuples of
> (name, stat_result). But most people didn't like the API, and I don't
> really either. You could overload stat_result with a .name attribute
> in this case, but it still isn't a nice API to have most of the
> attributes None, and then you have to test for that, etc.

Yes, returning (name, stat_result) would be my first motion too, I
don't see why someone wouldn't like pair of 2 values, with each value
of obvious type and semantics within "os" module. Regarding stat
result, os.stat() provides full information about a file,
and intuitively, one may expect that os.scandir() would provide subset
of that info, asymptotically reaching volume of what os.stat() may
provide, depending on OS capabilities. So, if truly OS-independent
interface is wanted to salvage more data from a dir scanning, using
os.stat struct as data interface is hard to ignore.


But well, if it was rejected already, what can be said? Perhaps, at
least the PEP could be extended to explicitly mention other approached
which were discussed and rejected, not just link to a discussion
archive (from experience with reading other PEPs, they oftentimes
contained such subsections, so hope this suggestion is not ungrounded).

> 
> So basically we tweaked the API to do what was best, and ended up with
> it returning DirEntry objects with is_file() and similar methods.
> 
> Hope that helps give a bit more context. If you haven't read the
> relevant python-ideas and python-dev threads, those are interesting
> too.
> 
> -Ben


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From pmiscml at gmail.com  Fri Jun 27 14:13:13 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Fri, 27 Jun 2014 15:13:13 +0300
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <20140627020841.GD13014@ando>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <20140627030746.15641d7e@x34f> <20140627020841.GD13014@ando>
Message-ID: <20140627151313.6f2ff34d@x34f>

Hello,

On Fri, 27 Jun 2014 12:08:41 +1000
Steven D'Aprano <steve at pearwood.info> wrote:

> On Fri, Jun 27, 2014 at 03:07:46AM +0300, Paul Sokolovsky wrote:
> 
> > With my MicroPython hat on, os.scandir() would make things only
> > worse. With current interface, one can either have inefficient
> > implementation (like CPython chose) or efficient implementation
> > (like MicroPython chose) - all transparently. os.scandir()
> > supposedly opens up efficient implementation for everyone, but at
> > the price of bloating API and introducing heavy-weight objects to
> > wrap info. 
> 
> os.scandir is not part of the Python API, it is not a built-in
> function. It is part of the CPython standard library. 

Ok, so standard library also has API, and that's the API being
discussed. 

> That means (in
> my opinion) that there is an expectation that other Pythons should
> provide it, but not an absolute requirement. Especially for the os
> module, which by definition is platform-specific. 

Yes, that's intuitive, but not strict and formal, so is subject to
interpretations. As a developer working on alternative Python
implementation, I'd like to have better understanding of what needs to
be done to be a compliant implementation (in particular, because I need
to pass that info down to the users). So, I was told that
https://docs.python.org/3/reference/index.html describes Python, not
CPython. Next step is figuring out whether 
https://docs.python.org/3/library/index.html describes Python or
CPython, and if the latter, how to separate Python's stdlib essence from
extended library CPython provides?

> In my opinion that
> means you have four options:
> 
> 1. provide os.scandir, with exactly the same semantics as on CPython;
> 
> 2. provide os.scandir, but change its semantics to be more
> lightweight (e.g. return an ordinary tuple, as you already suggest);
> 
> 3. don't provide os.scandir at all; or
> 
> 4. do something different depending on whether the platform is Linux
>    or an embedded system.
> 
> I would consider any of those acceptable for a library feature, but
> not for a language feature.

Good, thanks. If that represents shared opinion of (C)Python developers
(so, there won't be claims like "MicroPython is not Python because it
doesn't provide os.scandir()" (or hundred of other missing stdlib
functions ;-) )) that's good enough already.

With that in mind, I wished that any Python implementation was as
complete and as efficient as possible, and one way to achieve that is
to not add stdlib entities without real need (be it more API calls or
more data types). So, I'm glad to know that os.scandir() passed thru
Occam's Razor in this respect and specified the way it is really for
common good.


[]

-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From j.wielicki at sotecware.net  Fri Jun 27 12:28:27 2014
From: j.wielicki at sotecware.net (Jonas Wielicki)
Date: Fri, 27 Jun 2014 12:28:27 +0200
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <53ACCDEE.9070906@mrabarnett.plus.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53ACAC94.1050206@mrabarnett.plus.com>
 <CAN8CLgnuWW+0dVug3tfzqzF5D-WuSTaVS_N26zNctYic9Hoo2Q@mail.gmail.com>
 <53ACB02F.4020402@stoneleaf.us>
 <CAL9jXCE0wFnkdPmnaoa25CH9nOkemdVp-LrJXE6wtQr7thqY4Q@mail.gmail.com>
 <53ACCDEE.9070906@mrabarnett.plus.com>
Message-ID: <53AD474B.4020204@sotecware.net>

On 27.06.2014 03:50, MRAB wrote:
> On 2014-06-27 02:37, Ben Hoyt wrote:
>> I don't mind iterdir() and would take it :-), but I'll just say why I
>> chose the name scandir() -- though it wasn't my suggestion originally:
>>
>> iterdir() sounds like just an iterator version of listdir(), kinda
>> like keys() and iterkeys() in Python 2. Whereas in actual fact the
>> return values are quite different (DirEntry objects vs strings), and
>> so the name change reflects that difference a little.
>>
> [snip]
> 
> The re module has 'findall', which returns a list of strings, and
> 'finditer', which returns an iterator that yields match objects, so
> there's a precedent. :-)

A bad precedent in my opinion though -- I was just recently bitten by
that, and I find it very untypical for python.

regards,
Jonas

From j.wielicki at sotecware.net  Fri Jun 27 12:44:35 2014
From: j.wielicki at sotecware.net (Jonas Wielicki)
Date: Fri, 27 Jun 2014 12:44:35 +0200
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
Message-ID: <53AD4B13.8070100@sotecware.net>

On 27.06.2014 00:59, Ben Hoyt wrote:
> Specifics of proposal
> =====================
> [snip] Each ``DirEntry`` object has the following
> attributes and methods:
> [snip]
> Notes on caching
> ----------------
> 
> The ``DirEntry`` objects are relatively dumb -- the ``name`` attribute
> is obviously always cached, and the ``is_X`` and ``lstat`` methods
> cache their values (immediately on Windows via ``FindNextFile``, and
> on first use on Linux / OS X via a ``stat`` call) and never refetch
> from the system.

I find this behaviour a bit misleading: using methods and have them
return cached results. How much (implementation and/or performance
and/or memory) overhead would incur by using property-like access here?
I think this would underline the static nature of the data.

This would break the semantics with respect to pathlib, but they?re only
marginally equal anyways -- and as far as I understand it, pathlib won?t
cache, so I think this has a fair point here.

regards,
jwi

From status at bugs.python.org  Fri Jun 27 18:07:57 2014
From: status at bugs.python.org (Python tracker)
Date: Fri, 27 Jun 2014 18:07:57 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <20140627160757.D267E56A2F@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (2014-06-20 - 2014-06-27)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open    4643 (-12)
  closed 29004 (+72)
  total  33647 (+60)

Open issues with patches: 2162 


Issues opened (50)
==================

#6916: Remove deprecated items from asynchat
http://bugs.python.org/issue6916  reopened by ezio.melotti

#10312: intcatcher() can deadlock
http://bugs.python.org/issue10312  reopened by Claudiu.Popa

#21817: `concurrent.futures.ProcessPoolExecutor` swallows tracebacks
http://bugs.python.org/issue21817  opened by cool-RR

#21818: cookielib documentation references Cookie module, not cookieli
http://bugs.python.org/issue21818  opened by Ajtag

#21820: unittest: unhelpful truncating of long strings.
http://bugs.python.org/issue21820  opened by cjw296

#21821: The function cygwinccompiler.is_cygwingcc leads to FileNotFoun
http://bugs.python.org/issue21821  opened by paugier

#21822: KeyboardInterrupt during Thread.join hangs that Thread
http://bugs.python.org/issue21822  opened by tupl

#21825: Embedding-Python example code from documentation crashes
http://bugs.python.org/issue21825  opened by Pat.Le.Cat

#21826: Performance issue (+fix) AIX ctypes.util with no /sbin/ldconfi
http://bugs.python.org/issue21826  opened by tw.bert

#21827: textwrap.dedent() fails when largest common whitespace is a su
http://bugs.python.org/issue21827  opened by robertjli

#21830: ssl.wrap_socket fails on Windows 7 when specifying ca_certs
http://bugs.python.org/issue21830  opened by David.M.Noriega

#21833: Fix unicodeless build of Python
http://bugs.python.org/issue21833  opened by serhiy.storchaka

#21834: Fix a number of tests in unicodeless build
http://bugs.python.org/issue21834  opened by serhiy.storchaka

#21835: Fix Tkinter in unicodeless build
http://bugs.python.org/issue21835  opened by serhiy.storchaka

#21836: Fix sqlite3 in unicodeless build
http://bugs.python.org/issue21836  opened by serhiy.storchaka

#21837: Fix tarfile in unicodeless build
http://bugs.python.org/issue21837  opened by serhiy.storchaka

#21838: Fix ctypes in unicodeless build
http://bugs.python.org/issue21838  opened by serhiy.storchaka

#21839: Fix distutils in unicodeless build
http://bugs.python.org/issue21839  opened by serhiy.storchaka

#21840: Fix os.path in unicodeless build
http://bugs.python.org/issue21840  opened by serhiy.storchaka

#21841: Fix xml.sax in unicodeless build
http://bugs.python.org/issue21841  opened by serhiy.storchaka

#21842: Fix IDLE in unicodeless build
http://bugs.python.org/issue21842  opened by serhiy.storchaka

#21843: Fix doctest in unicodeless build
http://bugs.python.org/issue21843  opened by serhiy.storchaka

#21844: Fix HTMLParser in unicodeless build
http://bugs.python.org/issue21844  opened by serhiy.storchaka

#21845: Fix plistlib in unicodeless build
http://bugs.python.org/issue21845  opened by serhiy.storchaka

#21846: Fix zipfile in unicodeless build
http://bugs.python.org/issue21846  opened by serhiy.storchaka

#21847: Fix xmlrpc in unicodeless build
http://bugs.python.org/issue21847  opened by serhiy.storchaka

#21848: Fix logging  in unicodeless build
http://bugs.python.org/issue21848  opened by serhiy.storchaka

#21849: Fix multiprocessing for non-ascii data
http://bugs.python.org/issue21849  opened by serhiy.storchaka

#21850: Fix httplib and SimpleHTTPServer in unicodeless build
http://bugs.python.org/issue21850  opened by serhiy.storchaka

#21851: Fix gettext in unicodeless build
http://bugs.python.org/issue21851  opened by serhiy.storchaka

#21852: Fix optparse in unicodeless build
http://bugs.python.org/issue21852  opened by serhiy.storchaka

#21853: Fix inspect in unicodeless build
http://bugs.python.org/issue21853  opened by serhiy.storchaka

#21854: Fix cookielib in unicodeless build
http://bugs.python.org/issue21854  opened by serhiy.storchaka

#21855: Fix decimal in unicodeless build
http://bugs.python.org/issue21855  opened by serhiy.storchaka

#21856: memoryview: no overflow on large slice values (start, stop, st
http://bugs.python.org/issue21856  opened by haypo

#21857: assert that functions clearing the current exception are not c
http://bugs.python.org/issue21857  opened by haypo

#21859: Add Python implementation of FileIO
http://bugs.python.org/issue21859  opened by serhiy.storchaka

#21860: Correct FileIO docstrings
http://bugs.python.org/issue21860  opened by serhiy.storchaka

#21861: io class name are hardcoded in reprs
http://bugs.python.org/issue21861  opened by serhiy.storchaka

#21862: cProfile command-line should accept "-m module_name" as an alt
http://bugs.python.org/issue21862  opened by pitrou

#21863: Display module names of C functions in cProfile
http://bugs.python.org/issue21863  opened by pitrou

#21864: Error in documentation of point 9.8 'Exceptions are classes to
http://bugs.python.org/issue21864  opened by Peibolvig

#21865: Improve invalid category exception for warnings.filterwarnings
http://bugs.python.org/issue21865  opened by berker.peksag

#21866: zipfile.ZipFile.close() doesn't respect allowZip64
http://bugs.python.org/issue21866  opened by bgilbert

#21867: Turtle returns TypeError when undobuffer is set to 0 (aka no u
http://bugs.python.org/issue21867  opened by Lita.Cho

#21868: Tbuffer in turtle allows negative size
http://bugs.python.org/issue21868  opened by Lita.Cho

#21869: Clean up quopri, correct method names encodestring and decodes
http://bugs.python.org/issue21869  opened by orsenthil

#21871: Python 2.7.7 regression in mimetypes read_windows_registry
http://bugs.python.org/issue21871  opened by agolde

#21872: LZMA library sometimes fails to decompress a file
http://bugs.python.org/issue21872  opened by vnummela

#21874: test_strptime fails on rhel/centos/fedora systems
http://bugs.python.org/issue21874  opened by boblfoot


Most recent 15 issues with no replies (15)
==========================================

#21874: test_strptime fails on rhel/centos/fedora systems
http://bugs.python.org/issue21874

#21871: Python 2.7.7 regression in mimetypes read_windows_registry
http://bugs.python.org/issue21871

#21865: Improve invalid category exception for warnings.filterwarnings
http://bugs.python.org/issue21865

#21861: io class name are hardcoded in reprs
http://bugs.python.org/issue21861

#21859: Add Python implementation of FileIO
http://bugs.python.org/issue21859

#21855: Fix decimal in unicodeless build
http://bugs.python.org/issue21855

#21854: Fix cookielib in unicodeless build
http://bugs.python.org/issue21854

#21853: Fix inspect in unicodeless build
http://bugs.python.org/issue21853

#21852: Fix optparse in unicodeless build
http://bugs.python.org/issue21852

#21851: Fix gettext in unicodeless build
http://bugs.python.org/issue21851

#21850: Fix httplib and SimpleHTTPServer in unicodeless build
http://bugs.python.org/issue21850

#21847: Fix xmlrpc in unicodeless build
http://bugs.python.org/issue21847

#21846: Fix zipfile in unicodeless build
http://bugs.python.org/issue21846

#21845: Fix plistlib in unicodeless build
http://bugs.python.org/issue21845

#21843: Fix doctest in unicodeless build
http://bugs.python.org/issue21843


Most recent 15 issues waiting for review (15)
=============================================

#21868: Tbuffer in turtle allows negative size
http://bugs.python.org/issue21868

#21865: Improve invalid category exception for warnings.filterwarnings
http://bugs.python.org/issue21865

#21863: Display module names of C functions in cProfile
http://bugs.python.org/issue21863

#21862: cProfile command-line should accept "-m module_name" as an alt
http://bugs.python.org/issue21862

#21860: Correct FileIO docstrings
http://bugs.python.org/issue21860

#21859: Add Python implementation of FileIO
http://bugs.python.org/issue21859

#21857: assert that functions clearing the current exception are not c
http://bugs.python.org/issue21857

#21855: Fix decimal in unicodeless build
http://bugs.python.org/issue21855

#21854: Fix cookielib in unicodeless build
http://bugs.python.org/issue21854

#21853: Fix inspect in unicodeless build
http://bugs.python.org/issue21853

#21852: Fix optparse in unicodeless build
http://bugs.python.org/issue21852

#21851: Fix gettext in unicodeless build
http://bugs.python.org/issue21851

#21850: Fix httplib and SimpleHTTPServer in unicodeless build
http://bugs.python.org/issue21850

#21849: Fix multiprocessing for non-ascii data
http://bugs.python.org/issue21849

#21848: Fix logging  in unicodeless build
http://bugs.python.org/issue21848


Top 10 most discussed issues (10)
=================================

#14460: In re's positive lookbehind assertion repetition works
http://bugs.python.org/issue14460  10 msgs

#21163: asyncio doesn't warn if a task is destroyed during its executi
http://bugs.python.org/issue21163  10 msgs

#21765: Idle: make 3.x HyperParser work with non-ascii identifiers.
http://bugs.python.org/issue21765   9 msgs

#21820: unittest: unhelpful truncating of long strings.
http://bugs.python.org/issue21820   9 msgs

#6916: Remove deprecated items from asynchat
http://bugs.python.org/issue6916   8 msgs

#11406: There is no os.listdir() equivalent returning generator instea
http://bugs.python.org/issue11406   7 msgs

#12750: datetime.strftime('%s') should respect tzinfo
http://bugs.python.org/issue12750   7 msgs

#19351: python msi installers - silent mode
http://bugs.python.org/issue19351   7 msgs

#20092: type() constructor should bind __int__ to __index__ when __ind
http://bugs.python.org/issue20092   6 msgs

#21331: Reversing an encoding with unicode-escape returns a different 
http://bugs.python.org/issue21331   6 msgs


Issues closed (70)
==================

#2213: build_tkinter.py does not handle paths with spaces
http://bugs.python.org/issue2213  closed by loewis

#4346: PyObject_CallMethod changes the exception message already set 
http://bugs.python.org/issue4346  closed by python-dev

#4613: Can't figure out where SyntaxError: can not delete variable 'x
http://bugs.python.org/issue4613  closed by ned.deily

#4735: An error occurred during the installation of assembly
http://bugs.python.org/issue4735  closed by zach.ware

#5235: distutils seems to only work with VC++ 2008 (9.0)
http://bugs.python.org/issue5235  closed by loewis

#6305: islice doesn't accept large stop values
http://bugs.python.org/issue6305  closed by loewis

#6362: multiprocessing: handling of errno after signals in sem_acquir
http://bugs.python.org/issue6362  closed by loewis

#8192: SQLite3 PRAGMA table_info doesn't respect database on Win32
http://bugs.python.org/issue8192  closed by loewis

#8343: improve re parse error messages for named groups
http://bugs.python.org/issue8343  closed by rhettinger

#10217: python-2.7.amd64.msi install fails
http://bugs.python.org/issue10217  closed by zach.ware

#10747: Include version info in Windows shortcuts
http://bugs.python.org/issue10747  closed by loewis

#10798: test_concurrent_futures fails on FreeBSD
http://bugs.python.org/issue10798  closed by haypo

#11974: Class definition gotcha.. should this be documented somewhere?
http://bugs.python.org/issue11974  closed by rhettinger

#12066: Empty ('') xmlns attribute is not properly handled by xml.dom.
http://bugs.python.org/issue12066  closed by ned.deily

#12860: http client attempts to send a readable object twice
http://bugs.python.org/issue12860  closed by ned.deily

#13143: os.path.islink documentation is ambiguous
http://bugs.python.org/issue13143  closed by python-dev

#14457: Unattended Install doesn't populate registry
http://bugs.python.org/issue14457  closed by loewis

#14477: Rietveld test issue
http://bugs.python.org/issue14477  closed by loewis

#14540: Crash in Modules/_ctypes/libffi/src/dlmalloc.c on ia64-hp-hpux
http://bugs.python.org/issue14540  closed by pda

#14561: python-2.7.2-r3 suffers test failure at test_mhlib
http://bugs.python.org/issue14561  closed by ned.deily

#15588: quopri: encodestring and decodestring handle bytes, not string
http://bugs.python.org/issue15588  closed by orsenthil

#16667: timezone docs need "versionadded: 3.2"
http://bugs.python.org/issue16667  closed by python-dev

#16976: Asyncore/asynchat hangs when used with ssl sockets
http://bugs.python.org/issue16976  closed by giampaolo.rodola

#17170: string method lookup is too slow
http://bugs.python.org/issue17170  closed by pitrou

#17424: help() should use the class signature
http://bugs.python.org/issue17424  closed by yselivanov

#17449: dev guide appears not to cover the benchmarking suite
http://bugs.python.org/issue17449  closed by python-dev

#19145: Inconsistent behaviour in itertools.repeat when using negative
http://bugs.python.org/issue19145  closed by rhettinger

#19897: Use python as executable instead of python3 in Python 2 docs
http://bugs.python.org/issue19897  closed by berker.peksag

#20155: Regression test test_httpservers fails, hangs on Windows
http://bugs.python.org/issue20155  closed by r.david.murray

#20295: imghdr add openexr support
http://bugs.python.org/issue20295  closed by r.david.murray

#20446: ipaddress: hash similarities for ipv4 and ipv6
http://bugs.python.org/issue20446  closed by tim.peters

#20753: disable test_robotparser test that uses an invalid URL
http://bugs.python.org/issue20753  closed by orsenthil

#20756: Segmentation fault with unoconv
http://bugs.python.org/issue20756  closed by Sworddragon

#20872: dbm/gdbm/ndbm close methods are not document
http://bugs.python.org/issue20872  closed by python-dev

#20939: test_geturl of test_urllibnet fails with 'https://www.python.o
http://bugs.python.org/issue20939  closed by ned.deily

#21030: pip usable only by administrators on Windows and SELinux
http://bugs.python.org/issue21030  closed by loewis

#21158: Windows installer service could not be accessed
http://bugs.python.org/issue21158  closed by loewis

#21216: getaddrinfo is wrongly considered thread safe on linux
http://bugs.python.org/issue21216  closed by gregory.p.smith

#21441: Buffer Protocol Documentation Error
http://bugs.python.org/issue21441  closed by python-dev

#21476: Inconsistent behaviour between BytesParser.parse and Parser.pa
http://bugs.python.org/issue21476  closed by r.david.murray

#21491: race condition in SocketServer.py ForkingMixIn collect_childre
http://bugs.python.org/issue21491  closed by neologix

#21532: 2.7.7rc1 msi is lacking libpython27.a
http://bugs.python.org/issue21532  closed by loewis

#21635: difflib.SequenceMatcher stores matching blocks as tuples, not 
http://bugs.python.org/issue21635  closed by rhettinger

#21670: Add repr to shelve.Shelf
http://bugs.python.org/issue21670  closed by rhettinger

#21672: Python for Windows 2.7.7: Path Configuration File No Longer Wo
http://bugs.python.org/issue21672  closed by python-dev

#21684: inspect.signature bind doesn't include defaults or empty tuple
http://bugs.python.org/issue21684  closed by yselivanov

#21716: 3.4.1 download page link for OpenPGP signatures has no sigs
http://bugs.python.org/issue21716  closed by ned.deily

#21729: Use `with` statement in dbm.dumb
http://bugs.python.org/issue21729  closed by serhiy.storchaka

#21768: Fix a NameError in test_pydoc
http://bugs.python.org/issue21768  closed by terry.reedy

#21769: Fix a NameError in test_descr
http://bugs.python.org/issue21769  closed by terry.reedy

#21770: Module not callable in script_helper.py
http://bugs.python.org/issue21770  closed by terry.reedy

#21786: Use assertEqual in test_pydoc
http://bugs.python.org/issue21786  closed by rhettinger

#21799: python34.dll is not installed
http://bugs.python.org/issue21799  closed by loewis

#21801: inspect.signature doesn't always return a signature
http://bugs.python.org/issue21801  closed by python-dev

#21807: SysLogHandler closes TCP connection after first message
http://bugs.python.org/issue21807  closed by vinay.sajip

#21809: Building Python3 on VMS - External repository
http://bugs.python.org/issue21809  closed by terry.reedy

#21812: turtle.shapetransform doesn't transform the turtle on the firs
http://bugs.python.org/issue21812  closed by rhettinger

#21814: object.__setattr__ or super(...).__setattr__?
http://bugs.python.org/issue21814  closed by rhettinger

#21816: OverflowError: Python int too large to convert to C long
http://bugs.python.org/issue21816  closed by ned.deily

#21819: Remaining buffer from socket isn't available anymore after cal
http://bugs.python.org/issue21819  closed by neologix

#21823: Catch turtle.Terminator exceptions in turtledemo
http://bugs.python.org/issue21823  closed by terry.reedy

#21824: Make turtledemo 2.7 help show file contents, not file name.
http://bugs.python.org/issue21824  closed by terry.reedy

#21828: added/corrected containment relationship for networks in lib i
http://bugs.python.org/issue21828  closed by nlm

#21829: Wrong test in ctypes
http://bugs.python.org/issue21829  closed by zach.ware

#21831: integer overflow in 'buffer' type allows reading memory
http://bugs.python.org/issue21831  closed by python-dev

#21832: collections.namedtuple does questionable things when passed qu
http://bugs.python.org/issue21832  closed by rhettinger

#21858: Enhance error handling in the sqlite module
http://bugs.python.org/issue21858  closed by haypo

#21870: Ctrl-C doesn't interrupt simple loop
http://bugs.python.org/issue21870  closed by r.david.murray

#21873: Tuple comparisons with NaNs are broken
http://bugs.python.org/issue21873  closed by rhettinger

#21875: Remove vestigial references to Classic Mac OS attributes in os
http://bugs.python.org/issue21875  closed by ned.deily

From benjamin at python.org  Fri Jun 27 18:50:19 2014
From: benjamin at python.org (Benjamin Peterson)
Date: Fri, 27 Jun 2014 09:50:19 -0700
Subject: [Python-Dev] buildbot.python.org down?
In-Reply-To: <nad-3B7B5F.02145227062014@news.gmane.org>
References: <nad-3B7B5F.02145227062014@news.gmane.org>
Message-ID: <1403887819.13904.135300541.288F8C33@webmail.messagingengine.com>

On Fri, Jun 27, 2014, at 02:14, Ned Deily wrote:
> The buildbot web site seems to have been down for some hours and still 
> is as of 0915 UTC.  I'm not sure who is watching over it but I'll ping 
> the infrastructure team as well.

Fixed. The VM crashed, and Ernest rebooted it.

From python at mrabarnett.plus.com  Fri Jun 27 18:56:28 2014
From: python at mrabarnett.plus.com (MRAB)
Date: Fri, 27 Jun 2014 17:56:28 +0100
Subject: [Python-Dev] LZO bug
Message-ID: <53ADA23C.5000801@mrabarnett.plus.com>

Is this something that we need to worry about?

Raising Lazarus - The 20 Year Old Bug that Went to Mars
http://blog.securitymouse.com/2014/06/raising-lazarus-20-year-old-bug-that.html


From raymond.hettinger at gmail.com  Fri Jun 27 20:13:53 2014
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Fri, 27 Jun 2014 11:13:53 -0700
Subject: [Python-Dev] LZO bug
In-Reply-To: <53ADA23C.5000801@mrabarnett.plus.com>
References: <53ADA23C.5000801@mrabarnett.plus.com>
Message-ID: <ECE287D1-1A8E-465A-BD15-6DB6407B5228@gmail.com>


On Jun 27, 2014, at 9:56 AM, MRAB <python at mrabarnett.plus.com> wrote:

> Is this something that we need to worry about?
> 
> Raising Lazarus - The 20 Year Old Bug that Went to Mars
> http://blog.securitymouse.com/2014/06/raising-lazarus-20-year-old-bug-that.html


Debunking the LZ4 "20 years old bug" myth
http://fastcompression.blogspot.com/2014/06/debunking-lz4-20-years-old-bug-myth.html


Raymond


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140627/ce4f1a10/attachment.html>

From ncoghlan at gmail.com  Fri Jun 27 23:58:50 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 28 Jun 2014 07:58:50 +1000
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <53AD4B13.8070100@sotecware.net>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53AD4B13.8070100@sotecware.net>
Message-ID: <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>

On 28 Jun 2014 01:27, "Jonas Wielicki" <j.wielicki at sotecware.net> wrote:
>
> On 27.06.2014 00:59, Ben Hoyt wrote:
> > Specifics of proposal
> > =====================
> > [snip] Each ``DirEntry`` object has the following
> > attributes and methods:
> > [snip]
> > Notes on caching
> > ----------------
> >
> > The ``DirEntry`` objects are relatively dumb -- the ``name`` attribute
> > is obviously always cached, and the ``is_X`` and ``lstat`` methods
> > cache their values (immediately on Windows via ``FindNextFile``, and
> > on first use on Linux / OS X via a ``stat`` call) and never refetch
> > from the system.
>
> I find this behaviour a bit misleading: using methods and have them
> return cached results. How much (implementation and/or performance
> and/or memory) overhead would incur by using property-like access here?
> I think this would underline the static nature of the data.
>
> This would break the semantics with respect to pathlib, but they?re only
> marginally equal anyways -- and as far as I understand it, pathlib won?t
> cache, so I think this has a fair point here.

Indeed - using properties rather than methods may help emphasise the
deliberate *difference* from pathlib in this case (i.e. value when the
result was retrieved from the OS, rather than the value right now). The
main benefit is that switching from using the DirEntry object to a pathlib
Path will require touching all the places where the performance
characteristics switch from "memory access" to "system call". This benefit
is also the main downside, so I'd actually be OK with either decision on
this one.

Other comments:

* +1 on the general idea
* +1 on scandir() over iterdir, since it *isn't* just an iterator version
of listdir
* -1 on including Windows specific globbing support in the API
* -0 on including cross platform globbing support in the initial iteration
of the API (that could be done later as a separate RFE instead)
* +1 on a new section in the PEP covering rejected design options (calling
it iterdir, returning a 2-tuple instead of a dedicated DirEntry type)
* regarding "why not a 2-tuple", we know from experience that operating
systems evolve and we end up wanting to add additional info to this kind of
API. A dedicated DirEntry type lets us adjust the information returned over
time, without breaking backwards compatibility and without resorting to
ugly hacks like those in some of the time and stat APIs (or even our own
codec info APIs)
* it would be nice to see some relative performance numbers for NFS and
CIFS network shares - the additional network round trips can make excessive
stat calls absolutely brutal from a speed perspective when using a network
drive (that's why the stat caching added to the import system in 3.3
dramatically sped up the case of having network drives on sys.path, and why
I thought AJ had a point when he was complaining about the fact we didn't
expose the dirent data from os.listdir)

Regards,
Nick.

>
> regards,
> jwi
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140628/9ced4e32/attachment.html>

From victor.stinner at gmail.com  Sat Jun 28 01:51:44 2014
From: victor.stinner at gmail.com (Victor Stinner)
Date: Sat, 28 Jun 2014 01:51:44 +0200
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <loguom$58s$1@ger.gmane.org>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
 <1403625970.6550.133062453.693ECDEA@webmail.messagingengine.com>
 <nad-E06C00.12542824062014@news.gmane.org> <loegvg$fm8$2@ger.gmane.org>
 <CAMpsgwb1GkBbPKLdF3qmA-h00wEqM0G5=UVicgdvOJ=su+hG2Q@mail.gmail.com>
 <loekl5$2m7$1@ger.gmane.org>
 <CADiSq7ecypARn9Rf6C=tBHnb6vPxi8RgtBhAyujHORhJKVheug@mail.gmail.com>
 <loguom$58s$1@ger.gmane.org>
Message-ID: <CAMpsgwY5Qc_P4SgrKp4vvJsv-5kDWvooO1phLJHwd_=9BfV=vQ@mail.gmail.com>

2014-06-26 13:04 GMT+02:00 Antoine Pitrou <antoine at python.org>:
> For the same reason, I agree with Victor that we should ditch the
> threading-disabled builds. It's too much of a hassle for no actual,
> practical benefit. People who want a threadless unicodeless Python can
> install Python 1.5.2 for all I care.

By the way, adding a buildbot for testing Python without thread
support is not enough. The buildbot is currently broken since more
than one month and nobody noticed :-p

http://buildbot.python.org/all/builders/AMD64%20Fedora%20without%20threads%203.x/

Ok, I noticed, but I consider that I spent too much time on this minor
use case. I prefer to leave such task to someone else :-)

Victor

From greg at krypto.org  Sat Jun 28 08:17:55 2014
From: greg at krypto.org (Gregory P. Smith)
Date: Fri, 27 Jun 2014 23:17:55 -0700
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53AD4B13.8070100@sotecware.net>
 <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>
Message-ID: <CAGE7PNLVsErOxfh2afeTQf2H3eSPaVtW1Jd1q8xXXLnsxrWX3w@mail.gmail.com>

On Fri, Jun 27, 2014 at 2:58 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
>  * -1 on including Windows specific globbing support in the API
> * -0 on including cross platform globbing support in the initial iteration
> of the API (that could be done later as a separate RFE instead)
>
Agreed.  Globbing or filtering support should not hold this up.  If that
part isn't settled, just don't include it and work out what it should be as
a future enhancement.

> * +1 on a new section in the PEP covering rejected design options (calling
> it iterdir, returning a 2-tuple instead of a dedicated DirEntry type)
>
+1.  IMNSHO, one of the most important part of PEPs: capturing the entire
decision process to document the "why nots".

> * regarding "why not a 2-tuple", we know from experience that operating
> systems evolve and we end up wanting to add additional info to this kind of
> API. A dedicated DirEntry type lets us adjust the information returned over
> time, without breaking backwards compatibility and without resorting to
> ugly hacks like those in some of the time and stat APIs (or even our own
> codec info APIs)
> * it would be nice to see some relative performance numbers for NFS and
> CIFS network shares - the additional network round trips can make excessive
> stat calls absolutely brutal from a speed perspective when using a network
> drive (that's why the stat caching added to the import system in 3.3
> dramatically sped up the case of having network drives on sys.path, and why
> I thought AJ had a point when he was complaining about the fact we didn't
> expose the dirent data from os.listdir)
>
fwiw, I wouldn't wait for benchmark numbers.

A needless stat call when you've got the information from an earlier API
call is already brutal. It is easy to compute from existing ballparks
remote file server / cloud access: ~100ms, local spinning disk seek+read:
~10ms. fetch of stat info cached in memory on file server on the local
network: ~500us.  You can go down further to local system call overhead
which can vary wildly but should likely be assumed to be at least 10us.

You don't need a benchmark to tell you that adding needless >= 500us-100ms
blocking operations to your program is bad. :)

-gps
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140627/1e7f3d13/attachment.html>

From ncoghlan at gmail.com  Sat Jun 28 11:19:23 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 28 Jun 2014 19:19:23 +1000
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CADiSq7eaorvmWCOoF2JBkM4V36qxGMUDWo0X4n8thzXvkLfwnA@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53AD4B13.8070100@sotecware.net>
 <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>
 <CAGE7PNLVsErOxfh2afeTQf2H3eSPaVtW1Jd1q8xXXLnsxrWX3w@mail.gmail.com>
 <CADiSq7eaorvmWCOoF2JBkM4V36qxGMUDWo0X4n8thzXvkLfwnA@mail.gmail.com>
Message-ID: <CADiSq7dayLiJTW8HYgXZ-czCfu9wFdnj18NCF0FuZsUW8jjG8g@mail.gmail.com>

On 28 June 2014 19:17, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Agreed, but walking even a moderately large tree over the network can
> really hammer home the point that this offers a significant
> performance enhancement as the latency of access increases. I've found
> that kind of comparison can be eye-opening for folks that are used to
> only operating on local disks (even spinning disks, let alone SSDs)
> and/or relatively small trees (distro build trees aren't *that* big,
> but they're big enough for this kind of difference in access overhead
> to start getting annoying).

Oops, forgot to add - I agree this isn't a blocking issue for the PEP,
it's definitely only in "nice to have" territory.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Sat Jun 28 11:17:12 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 28 Jun 2014 19:17:12 +1000
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CAGE7PNLVsErOxfh2afeTQf2H3eSPaVtW1Jd1q8xXXLnsxrWX3w@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53AD4B13.8070100@sotecware.net>
 <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>
 <CAGE7PNLVsErOxfh2afeTQf2H3eSPaVtW1Jd1q8xXXLnsxrWX3w@mail.gmail.com>
Message-ID: <CADiSq7eaorvmWCOoF2JBkM4V36qxGMUDWo0X4n8thzXvkLfwnA@mail.gmail.com>

On 28 June 2014 16:17, Gregory P. Smith <greg at krypto.org> wrote:
> On Fri, Jun 27, 2014 at 2:58 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> * it would be nice to see some relative performance numbers for NFS and
>> CIFS network shares - the additional network round trips can make excessive
>> stat calls absolutely brutal from a speed perspective when using a network
>> drive (that's why the stat caching added to the import system in 3.3
>> dramatically sped up the case of having network drives on sys.path, and why
>> I thought AJ had a point when he was complaining about the fact we didn't
>> expose the dirent data from os.listdir)
>
> fwiw, I wouldn't wait for benchmark numbers.
>
> A needless stat call when you've got the information from an earlier API
> call is already brutal. It is easy to compute from existing ballparks remote
> file server / cloud access: ~100ms, local spinning disk seek+read: ~10ms.
> fetch of stat info cached in memory on file server on the local network:
> ~500us.  You can go down further to local system call overhead which can
> vary wildly but should likely be assumed to be at least 10us.
>
> You don't need a benchmark to tell you that adding needless >= 500us-100ms
> blocking operations to your program is bad. :)

Agreed, but walking even a moderately large tree over the network can
really hammer home the point that this offers a significant
performance enhancement as the latency of access increases. I've found
that kind of comparison can be eye-opening for folks that are used to
only operating on local disks (even spinning disks, let alone SSDs)
and/or relatively small trees (distro build trees aren't *that* big,
but they're big enough for this kind of difference in access overhead
to start getting annoying).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From pmiscml at gmail.com  Sat Jun 28 12:58:54 2014
From: pmiscml at gmail.com (Paul Sokolovsky)
Date: Sat, 28 Jun 2014 13:58:54 +0300
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <CAPTjJmoc8TXPtQybNGDM-1UFbvJym6U7Qpw0dFUPNrLFWJBhNQ@mail.gmail.com>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
 <1403625970.6550.133062453.693ECDEA@webmail.messagingengine.com>
 <nad-E06C00.12542824062014@news.gmane.org>
 <loegvg$fm8$2@ger.gmane.org>
 <CAMpsgwb1GkBbPKLdF3qmA-h00wEqM0G5=UVicgdvOJ=su+hG2Q@mail.gmail.com>
 <loekl5$2m7$1@ger.gmane.org>
 <CADiSq7ecypARn9Rf6C=tBHnb6vPxi8RgtBhAyujHORhJKVheug@mail.gmail.com>
 <loguom$58s$1@ger.gmane.org>
 <CAPTjJmoc8TXPtQybNGDM-1UFbvJym6U7Qpw0dFUPNrLFWJBhNQ@mail.gmail.com>
Message-ID: <20140628135854.72f0ab28@x34f>

Hello,

On Thu, 26 Jun 2014 22:49:40 +1000
Chris Angelico <rosuav at gmail.com> wrote:

> On Thu, Jun 26, 2014 at 9:04 PM, Antoine Pitrou <antoine at python.org>
> wrote:
> > For the same reason, I agree with Victor that we should ditch the
> > threading-disabled builds. It's too much of a hassle for no actual,
> > practical benefit. People who want a threadless unicodeless Python
> > can install Python 1.5.2 for all I care.
> 
> Or some other implementation of Python. It's looking like micropython
> will be permanently supporting a non-Unicode build 

Yes.

> (although I stepped
> away from the project after a strong disagreement over what would and
> would not make sense, and haven't been following it since). 

Your patches with my further additions were finally merged. Unicode
strings still cannot be enabled by default due to
https://github.com/micropython/micropython/issues/726 . Any help with
reviewing/testing what's currently available is welcome.

> If someone
> wants a Python that doesn't have stuff that the core CPython devs
> treat as essential, s/he probably wants something like uPy anyway.

I hinted it during previous discussions of MicroPython, and would like
to say it again, that MicroPython already embraced a lot of ideas
rejected from CPython, like GC-only operation (which alone not
something to be proud of, but can you start up and do something in 2K
heap?) or tagged pointers
(https://mail.python.org/pipermail/python-dev/2004-July/046139.html).
So, it should be good vehicle to try any unorthodox ideas(*) or
implementations.


* MicroPython already implements intra-module constants for example.


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com

From 4kir4.1i at gmail.com  Sat Jun 28 15:05:31 2014
From: 4kir4.1i at gmail.com (Akira Li)
Date: Sat, 28 Jun 2014 17:05:31 +0400
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
	faster directory iterator
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
Message-ID: <877g412nhg.fsf@gmail.com>

Ben Hoyt <benhoyt at gmail.com> writes:

> Hi Python dev folks,
>
> I've written a PEP proposing a specific os.scandir() API for a
> directory iterator that returns the stat-like info from the OS, *the
> main advantage of which is to speed up os.walk() and similar
> operations between 4-20x, depending on your OS and file system.*
> ...
> http://legacy.python.org/dev/peps/pep-0471/
> ...
> Specifically, this PEP proposes adding a single function to the ``os``
> module in the standard library, ``scandir``, that takes a single,
> optional string as its argument::
>
>     scandir(path='.') -> generator of DirEntry objects
>

Have you considered adding support for paths relative to directory
descriptors [1] via keyword only dir_fd=None parameter if it may lead to
more efficient implementations on some platforms?

[1]: https://docs.python.org/3.4/library/os.html#dir-fd


--
akira


From rosuav at gmail.com  Sat Jun 28 17:27:44 2014
From: rosuav at gmail.com (Chris Angelico)
Date: Sun, 29 Jun 2014 01:27:44 +1000
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <877g412nhg.fsf@gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <877g412nhg.fsf@gmail.com>
Message-ID: <CAPTjJmpQ1FkJHDNZCAC3uxcG7dsETj0HyO7p_apifdjVcb1Ctg@mail.gmail.com>

On Sat, Jun 28, 2014 at 11:05 PM, Akira Li <4kir4.1i at gmail.com> wrote:
> Have you considered adding support for paths relative to directory
> descriptors [1] via keyword only dir_fd=None parameter if it may lead to
> more efficient implementations on some platforms?
>
> [1]: https://docs.python.org/3.4/library/os.html#dir-fd

Potentially more efficient and also potentially safer (see 'man
openat')... but an enhancement that can wait, if necessary.

ChrisA

From benhoyt at gmail.com  Sat Jun 28 21:48:03 2014
From: benhoyt at gmail.com (Ben Hoyt)
Date: Sat, 28 Jun 2014 15:48:03 -0400
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CAMpsgwbphfv+6wb07T+gJROMay2nnrqMvNJhLVYm8_e23QtQ6w@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <CAMpsgwbphfv+6wb07T+gJROMay2nnrqMvNJhLVYm8_e23QtQ6w@mail.gmail.com>
Message-ID: <CAL9jXCHAMjjDSRm2c37foGQyEW6UWV7Ni_bxNhZWFe298jrP+w@mail.gmail.com>

>> But the underlying system calls -- ``FindFirstFile`` /
>> ``FindNextFile`` on Windows and ``readdir`` on Linux and OS X --
>
> What about FreeBSD, OpenBSD, NetBSD, Solaris, etc. They don't provide readdir?

I guess it'd be better to say "Windows" and "Unix-based OSs"
throughout the PEP? Because all of these (including Mac OS X) are
Unix-based.

> It looks like the WIN32_FIND_DATA has a dwFileAttributes field. So we
> should mimic stat_result recent addition: the new
> stat_result.file_attributes field. Add DirEntry.file_attributes which
> would only be available on Windows.
>
> The Windows structure also contains
>
>   FILETIME ftCreationTime;
>   FILETIME ftLastAccessTime;
>   FILETIME ftLastWriteTime;
>   DWORD    nFileSizeHigh;
>   DWORD    nFileSizeLow;
>
> It would be nice to expose them as well. I'm  no more surprised that
> the exact API is different depending on the OS for functions of the os
> module.

I think you've misunderstood how DirEntry.lstat() works on Windows --
it's basically a no-op, as Windows returns the full stat information
with the original FindFirst/FindNext OS calls. This is fairly explict
in the PEP, but I'm sure I could make it clearer:

    DirEntry.lstat(): "like os.lstat(), but requires no system calls on Windows

So you can already get the dwFileAttributes for free by saying
entry.lstat().st_file_attributes. You can also get all the other
fields you mentioned for free via .lstat() with no additional OS calls
on Windows, for example: entry.lstat().st_size.

Feel free to suggest changes to the PEP or scandir docs if this isn't
clear. Note that is_dir()/is_file()/is_symlink() are free on all
systems, but .lstat() is only free on Windows.

> Does your implementation uses a free list to avoid the cost of memory
> allocation? A short free list of 10 or maybe just 1 may help. The free
> list may be stored directly in the generator object.

No, it doesn't. I might add this to the PEP under "possible
improvements". However, I think the speed increase by removing the
extra OS call and/or disk seek is going to be way more than memory
allocation improvements, so I'm not sure this would be worth it.

> Does it support also bytes filenames on UNIX?

> Python now supports undecodable filenames thanks to the PEP 383
> (surrogateescape). I prefer to use the same type for filenames on
> Linux and Windows, so Unicode is better. But some users might prefer
> bytes for other reasons.

I forget exactly now what my scandir module does, but for os.scandir()
I think this should behave exactly like os.listdir() does for
Unicode/bytes filenames.

> Crazy idea: would it be possible to "convert" a DirEntry object to a
> pathlib.Path object without losing the cache? I guess that
> pathlib.Path expects a full  stat_result object.

The main problem is that pathlib.Path objects explicitly don't cache
stat info (and Guido doesn't want them to, for good reason I think).
There's a thread on python-dev about this earlier. I'll add it to a
"Rejected ideas" section.

> I don't understand how you can build a full lstat() result without
> really calling stat. I see that WIN32_FIND_DATA contains the size, but
> here you call lstat().

See above.

> Do you plan to continue to maintain your module for Python < 3.5, but
> upgrade your module for the final PEP?

Yes, I intend to maintain the standalone scandir module for 2.6 <=
Python < 3.5, at least for a good while. For integration into the
Python 3.5 stdlib, the implementation will be integrated into
posixmodule.c, of course.

>> Should there be a way to access the full path?
>> ----------------------------------------------
>>
>> Should ``DirEntry``'s have a way to get the full path without using
>> ``os.path.join(path, entry.name)``? This is a pretty common pattern,
>> and it may be useful to add pathlib-like ``str(entry)`` functionality.
>> This functionality has also been requested in `issue 13`_ on GitHub.
>>
>> .. _`issue 13`: https://github.com/benhoyt/scandir/issues/13
>
> I think that it would be very convinient to store the directory name
> in the DirEntry. It should be light, it's just a reference.
>
> And provide a fullname() name which would just return
> os.path.join(path, entry.name) without trying to resolve path to get
> an absolute path.

Yeah, fair suggestion. I'm still slightly on the fence about this, but
I think an explicit fullname() is a good suggestion. Ideally I think
it'd be better to mimic pathlib.Path.__str__() which is kind of the
equivalent of fullname(). But how does pathlib deal with unicode/bytes
issues if it's the str function which has to return a str object? Or
at least, it'd be very weird if __str__() returned bytes. But I think
it'd need to if you passed bytes into scandir(). Do others have
thoughts?

> Would it be hard to implement the wildcard feature on UNIX to compare
> performances of scandir('*.jpg') with and without the wildcard built
> in os.scandir?

It's a good idea, the problem with this is that the Windows wildcard
implementation has a bunch of crazy edge cases where *.ext will catch
more things than just a simple regex/glob. This was discussed on
python-dev or python-ideas previously, so I'll dig it up and add to a
Rejected Ideas section. In any case, this could be added later if
there's a way to iron out the Windows quirks.

-Ben

From benhoyt at gmail.com  Sat Jun 28 21:55:00 2014
From: benhoyt at gmail.com (Ben Hoyt)
Date: Sat, 28 Jun 2014 15:55:00 -0400
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53AD4B13.8070100@sotecware.net>
 <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>
Message-ID: <CAL9jXCHCVby=kbyrqYd8eJ4MQD0OeH0Zh6uVu0tVo0tdnPtWMw@mail.gmail.com>

Re is_dir etc being properties rather than methods:

>> I find this behaviour a bit misleading: using methods and have them
>> return cached results. How much (implementation and/or performance
>> and/or memory) overhead would incur by using property-like access here?
>> I think this would underline the static nature of the data.
>>
>> This would break the semantics with respect to pathlib, but they're only
>> marginally equal anyways -- and as far as I understand it, pathlib won't
>> cache, so I think this has a fair point here.
>
> Indeed - using properties rather than methods may help emphasise the
> deliberate *difference* from pathlib in this case (i.e. value when the
> result was retrieved from the OS, rather than the value right now). The main
> benefit is that switching from using the DirEntry object to a pathlib Path
> will require touching all the places where the performance characteristics
> switch from "memory access" to "system call". This benefit is also the main
> downside, so I'd actually be OK with either decision on this one.

The problem with this is that properties "look free", they look just
like attribute access, so you wouldn't normally handle exceptions when
accessing them. But .lstat() and .is_dir() etc may do an OS call, so
if you're needing to be careful with error handling, you may want to
handle errors on them. Hence I think it's best practice to make them
functions().

Some of us discussed this on python-dev or python-ideas a while back,
and I think there was general agreement with what I've stated above
and therefore they should be methods. But I'll dig up the links and
add to a Rejected ideas section.

> * +1 on a new section in the PEP covering rejected design options (calling
> it iterdir, returning a 2-tuple instead of a dedicated DirEntry type)

Great idea. I'll add a bunch of stuff, including the above, to a new
section, Rejected Design Options.

> * regarding "why not a 2-tuple", we know from experience that operating
> systems evolve and we end up wanting to add additional info to this kind of
> API. A dedicated DirEntry type lets us adjust the information returned over
> time, without breaking backwards compatibility and without resorting to ugly
> hacks like those in some of the time and stat APIs (or even our own codec
> info APIs)

Fully agreed.

> * it would be nice to see some relative performance numbers for NFS and CIFS
> network shares - the additional network round trips can make excessive stat
> calls absolutely brutal from a speed perspective when using a network drive
> (that's why the stat caching added to the import system in 3.3 dramatically
> sped up the case of having network drives on sys.path, and why I thought AJ
> had a point when he was complaining about the fact we didn't expose the
> dirent data from os.listdir)

Don't know if you saw, but there are actually some benchmarks,
including one over NFS, on the scandir GitHub page:

https://github.com/benhoyt/scandir#benchmarks

os.walk() was 23 times faster with scandir() than the current
listdir() + stat() implementation on the Windows NFS file system I
tried. Pretty good speedup!

-Ben

From ncoghlan at gmail.com  Sun Jun 29 06:59:19 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 Jun 2014 14:59:19 +1000
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CAL9jXCHAMjjDSRm2c37foGQyEW6UWV7Ni_bxNhZWFe298jrP+w@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <CAMpsgwbphfv+6wb07T+gJROMay2nnrqMvNJhLVYm8_e23QtQ6w@mail.gmail.com>
 <CAL9jXCHAMjjDSRm2c37foGQyEW6UWV7Ni_bxNhZWFe298jrP+w@mail.gmail.com>
Message-ID: <CADiSq7fQ_LM_683NiUx87UhEU7ysa7VAwyfbD938zjW3TAgJaQ@mail.gmail.com>

On 29 June 2014 05:48, Ben Hoyt <benhoyt at gmail.com> wrote:
>>> But the underlying system calls -- ``FindFirstFile`` /
>>> ``FindNextFile`` on Windows and ``readdir`` on Linux and OS X --
>>
>> What about FreeBSD, OpenBSD, NetBSD, Solaris, etc. They don't provide readdir?
>
> I guess it'd be better to say "Windows" and "Unix-based OSs"
> throughout the PEP? Because all of these (including Mac OS X) are
> Unix-based.

*nix and POSIX-based are the two conventions I use.


>> Crazy idea: would it be possible to "convert" a DirEntry object to a
>> pathlib.Path object without losing the cache? I guess that
>> pathlib.Path expects a full  stat_result object.
>
> The main problem is that pathlib.Path objects explicitly don't cache
> stat info (and Guido doesn't want them to, for good reason I think).
> There's a thread on python-dev about this earlier. I'll add it to a
> "Rejected ideas" section.

The key problem with caches on pathlib.Path objects is that you could
end up with two separate path objects that referred to the same
filesystem location but returned different answers about the
filesystem state because their caches might be stale. DirEntry is
different, as the content is generally *assumed* to be stale
(referring to when the directory was scanned, rather than the current
filesystem state). DirEntry.lstat() on POSIX systems will be an
exception to that general rule (referring to the time of first lookup,
rather than when the directory was scanned, so the answer rom lstat()
may be inconsistent with other data stored directly on the DirEntry
object), but one we can probably live with.

More generally, as part of the pathlib PEP review, we figured out that
a *per-object* cache of filesystem state would be an inherently bad
idea, but a string based *process global* cache might make sense for
modules like walkdir (not part of the stdlib - it's an iterator
pipeline based approach to file tree scanning I wrote a while back,
that currently suffers badly from the performance impact of repeated
stat calls at different stages of the pipeline). We realised this was
getting into a space where application and library specific concerns
are likely to start affecting the caching design, though, so the
current status of standard library level stat caching is "it's not
clear if there's an available approach that would be sufficiently
general purpose to be appropriate for inclusion in the standard
library".

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Sun Jun 29 07:03:27 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 Jun 2014 15:03:27 +1000
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CAL9jXCHCVby=kbyrqYd8eJ4MQD0OeH0Zh6uVu0tVo0tdnPtWMw@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53AD4B13.8070100@sotecware.net>
 <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>
 <CAL9jXCHCVby=kbyrqYd8eJ4MQD0OeH0Zh6uVu0tVo0tdnPtWMw@mail.gmail.com>
Message-ID: <CADiSq7c8xm0WEn9-1c1Yd+dAjnbwkGjNcxtTpmdo3Kyye98q7g@mail.gmail.com>

On 29 June 2014 05:55, Ben Hoyt <benhoyt at gmail.com> wrote:
> Re is_dir etc being properties rather than methods:
>
>>> I find this behaviour a bit misleading: using methods and have them
>>> return cached results. How much (implementation and/or performance
>>> and/or memory) overhead would incur by using property-like access here?
>>> I think this would underline the static nature of the data.
>>>
>>> This would break the semantics with respect to pathlib, but they're only
>>> marginally equal anyways -- and as far as I understand it, pathlib won't
>>> cache, so I think this has a fair point here.
>>
>> Indeed - using properties rather than methods may help emphasise the
>> deliberate *difference* from pathlib in this case (i.e. value when the
>> result was retrieved from the OS, rather than the value right now). The main
>> benefit is that switching from using the DirEntry object to a pathlib Path
>> will require touching all the places where the performance characteristics
>> switch from "memory access" to "system call". This benefit is also the main
>> downside, so I'd actually be OK with either decision on this one.
>
> The problem with this is that properties "look free", they look just
> like attribute access, so you wouldn't normally handle exceptions when
> accessing them. But .lstat() and .is_dir() etc may do an OS call, so
> if you're needing to be careful with error handling, you may want to
> handle errors on them. Hence I think it's best practice to make them
> functions().
>
> Some of us discussed this on python-dev or python-ideas a while back,
> and I think there was general agreement with what I've stated above
> and therefore they should be methods. But I'll dig up the links and
> add to a Rejected ideas section.

Yes, only the stuff that *never* needs a system call (regardless of
OS) would be a candidate for handling as a property rather than a
method call. Consistency of access would likely trump that idea
anyway, but it would still be worth ensuring that the PEP is clear on
which values are guaranteed to reflect the state at the time of the
directory scanning and which may imply an additional stat call.

>> * it would be nice to see some relative performance numbers for NFS and CIFS
>> network shares - the additional network round trips can make excessive stat
>> calls absolutely brutal from a speed perspective when using a network drive
>> (that's why the stat caching added to the import system in 3.3 dramatically
>> sped up the case of having network drives on sys.path, and why I thought AJ
>> had a point when he was complaining about the fact we didn't expose the
>> dirent data from os.listdir)
>
> Don't know if you saw, but there are actually some benchmarks,
> including one over NFS, on the scandir GitHub page:
>
> https://github.com/benhoyt/scandir#benchmarks

No, I hadn't seen those - may be worth referencing explicitly from the
PEP (and if there's already a reference... oops!)

> os.walk() was 23 times faster with scandir() than the current
> listdir() + stat() implementation on the Windows NFS file system I
> tried. Pretty good speedup!

Ah, nice!

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From greg at krypto.org  Sun Jun 29 08:26:24 2014
From: greg at krypto.org (Gregory P. Smith)
Date: Sat, 28 Jun 2014 23:26:24 -0700
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CAL9jXCHAMjjDSRm2c37foGQyEW6UWV7Ni_bxNhZWFe298jrP+w@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <CAMpsgwbphfv+6wb07T+gJROMay2nnrqMvNJhLVYm8_e23QtQ6w@mail.gmail.com>
 <CAL9jXCHAMjjDSRm2c37foGQyEW6UWV7Ni_bxNhZWFe298jrP+w@mail.gmail.com>
Message-ID: <CAGE7PNLdwORBZJyKM7eRy14wFKp=EMiLHyOvJ_QOyoLAYo604A@mail.gmail.com>

On Jun 28, 2014 12:49 PM, "Ben Hoyt" <benhoyt at gmail.com> wrote:
>
> >> But the underlying system calls -- ``FindFirstFile`` /
> >> ``FindNextFile`` on Windows and ``readdir`` on Linux and OS X --
> >
> > What about FreeBSD, OpenBSD, NetBSD, Solaris, etc. They don't provide
readdir?
>
> I guess it'd be better to say "Windows" and "Unix-based OSs"
> throughout the PEP? Because all of these (including Mac OS X) are
> Unix-based.

No, Just say POSIX.

>
> > It looks like the WIN32_FIND_DATA has a dwFileAttributes field. So we
> > should mimic stat_result recent addition: the new
> > stat_result.file_attributes field. Add DirEntry.file_attributes which
> > would only be available on Windows.
> >
> > The Windows structure also contains
> >
> >   FILETIME ftCreationTime;
> >   FILETIME ftLastAccessTime;
> >   FILETIME ftLastWriteTime;
> >   DWORD    nFileSizeHigh;
> >   DWORD    nFileSizeLow;
> >
> > It would be nice to expose them as well. I'm  no more surprised that
> > the exact API is different depending on the OS for functions of the os
> > module.
>
> I think you've misunderstood how DirEntry.lstat() works on Windows --
> it's basically a no-op, as Windows returns the full stat information
> with the original FindFirst/FindNext OS calls. This is fairly explict
> in the PEP, but I'm sure I could make it clearer:
>
>     DirEntry.lstat(): "like os.lstat(), but requires no system calls on
Windows
>
> So you can already get the dwFileAttributes for free by saying
> entry.lstat().st_file_attributes. You can also get all the other
> fields you mentioned for free via .lstat() with no additional OS calls
> on Windows, for example: entry.lstat().st_size.
>
> Feel free to suggest changes to the PEP or scandir docs if this isn't
> clear. Note that is_dir()/is_file()/is_symlink() are free on all
> systems, but .lstat() is only free on Windows.
>
> > Does your implementation uses a free list to avoid the cost of memory
> > allocation? A short free list of 10 or maybe just 1 may help. The free
> > list may be stored directly in the generator object.
>
> No, it doesn't. I might add this to the PEP under "possible
> improvements". However, I think the speed increase by removing the
> extra OS call and/or disk seek is going to be way more than memory
> allocation improvements, so I'm not sure this would be worth it.
>
> > Does it support also bytes filenames on UNIX?
>
> > Python now supports undecodable filenames thanks to the PEP 383
> > (surrogateescape). I prefer to use the same type for filenames on
> > Linux and Windows, so Unicode is better. But some users might prefer
> > bytes for other reasons.
>
> I forget exactly now what my scandir module does, but for os.scandir()
> I think this should behave exactly like os.listdir() does for
> Unicode/bytes filenames.
>
> > Crazy idea: would it be possible to "convert" a DirEntry object to a
> > pathlib.Path object without losing the cache? I guess that
> > pathlib.Path expects a full  stat_result object.
>
> The main problem is that pathlib.Path objects explicitly don't cache
> stat info (and Guido doesn't want them to, for good reason I think).
> There's a thread on python-dev about this earlier. I'll add it to a
> "Rejected ideas" section.
>
> > I don't understand how you can build a full lstat() result without
> > really calling stat. I see that WIN32_FIND_DATA contains the size, but
> > here you call lstat().
>
> See above.
>
> > Do you plan to continue to maintain your module for Python < 3.5, but
> > upgrade your module for the final PEP?
>
> Yes, I intend to maintain the standalone scandir module for 2.6 <=
> Python < 3.5, at least for a good while. For integration into the
> Python 3.5 stdlib, the implementation will be integrated into
> posixmodule.c, of course.
>
> >> Should there be a way to access the full path?
> >> ----------------------------------------------
> >>
> >> Should ``DirEntry``'s have a way to get the full path without using
> >> ``os.path.join(path, entry.name)``? This is a pretty common pattern,
> >> and it may be useful to add pathlib-like ``str(entry)`` functionality.
> >> This functionality has also been requested in `issue 13`_ on GitHub.
> >>
> >> .. _`issue 13`: https://github.com/benhoyt/scandir/issues/13
> >
> > I think that it would be very convinient to store the directory name
> > in the DirEntry. It should be light, it's just a reference.
> >
> > And provide a fullname() name which would just return
> > os.path.join(path, entry.name) without trying to resolve path to get
> > an absolute path.
>
> Yeah, fair suggestion. I'm still slightly on the fence about this, but
> I think an explicit fullname() is a good suggestion. Ideally I think
> it'd be better to mimic pathlib.Path.__str__() which is kind of the
> equivalent of fullname(). But how does pathlib deal with unicode/bytes
> issues if it's the str function which has to return a str object? Or
> at least, it'd be very weird if __str__() returned bytes. But I think
> it'd need to if you passed bytes into scandir(). Do others have
> thoughts?
>
> > Would it be hard to implement the wildcard feature on UNIX to compare
> > performances of scandir('*.jpg') with and without the wildcard built
> > in os.scandir?
>
> It's a good idea, the problem with this is that the Windows wildcard
> implementation has a bunch of crazy edge cases where *.ext will catch
> more things than just a simple regex/glob. This was discussed on
> python-dev or python-ideas previously, so I'll dig it up and add to a
> Rejected Ideas section. In any case, this could be added later if
> there's a way to iron out the Windows quirks.
>
> -Ben
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/greg%40krypto.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140628/adca5370/attachment-0001.html>

From walter at livinglogic.de  Sun Jun 29 10:23:42 2014
From: walter at livinglogic.de (Walter =?utf-8?q?D=C3=B6rwald?=)
Date: Sun, 29 Jun 2014 10:23:42 +0200
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CAL9jXCHAMjjDSRm2c37foGQyEW6UWV7Ni_bxNhZWFe298jrP+w@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <CAMpsgwbphfv+6wb07T+gJROMay2nnrqMvNJhLVYm8_e23QtQ6w@mail.gmail.com>
 <CAL9jXCHAMjjDSRm2c37foGQyEW6UWV7Ni_bxNhZWFe298jrP+w@mail.gmail.com>
Message-ID: <17DF953A-91A9-41E8-AFF7-FC612B2FB4BE@livinglogic.de>

On 28 Jun 2014, at 21:48, Ben Hoyt wrote:

> [...]
>> Crazy idea: would it be possible to "convert" a DirEntry object to a
>> pathlib.Path object without losing the cache? I guess that
>> pathlib.Path expects a full  stat_result object.
>
> The main problem is that pathlib.Path objects explicitly don't cache
> stat info (and Guido doesn't want them to, for good reason I think).
> There's a thread on python-dev about this earlier. I'll add it to a
> "Rejected ideas" section.

However, it would be bad to have two implementations of the concept of 
"filename" with different attribute and method names.

The best way to ensure compatible APIs would be if one class was derived 
from the other.

> [...]

Servus,
    Walter

From steve at pearwood.info  Sun Jun 29 12:52:40 2014
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 29 Jun 2014 20:52:40 +1000
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
	faster directory iterator
In-Reply-To: <CAL9jXCHCVby=kbyrqYd8eJ4MQD0OeH0Zh6uVu0tVo0tdnPtWMw@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53AD4B13.8070100@sotecware.net>
 <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>
 <CAL9jXCHCVby=kbyrqYd8eJ4MQD0OeH0Zh6uVu0tVo0tdnPtWMw@mail.gmail.com>
Message-ID: <20140629105235.GM13014@ando>

On Sat, Jun 28, 2014 at 03:55:00PM -0400, Ben Hoyt wrote:
> Re is_dir etc being properties rather than methods:
[...]
> The problem with this is that properties "look free", they look just
> like attribute access, so you wouldn't normally handle exceptions when
> accessing them. But .lstat() and .is_dir() etc may do an OS call, so
> if you're needing to be careful with error handling, you may want to
> handle errors on them. Hence I think it's best practice to make them
> functions().

I think this one could go either way. Methods look like they actually 
re-test the value each time you call it. I can easily see people not 
realising that the value is cached and writing code like this toy 
example:


# Detect a file change.
t = the_file.lstat().st_mtime
while the_file.lstat().st_mtime == t:
     sleep(0.1)
print("Changed!")


I know that's not the best way to detect file changes, but I'm sure 
people will do something like that and not realise that the call to 
lstat is cached.

Personally, I would prefer a property. If I forget to wrap a call in a 
try...except, it will fail hard and I will get an exception. But with a 
method call, the failure is silent and I keep getting the cached result.

Speaking of caching, is there a way to freshen the cached values?


-- 
Steven

From ncoghlan at gmail.com  Sun Jun 29 13:08:36 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 Jun 2014 21:08:36 +1000
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <20140629105235.GM13014@ando>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53AD4B13.8070100@sotecware.net>
 <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>
 <CAL9jXCHCVby=kbyrqYd8eJ4MQD0OeH0Zh6uVu0tVo0tdnPtWMw@mail.gmail.com>
 <20140629105235.GM13014@ando>
Message-ID: <CADiSq7cTLCgmXoZmEzHbDG7+0_w9hDxJ=p5RAn62=fS7NN54vw@mail.gmail.com>

On 29 June 2014 20:52, Steven D'Aprano <steve at pearwood.info> wrote:
> Speaking of caching, is there a way to freshen the cached values?

Switch to a full Path object instead of relying on the cached DirEntry data.

This is what makes me wary of including lstat, even though Windows
offers it without the extra stat call. Caching behaviour is *really*
hard to make intuitive, especially when it *sometimes* returns data
that looks fresh (as it on first call on POSIX systems).

Regards,
Nick.


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From p.f.moore at gmail.com  Sun Jun 29 13:45:49 2014
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 29 Jun 2014 12:45:49 +0100
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CADiSq7cTLCgmXoZmEzHbDG7+0_w9hDxJ=p5RAn62=fS7NN54vw@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53AD4B13.8070100@sotecware.net>
 <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>
 <CAL9jXCHCVby=kbyrqYd8eJ4MQD0OeH0Zh6uVu0tVo0tdnPtWMw@mail.gmail.com>
 <20140629105235.GM13014@ando>
 <CADiSq7cTLCgmXoZmEzHbDG7+0_w9hDxJ=p5RAn62=fS7NN54vw@mail.gmail.com>
Message-ID: <CACac1F-9z-v8O6-cWFSwuAGiorNg3Rgf+FpLXrc68OQnu_=qOg@mail.gmail.com>

On 29 June 2014 12:08, Nick Coghlan <ncoghlan at gmail.com> wrote:
> This is what makes me wary of including lstat, even though Windows
> offers it without the extra stat call. Caching behaviour is *really*
> hard to make intuitive, especially when it *sometimes* returns data
> that looks fresh (as it on first call on POSIX systems).

If it matters that much we *could* simply call it cached_lstat(). It's
ugly, but I really don't like the idea of throwing the information
away - after all, the fact that we currently throw data away is why
there's even a need for scandir. Let's not make the same mistake
again...

Paul

From ncoghlan at gmail.com  Sun Jun 29 14:28:14 2014
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 Jun 2014 22:28:14 +1000
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CACac1F-9z-v8O6-cWFSwuAGiorNg3Rgf+FpLXrc68OQnu_=qOg@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53AD4B13.8070100@sotecware.net>
 <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>
 <CAL9jXCHCVby=kbyrqYd8eJ4MQD0OeH0Zh6uVu0tVo0tdnPtWMw@mail.gmail.com>
 <20140629105235.GM13014@ando>
 <CADiSq7cTLCgmXoZmEzHbDG7+0_w9hDxJ=p5RAn62=fS7NN54vw@mail.gmail.com>
 <CACac1F-9z-v8O6-cWFSwuAGiorNg3Rgf+FpLXrc68OQnu_=qOg@mail.gmail.com>
Message-ID: <CADiSq7epzmkoG6Xqv2LBg2SDLnHkZK-UyNakkv3UNONYhODWpg@mail.gmail.com>

On 29 June 2014 21:45, Paul Moore <p.f.moore at gmail.com> wrote:
> On 29 June 2014 12:08, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> This is what makes me wary of including lstat, even though Windows
>> offers it without the extra stat call. Caching behaviour is *really*
>> hard to make intuitive, especially when it *sometimes* returns data
>> that looks fresh (as it on first call on POSIX systems).
>
> If it matters that much we *could* simply call it cached_lstat(). It's
> ugly, but I really don't like the idea of throwing the information
> away - after all, the fact that we currently throw data away is why
> there's even a need for scandir. Let's not make the same mistake
> again...

Future-proofing is the reason DirEntry is a full fledged class in the
first place, though.

Effectively communicating the behavioural difference between DirEntry
and pathlib.Path is the main thing that makes me nervous about
adhering too closely to the Path API.

To restate the problem and the alternative proposal, these are the
DirEntry methods under discussion:

    is_dir(): like os.path.isdir(), but requires no system calls on at
least POSIX and Windows
    is_file(): like os.path.isfile(), but requires no system calls on
at least POSIX and Windows
    is_symlink(): like os.path.islink(), but requires no system calls
on at least POSIX and Windows
    lstat(): like os.lstat(), but requires no system calls on Windows

For the almost-certain-to-be-cached items, the suggestion is to make
them properties (or just ordinary attributes):

    is_dir
    is_file
    is_symlink

What do with lstat() is currently less clear, since POSIX directory
scanning doesn't provide that level of detail by default.

The PEP also doesn't currently state whether the is_dir(), is_file()
and is_symlink() results would be updated if a call to lstat()
produced different answers than the original directory scanning
process, which further suggests to me that allowing the stat call to
be delayed on POSIX systems is a potentially problematic and
inherently confusing design. We would have two options:

- update them, meaning calling lstat() may change those results from
being a snapshot of the setting at the time the directory was scanned
- leave them alone, meaning the DirEntry object and the
DirEntry.lstat() result may give different answers

Those both sound ugly to me.

So, here's my alternative proposal: add an "ensure_lstat" flag to
scandir() itself, and don't have *any* methods on DirEntry, only
attributes.

That would make the DirEntry attributes:

    is_dir: boolean, always populated
    is_file: boolean, always populated
    is_symlink boolean, always populated
    lstat_result: stat result, may be None on POSIX systems if
ensure_lstat is False

(I'm not particularly sold on "lstat_result" as the name, but "lstat"
reads as a verb to me, so doesn't sound right as an attribute name)

What this would allow:

- by default, scanning is efficient everywhere, but lstat_result may
be None on POSIX systems
- if you always need the lstat result, setting "ensure_lstat" will
trigger the extra system call implicitly
- if you only sometimes need the stat result, you can call os.lstat()
explicitly when the DirEntry lstat attribute is None

Most importantly, *regardless of platform*, the cached stat result (if
not None) would reflect the state of the entry at the time the
directory was scanned, rather than at some arbitrary later point in
time when lstat() was first called on the DirEntry object.

There'd still be a slight window of discrepancy (since the filesystem
state may change between reading the directory entry and making the
lstat() call), but this could be effectively eliminated from the
perspective of the Python code by making the result of the lstat()
call authoritative for the whole DirEntry object.

Regards,
Nick.

P.S. We'd be generating quite a few of these, so we can use __slots__
to keep the memory overhead to a minimum (that's just a general
comment - it's really irrelevant to the methods-or-attributes
question).


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From j.wielicki at sotecware.net  Sun Jun 29 13:12:55 2014
From: j.wielicki at sotecware.net (Jonas Wielicki)
Date: Sun, 29 Jun 2014 13:12:55 +0200
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CADiSq7cTLCgmXoZmEzHbDG7+0_w9hDxJ=p5RAn62=fS7NN54vw@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53AD4B13.8070100@sotecware.net>
 <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>
 <CAL9jXCHCVby=kbyrqYd8eJ4MQD0OeH0Zh6uVu0tVo0tdnPtWMw@mail.gmail.com>
 <20140629105235.GM13014@ando>
 <CADiSq7cTLCgmXoZmEzHbDG7+0_w9hDxJ=p5RAn62=fS7NN54vw@mail.gmail.com>
Message-ID: <53AFF4B7.9030200@sotecware.net>

On 29.06.2014 13:08, Nick Coghlan wrote:
> On 29 June 2014 20:52, Steven D'Aprano <steve at pearwood.info> wrote:
>> Speaking of caching, is there a way to freshen the cached values?
> 
> Switch to a full Path object instead of relying on the cached DirEntry data.
> 
> This is what makes me wary of including lstat, even though Windows
> offers it without the extra stat call. Caching behaviour is *really*
> hard to make intuitive, especially when it *sometimes* returns data
> that looks fresh (as it on first call on POSIX systems).

This bugs me too. An idea I had was adding a keyword argument to scandir
which specifies whether stat data should be added to the direntry or not.

If the flag is set to True, This would implicitly call lstat on POSIX
before returning the DirEntry, and use the available data on Windows.

If the flag is set to False, all the fields in the DirEntry will be
None, for consistency, even on Windows.


This is not optimal in cases where the stat information is needed only
for some of the DirEntry objects, but would also reduce the required
logic in the DirEntry object.

Thoughts?

> 
> Regards,
> Nick.
> 
> 


From ethan at stoneleaf.us  Sun Jun 29 19:02:16 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Sun, 29 Jun 2014 10:02:16 -0700
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CADiSq7epzmkoG6Xqv2LBg2SDLnHkZK-UyNakkv3UNONYhODWpg@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53AD4B13.8070100@sotecware.net>
 <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>
 <CAL9jXCHCVby=kbyrqYd8eJ4MQD0OeH0Zh6uVu0tVo0tdnPtWMw@mail.gmail.com>
 <20140629105235.GM13014@ando>
 <CADiSq7cTLCgmXoZmEzHbDG7+0_w9hDxJ=p5RAn62=fS7NN54vw@mail.gmail.com>
 <CACac1F-9z-v8O6-cWFSwuAGiorNg3Rgf+FpLXrc68OQnu_=qOg@mail.gmail.com>
 <CADiSq7epzmkoG6Xqv2LBg2SDLnHkZK-UyNakkv3UNONYhODWpg@mail.gmail.com>
Message-ID: <53B04698.90600@stoneleaf.us>

On 06/29/2014 05:28 AM, Nick Coghlan wrote:
>
> So, here's my alternative proposal: add an "ensure_lstat" flag to
> scandir() itself, and don't have *any* methods on DirEntry, only
> attributes.
>
> That would make the DirEntry attributes:
>
>      is_dir: boolean, always populated
>      is_file: boolean, always populated
>      is_symlink boolean, always populated
>      lstat_result: stat result, may be None on POSIX systems if
> ensure_lstat is False
>
> (I'm not particularly sold on "lstat_result" as the name, but "lstat"
> reads as a verb to me, so doesn't sound right as an attribute name)

+1

--
~Ethan~

From ethan at stoneleaf.us  Sun Jun 29 19:04:19 2014
From: ethan at stoneleaf.us (Ethan Furman)
Date: Sun, 29 Jun 2014 10:04:19 -0700
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <53AFF4B7.9030200@sotecware.net>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53AD4B13.8070100@sotecware.net>
 <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>
 <CAL9jXCHCVby=kbyrqYd8eJ4MQD0OeH0Zh6uVu0tVo0tdnPtWMw@mail.gmail.com>
 <20140629105235.GM13014@ando>
 <CADiSq7cTLCgmXoZmEzHbDG7+0_w9hDxJ=p5RAn62=fS7NN54vw@mail.gmail.com>
 <53AFF4B7.9030200@sotecware.net>
Message-ID: <53B04713.1070700@stoneleaf.us>

On 06/29/2014 04:12 AM, Jonas Wielicki wrote:
>
> If the flag is set to False, all the fields in the DirEntry will be
> None, for consistency, even on Windows.

-1

This consistency is unnecessary.

--
~Ethan~

From 4kir4.1i at gmail.com  Sun Jun 29 20:32:53 2014
From: 4kir4.1i at gmail.com (Akira Li)
Date: Sun, 29 Jun 2014 22:32:53 +0400
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
	faster directory iterator
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <877g412nhg.fsf@gmail.com>
 <CAPTjJmpQ1FkJHDNZCAC3uxcG7dsETj0HyO7p_apifdjVcb1Ctg@mail.gmail.com>
Message-ID: <87zjgv1s8a.fsf@gmail.com>

Chris Angelico <rosuav at gmail.com> writes:

> On Sat, Jun 28, 2014 at 11:05 PM, Akira Li <4kir4.1i at gmail.com> wrote:
>> Have you considered adding support for paths relative to directory
>> descriptors [1] via keyword only dir_fd=None parameter if it may lead to
>> more efficient implementations on some platforms?
>>
>> [1]: https://docs.python.org/3.4/library/os.html#dir-fd
>
> Potentially more efficient and also potentially safer (see 'man
> openat')... but an enhancement that can wait, if necessary.
>

Introducing the feature later creates unnecessary incompatibilities.
Either it should be explicitly rejected in the PEP 471 and
something-like `os.scandir(os.open(relative_path, dir_fd=fd))` recommended
instead (assuming `os.scandir in os.supports_fd` like `os.listdir()`).

At C level it could be implemented using fdopendir/openat or scandirat.

Here's the function description using Argument Clinic DSL:

/*[clinic input]

os.scandir

    path : path_t(allow_fd=True, nullable=True) = '.'

        *path* can be specified as either str or bytes. On some
        platforms, *path* may also be specified as an open file
        descriptor; the file descriptor must refer to a directory.  If
        this functionality is unavailable, using it raises
        NotImplementedError.

    *

    dir_fd : dir_fd = None

        If not None, it should be a file descriptor open to a
        directory, and *path* should be a relative string; path will
        then be relative to that directory.  if *dir_fd* is
        unavailable, using it raises NotImplementedError.

Yield a DirEntry object for each file and directory in *path*.

Just like os.listdir, the '.' and '..' pseudo-directories are skipped,
and the entries are yielded in system-dependent order.

{parameters}
It's an error to use *dir_fd* when specifying *path* as an open file
descriptor.

[clinic start generated code]*/


And corresponding tests (from test_posix:PosixTester), to show the
compatibility with os.listdir argument parsing in detail:

    def test_scandir_default(self):
        # When scandir is called without argument,
        # it's the same as scandir(os.curdir).
        self.assertIn(support.TESTFN, [e.name for e in posix.scandir()])

    def _test_scandir(self, curdir):
        filenames = sorted(e.name for e in posix.scandir(curdir))
        self.assertIn(support.TESTFN, filenames)
        #NOTE: assume listdir, scandir accept the same types on the platform
        self.assertEqual(sorted(posix.listdir(curdir)), filenames)

    def test_scandir(self):
        self._test_scandir(os.curdir)

    def test_scandir_none(self):
        # it's the same as scandir(os.curdir).
        self._test_scandir(None)

    def test_scandir_bytes(self):
        # When scandir is called with a bytes object,
        # the returned entries names are still of type str.
        # Call `os.fsencode(entry.name)` to get bytes
        self.assertIn('a', {'a'})
        self.assertNotIn(b'a', {'a'})
        self._test_scandir(b'.')

    @unittest.skipUnless(posix.scandir in os.supports_fd,
                         "test needs fd support for posix.scandir()")
    def test_scandir_fd_minus_one(self):
        # it's the same as scandir(os.curdir).
        self._test_scandir(-1)

    def test_scandir_float(self):
        # invalid args
        self.assertRaises(TypeError, posix.scandir, -1.0)

    @unittest.skipUnless(posix.scandir in os.supports_fd,
                         "test needs fd support for posix.scandir()")
    def test_scandir_fd(self):
        fd = posix.open(posix.getcwd(), posix.O_RDONLY)
        self.addCleanup(posix.close, fd)
        self._test_scandir(fd)
        self.assertEqual(
            sorted(posix.scandir('.')),
            sorted(posix.scandir(fd)))
        # call 2nd time to test rewind
        self.assertEqual(
            sorted(posix.scandir('.')),
            sorted(posix.scandir(fd)))

    @unittest.skipUnless(posix.scandir in os.supports_dir_fd,
                         "test needs dir_fd support for os.scandir()")
    def test_scandir_dir_fd(self):
        relpath = 'relative_path'
        with support.temp_dir() as parent:
            fullpath = os.path.join(parent, relpath)
            with support.temp_dir(path=fullpath):
                support.create_empty_file(os.path.join(parent, 'a'))
                support.create_empty_file(os.path.join(fullpath, 'b'))
                fd = posix.open(parent, posix.O_RDONLY)
                self.addCleanup(posix.close, fd)
                self.assertEqual(
                    sorted(posix.scandir(relpath, dir_fd=fd)),
                    sorted(posix.scandir(fullpath)))
                # check that fd is still useful
                self.assertEqual(
                    sorted(posix.scandir(relpath, dir_fd=fd)),
                    sorted(posix.scandir(fullpath)))


--
Akira


From j.wielicki at sotecware.net  Sun Jun 29 23:04:09 2014
From: j.wielicki at sotecware.net (Jonas Wielicki)
Date: Sun, 29 Jun 2014 23:04:09 +0200
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <53B04713.1070700@stoneleaf.us>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53AD4B13.8070100@sotecware.net>
 <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>
 <CAL9jXCHCVby=kbyrqYd8eJ4MQD0OeH0Zh6uVu0tVo0tdnPtWMw@mail.gmail.com>
 <20140629105235.GM13014@ando>
 <CADiSq7cTLCgmXoZmEzHbDG7+0_w9hDxJ=p5RAn62=fS7NN54vw@mail.gmail.com>
 <53AFF4B7.9030200@sotecware.net> <53B04713.1070700@stoneleaf.us>
Message-ID: <53B07F49.6010300@sotecware.net>

On 29.06.2014 19:04, Ethan Furman wrote:
> On 06/29/2014 04:12 AM, Jonas Wielicki wrote:
>>
>> If the flag is set to False, all the fields in the DirEntry will be
>> None, for consistency, even on Windows.
> 
> -1
>
> This consistency is unnecessary.

I?m not sure -- similar to the windows_wildcard option this might be a
temptation to write platform dependent code, although possibly by
accident (i.e. not reading the docs carefully).

> 
> -- 
> ~Ethan~
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/j.wielicki%40sotecware.net
> 


From berker.peksag at gmail.com  Mon Jun 30 02:08:24 2014
From: berker.peksag at gmail.com (=?UTF-8?Q?Berker_Peksa=C4=9F?=)
Date: Mon, 30 Jun 2014 03:08:24 +0300
Subject: [Python-Dev] Fix Unicode-disabled build of Python 2.7
In-Reply-To: <CAMpsgwY5Qc_P4SgrKp4vvJsv-5kDWvooO1phLJHwd_=9BfV=vQ@mail.gmail.com>
References: <lobcee$uvq$1@ger.gmane.org>
 <CAMpsgwY-y9ZosAU1Aftn4hx_paFizub1YouBZT_2e_Cudds_9Q@mail.gmail.com>
 <1403625970.6550.133062453.693ECDEA@webmail.messagingengine.com>
 <nad-E06C00.12542824062014@news.gmane.org> <loegvg$fm8$2@ger.gmane.org>
 <CAMpsgwb1GkBbPKLdF3qmA-h00wEqM0G5=UVicgdvOJ=su+hG2Q@mail.gmail.com>
 <loekl5$2m7$1@ger.gmane.org>
 <CADiSq7ecypARn9Rf6C=tBHnb6vPxi8RgtBhAyujHORhJKVheug@mail.gmail.com>
 <loguom$58s$1@ger.gmane.org>
 <CAMpsgwY5Qc_P4SgrKp4vvJsv-5kDWvooO1phLJHwd_=9BfV=vQ@mail.gmail.com>
Message-ID: <CAF4280L6KEpNTq+JrsH1kcRUrSbVptLPqKUYZfDiJ8h+ypE6wg@mail.gmail.com>

On Sat, Jun 28, 2014 at 2:51 AM, Victor Stinner
<victor.stinner at gmail.com> wrote:
> 2014-06-26 13:04 GMT+02:00 Antoine Pitrou <antoine at python.org>:
>> For the same reason, I agree with Victor that we should ditch the
>> threading-disabled builds. It's too much of a hassle for no actual,
>> practical benefit. People who want a threadless unicodeless Python can
>> install Python 1.5.2 for all I care.
>
> By the way, adding a buildbot for testing Python without thread
> support is not enough. The buildbot is currently broken since more
> than one month and nobody noticed :-p

I've opened http://bugs.python.org/issue21755 to fix the test a couple
of weeks ago.

--Berker

>
> http://buildbot.python.org/all/builders/AMD64%20Fedora%20without%20threads%203.x/
>
> Ok, I noticed, but I consider that I spent too much time on this minor
> use case. I prefer to leave such task to someone else :-)
>
> Victor
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/berker.peksag%40gmail.com

From v+python at g.nevcal.com  Mon Jun 30 04:33:33 2014
From: v+python at g.nevcal.com (Glenn Linderman)
Date: Sun, 29 Jun 2014 19:33:33 -0700
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CADiSq7epzmkoG6Xqv2LBg2SDLnHkZK-UyNakkv3UNONYhODWpg@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53AD4B13.8070100@sotecware.net>
 <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>
 <CAL9jXCHCVby=kbyrqYd8eJ4MQD0OeH0Zh6uVu0tVo0tdnPtWMw@mail.gmail.com>
 <20140629105235.GM13014@ando>
 <CADiSq7cTLCgmXoZmEzHbDG7+0_w9hDxJ=p5RAn62=fS7NN54vw@mail.gmail.com>
 <CACac1F-9z-v8O6-cWFSwuAGiorNg3Rgf+FpLXrc68OQnu_=qOg@mail.gmail.com>
 <CADiSq7epzmkoG6Xqv2LBg2SDLnHkZK-UyNakkv3UNONYhODWpg@mail.gmail.com>
Message-ID: <53B0CC7D.6090609@g.nevcal.com>

On 6/29/2014 5:28 AM, Nick Coghlan wrote:
> There'd still be a slight window of discrepancy (since the filesystem
> state may change between reading the directory entry and making the
> lstat() call), but this could be effectively eliminated from the
> perspective of the Python code by making the result of the lstat()
> call authoritative for the whole DirEntry object.

+1 to this in particular, but this whole refresh of the semantics sounds 
better overall.

Finally, for the case where someone does want to keep the DirEntry 
around, a .refresh() API could rerun lstat() and update all the data.

And with that (initial data potentially always populated, or None, and 
an explicit refresh() API), the data could all be returned as 
properties, implying that they aren't fetching new data themselves, 
because they wouldn't be.

Glenn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140629/dd20692a/attachment-0001.html>

From benhoyt at gmail.com  Mon Jun 30 19:05:54 2014
From: benhoyt at gmail.com (Ben Hoyt)
Date: Mon, 30 Jun 2014 13:05:54 -0400
Subject: [Python-Dev] PEP 471 -- os.scandir() function -- a better and
 faster directory iterator
In-Reply-To: <CADiSq7epzmkoG6Xqv2LBg2SDLnHkZK-UyNakkv3UNONYhODWpg@mail.gmail.com>
References: <CAL9jXCGJAMp7AGdNUpjjCi=_P76VDihPJkLhj=jgvaens2XeBQ@mail.gmail.com>
 <53AD4B13.8070100@sotecware.net>
 <CADiSq7dV3sNOAj9PLsqJdzUEHoS8pAVUxDOxEPJzge5GgR9_OQ@mail.gmail.com>
 <CAL9jXCHCVby=kbyrqYd8eJ4MQD0OeH0Zh6uVu0tVo0tdnPtWMw@mail.gmail.com>
 <20140629105235.GM13014@ando>
 <CADiSq7cTLCgmXoZmEzHbDG7+0_w9hDxJ=p5RAn62=fS7NN54vw@mail.gmail.com>
 <CACac1F-9z-v8O6-cWFSwuAGiorNg3Rgf+FpLXrc68OQnu_=qOg@mail.gmail.com>
 <CADiSq7epzmkoG6Xqv2LBg2SDLnHkZK-UyNakkv3UNONYhODWpg@mail.gmail.com>
Message-ID: <CAL9jXCGWX3YEiOpFsTf2Lwh=E6OjoTsDw4_K1+rLARPRSVxd+g@mail.gmail.com>

> So, here's my alternative proposal: add an "ensure_lstat" flag to
> scandir() itself, and don't have *any* methods on DirEntry, only
> attributes.
>
> That would make the DirEntry attributes:
>
>     is_dir: boolean, always populated
>     is_file: boolean, always populated
>     is_symlink boolean, always populated
>     lstat_result: stat result, may be None on POSIX systems if
> ensure_lstat is False
>
> (I'm not particularly sold on "lstat_result" as the name, but "lstat"
> reads as a verb to me, so doesn't sound right as an attribute name)
>
> What this would allow:
>
> - by default, scanning is efficient everywhere, but lstat_result may
> be None on POSIX systems
> - if you always need the lstat result, setting "ensure_lstat" will
> trigger the extra system call implicitly
> - if you only sometimes need the stat result, you can call os.lstat()
> explicitly when the DirEntry lstat attribute is None
>
> Most importantly, *regardless of platform*, the cached stat result (if
> not None) would reflect the state of the entry at the time the
> directory was scanned, rather than at some arbitrary later point in
> time when lstat() was first called on the DirEntry object.
>
> There'd still be a slight window of discrepancy (since the filesystem
> state may change between reading the directory entry and making the
> lstat() call), but this could be effectively eliminated from the
> perspective of the Python code by making the result of the lstat()
> call authoritative for the whole DirEntry object.

Yeah, I quite like this. It does make the caching more explicit and
consistent. It's slightly annoying that it's less like pathlib.Path
now, but DirEntry was never pathlib.Path anyway, so maybe it doesn't
matter. The differences in naming may highlight the difference in
caching, so maybe it's a good thing.

Two further questions from me:

1) How does error handling work? Now os.stat() will/may be called
during iteration, so in __next__. But it hard to catch errors because
you don't call __next__ explicitly. Is this a problem? How do other
iterators that make system calls or raise errors handle this?

2) There's still the open question in the PEP of whether to include a
way to access the full path. This is cheap to build, it has to be
built anyway on POSIX systems, and it's quite useful for further
operations on the file. I think the best way to handle this is a
.fullname or .full_name attribute as suggested elsewhere. Thoughts?

-Ben