From ncoghlan at gmail.com Wed Feb 1 01:35:08 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 1 Feb 2012 10:35:08 +1000 Subject: [Python-Dev] Store timestamps as decimal.Decimal objects In-Reply-To: <4F287228.1090403@hotpy.org> References: <20120131131330.2349dc6b@pitrou.net> <4F287228.1090403@hotpy.org> Message-ID: On Wed, Feb 1, 2012 at 8:58 AM, Mark Shannon wrote: > Why not add a new function rather than modifying time.time()? > (after all its just a timestamp, does it really need nanosecond precision?) > > For those who do want super-accuracy then add a new function > time.picotime() (it could be nanotime but why not future proof it :) ) > which returns an int represent the number of picoseconds since the > epoch. ints never loose precision and never overflow. Because the problem is broader than that - it affects os.stat(), too, along with a number of the other time module APIs that produce timestamp values. That's where Alexander's suggestion of a separate "hirestime" module comes in - it would be based on the concept of *always* using a high precision type in the API (probably decimal.Decimal()). Conceptually, it's a very clean approach, and obviously has zero performance impact on existing APIs, but the idea of adding yet-another-time-related-module to the standard library is rather questionable. Such an approach is also likely to lead to a lot of duplicated code. Victor's current approach, unfortunately, is a bit of a "worst-of-both-worlds" approach. It couples the time and os modules to various other currently unrelated modules (such as datetime and decimal), but still doesn't provide a particularly extensible API (whether indicated by flags or strings, each new supported output type must be special cased in time and os). Perhaps more fruitful would be to revisit the original idea from the tracker of defining a conversion function protocol for timestamps using some basic fixed point arithmetic. The objection to using a conversion function that accepts a POSIX-style seconds+nanoseconds timespec is that it isn't future-proof - what if at some point in the future, nanonsecond resolution is considered inadequate? The secret to future-proofing such an API while only using integers lies in making the decimal exponent part of the conversion function signature: def from_components(integer, fraction=0, exponent=-9): return Decimal(integer) + Decimal(fraction) * Decimal((0, (1,), exponent)) >>> from_components(100) Decimal('100.000000000') >>> from_components(100, 100) Decimal('100.000000100') >>> from_components(100, 100) Decimal('100.000000100') >>> from_components(100, 100, -12) Decimal('100.000000000100') Such a protocol can easily be extended to any other type - the time module could provide conversion functions for integers and float objects (meaning results may have lower precision than the underlying system calls), while the existing "fromtimestamp" APIs in datetime can be updated to accept the new optional arguments (and perhaps an appropriate class method added to timedelta, too). A class method could also be added to the decimal module to construct instances from integer components (as shown above), since that method of construction isn't actually specific to timestamps. With this approach, API usage might end up looking something like: >>> time.time() 1328006975.681211 >>> time.time(convert=time.as_float) 1328006975.681211 >>> time.time(convert=time.as_int) 1328006979 >>> time.time(convert=time.as_tuple) (1328006975, 681211, -9) >>> time.time(convert=decimal.Decimal.from_components) Decimal('1328006983.761119000') >>> time.time(convert=datetime.datetime.fromtimestamp) datetime.datetime(2012, 1, 31, 11, 49, 49, 409831) >>> time.time(convert=datetime.datetime.utcfromtimestamp) datetime.datetime(2012, 1, 31, 11, 49, 49, 409831) >>> time.time(convert=datetime.date.fromtimestamp) datetime.date(2012, 1, 31) >>> print(time.time(convert=datetime.timedelta.fromtimestamp)) 15370 days, 10:49:52.842116 This strategy would have negligible performance impact in already supported cases (just an extra check to determine that no callback was provided), and offer a very simple, yet fully general and future-proof, integer based callback protocol when you want your timestamps in a different format. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Wed Feb 1 03:35:14 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 1 Feb 2012 03:35:14 +0100 Subject: [Python-Dev] Store timestamps as decimal.Decimal objects References: <20120131131330.2349dc6b@pitrou.net> <4F287228.1090403@hotpy.org> Message-ID: <20120201033514.6d8f3460@pitrou.net> On Wed, 1 Feb 2012 10:35:08 +1000 Nick Coghlan wrote: > > With this approach, API usage might end up looking something like: > > >>> time.time() > 1328006975.681211 > >>> time.time(convert=time.as_float) > 1328006975.681211 > >>> time.time(convert=time.as_int) > 1328006979 > >>> time.time(convert=time.as_tuple) > (1328006975, 681211, -9) > >>> time.time(convert=decimal.Decimal.from_components) > Decimal('1328006983.761119000') It strikes me as inelegant to have to do so much typing for something as simple as getting the current time. We should approach the simplicity of ``time.time(format='decimal')`` or ``time.decimal_time()``. (and I think the callback thing is overkill) Regards Antoine. From pje at telecommunity.com Wed Feb 1 03:40:02 2012 From: pje at telecommunity.com (PJ Eby) Date: Tue, 31 Jan 2012 21:40:02 -0500 Subject: [Python-Dev] Store timestamps as decimal.Decimal objects In-Reply-To: References: <20120131131330.2349dc6b@pitrou.net> <4F287228.1090403@hotpy.org> Message-ID: On Tue, Jan 31, 2012 at 7:35 PM, Nick Coghlan wrote: > Such a protocol can easily be extended to any other type - the time > module could provide conversion functions for integers and float > objects (meaning results may have lower precision than the underlying > system calls), while the existing "fromtimestamp" APIs in datetime can > be updated to accept the new optional arguments (and perhaps an > appropriate class method added to timedelta, too). A class method > could also be added to the decimal module to construct instances from > integer components (as shown above), since that method of construction > isn't actually specific to timestamps. > Why not just make it something like __fromfixed__() and make it a standard protocol, implemented on floats, ints, decimals, etc. Then the API is just "time.time(type)", where type is any object providing a __fromfixed__ method. ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From anacrolix at gmail.com Wed Feb 1 04:02:26 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Wed, 1 Feb 2012 14:02:26 +1100 Subject: [Python-Dev] Store timestamps as decimal.Decimal objects In-Reply-To: References: <20120131131330.2349dc6b@pitrou.net> <4F287228.1090403@hotpy.org> Message-ID: Analysis paralysis commence. +1 for separate module using decimal. On Feb 1, 2012 1:44 PM, "PJ Eby" wrote: > On Tue, Jan 31, 2012 at 7:35 PM, Nick Coghlan wrote: > >> Such a protocol can easily be extended to any other type - the time >> module could provide conversion functions for integers and float >> objects (meaning results may have lower precision than the underlying >> system calls), while the existing "fromtimestamp" APIs in datetime can >> be updated to accept the new optional arguments (and perhaps an >> appropriate class method added to timedelta, too). A class method >> could also be added to the decimal module to construct instances from >> integer components (as shown above), since that method of construction >> isn't actually specific to timestamps. >> > > Why not just make it something like __fromfixed__() and make it a standard > protocol, implemented on floats, ints, decimals, etc. Then the API is just > "time.time(type)", where type is any object providing a __fromfixed__ > method. ;-) > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/anacrolix%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Feb 1 05:08:34 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 1 Feb 2012 14:08:34 +1000 Subject: [Python-Dev] Store timestamps as decimal.Decimal objects In-Reply-To: <20120201033514.6d8f3460@pitrou.net> References: <20120131131330.2349dc6b@pitrou.net> <4F287228.1090403@hotpy.org> <20120201033514.6d8f3460@pitrou.net> Message-ID: On Wed, Feb 1, 2012 at 12:35 PM, Antoine Pitrou wrote: > It strikes me as inelegant to have to do so much typing for something > as simple as getting the current time. We should approach the > simplicity of ``time.time(format='decimal')`` or > ``time.decimal_time()``. Getting the current time is simple (you can already do it), getting access to high precision time without performance regressions or backwards incompatiblities or excessive code duplication is hard. There's a very simple rule in large scale software development: coupling is bad and you should do everything you can to minimise it. Victor's approach throws that out the window by requiring that time and os know about every possible output format for time values. That's why protocols are so valuable: instead of having MxN points of interconnection, you just define a standard protocol as the basis for interaction, and the consumer of the protocol doesn't need to care about the details of the provider, they just care about the protocol itself. So, the question becomes how to solve the problem of exposing high resolution timestamps to Python code in a way that: - is applicable not just to time.time(), but also to os.stat(), time.clock(), time.wall_clock() and any other timestamp sources I've forgotten. - is backwards compatible for all those use cases - doesn't cause a significant performance regression for any of those use cases - doesn't cause excessive coupling between the time and os modules and other parts of Python - doesn't excessively duplicate code - doesn't add too much machinery for a relatively minor problem The one key aspect that I think Victor's suggestion gets right is that we want a way to request high precision time from the *existing* APIs, and that this needs to be selected on a per call basis rather than globally for the whole application. The big advantage of going with a callback based approach is that it gives you flexibility and low coupling without any additional supporting infrastructure, and you have the full suite of Python tools available to deal with any resulting verbosity issues. For example, it would become *trivial* to write Alexander's suggested "hirestime" module that always returned decimal.Decimal objects: _hires = decimal.Decimal.from_components def time(): return time.time(convert=_hires) def clock(): return time.clock(convert=_hires) def stat(path): return os.stat(path, timestamps=_hires) # etc... PJE is quite right that using a new named protocol rather than a callback with a particular signature could also work, but I don't see a lot of advantages in doing so. On the other hand, if you go with the "named output format", "hires=True" or new API approaches, you end up having to decide what additional coupling you're going to introduce to time and os. Now, in this case, I actually think there *is* a reasonable option available if we decide to go down that path: - incorporate Stefan Krah's cdecimal work into the standard library - add a "hires=False" flag to affected APIs - return a Decimal instance with full available precision if "hires=True" is passed in. - make time and os explicitly depend on the ability to create decimal.Decimal instances A hirestime module is even easier to implement in that case: def time(): return time.time(hires=True) def clock(): return time.clock(hires=True) def stat(path): return os.stat(path, hires=True) # etc... All of the other APIs (datetime, timedelta, etc) can then just be updated to also accept a Decimal object as input, rather than handling the (integer, fraction, exponent) callback signature I suggested. Either extreme (full flexibility via a callback API or protocol, or else settling specifically on decimal.Decimal and explicitly making time and os dependent on that type) makes sense to me. A wishy-washy middle ground that introduces a dependency from time and os onto multiple other modules *without* making the API user extensible doesn't seem reasonable at all. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ethan at stoneleaf.us Wed Feb 1 04:57:15 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 31 Jan 2012 19:57:15 -0800 Subject: [Python-Dev] PEP 409 - final? Message-ID: <4F28B81B.20801@stoneleaf.us> I haven't seen any further discussion here or in the bug tracker. Below is the latest version of this PEP, now with a section on Language Details. Who makes the final call on this? Any idea how long that will take? (Not that I'm antsy, or anything... ;) PEP: 409 Title: Suppressing exception context Version: $Revision$ Last-Modified: $Date$ Author: Ethan Furman Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 26-Jan-2012 Post-History: 30-Aug-2002, 01-Feb-2012 Abstract ======== One of the open issues from PEP 3134 is suppressing context: currently there is no way to do it. This PEP proposes one. Rationale ========= There are two basic ways to generate exceptions: 1) Python does it (buggy code, missing resources, ending loops, etc.) 2) manually (with a raise statement) When writing libraries, or even just custom classes, it can become necessary to raise exceptions; moreover it can be useful, even necessary, to change from one exception to another. To take an example from my dbf module:: try: value = int(value) except Exception: raise DbfError(...) Whatever the original exception was (``ValueError``, ``TypeError``, or something else) is irrelevant. The exception from this point on is a ``DbfError``, and the original exception is of no value. However, if this exception is printed, we would currently see both. Alternatives ============ Several possibilities have been put forth: * ``raise as NewException()`` Reuses the ``as`` keyword; can be confusing since we are not really reraising the originating exception * ``raise NewException() from None`` Follows existing syntax of explicitly declaring the originating exception * ``exc = NewException(); exc.__context__ = None; raise exc`` Very verbose way of the previous method * ``raise NewException.no_context(...)`` Make context suppression a class method. All of the above options will require changes to the core. Proposal ======== I proprose going with the second option:: raise NewException from None It has the advantage of using the existing pattern of explicitly setting the cause:: raise KeyError() from NameError() but because the 'cause' is ``None`` the previous context, while retained, is not displayed by the default exception printing routines. Language Details ================ Currently, ``__context__`` and ``__cause__`` start out as None, and then get set as exceptions occur. To support ``from None``, ``__context__`` will stay as it is, but ``__cause__`` will start out as ``False``, and will change to ``None`` when the ``raise ... from None`` method is used. The default exception printing routine will then: * If ``__cause__`` is ``False`` the ``__context__`` (if any) will be printed. * If ``__cause__`` is ``None`` the ``__context__`` will not be printed. * if ``__cause__`` is anything else, ``__cause__`` will be printed. This has the benefit of leaving the ``__context__`` intact for future logging, querying, etc., while suppressing its display if it is not caught. This is important for those times when trying to debug poorly written libraries with `bad error messages`_. Patches ======= There is a patch for CPython implementing this attached to `Issue 6210`_. References ========== Discussion and refinements in this `thread on python-dev`_. .. _bad error messages: http://bugs.python.org/msg152294 .. _Issue 6210: http://bugs.python.org/issue6210 .. _thread on python-dev: http://mail.python.org/pipermail/python-dev/2012-January/115838.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From ethan at stoneleaf.us Wed Feb 1 04:58:30 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 31 Jan 2012 19:58:30 -0800 Subject: [Python-Dev] PEP 409 - now properly formatted (sorry for the noise) Message-ID: <4F28B866.5030905@stoneleaf.us> PEP: 409 Title: Suppressing exception context Version: $Revision$ Last-Modified: $Date$ Author: Ethan Furman Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 26-Jan-2012 Post-History: 30-Aug-2002, 01-Feb-2012 Abstract ======== One of the open issues from PEP 3134 is suppressing context: currently there is no way to do it. This PEP proposes one. Rationale ========= There are two basic ways to generate exceptions: 1) Python does it (buggy code, missing resources, ending loops, etc.) 2) manually (with a raise statement) When writing libraries, or even just custom classes, it can become necessary to raise exceptions; moreover it can be useful, even necessary, to change from one exception to another. To take an example from my dbf module:: try: value = int(value) except Exception: raise DbfError(...) Whatever the original exception was (``ValueError``, ``TypeError``, or something else) is irrelevant. The exception from this point on is a ``DbfError``, and the original exception is of no value. However, if this exception is printed, we would currently see both. Alternatives ============ Several possibilities have been put forth: * ``raise as NewException()`` Reuses the ``as`` keyword; can be confusing since we are not really reraising the originating exception * ``raise NewException() from None`` Follows existing syntax of explicitly declaring the originating exception * ``exc = NewException(); exc.__context__ = None; raise exc`` Very verbose way of the previous method * ``raise NewException.no_context(...)`` Make context suppression a class method. All of the above options will require changes to the core. Proposal ======== I proprose going with the second option:: raise NewException from None It has the advantage of using the existing pattern of explicitly setting the cause:: raise KeyError() from NameError() but because the 'cause' is ``None`` the previous context, while retained, is not displayed by the default exception printing routines. Language Details ================ Currently, ``__context__`` and ``__cause__`` start out as None, and then get set as exceptions occur. To support ``from None``, ``__context__`` will stay as it is, but ``__cause__`` will start out as ``False``, and will change to ``None`` when the ``raise ... from None`` method is used. The default exception printing routine will then: * If ``__cause__`` is ``False`` the ``__context__`` (if any) will be printed. * If ``__cause__`` is ``None`` the ``__context__`` will not be printed. * if ``__cause__`` is anything else, ``__cause__`` will be printed. This has the benefit of leaving the ``__context__`` intact for future logging, querying, etc., while suppressing its display if it is not caught. This is important for those times when trying to debug poorly written libraries with `bad error messages`_. Patches ======= There is a patch for CPython implementing this attached to `Issue 6210`_. References ========== Discussion and refinements in this `thread on python-dev`_. .. _bad error messages: http://bugs.python.org/msg152294 .. _Issue 6210: http://bugs.python.org/issue6210 .. _thread on python-dev: http://mail.python.org/pipermail/python-dev/2012-January/115838.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From ncoghlan at gmail.com Wed Feb 1 06:14:31 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 1 Feb 2012 15:14:31 +1000 Subject: [Python-Dev] PEP 409 - final? In-Reply-To: <4F28B81B.20801@stoneleaf.us> References: <4F28B81B.20801@stoneleaf.us> Message-ID: On Wed, Feb 1, 2012 at 1:57 PM, Ethan Furman wrote: > I haven't seen any further discussion here or in the bug tracker. ?Below is > the latest version of this PEP, now with a section on Language Details. > > Who makes the final call on this? ?Any idea how long that will take? (Not > that I'm antsy, or anything... ;) Guido still has the final say on PEP approvals as BDFL - it's just that sometimes he'll tap someone else and say "Your call!" (thus making them a BDFOP - Benevolent Dictator for One PEP). FWIW, I'm personally +1 on the latest version of this. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ethan at stoneleaf.us Wed Feb 1 06:07:12 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 31 Jan 2012 21:07:12 -0800 Subject: [Python-Dev] docs fixes and PEP 409 Message-ID: <4F28C880.6060101@stoneleaf.us> I'm looking at the docs to make the relevant changes due to PEP 409, and I'm noticing some problems. E.g. The PyException_Get|Set_Context|Cause all talk about using NULL to clear the related attribute, when actually in should be Py_None. Only PyException_GetCause is directly related to PEP 409 -- should I only fix that one, and open up a new issue on the tracker for the other three, or should I fix all four now? ~Ethan~ From ncoghlan at gmail.com Wed Feb 1 06:57:44 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 1 Feb 2012 15:57:44 +1000 Subject: [Python-Dev] docs fixes and PEP 409 In-Reply-To: <4F28C880.6060101@stoneleaf.us> References: <4F28C880.6060101@stoneleaf.us> Message-ID: On Wed, Feb 1, 2012 at 3:07 PM, Ethan Furman wrote: > I'm looking at the docs to make the relevant changes due to PEP 409, and I'm > noticing some problems. > > E.g. The PyException_Get|Set_Context|Cause all talk about using NULL to > clear the related attribute, when actually in should be Py_None. > > Only PyException_GetCause is directly related to PEP 409 -- should I only > fix that one, and open up a new issue on the tracker for the other three, or > should I fix all four now? Passing in NULL is the right way to clear them using those APIs - the descriptors in exceptions.c then control how "not set" is exposed at the Python layer. So only Get/SetCause should need updating for PEP 409 to say to pass in NULL to clear the cause and fall back on displaying the context and Py_None to suppress the context in the default display. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ethan at stoneleaf.us Wed Feb 1 06:56:33 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 31 Jan 2012 21:56:33 -0800 Subject: [Python-Dev] docs fixes and PEP 409 In-Reply-To: <4F28C880.6060101@stoneleaf.us> References: <4F28C880.6060101@stoneleaf.us> Message-ID: <4F28D411.2060106@stoneleaf.us> Ethan Furman wrote: > Only PyException_GetCause is directly related to PEP 409 -- should I > only fix that one, and open up a new issue on the tracker for the other > three, or should I fix all four now? The specific question is now irrelevant (still learning the differences between the C code and the Python code ;) -- but the general question remains... ~Ethan~ From ethan at stoneleaf.us Wed Feb 1 06:36:48 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 31 Jan 2012 21:36:48 -0800 Subject: [Python-Dev] PEP 409 - now properly formatted (sorry for the noise) In-Reply-To: <4F28B866.5030905@stoneleaf.us> References: <4F28B866.5030905@stoneleaf.us> Message-ID: <4F28CF70.6000708@stoneleaf.us> What an appropriate title since I sent it to the wrong place. :( ~Ethan~ From stefan_ml at behnel.de Wed Feb 1 08:45:35 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 01 Feb 2012 08:45:35 +0100 Subject: [Python-Dev] Python 3 optimizations, continued, continued again... In-Reply-To: References: <4F23C657.9050501@hotpy.org> Message-ID: stefan brunthaler, 31.01.2012 22:17: >> Well, nobody wants to review generated code. >> > I agree. The code generator basically uses templates that contain the > information and a dump of the C-structure of several types to traverse > and see which one of them implements which functions. There is really > no magic there, the most "complex" thing is to get the inline-cache > miss checks for function calls right. But I tried to make the > generated code look pretty, so that working with it is not too much of > a hassle. The code generator itself is a little bit more complicated, > so I am not sure it would help a lot... How many times did you regenerate this code until you got it right? And how do you know that you really got it so right that it was the last time ever that you needed your generator for it? What if the C structure of any of those "several types" ever changes? Stefan From victor.stinner at haypocalc.com Wed Feb 1 09:03:35 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 1 Feb 2012 09:03:35 +0100 Subject: [Python-Dev] Store timestamps as decimal.Decimal objects In-Reply-To: References: <20120131131330.2349dc6b@pitrou.net> <4F287228.1090403@hotpy.org> Message-ID: 2012/2/1 Nick Coghlan : > The secret to future-proofing such an API while only using integers > lies in making the decimal exponent part of the conversion function > signature: > > ? ?def from_components(integer, fraction=0, exponent=-9): > ? ? ? ?return Decimal(integer) + Decimal(fraction) * Decimal((0, > (1,), exponent)) The fractional part is not necessary related to a power of 10. An earlier version of my patch used also powers of 10, but it didn't work (loose precision) for QueryPerformanceCounter() and was more complex than the new version. NTP timestamp uses a fraction of 2**32. QueryPerformanceCounter() (used by time.clock() on Windows) uses the CPU frequency. We may need more information when adding a new timestamp formats later. If we expose the "internal structure" used to compute any timestamp format, we cannot change the internal structure later without breaking (one more time) the API. My patch uses the format (seconds: int, floatpart: int, divisor: int). For example, I hesitate to add a field to specify the start of the timestamp: undefined for time.wallclock(), time.clock(), and time.clock_gettime(time.CLOCK_MONOTONIC), Epoch for other timestamps. My patch is similar to your idea except that everything is done internally to not have to expose internal structures, and it doesn't touch decimal or datetime modules. It would be surprising to add a method related to timestamp to the Decimal class. > This strategy would have negligible performance impact There is no such performance issue: time.time() performance is exactly the same using my patch. Depending on the requested format, the performance may be better or worse. But even for Decimal, I think that the creation of Decimal is really "fast" (I should provide numbers :-)). Victor From ncoghlan at gmail.com Wed Feb 1 11:43:24 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 1 Feb 2012 20:43:24 +1000 Subject: [Python-Dev] Store timestamps as decimal.Decimal objects In-Reply-To: References: <20120131131330.2349dc6b@pitrou.net> <4F287228.1090403@hotpy.org> Message-ID: On Wed, Feb 1, 2012 at 6:03 PM, Victor Stinner wrote: > 2012/2/1 Nick Coghlan : >> The secret to future-proofing such an API while only using integers >> lies in making the decimal exponent part of the conversion function >> signature: >> >> ? ?def from_components(integer, fraction=0, exponent=-9): >> ? ? ? ?return Decimal(integer) + Decimal(fraction) * Decimal((0, >> (1,), exponent)) > > The fractional part is not necessary related to a power of 10. An > earlier version of my patch used also powers of 10, but it didn't work > (loose precision) for QueryPerformanceCounter() and was more complex > than the new version. NTP timestamp uses a fraction of 2**32. > QueryPerformanceCounter() (used by time.clock() on Windows) uses the > CPU frequency. If a callback protocol is used at all, there's no reason those details need to be exposed to the callbacks. Just choose an appropriate exponent based on the precision of the underlying API call. > We may need more information when adding a new timestamp formats > later. If we expose the "internal structure" used to compute any > timestamp format, we cannot change the internal structure later > without breaking (one more time) the API. You're assuming we're ever going to want timestamps that are something more than just a number. That's a *huge* leap (much bigger than increasing the precision, which is the problem we're dealing with now). With arbitrary length integers available, "integer, fraction, exponent" lets you express numbers to whatever precision you like, just as decimal.Decimal does (more on that below). > My patch is similar to your idea except that everything is done > internally to not have to expose internal structures, and it doesn't > touch decimal or datetime modules. It would be surprising to add a > method related to timestamp to the Decimal class. No, you wouldn't add a timestamp specific method to the Decimal class - you'd add one that let you easily construct a decimal from a fixed point representation (i.e. integer + fraction*10**exponent) >> This strategy would have negligible performance impact > > There is no such performance issue: time.time() performance is exactly > the same using my patch. Depending on the requested format, the > performance may be better or worse. But even for Decimal, I think that > the creation of Decimal is really "fast" (I should provide numbers > :-)). But this gets us to my final question. Given that Decimal supports arbitrary precision, *why* increase the complexity of the underlying API by supporting *other* output types? If you're not going to support arbitrary callbacks, why not just have a "high precision" flag to request Decimal instances and be done with it? datetime, timedelta and so forth would be able to get everything they needed from the Decimal value. As I said in my last message, both a 3-tuple (integer, fraction, exponent) based callback protocol effectively supporting arbitrary output types and a boolean flag to request Decimal values make sense to me and I could argue in favour of either of them. However, I don't understand the value you see in this odd middle ground of "instead of picking 1 arbitrary precision timestamp representation, whether an integer triple or decimal.Decimal, we're going to offer a few different ones and make you decide which one of them you actually want every time you call the API". That's seriously ducking our responsibilities as language developers - it's our job to make that call, not each user's. Given the way the discussion has gone, my preference is actually shifting strongly towards just returning decimal.Decimal instances when high precision timestamps are requested via a boolean flag. The flag isn't pretty, but it works, and the extra flexibility of a "type" parameter or a callback protocol doesn't really buy us anything once we have an output type that supports arbitrary precision. FWIW, I did a quick survey of what other languages seem to offer in terms of high resolution time interfaces: - Perl appears to have Time::HiRes (it seems to use floats in the API though, so I'm not sure how that works in practice) - C# (and the CLR) don't appear to care about POSIX and just offer 100 nanosecond resolution in their DateTime libraries - Java appears to have System.nanoTime(), no idea what they do for filesystem times However, I don't know enough about how the APIs in those languages work to do sensible searches. It doesn't appear to be a cleanly solved problem anywhere, though. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Wed Feb 1 12:08:42 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 1 Feb 2012 12:08:42 +0100 Subject: [Python-Dev] Store timestamps as decimal.Decimal objects In-Reply-To: References: <20120131131330.2349dc6b@pitrou.net> <4F287228.1090403@hotpy.org> <20120201033514.6d8f3460@pitrou.net> Message-ID: <20120201120842.42c91b5c@pitrou.net> On Wed, 1 Feb 2012 14:08:34 +1000 Nick Coghlan wrote: > On Wed, Feb 1, 2012 at 12:35 PM, Antoine Pitrou wrote: > > It strikes me as inelegant to have to do so much typing for something > > as simple as getting the current time. We should approach the > > simplicity of ``time.time(format='decimal')`` or > > ``time.decimal_time()``. > > Getting the current time is simple (you can already do it), getting > access to high precision time without performance regressions or > backwards incompatiblities or excessive code duplication is hard. The implementation of it might be hard, the API doesn't have to be. You can even use a callback system under the hood, you just don't have to *expose* that complication to the user. > There's a very simple rule in large scale software development: > coupling is bad and you should do everything you can to minimise it. The question is: is coupling worse than exposing horrible APIs? ;) If Decimal were a core object as float is, we wouldn't have this discussion because returning a Decimal would be considered "natural". > Victor's approach throws that out the window by requiring that time > and os know about every possible output format for time values. Victor's proposal is maximalist in that it proposes several different output formats. Decimal is probably enough for real use cases, though. > For example, it would become *trivial* to write Alexander's suggested > "hirestime" module that always returned decimal.Decimal objects: Right, but that's not even a plausible request. Nobody wants to write a separate time module just to have a different return type. Regards Antoine. From ncoghlan at gmail.com Wed Feb 1 12:26:08 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 1 Feb 2012 21:26:08 +1000 Subject: [Python-Dev] Store timestamps as decimal.Decimal objects In-Reply-To: <20120201120842.42c91b5c@pitrou.net> References: <20120131131330.2349dc6b@pitrou.net> <4F287228.1090403@hotpy.org> <20120201033514.6d8f3460@pitrou.net> <20120201120842.42c91b5c@pitrou.net> Message-ID: On Wed, Feb 1, 2012 at 9:08 PM, Antoine Pitrou wrote: > Right, but that's not even a plausible request. Nobody wants to write a > separate time module just to have a different return type. I can definitely see someone doing "import hirestime as time" to avoid having to pass a flag everywhere, though. I don't think that should be the way *we* expose the functionality - I just think it's a possible end user technique we should keep in mind when assessing the alternatives. As I said in my last reply to Victor though, I'm definitely coming around to the point of view that supporting more than just Decimal is overgeneralising to the detriment of the API design. As you say, if decimal objects were a builtin type, we wouldn't even be considering alternative high precision representations - the only discussion would be about the details of the API for *requesting* high resolution timestamps (and while boolean flags are ugly, I'm not sure there's anything else that will satisfy backwards compatibility constraints). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From victor.stinner at haypocalc.com Wed Feb 1 12:40:08 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 1 Feb 2012 12:40:08 +0100 Subject: [Python-Dev] Store timestamps as decimal.Decimal objects In-Reply-To: References: <20120131131330.2349dc6b@pitrou.net> <4F287228.1090403@hotpy.org> Message-ID: > If a callback protocol is used at all, there's no reason those details > need to be exposed to the callbacks. Just choose an appropriate > exponent based on the precision of the underlying API call. If the clock divisor cannot be written as a power of 10, you loose precision, just because your format requires a power of 10. Using (seconds, floatpart, divisor) you don't loose any bit. The conversion function using this tuple can choose how to use these numbers and do its best to optimize the precision (e.g. choose how to round the division). By the way, my patch uses a dummy integer division (floatpart / divisor). I hesitate to round to the closest integer. For example, 19//10=1, whereas 2 whould be a better answer. A possibility is to use (floatpart + (divisor/2)) / divisor. >> We may need more information when adding a new timestamp formats >> later. If we expose the "internal structure" used to compute any >> timestamp format, we cannot change the internal structure later >> without breaking (one more time) the API. > > You're assuming we're ever going to want timestamps that are something > more than just a number. That's a *huge* leap (much bigger than > increasing the precision, which is the problem we're dealing with > now). I tried to design an API supporting future timestamp formats. For time methods, it is maybe not useful to produce directly a datetime object. But for os.stat(), it is just *practical* to get directly a high-level object. We may add a new float128 type later, and it would nice to be able to get a timestamp directly as a float128, without having to break the API one more time. Getting a timestamp as a Decimal to convert it to float128 is not optimal. That's why I don't like adding a boolean flag. It doesn't mean that we should add datetime.datetime or datetime.timedelta right now. It can be done later, or never :-) > No, you wouldn't add a timestamp specific method to the Decimal class > - you'd add one that let you easily construct a decimal from a fixed > point representation (i.e. integer + fraction*10**exponent) Only if you use (intpart, floatpart, exponent). Would this function be useful for something else than timestamps? > But this gets us to my final question. Given that Decimal supports > arbitrary precision, *why* increase the complexity of the underlying > API by supporting *other* output types? We need to support at least 3 formats: int, float and (e.g. Decimal), to keep backward compatibilty. > datetime, timedelta and so forth would be able to get everything > they needed from the Decimal value. Yes. Getting timestamps directly as datetime or timedelta is maybe overkill. datetime gives more information than a raw number (int, float or Decimal): you don't have to care the start date of the timestamp. Internally, it would help to support Windows timestamps (number of 100 ns since 1601.1.1), even if we may have to convert the Windows timestamp to a Epoch timestamp if the user requests a number instead of a datetime object (for backward compatibility ?). Victor From ncoghlan at gmail.com Wed Feb 1 12:59:58 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 1 Feb 2012 21:59:58 +1000 Subject: [Python-Dev] Store timestamps as decimal.Decimal objects In-Reply-To: References: <20120131131330.2349dc6b@pitrou.net> <4F287228.1090403@hotpy.org> Message-ID: On Wed, Feb 1, 2012 at 9:40 PM, Victor Stinner wrote: >> If a callback protocol is used at all, there's no reason those details >> need to be exposed to the callbacks. Just choose an appropriate >> exponent based on the precision of the underlying API call. > > If the clock divisor cannot be written as a power of 10, you loose > precision, just because your format requires a power of 10. Using > (seconds, floatpart, divisor) you don't loose any bit. The conversion > function using this tuple can choose how to use these numbers and do > its best to optimize the precision (e.g. choose how to round the > division). > > By the way, my patch uses a dummy integer division (floatpart / > divisor). I hesitate to round to the closest integer. For example, > 19//10=1, whereas 2 whould be a better answer. A possibility is to use > (floatpart + (divisor/2)) / divisor. If you would lose precision, make the decimal exponent (and hence fractional part) larger. You have exactly the same problem when converting to decimal, and the solution is the same (i.e. use as many significant digits as you need to preserve the underlying precision). > I tried to design an API supporting future timestamp formats. For time > methods, it is maybe not useful to produce directly a datetime object. > But for os.stat(), it is just *practical* to get directly a high-level > object. > > We may add a new float128 type later, and it would nice to be able to > get a timestamp directly as a float128, without having to break the > API one more time. Getting a timestamp as a Decimal to convert it to > float128 is not optimal. That's why I don't like adding a boolean > flag. Introducing API complexity now for entirely theoretical future needs is a classic case of YAGNI (You Ain't Gonna Need It). Besides, float128 is a bad example - such a type could just be returned directly where we return float64 now. (The only reason we can't do that with Decimal is because we deliberately don't allow implicit conversion of float values to Decimal values in binary operations). >> But this gets us to my final question. Given that Decimal supports >> arbitrary precision, *why* increase the complexity of the underlying >> API by supporting *other* output types? > > We need to support at least 3 formats: int, float and format> (e.g. Decimal), to keep backward compatibilty. int and float are already supported today, and a process global switch works for that (since they're numerically interoperable). A per-call setting is only needed for Decimal due to its deliberate lack of implicit interoperability with binary floats. >> datetime, timedelta and so forth would be able to get everything >> they needed from the Decimal value. > > Yes. Getting timestamps directly as datetime or timedelta is maybe overkill. > > datetime gives more information than a raw number (int, float or > Decimal): you don't have to care the start date of the timestamp. That's a higher level concern though - not something the timestamp APIs themselves should be worrying about. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From s.brunthaler at uci.edu Wed Feb 1 16:55:20 2012 From: s.brunthaler at uci.edu (stefan brunthaler) Date: Wed, 1 Feb 2012 07:55:20 -0800 Subject: [Python-Dev] Python 3 optimizations, continued, continued again... In-Reply-To: References: <4F23C657.9050501@hotpy.org> Message-ID: > How many times did you regenerate this code until you got it right? Well, honestly, I changed the code generator to "pack" the new optimized instruction derivatives densly into the available opcodes, so that I can make optimal use of what's there. Thus I only generated the code twice for this patch. > And how do you know that you really got it so right that it was the last time ever > that you needed your generator for it? I am positive that I am going to need my code generator in the future, as I have several ideas to increase performance even more. As I have mentioned before, my quickening based inline caching technique is very simple, and if it would crash, chances are that any of the inline-cache miss guards don't capture all scenarios, i.e., are non-exhaustive. The regression-tests run, so do the official benchmarks plus the computer language benchmarks game. In addition, this has been my line of research since 2009, so I have extensive experience with it, too. > What if the C structure of any of those "several types" ever changes? Since I optimize interpreter instructions, any change that affects their implementation requires changing of the optimized instructions, too. Having the code generator ready for such things would certainly be a good idea (probably also for generating the default interpreter dispatch loop), since you could also add your own "profile" for your application/domain to re-use the remaining 30+ instruction opcodes. The direct answer is that I would need to re-generate the driver file, which is basically a gdb-dump plus an Emacs macro (please note that I did not need to do that since working with ~ 3.0b1) I will add a list of the types I use for specializing to patch section on the "additional resources" page of my homepage (including a fixed patch of what Georg brought to my attention.) --stefan From lukasz at langa.pl Wed Feb 1 17:12:33 2012 From: lukasz at langa.pl (=?iso-8859-2?Q?=A3ukasz_Langa?=) Date: Wed, 1 Feb 2012 17:12:33 +0100 Subject: [Python-Dev] Python 3 optimizations, continued, continued again... In-Reply-To: References: <4F23C657.9050501@hotpy.org> Message-ID: Wiadomo?? napisana przez stefan brunthaler w dniu 1 lut 2012, o godz. 16:55: >> And how do you know that you really got it so right that it was the last time ever >> that you needed your generator for it? > > I am positive that I am going to need my code generator in the future, > as I have several ideas to increase performance even more. Hello, Stefan. First let me thank you for your interest in improving the interpreter. We appreciate and encourage efforts to make it perform better. But let me put this straight: as an open-source project, we are hesitant to accept changes which depend on closed software. Even if your optimization techniques would result in performance a hundred times better than what is currently achieved, we would still be wary to accept them. Please note that this is not because of lack of trust or better yet greed for your code. We need to make sure that under no circumstances our codebase is in danger because something important was left out along the way. Maintenance of generated code is yet another nuissance that should better be strongly justified. -- Best regards, ?ukasz Langa Senior Systems Architecture Engineer IT Infrastructure Department Grupa Allegro Sp. z o.o. -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.brunthaler at uci.edu Wed Feb 1 17:36:19 2012 From: s.brunthaler at uci.edu (stefan brunthaler) Date: Wed, 1 Feb 2012 08:36:19 -0800 Subject: [Python-Dev] Python 3 optimizations, continued, continued again... In-Reply-To: References: <4F23C657.9050501@hotpy.org> Message-ID: > But let me put this straight: as an open-source project, we are hesitant to > accept changes which depend on closed software. Even if your optimization > techniques would result in performance a hundred times better than what is > currently achieved, we would still be wary to accept them. > > Please note that this is not because of lack of trust or better yet?greed > for your code.?We need to make sure > that under no circumstances our codebase is in danger because something > important was left out along the way. > I am positive that the code generator does not depend on any closed source components, I just juse mako for storing the C code templates that I generate -- everything else I wrote myself. Of course, I'll give the code generator to pydev, too, if necessary. However, I need to strip it down, so that it does not do all the other stuff that you don't need. I just wanted to give you the implementation now, since Benjamin said that he wants to see real code and results first. If you want to integrate the inca-optimization, I am going to start working on this asap. > Maintenance of generated code is yet another nuissance that should better be > strongly justified. > I agree, but the nice thing is that the technique is very simple: only if you changed a significant part of the interpreter implementation's, you'd need to change the optimized derivatives, too. If one generates the default interpreter implementation, too, then one gets the optimizations almost for free. For maintenance reasons I chose to use a template-based system, too, since this gives you a direct correspondence between the actual code and what's generated, without interfering with the code generator at all. --stefan From hansmu at xs4all.nl Wed Feb 1 18:13:19 2012 From: hansmu at xs4all.nl (Hans Mulder) Date: Wed, 01 Feb 2012 18:13:19 +0100 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: <4F25D686.9070907@pearwood.info> References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> Message-ID: On 30/01/12 00:30:14, Steven D'Aprano wrote: > Mark Shannon wrote: >> Antoine Pitrou wrote: [......] >> Antoine is right. It is a reorganisation of the dict, plus a couple of >> changes to typeobject.c and object.c to ensure that instance >> dictionaries do indeed share keys arrays. > > > I don't quite follow how that could work. > > If I have this: > > class C: > pass > > a = C() > b = C() > > a.spam = 1 > b.ham = 2 > > > how can a.__dict__ and b.__dict__ share key arrays? I've tried reading > the source, but I'm afraid I don't understand it well enough to make > sense of it. They can't. But then, your class is atypical. Usually, classes initialize all the attributes of their instances in the __init__ method, perhaps like so: class D: def __init__(self, ham=None, spam=None): self.ham = ham self.spam = spam As long as you follow the common practice of not adding any attributes after the object has been initialized, your instances can share their keys array. Mark's patch will do that. You'll still be allowed to have different attributes per instance, but if you do that, then the patch doesn't buy you much. -- HansM From ownerscircle at gmail.com Wed Feb 1 18:27:15 2012 From: ownerscircle at gmail.com (PJ Eby) Date: Wed, 1 Feb 2012 12:27:15 -0500 Subject: [Python-Dev] Store timestamps as decimal.Decimal objects In-Reply-To: References: <20120131131330.2349dc6b@pitrou.net> <4F287228.1090403@hotpy.org> <20120201033514.6d8f3460@pitrou.net> Message-ID: On Jan 31, 2012 11:08 PM, "Nick Coghlan" wrote: > PJE is quite right that using a new named protocol rather than a > callback with a particular signature could also work, but I don't see > a lot of advantages in doing so. The advantage is that it fits your brain better. That is, you don't have to remember another symbol besides the type you wanted. (There's probably fewer keystrokes involved, too.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Feb 1 18:46:56 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 1 Feb 2012 09:46:56 -0800 Subject: [Python-Dev] Python 3 optimizations, continued, continued again... In-Reply-To: References: <4F23C657.9050501@hotpy.org> Message-ID: Let's make one thing clear. The Python core developers need to be able to reproduce your results from scratch, and that means access to the templates, code generators, inputs, and everything else you used. (Of course for stuff you didn't write that's already open source, all we need is a pointer to the open source project and the exact version/configuration you used, plus any local mods you made.) I understand that you're hesitant to just dump your current mess, and you want to clean it up before you show it to us. That's fine. But until you're ready to show it, we're not going to integrate any of your work into CPython, even though some of us (maybe Benjamin) may be interested in kicking its tires. And remember, it doesn't need to be perfect (in fact perfectionism is probably a bad idea here). But it does need to be open source. Every single bit of it. (And no GPL, please.) --Guido 2012/2/1 stefan brunthaler : >> But let me put this straight: as an open-source project, we are hesitant to >> accept changes which depend on closed software. Even if your optimization >> techniques would result in performance a hundred times better than what is >> currently achieved, we would still be wary to accept them. >> >> Please note that this is not because of lack of trust or better yet?greed >> for your code.?We need to make sure >> that under no circumstances our codebase is in danger because something >> important was left out along the way. >> > I am positive that the code generator does not depend on any closed > source components, I just juse mako for storing the C code templates > that I generate -- everything else I wrote myself. > Of course, I'll give the code generator to pydev, too, if necessary. > However, I need to strip it down, so that it does not do all the other > stuff that you don't need. I just wanted to give you the > implementation now, since Benjamin said that he wants to see real code > and results first. If you want to integrate the inca-optimization, I > am going to start working on this asap. > > >> Maintenance of generated code is yet another nuissance that should better be >> strongly justified. >> > I agree, but the nice thing is that the technique is very simple: only > if you changed a significant part of the interpreter implementation's, > you'd need to change the optimized derivatives, too. If one generates > the default interpreter implementation, too, then one gets the > optimizations almost for free. For maintenance reasons I chose to use > a template-based system, too, since this gives you a direct > correspondence between the actual code and what's generated, without > interfering with the code generator at all. > > --stefan > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From guido at python.org Wed Feb 1 18:50:55 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 1 Feb 2012 09:50:55 -0800 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> Message-ID: On Wed, Feb 1, 2012 at 9:13 AM, Hans Mulder wrote: > On 30/01/12 00:30:14, Steven D'Aprano wrote: >> >> Mark Shannon wrote: >>> >>> Antoine Pitrou wrote: > > [......] > >>> Antoine is right. It is a reorganisation of the dict, plus a couple of >>> changes to typeobject.c and object.c to ensure that instance >>> dictionaries do indeed share keys arrays. >> >> >> >> I don't quite follow how that could work. >> >> If I have this: >> >> class C: >> pass >> >> a = C() >> b = C() >> >> a.spam = 1 >> b.ham = 2 >> >> >> how can a.__dict__ and b.__dict__ share key arrays? I've tried reading >> the source, but I'm afraid I don't understand it well enough to make >> sense of it. > > > They can't. > > But then, your class is atypical. ?Usually, classes initialize all the > attributes of their instances in the __init__ method, perhaps like so: > > class D: > ? ?def __init__(self, ham=None, spam=None): > ? ? ? ?self.ham = ham > ? ? ? ?self.spam = spam > > As long as you follow the common practice of not adding any attributes > after the object has been initialized, your instances can share their > keys array. ?Mark's patch will do that. > > You'll still be allowed to have different attributes per instance, but > if you do that, then the patch doesn't buy you much. Hey, I like this! It's a subtle encouragement for developers to initialize all their instance variables in their __init__ or __new__ method, with a (modest) performance improvement for a carrot. (Though I have to admit I have no idea how you do it. Wouldn't the set of dict keys be different while __init__ is in the middle of setting the instance variables?) Another question: a common pattern is to use (immutable) class variables as default values for instance variables, and only set the instance variables once they need to be different. Does such a class benefit from your improvement? > -- HansM -- --Guido van Rossum (python.org/~guido) From guido at python.org Wed Feb 1 19:01:29 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 1 Feb 2012 10:01:29 -0800 Subject: [Python-Dev] PEP 409 - final? In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> Message-ID: Hm... Reading this draft, I like the idea of using "raise X from None", but I still have one quibble. It seems the from clause sets __cause__, and __cause__ can indicate three things: (1) print __cause__ (explicitly set), (2) print __context__ (default), (3) print neither (raise X from None). For (1), __cause__ must of course be a traceback object. The PEP currently proposes to use two special values: False for (2), None for (3). To me, this has a pretty strong code smell, and I don't want this pattern to be enshrined in a PEP as an example for all to follow. (And I also don't like "do as I say, don't do as I do." :-) Can we think of a different special value to distinguish between (2) and (3)? Ideally one that doesn't change the nice "from None" idiom, which I actually like as a way to spell this. Sorry that life isn't easier, --Guido On Tue, Jan 31, 2012 at 9:14 PM, Nick Coghlan wrote: > On Wed, Feb 1, 2012 at 1:57 PM, Ethan Furman wrote: >> I haven't seen any further discussion here or in the bug tracker. ?Below is >> the latest version of this PEP, now with a section on Language Details. >> >> Who makes the final call on this? ?Any idea how long that will take? (Not >> that I'm antsy, or anything... ;) > > Guido still has the final say on PEP approvals as BDFL - it's just > that sometimes he'll tap someone else and say "Your call!" (thus > making them a BDFOP - Benevolent Dictator for One PEP). > > FWIW, I'm personally +1 on the latest version of this. > > Cheers, > Nick. > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From alex.gaynor at gmail.com Wed Feb 1 19:19:28 2012 From: alex.gaynor at gmail.com (Alex) Date: Wed, 1 Feb 2012 18:19:28 +0000 (UTC) Subject: [Python-Dev] A new dictionary implementation References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> Message-ID: Guido van Rossum python.org> writes: > Hey, I like this! It's a subtle encouragement for developers to > initialize all their instance variables in their __init__ or __new__ > method, with a (modest) performance improvement for a carrot. (Though > I have to admit I have no idea how you do it. Wouldn't the set of dict > keys be different while __init__ is in the middle of setting the > instance variables?) > > Another question: a common pattern is to use (immutable) class > variables as default values for instance variables, and only set the > instance variables once they need to be different. Does such a class > benefit from your improvement? > > > -- HansM > While I absolutely cannot speak to this implementation. Traditionally this type of approach is refered to as maps, and was pioneered in SELF, originally presented at OOPSLA '89: http://dl.acm.org/citation.cfm?id=74884 . PyPy also uses these maps to back it's object, although from what I've read the implementation looks nothing like the proposed one for CPython, you can read about that here: http://bit.ly/zwlOkV , and if you're really excited about this you can read our implementation here: https://bitbucket.org/pypy/pypy/src/default/pypy/objspace/std/mapdict.py . Alex From alex.gaynor at gmail.com Wed Feb 1 19:36:44 2012 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Wed, 1 Feb 2012 18:36:44 +0000 (UTC) Subject: [Python-Dev] A new dictionary implementation References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> Message-ID: Alex gmail.com> writes: > about that here: http://bit.ly/zwlOkV , and if you're really excited about this Err, here's a working version of the bit.ly link: http://bit.ly/a05h6r Alex From s.brunthaler at uci.edu Wed Feb 1 20:08:44 2012 From: s.brunthaler at uci.edu (stefan brunthaler) Date: Wed, 1 Feb 2012 11:08:44 -0800 Subject: [Python-Dev] Python 3 optimizations, continued, continued again... In-Reply-To: References: <4F23C657.9050501@hotpy.org> Message-ID: On Wed, Feb 1, 2012 at 09:46, Guido van Rossum wrote: > Let's make one thing clear. The Python core developers need to be able > to reproduce your results from scratch, and that means access to the > templates, code generators, inputs, and everything else you used. (Of > course for stuff you didn't write that's already open source, all we > need is a pointer to the open source project and the exact > version/configuration you used, plus any local mods you made.) > > I understand that you're hesitant to just dump your current mess, and > you want to clean it up before you show it to us. That's fine. But > until you're ready to show it, we're not going to integrate any of > your work into CPython, even though some of us (maybe Benjamin) may be > interested in kicking its tires. And remember, it doesn't need to be > perfect (in fact perfectionism is probably a bad idea here). But it > does need to be open source. Every single bit of it. (And no GPL, > please.) > I understand all of these issues. Currently, it's not really a mess, but much more complicated as it needs to be for only supporting the inca optimization. I don't know what the time frame for a possible integration is (my guess is that it'd be safe anyways to disable it, like the threaded code support was handled.) As for the license: I really don't care about that at all, the only thing nice to have would be to have a pointer to my home page and/or the corresponding research, but that's about all on my wish list. --stefan From glyph at twistedmatrix.com Wed Feb 1 20:10:29 2012 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Wed, 1 Feb 2012 14:10:29 -0500 Subject: [Python-Dev] Python 3 optimizations, continued, continued again... In-Reply-To: References: <4F23C657.9050501@hotpy.org> Message-ID: <94289846-4A36-426C-B682-06A891715D3B@twistedmatrix.com> On Feb 1, 2012, at 12:46 PM, Guido van Rossum wrote: > I understand that you're hesitant to just dump your current mess, and > you want to clean it up before you show it to us. That's fine. (...) And remember, it doesn't need to be > perfect (in fact perfectionism is probably a bad idea here). Just as a general point of advice to open source contributors, I'd suggest erring on the side of the latter rather than the former suggestion here: dump your current mess, along with the relevant caveats ("it's a mess, much of it is irrelevant") so that other developers can help you clean it up, rather than putting the entire burden of the cleanup on yourself. Experience has taught me that most people who hold back work because it needs cleanup eventually run out of steam and their work never gets integrated and maintained. -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed Feb 1 19:48:33 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 01 Feb 2012 10:48:33 -0800 Subject: [Python-Dev] PEP 409 - final? In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> Message-ID: <4F298901.9090100@stoneleaf.us> Guido van Rossum wrote: > Hm... Reading this draft, I like the idea of using "raise X from > None", but I still have one quibble. It seems the from clause sets > __cause__, and __cause__ can indicate three things: (1) print > __cause__ (explicitly set), (2) print __context__ (default), (3) print > neither (raise X from None). For (1), __cause__ must of course be a > traceback object. Actually, for (1) __cause__ is an exception object, not a traceback. > The PEP currently proposes to use two special > values: False for (2), None for (3). To me, this has a pretty strong > code smell, and I don't want this pattern to be enshrined in a PEP as > an example for all to follow. (And I also don't like "do as I say, > don't do as I do." :-) My apologies for my ignorance, but is the code smell because both False and None evaluate to bool(False)? I suppose we could use True for (2) to indicate that __context__ should be printed, leaving None for (3)... but having __context__ at None and __cause__ at True could certainly be confusing (the default case when no chaining is in effect). > Can we think of a different special value to distinguish between (2) > and (3)? Ideally one that doesn't change the nice "from None" idiom, > which I actually like as a way to spell this. How about this: Exception Life Cycle ==================== Stage 1 - brand new exception ----------------------------- raise ValueError() * __context__ is None * __cause__ is None Stage 2 - exception caught, exception raised -------------------------------------------- try: raise ValueError() except Exception: raise CustomError() * __context__ is previous exception * __cause__ is True Stage 3 - exception raised from [exception | None] -------------------------------------------------- try: raise ValueError() except Exception: raise CustomError() from [OtherException | None] * __context__ is previous exception * __cause__ is [OtherException | None] > > Sorry that life isn't easier, Where would be the fun without the challenge? ~Ethan~ From guido at python.org Wed Feb 1 21:07:06 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 1 Feb 2012 12:07:06 -0800 Subject: [Python-Dev] PEP 409 - final? In-Reply-To: <4F298901.9090100@stoneleaf.us> References: <4F28B81B.20801@stoneleaf.us> <4F298901.9090100@stoneleaf.us> Message-ID: On Wed, Feb 1, 2012 at 10:48 AM, Ethan Furman wrote: > Guido van Rossum wrote: >> >> Hm... Reading this draft, I like the idea of using "raise X from >> None", but I still have one quibble. It seems the from clause sets >> __cause__, and __cause__ can indicate three things: (1) print >> __cause__ (explicitly set), (2) print __context__ (default), (3) print >> neither (raise X from None). For (1), __cause__ must of course be a >> traceback object. > > > Actually, for (1) __cause__ is an exception object, not a traceback. Ah, sorry. I'm not as detail-oriented as I was. :-) >> The PEP currently proposes to use two special >> values: False for (2), None for (3). To me, this has a pretty strong >> code smell, and I don't want this pattern to be enshrined in a PEP as >> an example for all to follow. (And I also don't like "do as I say, >> don't do as I do." :-) > > > My apologies for my ignorance, but is the code smell because both False and > None evaluate to bool(False)? That's part of it, but the other part is that the type of __context__ is now truly dynamic. I often *think* of variables as having some static type, e.g. "integer" or "Foo instance", and for most Foo instances I consider None an acceptable value (since that's how pointer types work in most static languages). But the type of __context__ you're proposing is now a union of exception and bool, except that the bool can only be False. > I suppose we could use True for (2) to > indicate that __context__ should be printed, leaving None for (3)... but > having __context__ at None and __cause__ at True could certainly be > confusing (the default case when no chaining is in effect). It seems you really need a marker object. I'd be fine with using some other opaque marker -- IMO that's much better than using False but disallowing True. >> Can we think of a different special value to distinguish between (2) >> and (3)? Ideally one that doesn't change the nice "from None" idiom, >> which I actually like as a way to spell this. > > > How about this: > > > Exception Life Cycle > ==================== > > > Stage 1 - brand new exception > ----------------------------- > > raise ValueError() > > * __context__ is None > * __cause__ is None > > > Stage 2 - exception caught, exception raised > -------------------------------------------- > > try: > ? raise ValueError() > except Exception: > ? raise CustomError() > > * __context__ is previous exception > * __cause__ is True > > > Stage 3 - exception raised from [exception | None] > -------------------------------------------------- > > try: > ? raise ValueError() > except Exception: > ? raise CustomError() from [OtherException | None] > > * __context__ is previous exception > * __cause__ is [OtherException | None] No, this has the same code smell for me. See above. >> Sorry that life isn't easier, > > > Where would be the fun without the challenge? +1 :-) -- --Guido van Rossum (python.org/~guido) From martin at v.loewis.de Wed Feb 1 21:32:36 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Wed, 01 Feb 2012 21:32:36 +0100 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> Message-ID: <20120201213236.Horde.y3xrE1NNcXdPKaFkoMeRHaA@webmail.df.eu> > Hey, I like this! It's a subtle encouragement for developers to > initialize all their instance variables in their __init__ or __new__ > method, with a (modest) performance improvement for a carrot. (Though > I have to admit I have no idea how you do it. Wouldn't the set of dict > keys be different while __init__ is in the middle of setting the > instance variables?) The "type's attribute set" will be a superset of the instance's, for a shared key set. Initializing the first instance grows the key set, which is put into the type. Subsequent instances start out with the key set as a candidate, and have all values set to NULL in the dict values set. As long as you are only setting attributes that are in the shared key set, the values just get set. When it encounters a key not in the shared key set, the dict dissociates itself from the shared key set. > Another question: a common pattern is to use (immutable) class > variables as default values for instance variables, and only set the > instance variables once they need to be different. Does such a class > benefit from your improvement? It depends. IIUC, if the first instance happens to get this attribute set, it ends up in the shared key set, and subsequent instances may have a NULL value for the key. I'm unsure how *exactly* the key set gets frozen. You cannot allow resizing the key set once it is shared, as you would have to find all instances with the same key set and resize their values. It would be possible (IIUC) to add more keys to the shared key set if that doesn't cause a resize, but I'm not sure whether the patch does that. Regards, Martin From brian at python.org Wed Feb 1 21:46:29 2012 From: brian at python.org (Brian Curtin) Date: Wed, 1 Feb 2012 14:46:29 -0600 Subject: [Python-Dev] Switching to Visual Studio 2010 In-Reply-To: <20120129202309.GA21774@snakebite.org> References: <4F15DD85.6000905@v.loewis.de> <4F15E1A1.6090303@v.loewis.de> <20120126215431.Horde.dSI3OML8999PIb2HJXHnfeA@webmail.df.eu> <20120129202309.GA21774@snakebite.org> Message-ID: On Sun, Jan 29, 2012 at 14:23, Trent Nelson wrote: > ? ?Brian, what are your plans? ?Are you going to continue working in > ? ?hg.python.org/sandbox/vs2010port then merge everything over when > ? ?ready? ?I have some time available to work on this for the next > ? ?three weeks or so and would like to help out. Yep, I'm working out of that repo, and any help you can provide would be great. I need to go back over Martin's checklist to find out what I've actually done in terms of moving old stuff around and whatnot, but the basic gist is that it builds and passes most of the test suite save for 5-6 modules IIRC. From guido at python.org Wed Feb 1 21:50:42 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 1 Feb 2012 12:50:42 -0800 Subject: [Python-Dev] Python 3 optimizations, continued, continued again... In-Reply-To: References: <4F23C657.9050501@hotpy.org> Message-ID: On Wed, Feb 1, 2012 at 11:08 AM, stefan brunthaler wrote: > On Wed, Feb 1, 2012 at 09:46, Guido van Rossum wrote: >> Let's make one thing clear. The Python core developers need to be able >> to reproduce your results from scratch, and that means access to the >> templates, code generators, inputs, and everything else you used. (Of >> course for stuff you didn't write that's already open source, all we >> need is a pointer to the open source project and the exact >> version/configuration you used, plus any local mods you made.) >> >> I understand that you're hesitant to just dump your current mess, and >> you want to clean it up before you show it to us. That's fine. But >> until you're ready to show it, we're not going to integrate any of >> your work into CPython, even though some of us (maybe Benjamin) may be >> interested in kicking its tires. And remember, it doesn't need to be >> perfect (in fact perfectionism is probably a bad idea here). But it >> does need to be open source. Every single bit of it. (And no GPL, >> please.) >> > I understand all of these issues. Currently, it's not really a mess, > but much more complicated as it needs to be for only supporting the > inca optimization. I don't know ?what the time frame for a possible > integration is (my guess is that it'd be safe anyways to disable it, > like the threaded code support was handled.) It won't be integrated until you have published your mess. > As for the license: I really don't care about that at all, the only > thing nice to have would be to have a pointer to my home page and/or > the corresponding research, but that's about all on my wish list. Please don't try to enforce that in the license. That usually backfires. Use Apache 2, which is what the PSF prefers. -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Wed Feb 1 21:53:06 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 01 Feb 2012 15:53:06 -0500 Subject: [Python-Dev] PEP 409 - final? In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F298901.9090100@stoneleaf.us> Message-ID: On 2/1/2012 3:07 PM, Guido van Rossum wrote: > On Wed, Feb 1, 2012 at 10:48 AM, Ethan Furman wrote: >> Guido van Rossum wrote: >>> >>> Hm... Reading this draft, I like the idea of using "raise X from >>> None", but I still have one quibble. It seems the from clause sets >>> __cause__, and __cause__ can indicate three things: (1) print >>> __cause__ (explicitly set), (2) print __context__ (default), (3) print >>> neither (raise X from None). For (1), __cause__ must of course be a >>> traceback object. >> >> >> Actually, for (1) __cause__ is an exception object, not a traceback. > > Ah, sorry. I'm not as detail-oriented as I was. :-) > >>> The PEP currently proposes to use two special >>> values: False for (2), None for (3). To me, this has a pretty strong >>> code smell, and I don't want this pattern to be enshrined in a PEP as >>> an example for all to follow. (And I also don't like "do as I say, >>> don't do as I do." :-) >> >> >> My apologies for my ignorance, but is the code smell because both False and >> None evaluate to bool(False)? > > That's part of it, but the other part is that the type of __context__ > is now truly dynamic. I often *think* of variables as having some > static type, e.g. "integer" or "Foo instance", and for most Foo > instances I consider None an acceptable value (since that's how > pointer types work in most static languages). But the type of > __context__ you're proposing is now a union of exception and bool, > except that the bool can only be False. It sounds like you are asking for a special class __NoException__(BaseException) to use as the marker. -- Terry Jan Reedy From guido at python.org Wed Feb 1 22:00:29 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 1 Feb 2012 13:00:29 -0800 Subject: [Python-Dev] PEP 409 - final? In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F298901.9090100@stoneleaf.us> Message-ID: Not a bad idea. On Wed, Feb 1, 2012 at 12:53 PM, Terry Reedy wrote: > On 2/1/2012 3:07 PM, Guido van Rossum wrote: >> >> On Wed, Feb 1, 2012 at 10:48 AM, Ethan Furman ?wrote: >>> >>> Guido van Rossum wrote: >>>> >>>> >>>> Hm... Reading this draft, I like the idea of using "raise X from >>>> None", but I still have one quibble. It seems the from clause sets >>>> __cause__, and __cause__ can indicate three things: (1) print >>>> __cause__ (explicitly set), (2) print __context__ (default), (3) print >>>> neither (raise X from None). For (1), __cause__ must of course be a >>>> traceback object. >>> >>> >>> >>> Actually, for (1) __cause__ is an exception object, not a traceback. >> >> >> Ah, sorry. I'm not as detail-oriented as I was. :-) >> >>>> The PEP currently proposes to use two special >>>> values: False for (2), None for (3). To me, this has a pretty strong >>>> code smell, and I don't want this pattern to be enshrined in a PEP as >>>> an example for all to follow. (And I also don't like "do as I say, >>>> don't do as I do." :-) >>> >>> >>> >>> My apologies for my ignorance, but is the code smell because both False >>> and >>> None evaluate to bool(False)? >> >> >> That's part of it, but the other part is that the type of __context__ >> is now truly dynamic. I often *think* of variables as having some >> static type, e.g. "integer" or "Foo instance", and for most Foo >> instances I consider None an acceptable value (since that's how >> pointer types work in most static languages). But the type of >> __context__ you're proposing is now a union of exception and bool, >> except that the bool can only be False. > > > It sounds like you are asking for a special class > __NoException__(BaseException) to use as the marker. > > -- > Terry Jan Reedy > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Wed Feb 1 21:55:49 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 01 Feb 2012 12:55:49 -0800 Subject: [Python-Dev] PEP 409 - final? In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F298901.9090100@stoneleaf.us> Message-ID: <4F29A6D5.4060108@stoneleaf.us> Guido van Rossum wrote: > On Wed, Feb 1, 2012 at 10:48 AM, Ethan Furman wrote: >> My apologies for my ignorance, but is the code smell because both False and >> None evaluate to bool(False)? > > That's part of it, but the other part is that the type of __context__ > is now truly dynamic. I often *think* of variables as having some > static type, e.g. "integer" or "Foo instance", and for most Foo > instances I consider None an acceptable value (since that's how > pointer types work in most static languages). But the type of > __context__ you're proposing is now a union of exception and bool, > except that the bool can only be False. > > It seems you really need a marker object. I'd be fine with using some > other opaque marker -- IMO that's much better than using False but > disallowing True. So for __cause__ we need three values: 1) Not set special value (prints __context__ if present) 2) Some exception (print instead of __context__) 3) Ignore __context__ special value (and stop following the __context__ chain) For (3) we're hoping for None, for (2) we have an actual exception, and for (1) -- hmmm. It seems like a stretch, but we could do (looking at both __context__ and __cause__): __context__ __cause__ raise None False [1] reraise previous True [2] reraise from previous None [3] | exception [1] False means non-chained exception [2] True means chained exception [3] None means chained exception, but by default we do not print nor follow the chain The downside to this is that effectively either False and True mean the same thing, i.e. try to follow the __context__ chain, or False and None mean the same thing, i.e. don't bother trying to follow the __context__ chain because it either doesn't exist or is being suppressed. Feels like a bunch of complexity for marginal value. As you were saying, some other object to replace both False and True in the above table would be ideal. ~Ethan~ From guido at python.org Wed Feb 1 22:30:44 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 1 Feb 2012 13:30:44 -0800 Subject: [Python-Dev] PEP 409 - final? In-Reply-To: <4F29A6D5.4060108@stoneleaf.us> References: <4F28B81B.20801@stoneleaf.us> <4F298901.9090100@stoneleaf.us> <4F29A6D5.4060108@stoneleaf.us> Message-ID: On Wed, Feb 1, 2012 at 12:55 PM, Ethan Furman wrote: > Guido van Rossum wrote: >> >> On Wed, Feb 1, 2012 at 10:48 AM, Ethan Furman wrote: >>> >>> My apologies for my ignorance, but is the code smell because both False >>> and >>> None evaluate to bool(False)? >> >> >> That's part of it, but the other part is that the type of __context__ >> is now truly dynamic. I often *think* of variables as having some >> static type, e.g. "integer" or "Foo instance", and for most Foo >> instances I consider None an acceptable value (since that's how >> pointer types work in most static languages). But the type of >> __context__ you're proposing is now a union of exception and bool, >> except that the bool can only be False. >> >> It seems you really need a marker object. I'd be fine with using some >> other opaque marker -- IMO that's much better than using False but >> disallowing True. > > > So for __cause__ we need three values: > > ?1) Not set special value (prints __context__ if present) > > ?2) Some exception (print instead of __context__) > > ?3) Ignore __context__ special value (and stop following the > ? ? __context__ chain) > > For (3) we're hoping for None, for (2) we have an actual exception, and for > (1) -- hmmm. > > It seems like a stretch, but we could do (looking at both __context__ and > __cause__): > > ? ? ? ? ? ? ? ? ? __context__ ? ? ? ? ?__cause__ > > raise ? ? ? ? ? ? ?None ? ? ? ? ? ? ? ? False [1] > > reraise ? ? ? ? ? ?previous ? ? ? ? ? ? True ?[2] > > reraise from ? ? ? previous ? ? ? ? ? ? None [3] | exception > > [1] False means non-chained exception > [2] True means chained exception > [3] None means chained exception, but by default we do not print > ? ?nor follow the chain > > The downside to this is that effectively either False and True mean the same > thing, i.e. try to follow the __context__ chain, or False and None mean the > same thing, i.e. don't bother trying to follow the __context__ chain because > it either doesn't exist or is being suppressed. > > Feels like a bunch of complexity for marginal value. ?As you were saying, > some other object to replace both False and True in the above table would be > ideal. So what did you think of Terry Reedy's idea of using a special exception class? -- --Guido van Rossum (python.org/~guido) From iacobcatalin at gmail.com Wed Feb 1 22:37:46 2012 From: iacobcatalin at gmail.com (Catalin Iacob) Date: Wed, 1 Feb 2012 22:37:46 +0100 Subject: [Python-Dev] Switching to Visual Studio 2010 In-Reply-To: <4F15DD85.6000905@v.loewis.de> References: <4F15DD85.6000905@v.loewis.de> Message-ID: On Tue, Jan 17, 2012 at 9:43 PM, "Martin v. L?wis" wrote: ... > P.S. Here is my personal list of requirements and non-requirements: ... > - must generate binaries that run on Windows XP I recently read about Firefox switching to VS2010 and therefore needing to drop support for Windows 2000, XP RTM (no service pack) and XP SP1. Indeed, [1] confirms that the VS2010 runtime (it's not clear if the C one, the C++ one or both) needs XP SP2 or higher. Just thought I'd share this so that an informed decision can be made, in my opinion it would be ok for Python 3.3 to drop everything prior to XP SP2. Maybe not very relevant, but [2] has some mention of statistics for Firefox usage on systems prior to XP SP2. [1] http://connect.microsoft.com/VisualStudio/feedback/details/526821/executables-built-with-visual-c-2010-do-not-run-on-windows-xp-prior-to-sp2 [2] http://weblogs.mozillazine.org/asa/archives/2012/01/end_of_firefox_win2k.html From brian at python.org Wed Feb 1 22:41:48 2012 From: brian at python.org (Brian Curtin) Date: Wed, 1 Feb 2012 15:41:48 -0600 Subject: [Python-Dev] Switching to Visual Studio 2010 In-Reply-To: References: <4F15DD85.6000905@v.loewis.de> Message-ID: On Wed, Feb 1, 2012 at 15:37, Catalin Iacob wrote: > On Tue, Jan 17, 2012 at 9:43 PM, "Martin v. L?wis" wrote: > ... >> P.S. Here is my personal list of requirements and non-requirements: > ... >> - must generate binaries that run on Windows XP > > I recently read about Firefox switching to VS2010 and therefore > needing to drop support for Windows 2000, XP RTM (no service pack) and > XP SP1. Indeed, [1] confirms that the VS2010 runtime (it's not clear > if the C one, the C++ one or both) needs XP SP2 or higher. > > Just thought I'd share this so that an informed decision can be made, > in my opinion it would be ok for Python 3.3 to drop everything prior > to XP SP2. > > Maybe not very relevant, but [2] has some mention of statistics for > Firefox usage on systems prior to XP SP2. > > [1] http://connect.microsoft.com/VisualStudio/feedback/details/526821/executables-built-with-visual-c-2010-do-not-run-on-windows-xp-prior-to-sp2 > [2] http://weblogs.mozillazine.org/asa/archives/2012/01/end_of_firefox_win2k.html We already started moving forward with dropping Windows 2000 prior to this coming up. http://mail.python.org/pipermail/python-dev/2011-May/111159.html was the discussion (which links an older discussion) and PEP-11 (http://www.python.org/dev/peps/pep-0011/) was updated accordingly. From brian at python.org Wed Feb 1 22:59:33 2012 From: brian at python.org (Brian Curtin) Date: Wed, 1 Feb 2012 15:59:33 -0600 Subject: [Python-Dev] Switching to Visual Studio 2010 In-Reply-To: References: <4F15DD85.6000905@v.loewis.de> Message-ID: On Wed, Feb 1, 2012 at 15:41, Brian Curtin wrote: > On Wed, Feb 1, 2012 at 15:37, Catalin Iacob wrote: >> On Tue, Jan 17, 2012 at 9:43 PM, "Martin v. L?wis" wrote: >> ... >>> P.S. Here is my personal list of requirements and non-requirements: >> ... >>> - must generate binaries that run on Windows XP >> >> I recently read about Firefox switching to VS2010 and therefore >> needing to drop support for Windows 2000, XP RTM (no service pack) and >> XP SP1. Indeed, [1] confirms that the VS2010 runtime (it's not clear >> if the C one, the C++ one or both) needs XP SP2 or higher. >> >> Just thought I'd share this so that an informed decision can be made, >> in my opinion it would be ok for Python 3.3 to drop everything prior >> to XP SP2. >> >> Maybe not very relevant, but [2] has some mention of statistics for >> Firefox usage on systems prior to XP SP2. >> >> [1] http://connect.microsoft.com/VisualStudio/feedback/details/526821/executables-built-with-visual-c-2010-do-not-run-on-windows-xp-prior-to-sp2 >> [2] http://weblogs.mozillazine.org/asa/archives/2012/01/end_of_firefox_win2k.html > > We already started moving forward with dropping Windows 2000 prior to > this coming up. > http://mail.python.org/pipermail/python-dev/2011-May/111159.html was > the discussion (which links an older discussion) and PEP-11 > (http://www.python.org/dev/peps/pep-0011/) was updated accordingly. Sorry, hit send too soon... Anyway, I can't imagine many of our users (and their users) are still using pre-SP2. It was released in 2004 and was superseded by SP3 and two entire OS releases. I don't know of a reliable way of figuring out whether or not pre-SP2 is a measurable demographic for us, but I can't imagine it's enough to make us hold up the move for another ~2 years. From anacrolix at gmail.com Wed Feb 1 23:31:33 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Thu, 2 Feb 2012 09:31:33 +1100 Subject: [Python-Dev] PEP 409 - final? In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F298901.9090100@stoneleaf.us> <4F29A6D5.4060108@stoneleaf.us> Message-ID: raise from None seems pretty "in band". A NoException class could have many other uses and leaves no confusion about intent. -------------- next part -------------- An HTML attachment was scrubbed... URL: From trent at snakebite.org Wed Feb 1 23:49:03 2012 From: trent at snakebite.org (Trent Nelson) Date: Wed, 1 Feb 2012 17:49:03 -0500 Subject: [Python-Dev] Switching to Visual Studio 2010 In-Reply-To: <20120129202309.GA21774@snakebite.org> References: <4F15DD85.6000905@v.loewis.de> <4F15E1A1.6090303@v.loewis.de> <20120126215431.Horde.dSI3OML8999PIb2HJXHnfeA@webmail.df.eu> <20120129202309.GA21774@snakebite.org> Message-ID: <20120201224900.GA28491@snakebite.org> On Sun, Jan 29, 2012 at 12:23:14PM -0800, Trent Nelson wrote: > * Updates to externals/(tcl|tk)-8.5.9.x so that they both build with > VS2010. Before I go updating tcl/tk, any thoughts on bumping our support to the latest revision, 8.5.11? I guess the same question applies to all the externals, actually (zlib, openssl, sqlite, bsddb, etc). In the past we've typically bumped up our support to the latest version prior to beta, then stuck with that for the release's life, right? Semi-related note: is svn.python.org/externals still the primary repo for externals? (I can't see a similarly named hg repo.) Trent. From ethan at stoneleaf.us Thu Feb 2 01:18:08 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 01 Feb 2012 16:18:08 -0800 Subject: [Python-Dev] PEP 409 - final? In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F298901.9090100@stoneleaf.us> <4F29A6D5.4060108@stoneleaf.us> Message-ID: <4F29D640.30306@stoneleaf.us> Terry Reedy wrote: > It sounds like you are asking for a special class > __NoException__(BaseException) to use as the marker. Guido van Rossum wrote: > So what did you think of Terry Reedy's idea of using a special exception class? Our table would then look like: __context__ __cause__ raise None __NoException__ reraise previous __NoException__ reraise from previous None | exception It is certainly simpler than trying to force the use of both True and False. :) The purist side of me thinks it's still slightly awkward; the practical side recognizes that there probably is not a perfect solution and thinks this is workable, and is willing to deal with the slight awkwardness to get 'from None' up and running. :) The main reason for the effort in keeping the previous exception in __context__ instead of just clobbering it is for custom error handlers, yes? So here is a brief comparison of the two: def complete_traceback(): .... Actually, I got about three lines into that and realized that whatever __cause__ is set to is completely irrelevant for that function: if *__context__* is not None, follow the chain; the only relevance __cause__ has is when would it print? If it is a (valid) exception. And how do we know if it's valid? # True, False, None if isinstance(exc.__cause__, BaseException): print_exc(exc) or if exc.__cause__ not in (True, False, None): print_exc(exc) vs # None, __NoException__ (forced to be an instance) if (exc.__cause__ is not None and not isinstance(exc.__cause__, __NoException__): print_exc(exc) or # None, __NoException__ (forced to stay a class) if exc.__cause__ not in (None, __NoException__): print_exc(exc) Having gone through all that, I'm equally willing to go either way (True/False/None or __NoException__). Implementation questions for the __NoException__ route: 1) Do we want double underscores, or just a single one? I'm thinking double to mark it as special as opposed to private. 2) This is a new exception class -- do we want to store the class itself in __context__, or it's instance? If its class, should we somehow disallow instantiation of it? 3) Should it be an exception, or just inherit from object? Is it worth worrying about somebody trying to raise it, or raise from it? 4) Is the name '__NoException__' confusing? ~Ethan~ From timothy.c.delaney at gmail.com Thu Feb 2 01:44:24 2012 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Thu, 2 Feb 2012 11:44:24 +1100 Subject: [Python-Dev] PEP 409 - final? In-Reply-To: <4F29D640.30306@stoneleaf.us> References: <4F28B81B.20801@stoneleaf.us> <4F298901.9090100@stoneleaf.us> <4F29A6D5.4060108@stoneleaf.us> <4F29D640.30306@stoneleaf.us> Message-ID: On 2 February 2012 11:18, Ethan Furman wrote: > Implementation questions for the __NoException__ route: > > 1) Do we want double underscores, or just a single one? > > I'm thinking double to mark it as special as opposed > to private. > Double and exposed allows someone to explicitly the __cause__ to __NoException__ on an existing exception. > 2) This is a new exception class -- do we want to store the > class itself in __context__, or it's instance? If its > class, should we somehow disallow instantiation of it? > > 3) Should it be an exception, or just inherit from object? > Is it worth worrying about somebody trying to raise it, or > raise from it? > If it's not actually an exception, we get prevention of instantiation for free. My feeling is just make it a singleton object. > 4) Is the name '__NoException__' confusing? Seems perfectly expressive to me so long as it can't itself be raised. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Feb 2 01:49:32 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 2 Feb 2012 10:49:32 +1000 Subject: [Python-Dev] PEP 409 - final? In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F298901.9090100@stoneleaf.us> <4F29A6D5.4060108@stoneleaf.us> <4F29D640.30306@stoneleaf.us> Message-ID: On Thu, Feb 2, 2012 at 10:44 AM, Tim Delaney wrote: >> 3) Should it be an exception, or just inherit from object? >> ? Is it worth worrying about somebody trying to raise it, or >> ? raise from it? > > If it's not actually an exception, we get prevention of instantiation for > free. My feeling is just make it a singleton object. Yeah, a new Ellipsis/None style singleton probably makes more sense than an exception instance. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eric at trueblade.com Thu Feb 2 02:01:03 2012 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 01 Feb 2012 20:01:03 -0500 Subject: [Python-Dev] PEP 409 - final? In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F298901.9090100@stoneleaf.us> <4F29A6D5.4060108@stoneleaf.us> <4F29D640.30306@stoneleaf.us> Message-ID: <4F29E04F.7060602@trueblade.com> On 2/1/2012 7:49 PM, Nick Coghlan wrote: > On Thu, Feb 2, 2012 at 10:44 AM, Tim Delaney > wrote: >>> 3) Should it be an exception, or just inherit from object? >>> Is it worth worrying about somebody trying to raise it, or >>> raise from it? >> >> If it's not actually an exception, we get prevention of instantiation for >> free. My feeling is just make it a singleton object. > > Yeah, a new Ellipsis/None style singleton probably makes more sense > than an exception instance. But now we're adding a new singleton, unrelated to exceptions (other than its name) because we don't want to use an existing singleton (False). Maybe the name difference is good enough justification. Eric. From victor.stinner at haypocalc.com Thu Feb 2 02:03:15 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 2 Feb 2012 02:03:15 +0100 Subject: [Python-Dev] PEP: New timestamp formats Message-ID: Even if I am not really conviced that a PEP helps to design an API, here is a draft of a PEP to add new timestamp formats to Python 3.3. Don't see the draft as a final proposition, it is just a document supposed to help the discussion :-) --- PEP: xxx Title: New timestamp formats Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 01-Feburary-2012 Python-Version: 3.3 Abstract ======== Python 3.3 introduced functions supporting nanosecond resolutions. Python 3.3 only supports int or float to store timestamps, but these types cannot be use to store a timestamp with a nanosecond resolution. Motivation ========== Python 2.3 introduced float timestamps to support subsecond resolutions, os.stat() uses float timestamps by default since Python 2.5. Python 3.3 introduced functions supporting nanosecond resolutions: * os.stat() * os.utimensat() * os.futimens() * time.clock_gettime() * time.clock_getres() * time.wallclock() (reuse time.clock_gettime(time.CLOCK_MONOTONIC)) The problem is that floats of 64 bits are unable to store nanoseconds (10^-9) for timestamps bigger than 2^24 seconds (194 days 4 hours: 1970-07-14 for an Epoch timestamp) without loosing precision. .. note:: 64 bits float starts to loose precision with microsecond (10^-6) resolution for timestamp bigger than 2^33 seconds (272 years: 2242-03-16 for an Epoch timestamp). Timestamp formats ================= Choose a new format for nanosecond resolution --------------------------------------------- To support nanosecond resolution, four formats were considered: * 128 bits float * decimal.Decimal * datetime.datetime * tuple of integers Criteria -------- It should be possible to do arithmetic, for example:: t1 = time.time() # ... t2 = time.time() dt = t2 - t1 Two timestamps should be comparable (t2 > t1). The format should have a resolution of a least 1 nanosecond (without loosing precision). It is better if the format can have an arbitrary resolution. 128 bits float -------------- Add a new IEEE 754-2008 quad-precision float type. The IEEE 754-2008 quad precision float has 1 sign bit, 15 bits of exponent and 112 bits of mantissa. 128 bits float is supported by GCC (4.3), Clang and ICC. The problem is that Visual C++ 2008 doesn't support it. Python must be portable and so cannot rely on a type only available on some platforms. Another example: GCC 4.3 does not support __float128 in 32-bit mode on x86 (but gcc 4.4 does). Intel CPUs have FPU supporting 80-bit floats, but not using SSE intructions. Other CPU vendors don't support this float size. There is also a license issue: GCC uses the MPFR library which is distributed under the GNU LGPL license. This license is incompatible with the Python Software License. datetime.datetime ----------------- datetime.datetime only supports microsecond resolution, but can be enhanced to support nanosecond. datetime.datetime has issues: - there is no easy way to convert it into "seconds since the epoch" - any broken-down time has issues of time stamp ordering in the duplicate hour of switching from DST to normal time - time zone support is flaky-to-nonexistent in the datetime module decimal.Decimal --------------- The decimal module is implemented in Python and is not really fast. Using Decimal by default would cause bootstrap issue because the module is implemented in Python. Decimal can store a timestamp with any resolution, not only nanosecond, the resolution is configurable at runtime. Decimal objects support all arithmetics operations and are compatible with int and float. The decimal module is slow, but there is a C reimplementation of the decimal module which is almost ready for inclusion. tuple ----- Various kind of tuples have been proposed. All propositions only use integers: * a) (sec, nsec): C timespec structure, useful for os.futimens() for example * b) (sec, floatpart, exponent): value = sec + floatpart * 10**exponent * c) (sec, floatpart, divisor): value = sec + floatpart / divisor The format (a) only supports nanosecond resolution. The format (a) and (b) may loose precision if the clock divisor is not a power of 10. For format (c) should be enough for most cases. Creating a tuple of integers is fast. Arithmetic operations cannot be done directly on tuple: t2-t1 doesn't work for example. Final formats ------------- The PEP proposes to provide 5 different timestamp formats: * numbers: * int * float * decimal.Decimal * datetime.timedelta * broken-down time: * datetime.datetime API design ========== Change the default result type ------------------------------ Python 2.3 introduced os.stat_float_times(). The problem is that this flag is global, and so may break libraries if the application changes the type. Changing the default result type would break backward compatibility. Callback and creating a new module to convert timestamps -------------------------------------------------------- Use a callback taking integers to create a timestamp. Example with float: def timestamp_to_float(seconds, floatpart, divisor): return seconds + floatpart / divisor The time module can provide some builtin converters, and other module, like datetime, can provide their own converters. Users can define their own types. An alternative is to add new module for all functions converting timestamps. The problem is that we have to design the API of the callback and we cannot change it later. We may need more information for future needs later. os.stat: add new fields ----------------------- It was proposed to add 3 fields to os.stat() structure to get nanoseconds of timestamps. Add an argument to change the result type ----------------------------------------- Add a argument to all functions creating timestamps, like time.time(), to change their result type. It was first proposed to use a string argument, e.g. time.time(format="decimal"). The problem is that the function has to import internally a module. Then it was decided to pass directly the type, e.g. time.time(format=decimal.Decimal). Using a type, the user has first to import the module. There is no direct link between a type and the function used to create the timestamp. By default, the float type is used to keep backward compatibility. For stat functions like os.stat(), the default type depends on os.stat_float_times(). Add new functions ----------------- Add new functions for each type, examples: * time.time_decimal() * os.stat_decimal() * os.stat_datetime() * etc. Changes ======= * Add *format* optional argument to time.clock(), time.clock_gettime(), time.clock_getres(), time.time() and time.wallclock(). * Add *timestamp* optional argument to os.fstat(), os.fstatat(), os.lstat() and os.stat(). Functions accepting timestamp as input should support decimal.Decimal objects without an internal conversion to float which may loose precision: * datetime.datetime.fromtimestamp() * time.localtime() * time.gmtime() TODO: * Change os.utimensat() and os.futimens() to accept Decimal * Change os.utimensat() and os.futimens() to not accept tuple anymore * Drop os.utimensat() and os.futimens() and patch os.utimeat() instead? * datetime should maybe support nanosecond? Backwards Compatibility ======================= Changes only add an new optional argument. The default type is unchanged and there is no impact on performances. Links ===== * `Issue #11457: os.stat(): add new fields to get timestamps as Decimal objects with nanosecond resolution `_ * `Issue #13882: Add format argument for time.time(), time.clock(), ... to get a timestamp as a Decimal object `_ * `[Python-Dev] Store timestamps as decimal.Decimal objects `_ Copyright ========= This document has been placed in the public domain. From ncoghlan at gmail.com Thu Feb 2 02:43:01 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 2 Feb 2012 11:43:01 +1000 Subject: [Python-Dev] PEP 409 - final? In-Reply-To: <4F29E04F.7060602@trueblade.com> References: <4F28B81B.20801@stoneleaf.us> <4F298901.9090100@stoneleaf.us> <4F29A6D5.4060108@stoneleaf.us> <4F29D640.30306@stoneleaf.us> <4F29E04F.7060602@trueblade.com> Message-ID: On Thu, Feb 2, 2012 at 11:01 AM, Eric V. Smith wrote: > On 2/1/2012 7:49 PM, Nick Coghlan wrote: >> On Thu, Feb 2, 2012 at 10:44 AM, Tim Delaney >> wrote: >>>> 3) Should it be an exception, or just inherit from object? >>>> ? Is it worth worrying about somebody trying to raise it, or >>>> ? raise from it? >>> >>> If it's not actually an exception, we get prevention of instantiation for >>> free. My feeling is just make it a singleton object. >> >> Yeah, a new Ellipsis/None style singleton probably makes more sense >> than an exception instance. > > But now we're adding a new singleton, unrelated to exceptions (other > than its name) because we don't want to use an existing singleton (False). > > Maybe the name difference is good enough justification. That's exactly the thought process that led me to endorse the idea of using False as the "not set" marker in the first place. With None being stolen to mean "No cause and don't print the context either", the choices become: - set some *other* exception attribute to indicate whether or not to print the context - use an existing singleton like False to mean "not set, use the context" - add a new singleton specifically to mean "not set, use the context" - use a new exception type to mean "not set, use the context" Hmm, after writing up that list, the idea of using "__cause__ is Ellipsis" (or even "__cause__ is ...")to mean "use __context__ instead" occurs to me. After all, "..." has the right connotations of "fill this in from somewhere else", and since we really just want a known sentinel object that isn't None and isn't a meaningful type like the boolean singletons... Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From timothy.c.delaney at gmail.com Thu Feb 2 03:18:32 2012 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Thu, 2 Feb 2012 13:18:32 +1100 Subject: [Python-Dev] PEP 409 - final? In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F298901.9090100@stoneleaf.us> <4F29A6D5.4060108@stoneleaf.us> <4F29D640.30306@stoneleaf.us> <4F29E04F.7060602@trueblade.com> Message-ID: On 2 February 2012 12:43, Nick Coghlan wrote: > Hmm, after writing up that list, the idea of using "__cause__ is > Ellipsis" (or even "__cause__ is ...")to mean "use __context__ > instead" occurs to me. After all, "..." has the right connotations of > "fill this in from somewhere else", and since we really just want a > known sentinel object that isn't None and isn't a meaningful type like > the boolean singletons... > It's cute yet seems appropriate ... I quite like it. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Feb 2 04:47:07 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 2 Feb 2012 13:47:07 +1000 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: On Thu, Feb 2, 2012 at 11:03 AM, Victor Stinner wrote: > Even if I am not really conviced that a PEP helps to design an API, > here is a draft of a PEP to add new timestamp formats to Python 3.3. > Don't see the draft as a final proposition, it is just a document > supposed to help the discussion :-) Helping keep a discussion on track (and avoiding rehashing old ground) is precisely why the PEP process exists. Thanks for writing this up :) > --- > > PEP: xxx > Title: New timestamp formats > Version: $Revision$ > Last-Modified: $Date$ > Author: Victor Stinner > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 01-Feburary-2012 > Python-Version: 3.3 > > > Abstract > ======== > > Python 3.3 introduced functions supporting nanosecond resolutions. Python 3.3 > only supports int or float to store timestamps, but these types cannot be use > to store a timestamp with a nanosecond resolution. > > > Motivation > ========== > > Python 2.3 introduced float timestamps to support subsecond resolutions, > os.stat() uses float timestamps by default since Python 2.5. Python 3.3 > introduced functions supporting nanosecond resolutions: > > ?* os.stat() > ?* os.utimensat() > ?* os.futimens() > ?* time.clock_gettime() > ?* time.clock_getres() > ?* time.wallclock() (reuse time.clock_gettime(time.CLOCK_MONOTONIC)) > > The problem is that floats of 64 bits are unable to store nanoseconds (10^-9) > for timestamps bigger than 2^24 seconds (194 days 4 hours: 1970-07-14 for an > Epoch timestamp) without loosing precision. > > .. note:: > ? 64 bits float starts to loose precision with microsecond (10^-6) resolution > ? for timestamp bigger than 2^33 seconds (272 years: 2242-03-16 for an Epoch > ? timestamp). > > > Timestamp formats > ================= > > Choose a new format for nanosecond resolution > --------------------------------------------- > > To support nanosecond resolution, four formats were considered: > > ?* 128 bits float > ?* decimal.Decimal > ?* datetime.datetime > ?* tuple of integers I'd add datetime.timedelta to this list. It's exactly what timestamps are, after all - the difference between the current time and the relevant epoch value. > Various kind of tuples have been proposed. All propositions only use integers: > > ?* a) (sec, nsec): C timespec structure, useful for os.futimens() for example > ?* b) (sec, floatpart, exponent): value = sec + floatpart * 10**exponent > ?* c) (sec, floatpart, divisor): value = sec + floatpart / divisor > > The format (a) only supports nanosecond resolution. > > The format (a) and (b) may loose precision if the clock divisor is not a > power of 10. > > For format (c) should be enough for most cases. Format (b) only loses precision if the exponent chosen for a given value is too small relative to the precision of the underlying timer (it's the same as using decimal.Decimal in that respect). The problem with (a) is that it simply cannot represent times with greater than nanosecond precision. Since we have the opportunity, we may as well deal with the precision question once and for all. Alternatively, you could return a 4-tuple that specifies the base in addition to the exponent. > Callback and creating a new module to convert timestamps > -------------------------------------------------------- > > Use a callback taking integers to create a timestamp. Example with float: > > ? ?def timestamp_to_float(seconds, floatpart, divisor): > ? ? ? ?return seconds + floatpart / divisor > > The time module can provide some builtin converters, and other module, like > datetime, can provide their own converters. Users can define their own types. > > An alternative is to add new module for all functions converting timestamps. > > The problem is that we have to design the API of the callback and we cannot > change it later. We may need more information for future needs later. I'd be more specific here - either of the 3-tuple options already presented in the PEP, or the 4-tuple option I mentioned above, would be suitable as the signature of an arbitrary precision callback API that assumes timestamps are always expressed as "seconds since a particular epoch value". Such an API could only become limiting if timestamps ever become something other than "the difference in time between right now and the relevant epoch value", and that's a sufficiently esoteric possibility that it really doesn't seem worthwhile to take it into account. The past problems with timestamp APIs have all related to increases in precision, not timestamps being redefined as something radically different. The PEP should also mention PJE's suggestion of creating a new named protocol specifically for the purpose (with a signature based on one of the proposed tuple formats), such that you could simply write: time.time() # output=float by default time.time(output=float) time.time(output=int) time.time(output=fractions.Fraction) time.time(output=decimal.Decimal) time.time(output=datetime.timedelta) time.time(output=datetime.datetime) # (and similarly for os.stat with a timestamp=type parameter) Rather than being timestamp specific, such a protocol would be a general numeric protocol. If (integer, numerator, denominator) is used (i.e. a "mixed number" in mathematical terms), then "__from_mixed__" would be an appropriate name. If (integer, fractional, exponent) is used (i.e. a fixed point notation), then "__from_fixed__" would work. # Algorithm for a "from mixed numbers" protocol, assuming division doesn't lose precision... def __from_mixed__(cls, integer, numerator, denominator): return cls(integer) + cls(numerator) / cls(denominator) # Algorithm for a "from fixed point" protocol, assuming negative exponents don't lose precision... def __from_fixed__(cls, integer, mantissa, base, exponent): return cls(integer) + cls(mantissa) * cls(base) ** cls(exponent) >From a *usage* point of view, this idea is actually the same as the proposal currently in the PEP. The difference is that instead of adding custom support for a few particular types directly to time and os, it instead defines a more general purpose protocol that covers not only this use case, but also any other situation where high precision fractions are relevant. One interesting question with a named protocol approach is whether such a protocol should *require* explicit support, or if it should fall back to the underlying mathematical operations. Since the conversions to float and int in the timestamp case are already known to be lossy, permitting lossy conversion via the mathematical equivalents seems reasonable, suggesting possible protocol definitions as follows: # Algorithm for a potentially precision-losing "from mixed numbers" protocol def from_mixed(cls, integer, numerator, denominator): try: factory = cls.__from_mixed__ except AttributeError: return cls(integer) + cls(numerator) / cls(denominator) return factory(integer, numerator, denominator) # Algorithm for a potentially lossy "from fixed point" protocol def from_fixed(cls, integer, mantissa, base, exponent): try: factory = cls.__from_fixed__ except AttributeError: return cls(integer) + cls(mantissa) * cls(base) ** cls(exponent) return factory(integer, mantissa, base, exponent) > os.stat: add new fields > ----------------------- > > It was proposed to add 3 fields to os.stat() structure to get nanoseconds of > timestamps. It's worth noting that the challenge with this is that it's potentially time consuming to populating the extra fields, and that this approach doesn't help with the time APIs that return timestamps directly. > Add an argument to change the result type > ----------------------------------------- > > Add a argument to all functions creating timestamps, like time.time(), to > change their result type. It was first proposed to use a string argument, > e.g. time.time(format="decimal"). The problem is that the function has > to import internally a module. Then it was decided to pass directly the > type, e.g. time.time(format=decimal.Decimal). Using a type, the user has > first to import the module. There is no direct link between a type and the > function used to create the timestamp. > > By default, the float type is used to keep backward compatibility. For stat > functions like os.stat(), the default type depends on os.stat_float_times(). There should also be a description of the "set a boolean flag to request high precision output" approach. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ethan at stoneleaf.us Thu Feb 2 08:18:23 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 01 Feb 2012 23:18:23 -0800 Subject: [Python-Dev] PEP 409 - final? In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F298901.9090100@stoneleaf.us> <4F29A6D5.4060108@stoneleaf.us> <4F29D640.30306@stoneleaf.us> <4F29E04F.7060602@trueblade.com> Message-ID: <4F2A38BF.7080701@stoneleaf.us> Tim Delaney wrote: > On 2 February 2012 12:43, Nick Coghlan wrote: > > Hmm, after writing up that list, the idea of using "__cause__ is > Ellipsis" (or even "__cause__ is ...")to mean "use __context__ > instead" occurs to me. After all, "..." has the right connotations of > "fill this in from somewhere else", and since we really just want a > known sentinel object that isn't None and isn't a meaningful type like > the boolean singletons... > > > It's cute yet seems appropriate ... I quite like it. I find it very amusing, yet also appropriate -- I'm happy with it. ~Ethan~ From p.f.moore at gmail.com Thu Feb 2 11:53:49 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 2 Feb 2012 10:53:49 +0000 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: On 2 February 2012 03:47, Nick Coghlan wrote: > Rather than being timestamp specific, such a protocol would be a > general numeric protocol. If (integer, numerator, denominator) is used > (i.e. a "mixed number" in mathematical terms), then "__from_mixed__" > would be an appropriate name. If (integer, fractional, exponent) is > used (i.e. a fixed point notation), then "__from_fixed__" would work. > > ? ?# Algorithm for a "from mixed numbers" protocol, assuming division > doesn't lose precision... > ? ?def __from_mixed__(cls, integer, numerator, denominator): > ? ? ? ?return cls(integer) + cls(numerator) / cls(denominator) > > ? ?# Algorithm for a "from fixed point" protocol, assuming negative > exponents don't lose precision... > ? ?def __from_fixed__(cls, integer, mantissa, base, exponent): > ? ? ? ?return cls(integer) + cls(mantissa) * cls(base) ** cls(exponent) > > >From a *usage* point of view, this idea is actually the same as the > proposal currently in the PEP. The difference is that instead of > adding custom support for a few particular types directly to time and > os, it instead defines a more general purpose protocol that covers not > only this use case, but also any other situation where high precision > fractions are relevant. > > One interesting question with a named protocol approach is whether > such a protocol should *require* explicit support, or if it should > fall back to the underlying mathematical operations. Since the > conversions to float and int in the timestamp case are already known > to be lossy, permitting lossy conversion via the mathematical > equivalents seems reasonable, suggesting possible protocol definitions > as follows: > > ? ?# Algorithm for a potentially precision-losing "from mixed numbers" protocol > ? ?def from_mixed(cls, integer, numerator, denominator): > ? ? ? ?try: > ? ? ? ? ? ?factory = cls.__from_mixed__ > ? ? ? ?except AttributeError: > ? ? ? ? ? ?return cls(integer) + cls(numerator) / cls(denominator) > ? ? ? ?return factory(integer, numerator, denominator) > > ? ?# Algorithm for a potentially lossy "from fixed point" protocol > ? ?def from_fixed(cls, integer, mantissa, base, exponent): > ? ? ? ?try: > ? ? ? ? ? ?factory = cls.__from_fixed__ > ? ? ? ?except AttributeError: > ? ? ? ? ? ?return cls(integer) + cls(mantissa) * cls(base) ** cls(exponent) > ? ? ? ?return factory(integer, mantissa, base, exponent) The key problem with a protocol is that the implementer has to make these decisions. The callback approach defers that decision to the end user. After all, the end user is the one who knows for his app whether precision loss is acceptable. You could probably also have a standard named protocol which can be used as a callback in straightforward cases time.time(callback=timedelta.__from_mixed__) That's wordy, and a bit ugly, though. The callback code could special-case types and look for __from_mixed__, I guess. Or use an ABC, and have the code that uses the callback do if issubclass(cb, MixedNumberABC): return cb.__from_mixed__(whole, num, den) else: return cb(whole, num, den) (The second branch is the one that allows the user to override the predefined types that work - if you omit that, you're back to a named protocol and ABCs don't gain you much beyond documentation). Part of me feels that there's a use case for generic functions in here, but maybe not (as it's overloading on the return type). Let's not open that discussion again, though. Paul. From chris at simplistix.co.uk Thu Feb 2 12:30:17 2012 From: chris at simplistix.co.uk (Chris Withers) Date: Thu, 02 Feb 2012 11:30:17 +0000 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> Message-ID: <4F2A73C9.1090900@simplistix.co.uk> On 01/02/2012 17:50, Guido van Rossum wrote: > Another question: a common pattern is to use (immutable) class > variables as default values for instance variables, and only set the > instance variables once they need to be different. Does such a class > benefit from your improvement? A less common pattern, but which still needs to work, is where a mutable class variable is deliberately store state across all instances of a class... Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk From victor.stinner at haypocalc.com Thu Feb 2 13:16:33 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 2 Feb 2012 13:16:33 +0100 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: > I'd add datetime.timedelta to this list. It's exactly what timestamps > are, after all - the difference between the current time and the > relevant epoch value. Ah yes, I forgot to mention it, whereas it is listed in the "final timestamp formats list" :-) >> ?* a) (sec, nsec): C timespec structure, useful for os.futimens() for example >> ?* b) (sec, floatpart, exponent): value = sec + floatpart * 10**exponent >> ?* c) (sec, floatpart, divisor): value = sec + floatpart / divisor >> >> The format (a) and (b) may loose precision if the clock divisor is not a >> power of 10. > > Format (b) only loses precision if the exponent chosen for a given > value is too small relative to the precision of the underlying timer > (it's the same as using decimal.Decimal in that respect). Let's take an NTP timestamp in format (c): (sec=0, floatpart=100000000, divisor=2**32): >>> Decimal(100000000) * Decimal(10)**-10 Decimal('0.0100000000') >>> Decimal(100000000) / Decimal(2)**32 Decimal('0.023283064365386962890625') You have an error of 57%. Or do you mean that not only 2**32 should be modified, but also 100000000? How do you adapt 100000000 (floatpart) when changing the divisor (2**32 => 10**-10)? The format (c) avoids an operation (base^exponent) and avoids loosing precision. There is the same issue with QueryPerformanceFrequency and QueryPerformanceCounter used by time.clock(), the frequency is not a power of any base. I forgot to mention another advantage of (c), used by my patch for the Decimal format: you can get the exact resolution of the clock directly: 1/divisor. It works for any divisor (not only base^exponent). By the way, the format (c) can be simplified as a fraction: (numerator, denominator) using (seconds * divisor + floatpart, divisor). But this format is less practical to implement a function creating a timestamp. >> Callback and creating a new module to convert timestamps > (...) > Such an API could only become limiting if > timestamps ever become something other than "the difference in time > between right now and the relevant epoch value", and that's a > sufficiently esoteric possibility that it really doesn't seem > worthwhile to take it into account. It may be interesting to support a different start date (other than 1970.1.1), if we choose to support broken-down timestamps (e.g. datetime.datetime). > The PEP should also mention PJE's suggestion of creating a new named > protocol specifically for the purpose (with a signature based on one > of the proposed tuple formats) (...) Ok, I will add it. > Rather than being timestamp specific, such a protocol would be a > general numeric protocol. If (integer, numerator, denominator) is used > (i.e. a "mixed number" in mathematical terms), then "__from_mixed__" > would be an appropriate name. If (integer, fractional, exponent) is > used (i.e. a fixed point notation), then "__from_fixed__" would work. > > ? ?# Algorithm for a "from mixed numbers" protocol, assuming division > doesn't lose precision... > ? ?def __from_mixed__(cls, integer, numerator, denominator): > ? ? ? ?return cls(integer) + cls(numerator) / cls(denominator) Even if I like the idea, I don't think that we need all this machinery to support nanosecond resolution. I should maybe forget my idea of using datetime.datetime or datetime.timedelta, or only only support int, float and decimal.Decimal. datetime.datetime and datetime.timedelta are already compatible with Decimal (except that they may loose precision because of an internal conversion to float): datetime.datetime.fromtimestamp(t) and datetime.timedelta(seconds=t). If we only support int, float and Decimal, we don't need to add a new protocol, hardcoded functions are enough :-) >> os.stat: add new fields >> ----------------------- >> >> It was proposed to add 3 fields to os.stat() structure to get nanoseconds of >> timestamps. > > It's worth noting that the challenge with this is that it's > potentially time consuming to populating the extra fields, and that > this approach doesn't help with the time APIs that return timestamps > directly. New fields can be optional (add a flag to get them), but I don't like the idea of a structure with a variable number of fields, especially because os.stat() structure can be used as a tuple (get a field by its index). Patching os.stat() doesn't solve the problem for the time module anyway. >> Add an argument to change the result type >> ----------------------------------------- > > There should also be a description of the "set a boolean flag to > request high precision output" approach. You mean something like: time.time(hires=True)? Or time.time(decimal=True)? Victor From p.f.moore at gmail.com Thu Feb 2 13:45:34 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 2 Feb 2012 12:45:34 +0000 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: On 2 February 2012 12:16, Victor Stinner wrote: > Let's take an NTP timestamp in format (c): (sec=0, > floatpart=100000000, divisor=2**32): > >>>> Decimal(100000000) * Decimal(10)**-10 > Decimal('0.0100000000') >>>> Decimal(100000000) / Decimal(2)**32 > Decimal('0.023283064365386962890625') > > You have an error of 57%. Or do you mean that not only 2**32 should be > modified, but also 100000000? How do you adapt 100000000 (floatpart) > when changing the divisor (2**32 => 10**-10)? The format (c) avoids an > operation (base^exponent) and avoids loosing precision. Am I missing something? If you're using the fixed point form (fraction, exponent) then 0.023283064365386962890625 would be written as (23283064365386962890625, -23). Same precision as the (100000000, base=2, exponent=32) format. Confused, Paul From ncoghlan at gmail.com Thu Feb 2 14:07:28 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 2 Feb 2012 23:07:28 +1000 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: On Thu, Feb 2, 2012 at 10:16 PM, Victor Stinner wrote: > If we only support int, float and Decimal, we don't need to add a new > protocol, hardcoded functions are enough :-) Yup, that's why your middle-ground approach didn't make any sense to me. Returning Decimal when a flag is set to request high precision values actually handles everything (since any epoch related questions only arise later when converting the decimal timestamp to an absolute time value). I think a protocol based approach would be *feasible*, but also overkill for the specific problem we're trying to handle (i.e. arbitrary precision timestamps). If a dependency from time and os on the decimal module means we decide to finally incorporate Stefan's cdecimal branch, I consider that a win in its own right (there are some speed hacks in decimal that didn't fair well in the Py3k transition because they went from being 8-bit str based to Unicode str based. They didn't *break* from a correctness point of view, but my money would be on they're being pessimisations now instead of optimisations). >>> os.stat: add new fields >>> ----------------------- > New fields can be optional (add a flag to get them), but I don't like > the idea of a structure with a variable number of fields, especially > because os.stat() structure can be used as a tuple (get a field by its > index). > > Patching os.stat() doesn't solve the problem for the time module anyway. We can't add new fields to the stat tuple anyway - it breaks tuple unpacking. Any new fields would have been accessible by name only (which poses its own problems, but is a solution we've used before - in the codecs module, for example). As you say though, this was never going to be adequate since it doesn't help with the time APIs. >>> Add an argument to change the result type >>> ----------------------------------------- >> >> There should also be a description of the "set a boolean flag to >> request high precision output" approach. > > You mean something like: time.time(hires=True)? Or time.time(decimal=True)? Yeah, I was thinking "hires" as the short form of "high resolution", but it's a little confusing since it also parses as the word "hires" (i.e. "hire"+"s"). "hi_res", "hi_prec" (for "high precision") or "full_prec" (for "full precision") might be better. I don't really like "decimal" as the flag name, since it confuses an implementation detail (using decimal.Decimal) with the design intent (preserving the full precision of the underlying timestamp). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From victor.stinner at haypocalc.com Thu Feb 2 14:10:14 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 2 Feb 2012 14:10:14 +0100 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: > Even if I like the idea, I don't think that we need all this machinery > to support nanosecond resolution. I should maybe forget my idea of > using datetime.datetime or datetime.timedelta, or only only support > int, float and decimal.Decimal. I updated my patch (issue #13882) to only support int, float and decimal.Decimal types. I suppose that it is just enough. Only adding decimal.Decimal type avoids many questions: - which API / protocol should be used to support other types - what is the start of a timestamp? - etc. As we seen: using time.time(timestamp=type) API, it will be easy to support new types later (using a new protocol, a registry like Unicode codecs, or anything else). Let's start with decimal.Decimal and support it correctly (e.g. patch datetime.datetime.fromtimestamp() and os.*utime*() functions). Victor From ncoghlan at gmail.com Thu Feb 2 14:13:49 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 2 Feb 2012 23:13:49 +1000 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: On Thu, Feb 2, 2012 at 10:45 PM, Paul Moore wrote: > On 2 February 2012 12:16, Victor Stinner wrote: >> Let's take an NTP timestamp in format (c): (sec=0, >> floatpart=100000000, divisor=2**32): >> >>>>> Decimal(100000000) * Decimal(10)**-10 >> Decimal('0.0100000000') >>>>> Decimal(100000000) / Decimal(2)**32 >> Decimal('0.023283064365386962890625') >> >> You have an error of 57%. Or do you mean that not only 2**32 should be >> modified, but also 100000000? How do you adapt 100000000 (floatpart) >> when changing the divisor (2**32 => 10**-10)? The format (c) avoids an >> operation (base^exponent) and avoids loosing precision. > > Am I missing something? If you're using the fixed point form > (fraction, exponent) then 0.023283064365386962890625 would be written > as (23283064365386962890625, -23). Same precision as the (100000000, > base=2, exponent=32) format. Yeah, Victor's persuaded me that the only two integer based formats that would be sufficiently flexible are (integer, numerator, divisor) and (integer, mantissa, base, exponent). The latter allows for a few more optimised conversions in particular cases. Assuming a base of 10 would just make things unnecessarily awkward when the underlying base is 2, though. However, I think it's even more right to not have a protocol at all and just use decimal.Decimal for arbitrary precision timestamps (explicitly requested via a flag to preserve backwards compatibility). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Feb 2 14:18:55 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 2 Feb 2012 23:18:55 +1000 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: On Thu, Feb 2, 2012 at 11:10 PM, Victor Stinner wrote: >> Even if I like the idea, I don't think that we need all this machinery >> to support nanosecond resolution. I should maybe forget my idea of >> using datetime.datetime or datetime.timedelta, or only only support >> int, float and decimal.Decimal. > > I updated my patch (issue #13882) to only support int, float and > decimal.Decimal types. I suppose that it is just enough. > > Only adding decimal.Decimal type avoids many questions: > > ?- which API / protocol should be used to support other types > ?- what is the start of a timestamp? > ?- etc. > > As we seen: using time.time(timestamp=type) API, it will be easy to > support new types later (using a new protocol, a registry like Unicode > codecs, or anything else). Yeah, I can definitely live with the type-based API if we restrict it to those 3 types. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Thu Feb 2 14:20:07 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 2 Feb 2012 14:20:07 +0100 Subject: [Python-Dev] PEP: New timestamp formats References: Message-ID: <20120202142007.785a29f5@pitrou.net> On Thu, 2 Feb 2012 23:07:28 +1000 Nick Coghlan wrote: > > We can't add new fields to the stat tuple anyway - it breaks tuple > unpacking. I don't think that's true. The stat tuple already has a varying number of fields: http://docs.python.org/dev/library/os.html#os.stat ?For backward compatibility, the return value of stat() is also accessible as a tuple of *at least* 10 integers [...] More items may be added at the end by some implementations.? (emphasis mine) So at most you could tuple-unpack os.stat(...)[:10]. (I've never seen code tuple-unpacking a stat tuple, myself. It sounds quite cumbersome to do so.) > >>> Add an argument to change the result type > >>> ----------------------------------------- > >> > >> There should also be a description of the "set a boolean flag to > >> request high precision output" approach. > > > > You mean something like: time.time(hires=True)? Or time.time(decimal=True)? > > Yeah, I was thinking "hires" as the short form of "high resolution", > but it's a little confusing since it also parses as the word "hires" > (i.e. "hire"+"s"). "hi_res", "hi_prec" (for "high precision") or > "full_prec" (for "full precision") might be better. > > I don't really like "decimal" as the flag name, since it confuses an > implementation detail (using decimal.Decimal) with the design intent > (preserving the full precision of the underlying timestamp). But that implementation detail will be visible to the user, including when combining the result with other numbers (as Decimal "wins" over float and int). IMHO it wouldn't be silly to make it explicit. I think "hires" may confuse people into thinking the time source has a higher resolution, whereas it's only the return type. Perhaps it's just a documentation issue, though. Regards Antoine. From solipsis at pitrou.net Thu Feb 2 14:21:25 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 2 Feb 2012 14:21:25 +0100 Subject: [Python-Dev] PEP: New timestamp formats References: Message-ID: <20120202142125.0abb3950@pitrou.net> On Thu, 2 Feb 2012 14:10:14 +0100 Victor Stinner wrote: > > Even if I like the idea, I don't think that we need all this machinery > > to support nanosecond resolution. I should maybe forget my idea of > > using datetime.datetime or datetime.timedelta, or only only support > > int, float and decimal.Decimal. > > I updated my patch (issue #13882) to only support int, float and > decimal.Decimal types. I suppose that it is just enough. Why int? That doesn't seem to bring anything. Regards Antoine. From mal at egenix.com Thu Feb 2 14:31:50 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 02 Feb 2012 14:31:50 +0100 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: <4F2A9046.1020106@egenix.com> Nick Coghlan wrote: > On Thu, Feb 2, 2012 at 10:16 PM, Victor Stinner >>>> Add an argument to change the result type >>>> ----------------------------------------- >>> >>> There should also be a description of the "set a boolean flag to >>> request high precision output" approach. >> >> You mean something like: time.time(hires=True)? Or time.time(decimal=True)? > > Yeah, I was thinking "hires" as the short form of "high resolution", > but it's a little confusing since it also parses as the word "hires" > (i.e. "hire"+"s"). "hi_res", "hi_prec" (for "high precision") or > "full_prec" (for "full precision") might be better. Isn't the above (having the return type depend on an argument setting) something we generally try to avoid ? I think it's better to settle on one type for high-res timers and add a new API(s) for it. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 02 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ncoghlan at gmail.com Thu Feb 2 14:43:02 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 2 Feb 2012 23:43:02 +1000 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: <4F2A9046.1020106@egenix.com> References: <4F2A9046.1020106@egenix.com> Message-ID: On Thu, Feb 2, 2012 at 11:31 PM, M.-A. Lemburg wrote: > Isn't the above (having the return type depend on an argument > setting) something we generally try to avoid ? In Victor's actual patch, the returned object is an instance of the type you pass in, so it actually avoids that issue. > I think it's better to settle on one type for high-res timers and > add a new API(s) for it. We've basically settled on decimal.Decimal now, so yeah, the decision becomes one of spelling - either new APIs that always return Decimal instances, or a way to ask the existing APIs to return Decimal instead of floats. The way I see it, the latter should be significantly less hassle to maintain (since the code remains almost entirely shared), and it becomes trivial for someone to layer a convenience wrapper over the top that *always* requests the high precision output. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From victor.stinner at haypocalc.com Thu Feb 2 15:09:41 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 2 Feb 2012 15:09:41 +0100 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: <20120202142125.0abb3950@pitrou.net> References: <20120202142125.0abb3950@pitrou.net> Message-ID: > Why int? That doesn't seem to bring anything. It helps to deprecate/replace os.stat_float_times(), which may be used for backward compatibility (with Python 2.2 ? :-)). From solipsis at pitrou.net Thu Feb 2 15:28:31 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 2 Feb 2012 15:28:31 +0100 Subject: [Python-Dev] PEP: New timestamp formats References: <20120202142125.0abb3950@pitrou.net> Message-ID: <20120202152831.5b85cad6@pitrou.net> On Thu, 2 Feb 2012 15:09:41 +0100 Victor Stinner wrote: > > Why int? That doesn't seem to bring anything. > > It helps to deprecate/replace os.stat_float_times(), which may be used > for backward compatibility (with Python 2.2 ? :-)). I must admit I don't understand the stat_float_times documentation: ?For compatibility with older Python versions, accessing stat_result as a tuple always returns integers. Python now returns float values by default. Applications which do not work correctly with floating point time stamps can use this function to restore the old behaviour.? These two paragraphs seem to contradict themselves. That said, I don't understand why we couldn't simply deprecate stat_float_times() right now. Having an option for integer timestamps is pointless, you can just call int() on the result if you want. Regards Antoine. From victor.stinner at haypocalc.com Thu Feb 2 16:25:25 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 2 Feb 2012 16:25:25 +0100 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: <20120202152831.5b85cad6@pitrou.net> References: <20120202142125.0abb3950@pitrou.net> <20120202152831.5b85cad6@pitrou.net> Message-ID: > That said, I don't understand why we couldn't simply deprecate > stat_float_times() right now. Having an option for integer timestamps > is pointless, you can just call int() on the result if you want. So which API do you propose for time.time() to get a Decimal object? time.time(timestamp=decimal.Decimal) time.time(decimal=True) or time.time(hires=True) or something else? Victor From barry at python.org Thu Feb 2 17:56:49 2012 From: barry at python.org (Barry Warsaw) Date: Thu, 2 Feb 2012 11:56:49 -0500 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: <20120202115649.3833d4fc@resist.wooz.org> On Feb 02, 2012, at 11:07 PM, Nick Coghlan wrote: >Yup, that's why your middle-ground approach didn't make any sense to >me. Returning Decimal when a flag is set to request high precision >values actually handles everything (since any epoch related questions >only arise later when converting the decimal timestamp to an absolute >time value). Guido really dislikes APIs where a flag changes the return type, and I agree with him. It's because this is highly unreadable: results = blah.whatever(True) What the heck does that `True` do? It can be marginally better with a keyword-only argument, but not much. I haven't read the whole thread so maybe this is a stupid question, but why can't we add a datetime-compatible higher precision type that hides all the implementation details? -Barry From fuzzyman at voidspace.org.uk Thu Feb 2 18:03:32 2012 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 02 Feb 2012 17:03:32 +0000 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: <4F2A73C9.1090900@simplistix.co.uk> References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> <4F2A73C9.1090900@simplistix.co.uk> Message-ID: <4F2AC1E4.8030605@voidspace.org.uk> On 02/02/2012 11:30, Chris Withers wrote: > On 01/02/2012 17:50, Guido van Rossum wrote: >> Another question: a common pattern is to use (immutable) class >> variables as default values for instance variables, and only set the >> instance variables once they need to be different. Does such a class >> benefit from your improvement? > > A less common pattern, but which still needs to work, is where a > mutable class variable is deliberately store state across all > instances of a class... > Given that Mark's patch passes the Python test suite I'm sure basic patterns like this *work*, the question is which of them take advantage of the improved memory efficiency. In the case you mention I don't think it's an issue at all, because the class level attribute doesn't (generally) appear in instance dicts. What's also common is where the class holds a *default* value for instances, which may be overridden by an instance attribute on *some* instances. All the best, Michael Foord > Chris > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From solipsis at pitrou.net Thu Feb 2 18:22:46 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 2 Feb 2012 18:22:46 +0100 Subject: [Python-Dev] A new dictionary implementation References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> Message-ID: <20120202182246.2e48f19e@pitrou.net> On Wed, 1 Feb 2012 09:50:55 -0800 Guido van Rossum wrote: > On Wed, Feb 1, 2012 at 9:13 AM, Hans Mulder wrote: > > On 30/01/12 00:30:14, Steven D'Aprano wrote: > >> > >> Mark Shannon wrote: > >>> > >>> Antoine Pitrou wrote: > > > > [......] > > > >>> Antoine is right. It is a reorganisation of the dict, plus a couple of > >>> changes to typeobject.c and object.c to ensure that instance > >>> dictionaries do indeed share keys arrays. > >> > >> > >> > >> I don't quite follow how that could work. > >> > >> If I have this: > >> > >> class C: > >> pass > >> > >> a = C() > >> b = C() > >> > >> a.spam = 1 > >> b.ham = 2 > >> > >> > >> how can a.__dict__ and b.__dict__ share key arrays? I've tried reading > >> the source, but I'm afraid I don't understand it well enough to make > >> sense of it. > > > > > > They can't. > > > > But then, your class is atypical. ?Usually, classes initialize all the > > attributes of their instances in the __init__ method, perhaps like so: > > > > class D: > > ? ?def __init__(self, ham=None, spam=None): > > ? ? ? ?self.ham = ham > > ? ? ? ?self.spam = spam > > > > As long as you follow the common practice of not adding any attributes > > after the object has been initialized, your instances can share their > > keys array. ?Mark's patch will do that. > > > > You'll still be allowed to have different attributes per instance, but > > if you do that, then the patch doesn't buy you much. > > Hey, I like this! It's a subtle encouragement for developers to > initialize all their instance variables in their __init__ or __new__ > method, with a (modest) performance improvement for a carrot. (Though > I have to admit I have no idea how you do it. Wouldn't the set of dict > keys be different while __init__ is in the middle of setting the > instance variables?) > > Another question: a common pattern is to use (immutable) class > variables as default values for instance variables, and only set the > instance variables once they need to be different. Does such a class > benefit from your improvement? I'm not sure who "you" is in your e-mail, but AFAICT Mark's patch doesn't special-case __init__ or __new__. Any attribute setting on an instance uses the shared keys array on the instance's type. "Missing" attributes on an instance are simply NULL pointers in the instance's values array. (I've suggested that the keys array be bounded in size, to avoid pathological cases where someone (ab)uses instances as fancy dicts and puts lots of random data in them) Regards Antoine. From martin at v.loewis.de Thu Feb 2 19:49:53 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 02 Feb 2012 19:49:53 +0100 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: <4F2A73C9.1090900@simplistix.co.uk> References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> <4F2A73C9.1090900@simplistix.co.uk> Message-ID: <4F2ADAD1.2000002@v.loewis.de> Am 02.02.2012 12:30, schrieb Chris Withers: > On 01/02/2012 17:50, Guido van Rossum wrote: >> Another question: a common pattern is to use (immutable) class >> variables as default values for instance variables, and only set the >> instance variables once they need to be different. Does such a class >> benefit from your improvement? > > A less common pattern, but which still needs to work, is where a mutable > class variable is deliberately store state across all instances of a > class... This is really *just* a dictionary implementation. It doesn't affect any of the lookup procedures. If you trust that the dictionary semantics on its own isn't changed (which I believe is the case, except for key order), none of the dict applications will change. Regards, Martin From mark at hotpy.org Thu Feb 2 20:17:16 2012 From: mark at hotpy.org (Mark Shannon) Date: Thu, 02 Feb 2012 19:17:16 +0000 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> Message-ID: <4F2AE13C.6010900@hotpy.org> Just a quick update. I've been analysing and profile the behaviour of my new dict and messing about with various implementation options. I've settled on a new implementation. Its the same basic idea, but with better locality of reference for unshared keys. Guido asked: > Another question: a common pattern is to use (immutable) class > variables as default values for instance variables, and only set the > instance variables once they need to be different. Does such a class > benefit from your improvement? For those instances which keep the default, yes. Otherwise the answer is, as Martin pointed out, it could yes provided that adding a new key does not force a resize. Although it is a bit arbitrary when a resize occurs. The new version will incorporate this behaviour. Expect version 2 soon. Cheers, Mark. From ncoghlan at gmail.com Thu Feb 2 21:48:44 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 3 Feb 2012 06:48:44 +1000 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: <20120202115649.3833d4fc@resist.wooz.org> References: <20120202115649.3833d4fc@resist.wooz.org> Message-ID: On Feb 3, 2012 2:59 AM, "Barry Warsaw" wrote: > > On Feb 02, 2012, at 11:07 PM, Nick Coghlan wrote: > > >Yup, that's why your middle-ground approach didn't make any sense to > >me. Returning Decimal when a flag is set to request high precision > >values actually handles everything (since any epoch related questions > >only arise later when converting the decimal timestamp to an absolute > >time value). > > Guido really dislikes APIs where a flag changes the return type, and I agree > with him. It's because this is highly unreadable: > > results = blah.whatever(True) > > What the heck does that `True` do? It can be marginally better with a > keyword-only argument, but not much. Victor's patch passes in the return type rather than a binary flag, thus avoiding this particular problem. > I haven't read the whole thread so maybe this is a stupid question, but why > can't we add a datetime-compatible higher precision type that hides all the > implementation details? > > -Barry It's not a stupid question, but for backwards compatibility, what we would actually need is a version of Decimal that implicitly interoperates with binary floats. That's... not trivial. Cheers, Nick -- Sent from my phone, thus the relative brevity :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jyasskin at gmail.com Thu Feb 2 23:18:22 2012 From: jyasskin at gmail.com (Jeffrey Yasskin) Date: Thu, 2 Feb 2012 14:18:22 -0800 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: On Wed, Feb 1, 2012 at 5:03 PM, Victor Stinner wrote: > datetime.datetime > ----------------- > > datetime.datetime only supports microsecond resolution, but can be enhanced > to support nanosecond. > > datetime.datetime has issues: > > - there is no easy way to convert it into "seconds since the epoch" Not true: >>> import datetime, time >>> epoch = datetime.datetime(1970, 1, 1, 0, 0, 0) >>> (datetime.datetime.utcnow() - epoch).total_seconds() 1328219742.385039 >>> time.time() 1328219747.640937 >>> > - any broken-down time has issues of time stamp ordering in the > ?duplicate hour of switching from DST to normal time Only if you insist on putting it in a timezone. Use UTC, and you should be fine. > - time zone support is flaky-to-nonexistent in the datetime module Why do you need time zone support for system interfaces that return times in UTC? I think I saw another objection that datetime represented points in time, while functions like time.time() and os.stat() return an offset from the epoch. This objection seems silly to me: the return value of the system interfaces intends to represent points in time, even though it has to be implemented as an offset since an epoch because of limitations in C, and datetime is also implemented as an offset from an epoch (year 0). On the other hand, the return value of functions like time.clock() is _not_ intended to represent an exact point in time, and so should be either a timedelta or Decimal. Jeffrey From regebro at gmail.com Thu Feb 2 23:25:17 2012 From: regebro at gmail.com (Lennart Regebro) Date: Thu, 2 Feb 2012 23:25:17 +0100 Subject: [Python-Dev] Python 3 optimizations, continued, continued again... In-Reply-To: References: <4F23C657.9050501@hotpy.org> Message-ID: On Wed, Feb 1, 2012 at 20:08, stefan brunthaler wrote: > I understand all of these issues. Currently, it's not really a mess, > but much more complicated as it needs to be for only supporting the > inca optimization. I really don't think that is a problem. The core contributors can deal well with complexity in my experience. :-) //Lennart From ethan at stoneleaf.us Thu Feb 2 23:10:31 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 02 Feb 2012 14:10:31 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <4F28B81B.20801@stoneleaf.us> References: <4F28B81B.20801@stoneleaf.us> Message-ID: <4F2B09D7.3020704@stoneleaf.us> PEP: 409 Title: Suppressing exception context Version: $Revision$ Last-Modified: $Date$ Author: Ethan Furman Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 26-Jan-2012 Post-History: 30-Aug-2002, 01-Feb-2012, 03-Feb-2012 Abstract ======== One of the open issues from PEP 3134 is suppressing context: currently there is no way to do it. This PEP proposes one. Rationale ========= There are two basic ways to generate exceptions: 1) Python does it (buggy code, missing resources, ending loops, etc.) 2) manually (with a raise statement) When writing libraries, or even just custom classes, it can become necessary to raise exceptions; moreover it can be useful, even necessary, to change from one exception to another. To take an example from my dbf module: try: value = int(value) except Exception: raise DbfError(...) Whatever the original exception was (/ValueError/, /TypeError/, or something else) is irrelevant. The exception from this point on is a /DbfError/, and the original exception is of no value. However, if this exception is printed, we would currently see both. Alternatives ============ Several possibilities have been put forth: * /raise as NewException()/ Reuses the /as/ keyword; can be confusing since we are not really reraising the originating exception * /raise NewException() from None/ Follows existing syntax of explicitly declaring the originating exception * /exc = NewException(); exc.__context__ = None; raise exc/ Very verbose way of the previous method * /raise NewException.no_context(...)/ Make context suppression a class method. All of the above options will require changes to the core. Proposal ======== I proprose going with the second option: raise NewException from None It has the advantage of using the existing pattern of explicitly setting the cause: raise KeyError() from NameError() but because the cause is /None/ the previous context is not displayed by the default exception printing routines. Implementation Discussion ========================= Currently, /None/ is the default for both /__context__/ and /__cause__/. In order to support /raise ... from None/ (which would set /__cause__/ to /None/) we need a different default value for /__cause__/. Several ideas were put forth on how to implement this at the language level: * Overwrite the previous exception information (side-stepping the issue and leaving /__cause__/ at /None/). Rejected as this can seriously hinder debugging due to `poor error messages`_. * Use one of the boolean values in /__cause__/: /False/ would be the default value, and would be replaced when /from .../ was used with the explicity chained exception or /None/. Rejected as this encourages the use of two different objects types for /__cause__/ with one of them (boolean) not allowed to have the full range of possible values (/True/ would never be used). * Create a special exception class, /__NoException__/. Rejected as possibly confusing, possibly being mistakenly raised by users, and not being a truly unique value as /None/, /True/, and /False/ are. * Use /Ellipsis/ as the default value (the /.../ singleton). Accepted. There are no other possible values; it cannot be raised as it is not an acception; it has the connotation of 'fill in the rest...' as in /__cause__/ is not set, look in /__context__/ for it. Language Details ================ To support /from None/, /__context__/ will stay as it is, but /__cause__/ will start out as /Ellipsis/ and will change to /None/ when the /raise ... from None/ method is used. ============================== ================== ================== form __context__ __cause__ ============================== ================== ================== raise /None/ /Ellipsis/ reraise previous exception /Ellipsis/ reraise from previous exception /None/ | /None/ | /ChainedException/ explicitly chained exception ============================== ================== ================== The default exception printing routine will then: * If /__cause__/ is /Ellipsis/ the /__context__/ (if any) will be printed. * If /__cause__/ is /None/ the /__context__/ will not be printed. * if /__cause__/ is anything else, /__cause__/ will be printed. Patches ======= There is a patch for CPython implementing this attached to `Issue 6210`_. References ========== Discussion and refinements in this `thread on python-dev`_. .. _poor error messages: http://bugs.python.org/msg152294 .. _issue 6210: http://bugs.python.org/issue6210 .. _Thread on python-dev: http://mail.python.org/pipermail/python-dev/2012-January/115838.html Copyright ========= This document has been placed in the public domain. From v+python at g.nevcal.com Thu Feb 2 23:40:05 2012 From: v+python at g.nevcal.com (Glenn Linderman) Date: Thu, 02 Feb 2012 14:40:05 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <4F2B09D7.3020704@stoneleaf.us> References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> Message-ID: <4F2B10C5.20000@g.nevcal.com> On 2/2/2012 2:10 PM, Ethan Furman wrote: > > * Use /Ellipsis/ as the default value (the /.../ singleton). > > Accepted. There are no other possible values; it cannot be raised as > it is not an acception; it has the connotation of 'fill in the > rest...' as in /__cause__/ is not set, look in /__context__/ for it. "exception" rather that "acception" (whatever that means) -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Thu Feb 2 23:37:28 2012 From: v+python at g.nevcal.com (Glenn Linderman) Date: Thu, 02 Feb 2012 14:37:28 -0800 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: <20120202152831.5b85cad6@pitrou.net> References: <20120202142125.0abb3950@pitrou.net> <20120202152831.5b85cad6@pitrou.net> Message-ID: <4F2B1028.3000201@g.nevcal.com> On 2/2/2012 6:28 AM, Antoine Pitrou wrote: > On Thu, 2 Feb 2012 15:09:41 +0100 > Victor Stinner wrote: > >>> Why int? That doesn't seem to bring anything. >> It helps to deprecate/replace os.stat_float_times(), which may be used >> for backward compatibility (with Python 2.2 ? :-)). > I must admit I don't understand the stat_float_times documentation: > > ?For compatibility with older Python versions, accessing stat_result as > a tuple always returns integers. > > Python now returns float values by default. Applications which do not > work correctly with floating point time stamps can use this function to > restore the old behaviour.? > > These two paragraphs seem to contradict themselves. > > > That said, I don't understand why we couldn't simply deprecate > stat_float_times() right now. Having an option for integer timestamps > is pointless, you can just call int() on the result if you want. > > Regards > > Antoine. Sorry to bring this up, but the PEP should probably consider another option: Introducing a precedent following os.stat_decimal_times(). Like os.stat_float_times, it would decide the return types of timestamps from os.stat. Or something along that line. Having it affect the results of time.time would be weird, though. And the whole design of os.stat_float_times smells of something being designed wrong in the first place, to need such an API to retain backward compatibility. But I'm not sure it is, even yet, designed for such flexibility. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Feb 3 00:30:23 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 2 Feb 2012 15:30:23 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <4F2B10C5.20000@g.nevcal.com> References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B10C5.20000@g.nevcal.com> Message-ID: Great, PEP 409 is accepted with Ellipsis instead of False! On Thu, Feb 2, 2012 at 2:40 PM, Glenn Linderman wrote: > On 2/2/2012 2:10 PM, Ethan Furman wrote: > > > * Use /Ellipsis/ as the default value (the /.../ singleton). > > ? Accepted.? There are no other possible values; it cannot be raised as > ? it is not an acception; it has the connotation of 'fill in the > ? rest...' as in /__cause__/ is not set, look in /__context__/ for it. > > > "exception" rather that "acception" (whatever that means) > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) From yselivanov.ml at gmail.com Fri Feb 3 00:32:09 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Thu, 2 Feb 2012 18:32:09 -0500 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <4F2B09D7.3020704@stoneleaf.us> References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> Message-ID: In my opinion using Ellipsis is just wrong. It is completely non-obvious not only to a beginner, but even to an experienced python developer. Writing 'raise Something() from None' looks less suspicious, but still strange. Isn't 'raise Exception().no_context()' or 'raise Exception().no_cause()' more obvious and easy to implement? More readable, less complex and ambiguous. On 2012-02-02, at 5:10 PM, Ethan Furman wrote: > PEP: 409 > Title: Suppressing exception context > Version: $Revision$ > Last-Modified: $Date$ > Author: Ethan Furman > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 26-Jan-2012 > Post-History: 30-Aug-2002, 01-Feb-2012, 03-Feb-2012 > > > Abstract > ======== > > One of the open issues from PEP 3134 is suppressing context: currently > there is no way to do it. This PEP proposes one. > > > Rationale > ========= > > There are two basic ways to generate exceptions: > > 1) Python does it (buggy code, missing resources, ending loops, etc.) > > 2) manually (with a raise statement) > > When writing libraries, or even just custom classes, it can become > necessary to raise exceptions; moreover it can be useful, even > necessary, to change from one exception to another. To take an example > from my dbf module: > > try: > value = int(value) > except Exception: > raise DbfError(...) > > Whatever the original exception was (/ValueError/, /TypeError/, or > something else) is irrelevant. The exception from this point on is a > /DbfError/, and the original exception is of no value. However, if > this exception is printed, we would currently see both. > > > Alternatives > ============ > Several possibilities have been put forth: > > * /raise as NewException()/ > > Reuses the /as/ keyword; can be confusing since we are not really > reraising the originating exception > > * /raise NewException() from None/ > > Follows existing syntax of explicitly declaring the originating > exception > > * /exc = NewException(); exc.__context__ = None; raise exc/ > > Very verbose way of the previous method > > * /raise NewException.no_context(...)/ > > Make context suppression a class method. > > All of the above options will require changes to the core. > > > Proposal > ======== > > I proprose going with the second option: > > raise NewException from None > > It has the advantage of using the existing pattern of explicitly setting > the cause: > > raise KeyError() from NameError() > > but because the cause is /None/ the previous context is not displayed > by the default exception printing routines. > > > Implementation Discussion > ========================= > > Currently, /None/ is the default for both /__context__/ and /__cause__/. > In order to support /raise ... from None/ (which would set /__cause__/ > to /None/) we need a different default value for /__cause__/. Several > ideas were put forth on how to implement this at the language level: > > * Overwrite the previous exception information (side-stepping the > issue and leaving /__cause__/ at /None/). > > Rejected as this can seriously hinder debugging due to > `poor error messages`_. > > * Use one of the boolean values in /__cause__/: /False/ would be the > default value, and would be replaced when /from .../ was used with > the explicity chained exception or /None/. > > Rejected as this encourages the use of two different objects types for > /__cause__/ with one of them (boolean) not allowed to have the full > range of possible values (/True/ would never be used). > > * Create a special exception class, /__NoException__/. > > Rejected as possibly confusing, possibly being mistakenly raised by > users, and not being a truly unique value as /None/, /True/, and > /False/ are. > > * Use /Ellipsis/ as the default value (the /.../ singleton). > > Accepted. There are no other possible values; it cannot be raised as > it is not an acception; it has the connotation of 'fill in the > rest...' as in /__cause__/ is not set, look in /__context__/ for it. > > > Language Details > ================ > > To support /from None/, /__context__/ will stay as it is, but > /__cause__/ will start out as /Ellipsis/ and will change to /None/ > when the /raise ... from None/ method is used. > > ============================== ================== ================== > form __context__ __cause__ > ============================== ================== ================== > raise /None/ /Ellipsis/ > > reraise previous exception /Ellipsis/ > > reraise from previous exception /None/ | > /None/ | /ChainedException/ explicitly chained > exception > ============================== ================== ================== > > The default exception printing routine will then: > > * If /__cause__/ is /Ellipsis/ the /__context__/ (if any) will be > printed. > > * If /__cause__/ is /None/ the /__context__/ will not be printed. > > * if /__cause__/ is anything else, /__cause__/ will be printed. > > > Patches > ======= > > There is a patch for CPython implementing this attached to `Issue 6210`_. > > > References > ========== > > Discussion and refinements in this `thread on python-dev`_. > > .. _poor error messages: > http://bugs.python.org/msg152294 > .. _issue 6210: > http://bugs.python.org/issue6210 > .. _Thread on python-dev: > http://mail.python.org/pipermail/python-dev/2012-January/115838.html > > > Copyright > ========= > > This document has been placed in the public domain. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/yselivanov.ml%40gmail.com From ethan at stoneleaf.us Fri Feb 3 00:16:42 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 02 Feb 2012 15:16:42 -0800 Subject: [Python-Dev] open issues on accepted PEPs Message-ID: <4F2B195A.7060401@stoneleaf.us> I was looking at the other Open Issues on PEP 3134, think I might try to resolve them as well, and discovered via testing that they have already been taken care of. Is there an established way to get information like that? I realize that PEPs are partly historical documents, but it would it make sense to add a note after an Open Issue (or any other section) that was refined, resolved, or whatever in a later PEP or bug or patch or ...* ~Ethan~ *Yes, I am volunteering to tackle that project. From ncoghlan at gmail.com Fri Feb 3 00:38:21 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 3 Feb 2012 09:38:21 +1000 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: <4F2B1028.3000201@g.nevcal.com> References: <20120202142125.0abb3950@pitrou.net> <20120202152831.5b85cad6@pitrou.net> <4F2B1028.3000201@g.nevcal.com> Message-ID: On Fri, Feb 3, 2012 at 8:37 AM, Glenn Linderman wrote: > Sorry to bring this up, but the PEP should probably consider another option: > Introducing a precedent following os.stat_decimal_times().? Like > os.stat_float_times, it would decide the return types of timestamps from > os.stat.? Or something along that line.? Having it affect the results of > time.time would be weird, though.? And the whole design of > os.stat_float_times smells of something being designed wrong in the first > place, to need such an API to retain backward compatibility.? But I'm not > sure it is, even yet, designed for such flexibility. We could get away with a global switch for the int->float transition because ints and floats interoperate pretty well. The same is not true for binary floats and decimal.Decimal. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Fri Feb 3 00:41:40 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 3 Feb 2012 09:41:40 +1000 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> Message-ID: On Fri, Feb 3, 2012 at 9:32 AM, Yury Selivanov wrote: > In my opinion using Ellipsis is just wrong. ?It is completely > non-obvious not only to a beginner, but even to an experienced > python developer. ?Writing 'raise Something() from None' > looks less suspicious, but still strange. Beginners will never even see it (unless they're printing out __cause__ explicitly for some unknown reason). Experienced devs can go read language reference or PEP 409 for the rationale (that's one of the reasons we have a PEP process). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Fri Feb 3 00:47:11 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 3 Feb 2012 09:47:11 +1000 Subject: [Python-Dev] open issues on accepted PEPs In-Reply-To: <4F2B195A.7060401@stoneleaf.us> References: <4F2B195A.7060401@stoneleaf.us> Message-ID: On Fri, Feb 3, 2012 at 9:16 AM, Ethan Furman wrote: > I was looking at the other Open Issues on PEP 3134, think I might try to > resolve them as well, and discovered via testing that they have already been > taken care of. > > Is there an established way to get information like that? > > I realize that PEPs are partly historical documents, but it would it make > sense to add a note after an Open Issue (or any other section) that was > refined, resolved, or whatever in a later PEP or bug or patch or ...* If that kind of thing comes up, updating the PEP directly is definitely a reasonable way to clarify things. If people want to see the *exact* state of the PEP when it was accepted, they're all under version control. What we actually do depends on the specifics of the PEP, though (and whether or not anyone feels motivated to clarify things!). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ethan at stoneleaf.us Thu Feb 2 23:52:11 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 02 Feb 2012 14:52:11 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <4F2B10C5.20000@g.nevcal.com> References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B10C5.20000@g.nevcal.com> Message-ID: <4F2B139B.7090306@stoneleaf.us> Glenn Linderman wrote: > On 2/2/2012 2:10 PM, Ethan Furman wrote: >> >> * Use /Ellipsis/ as the default value (the /.../ singleton). >> >> Accepted. There are no other possible values; it cannot be raised as >> it is not an acception; it has the connotation of 'fill in the >> rest...' as in /__cause__/ is not set, look in /__context__/ for it. > > "exception" rather that "acception" (whatever that means) Argh. Kinda sounds like a royal ball... Thanks. ~Ethan~ From guido at python.org Fri Feb 3 01:04:59 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 2 Feb 2012 16:04:59 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> Message-ID: On Thu, Feb 2, 2012 at 3:41 PM, Nick Coghlan wrote: > On Fri, Feb 3, 2012 at 9:32 AM, Yury Selivanov wrote: >> In my opinion using Ellipsis is just wrong. ?It is completely >> non-obvious not only to a beginner, but even to an experienced >> python developer. ?Writing 'raise Something() from None' >> looks less suspicious, but still strange. > > Beginners will never even see it (unless they're printing out > __cause__ explicitly for some unknown reason). Experienced devs can go > read language reference or PEP 409 for the rationale (that's one of > the reasons we have a PEP process). I somehow have a feeling that Yury misread the PEP (or maybe my +1) as saying that the syntax for suppressing the context would be "raise from Ellipsis". That's not the case, it's "from None". -- --Guido van Rossum (python.org/~guido) From victor.stinner at haypocalc.com Fri Feb 3 01:21:25 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 3 Feb 2012 01:21:25 +0100 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: I updated and completed my PEP and published the last draft. It will be available at: http://www.python.org/dev/peps/pep-0410/ ( or read the source: http://hg.python.org/peps/file/tip/pep-0410.txt ) I tried to list all alternatives. Victor From ncoghlan at gmail.com Fri Feb 3 01:34:24 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 3 Feb 2012 10:34:24 +1000 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> Message-ID: On Fri, Feb 3, 2012 at 10:04 AM, Guido van Rossum wrote: > On Thu, Feb 2, 2012 at 3:41 PM, Nick Coghlan wrote: >> On Fri, Feb 3, 2012 at 9:32 AM, Yury Selivanov wrote: >>> In my opinion using Ellipsis is just wrong. ?It is completely >>> non-obvious not only to a beginner, but even to an experienced >>> python developer. ?Writing 'raise Something() from None' >>> looks less suspicious, but still strange. >> >> Beginners will never even see it (unless they're printing out >> __cause__ explicitly for some unknown reason). Experienced devs can go >> read language reference or PEP 409 for the rationale (that's one of >> the reasons we have a PEP process). > > I somehow have a feeling that Yury misread the PEP (or maybe my +1) as > saying that the syntax for suppressing the context would be "raise > from Ellipsis". That's not the case, it's "from None". Oh right, that objection makes more sense. FWIW, I expect the implementation will *allow* "raise exc from Ellipsis" as an odd synonym for "raise exc". I'd want to allow "exc.__cause__ = Ellipsis" to reset an exception with a previously set __cause__ back to the default state, at which point the synonym follows from the semantics of "raise X from Y" as syntactic sugar for "_exc = X; _exc.__cause__ = Y; raise _exc" Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ethan at stoneleaf.us Fri Feb 3 01:41:04 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 02 Feb 2012 16:41:04 -0800 Subject: [Python-Dev] open issues on accepted PEPs In-Reply-To: References: <4F2B195A.7060401@stoneleaf.us> Message-ID: <4F2B2D20.7080808@stoneleaf.us> Nick Coghlan wrote: > On Fri, Feb 3, 2012 at 9:16 AM, Ethan Furman wrote: >> I was looking at the other Open Issues on PEP 3134, think I might try to >> resolve them as well, and discovered via testing that they have already been >> taken care of. >> >> Is there an established way to get information like that? >> >> I realize that PEPs are partly historical documents, but it would it make >> sense to add a note after an Open Issue (or any other section) that was >> refined, resolved, or whatever in a later PEP or bug or patch or ...* > > If that kind of thing comes up, updating the PEP directly is > definitely a reasonable way to clarify things. If people want to see > the *exact* state of the PEP when it was accepted, they're all under > version control. What we actually do depends on the specifics of the > PEP, though (and whether or not anyone feels motivated to clarify > things!). Okay. I would like to put links to the updates to the Open Issues in PEP3134 -- is there an easier way to find those besides typing in 'exception' in the bug tracker? I would like to complete that task in /this/ lifetime. ;) ~Ethan~ From ethan at stoneleaf.us Fri Feb 3 01:32:52 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 02 Feb 2012 16:32:52 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B10C5.20000@g.nevcal.com> Message-ID: <4F2B2B34.8050204@stoneleaf.us> Guido van Rossum wrote: > Great, PEP 409 is accepted with Ellipsis instead of False! Awesome. :) ~Ethan~ From ncoghlan at gmail.com Fri Feb 3 01:59:14 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 3 Feb 2012 10:59:14 +1000 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: On Fri, Feb 3, 2012 at 10:21 AM, Victor Stinner wrote: > I updated and completed my PEP and published the last draft. It will > be available at: > http://www.python.org/dev/peps/pep-0410/ > ( or read the source: http://hg.python.org/peps/file/tip/pep-0410.txt ) > > I tried to list all alternatives. Looks pretty good to me, just a few comments in regards to the descriptions of the alternatives (I'm not advocating for any the rejected options any more, since I'm happy with the current proposal, but the PEP should be clear on our reasons for rejecting them): decimal.Decimal - using this by default *was* considered, but rejected due to the bootstrapping problem (decimals are not builtin) and the compatibility problem (decimals do not play nicely with binary floats) - the chosen API also relates to the bootstrapping problem - since decimal.Decimal is passed in to the API directly, the builtin modules don't need to perform their own implicit import to get access to the type datetime.datetime - as noted earlier in the thread, total_seconds() actually gives you a decent timestamp value and always returning UTC avoids timezone issues - real problem with the idea is that not all timestamps can be easily made absolute (e.g. some APIs may return "time since system started" or "time since process started") - the complexity argument used against timedelta also applies tuple of integers - option B doesn't force loss of precision, it's just awkward because you have to do a complicated calculation to express the full precision fraction in base 10 - option C only requires that the denominator be expressible as a power of *some* base. That's the case for all interfaces we're aware of (either a power of 2 or a power of 10). protocol - should explicitly note that the "tuple of integers" format discussion is relevant to any such protocol design - explicitly note that this was rejected as being excessive given the requirements, but that the specific syntax proposed allows this to be introduced later if compelling use cases are discovered boolean argument - should note explicitly that this was rejected because we don't generally like having argument *values* change return *types* (in cases like 3.3's IOError, where values can determine the specific *subclass* created, there's still a common parent type). -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From v+python at g.nevcal.com Fri Feb 3 02:23:00 2012 From: v+python at g.nevcal.com (Glenn Linderman) Date: Thu, 02 Feb 2012 17:23:00 -0800 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: <20120202142125.0abb3950@pitrou.net> <20120202152831.5b85cad6@pitrou.net> <4F2B1028.3000201@g.nevcal.com> Message-ID: <4F2B36F4.90308@g.nevcal.com> On 2/2/2012 3:38 PM, Nick Coghlan wrote: > On Fri, Feb 3, 2012 at 8:37 AM, Glenn Linderman wrote: >> > Sorry to bring this up, but the PEP should probably consider another option: >> > Introducing a precedent following os.stat_decimal_times(). Like >> > os.stat_float_times, it would decide the return types of timestamps from >> > os.stat. Or something along that line. Having it affect the results of >> > time.time would be weird, though. And the whole design of >> > os.stat_float_times smells of something being designed wrong in the first >> > place, to need such an API to retain backward compatibility. But I'm not >> > sure it is, even yet, designed for such flexibility. > We could get away with a global switch for the int->float transition > because ints and floats interoperate pretty well. The same is not true > for binary floats and decimal.Decimal. I agree about the interoperability of the various types, but don't see why that doesn't mean a global switch couldn't work, although I'm not fond of global switches. Library code that calls os.stat would have to be ready to handle either return value, but could predetermine it by checking the switch state. Icky. But possible. In any case, mentioning it in the PEP, along with why it is a bad idea, is probably a good idea. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Feb 3 02:25:53 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 3 Feb 2012 02:25:53 +0100 Subject: [Python-Dev] PEP: New timestamp formats References: <20120202142125.0abb3950@pitrou.net> <20120202152831.5b85cad6@pitrou.net> Message-ID: <20120203022553.546bce16@pitrou.net> On Thu, 2 Feb 2012 16:25:25 +0100 Victor Stinner wrote: > > That said, I don't understand why we couldn't simply deprecate > > stat_float_times() right now. Having an option for integer timestamps > > is pointless, you can just call int() on the result if you want. > > So which API do you propose for time.time() to get a Decimal object? > > time.time(timestamp=decimal.Decimal) > time.time(decimal=True) or time.time(hires=True) time.time(type=decimal.Decimal) sounds fine. If you want a boolean argument, either `decimal=True` or `exact=True`. Regards Antoine. From guido at python.org Fri Feb 3 03:39:30 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 2 Feb 2012 18:39:30 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> Message-ID: On Thu, Feb 2, 2012 at 4:34 PM, Nick Coghlan wrote: > On Fri, Feb 3, 2012 at 10:04 AM, Guido van Rossum wrote: >> On Thu, Feb 2, 2012 at 3:41 PM, Nick Coghlan wrote: >>> On Fri, Feb 3, 2012 at 9:32 AM, Yury Selivanov wrote: >>>> In my opinion using Ellipsis is just wrong. ?It is completely >>>> non-obvious not only to a beginner, but even to an experienced >>>> python developer. ?Writing 'raise Something() from None' >>>> looks less suspicious, but still strange. >>> >>> Beginners will never even see it (unless they're printing out >>> __cause__ explicitly for some unknown reason). Experienced devs can go >>> read language reference or PEP 409 for the rationale (that's one of >>> the reasons we have a PEP process). >> >> I somehow have a feeling that Yury misread the PEP (or maybe my +1) as >> saying that the syntax for suppressing the context would be "raise >> from Ellipsis". That's not the case, it's "from None". > > Oh right, that objection makes more sense. > > FWIW, I expect the implementation will *allow* "raise exc from > Ellipsis" as an odd synonym for "raise exc". I'd want to allow > "exc.__cause__ = Ellipsis" to reset an exception with a previously set > __cause__ back to the default state, at which point the synonym > follows from the semantics of "raise X from Y" as syntactic sugar for > "_exc = X; _exc.__cause__ = Y; raise _exc" Sure. But those are all rather obscure cases. Ellipsis reads no less or more obscure than False when written explicitly. But that doesn't bother me. -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Fri Feb 3 03:49:55 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 3 Feb 2012 12:49:55 +1000 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <4F2B49AC.1000101@stoneleaf.us> References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> Message-ID: On Fri, Feb 3, 2012 at 12:42 PM, Ethan Furman wrote: > Nick Coghlan wrote: >> >> FWIW, I expect the implementation will *allow* "raise exc from >> Ellipsis" as an odd synonym for "raise exc". > > > Are we sure we want that? ?Raising from something not an exception seems > counter-intuitive (None being the obvious exception). It isn't so much a matter of wanting it as "Is it problematic enough to put any effort into preventing it?" (since allowing it is a natural outcome of the obvious implementation). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Fri Feb 3 03:54:04 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 2 Feb 2012 18:54:04 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> Message-ID: On Thu, Feb 2, 2012 at 6:49 PM, Nick Coghlan wrote: > On Fri, Feb 3, 2012 at 12:42 PM, Ethan Furman wrote: >> Nick Coghlan wrote: >>> >>> FWIW, I expect the implementation will *allow* "raise exc from >>> Ellipsis" as an odd synonym for "raise exc". >> >> >> Are we sure we want that? ?Raising from something not an exception seems >> counter-intuitive (None being the obvious exception). > > It isn't so much a matter of wanting it as "Is it problematic enough > to put any effort into preventing it?" (since allowing it is a natural > outcome of the obvious implementation). I would say yes we want that. It would be strange if you couldn't reset a variable explicitly to its default value. I don't expect people to do this often. But somebody might want to do a deep copy of an exception (with some systematic change), or there might be other reasons. I'm sure a few Python zen items apply here. :-) -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Fri Feb 3 03:58:47 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 02 Feb 2012 18:58:47 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> Message-ID: <4F2B4D67.10407@stoneleaf.us> Guido van Rossum wrote: > On Thu, Feb 2, 2012 at 6:49 PM, Nick Coghlan wrote: >> On Fri, Feb 3, 2012 at 12:42 PM, Ethan Furman wrote: >>> Nick Coghlan wrote: >>>> FWIW, I expect the implementation will *allow* "raise exc from >>>> Ellipsis" as an odd synonym for "raise exc". >>> >>> Are we sure we want that? Raising from something not an exception seems >>> counter-intuitive (None being the obvious exception). >> It isn't so much a matter of wanting it as "Is it problematic enough >> to put any effort into preventing it?" (since allowing it is a natural >> outcome of the obvious implementation). > > I would say yes we want that. It would be strange if you couldn't > reset a variable explicitly to its default value. I don't expect > people to do this often. But somebody might want to do a deep copy of > an exception (with some systematic change), or there might be other > reasons. I'm sure a few Python zen items apply here. :-) Okey-doke, I'll get it going. ~Ethan~ From timothy.c.delaney at gmail.com Fri Feb 3 04:34:26 2012 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Fri, 3 Feb 2012 14:34:26 +1100 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> Message-ID: On 3 February 2012 13:54, Guido van Rossum wrote: > On Thu, Feb 2, 2012 at 6:49 PM, Nick Coghlan wrote: > > On Fri, Feb 3, 2012 at 12:42 PM, Ethan Furman > wrote: > >> Nick Coghlan wrote: > >>> > >>> FWIW, I expect the implementation will *allow* "raise exc from > >>> Ellipsis" as an odd synonym for "raise exc". > >> > >> > >> Are we sure we want that? Raising from something not an exception seems > >> counter-intuitive (None being the obvious exception). > > > > It isn't so much a matter of wanting it as "Is it problematic enough > > to put any effort into preventing it?" (since allowing it is a natural > > outcome of the obvious implementation). > > I would say yes we want that. It would be strange if you couldn't > reset a variable explicitly to its default value. In that case, would the best syntax be: raise Exception() from Ellipsis or: raise Exception() from ... ? I kinda like the second - it feels more self-descriptive to me than "from Ellipsis" - but there's the counter-argument that it could look like noise, and I think would require a grammar change to allow it there. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Fri Feb 3 03:42:52 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 02 Feb 2012 18:42:52 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> Message-ID: <4F2B49AC.1000101@stoneleaf.us> Nick Coghlan wrote: > FWIW, I expect the implementation will *allow* "raise exc from > Ellipsis" as an odd synonym for "raise exc". Are we sure we want that? Raising from something not an exception seems counter-intuitive (None being the obvious exception). > I'd want to allow > "exc.__cause__ = Ellipsis" to reset an exception with a previously set > __cause__ back to the default state, Already done. :) > at which point the synonym > follows from the semantics of "raise X from Y" as syntactic sugar for > "_exc = X; _exc.__cause__ = Y; raise _exc" I can see where it would make some sense that way, but it still seems odd. ~Ethan~ From ncoghlan at gmail.com Fri Feb 3 05:02:36 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 3 Feb 2012 14:02:36 +1000 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> Message-ID: On Fri, Feb 3, 2012 at 1:34 PM, Tim Delaney wrote: > ? I kinda like the second - it feels more self-descriptive to me than "from > Ellipsis" - but there's the counter-argument that it could look like noise, > and I think would require a grammar change to allow it there. Both will be allowed - in 3.x, '...' is just an ordinary expression that means exactly the same thing as the builtin Ellipsis: >>> Ellipsis Ellipsis >>> ... Ellipsis Sane code almost certainly won't include *either* form, though. If you're reraising an exception, you should generally be leaving __cause__ and __context__ alone, and if you're raising a *new* exception, then __cause__ will already be Ellipsis by default - you only need to use "raise X from Y" to set it to something *else*. As I noted earlier, supporting Ellipsis in the "raise X from Y" syntax shouldn't require a code change in Ethan's implementation, just a few additional tests to ensure it works as expected. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ethan at stoneleaf.us Fri Feb 3 04:43:46 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 02 Feb 2012 19:43:46 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> Message-ID: <4F2B57F2.7030202@stoneleaf.us> Tim Delaney wrote: > In that case, would the best syntax be: > > raise Exception() from Ellipsis > > or: > > raise Exception() from ... > > ? I kinda like the second - it feels more self-descriptive to me than > "from Ellipsis" - but there's the counter-argument that it could look > like noise, and I think would require a grammar change to allow it there. raise Exception() from ... is... well, I am now gleeful -- especially since I went to my fresh copy of Python 3.3.0a0 and did this: --> ... Ellipsis --> raise ValueError from ... Traceback (most recent call last): File "", line 1, in ValueError Have I said lately how much I *love* Python? ~Ethan~ From timothy.c.delaney at gmail.com Fri Feb 3 06:16:07 2012 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Fri, 3 Feb 2012 16:16:07 +1100 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> Message-ID: On 3 February 2012 15:02, Nick Coghlan wrote: > Both will be allowed - in 3.x, '...' is just an ordinary expression > that means exactly the same thing as the builtin Ellipsis: > > >>> Ellipsis > Ellipsis > >>> ... > Ellipsis > I'd totally forgotten that was the case in 3.x ... it's still not exactly common to use Ellipsis/... directly except in extended slicing. > Sane code almost certainly won't include *either* form, though. If > you're reraising an exception, you should generally be leaving > __cause__ and __context__ alone, and if you're raising a *new* > exception, then __cause__ will already be Ellipsis by default - you > only need to use "raise X from Y" to set it to something *else*. > Absolutely - I can't think of a reason to want to reraise an existing exception while supressing any existing __cause__ in favour of __context__. But I'm sure someone can. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.brunthaler at uci.edu Fri Feb 3 07:42:12 2012 From: s.brunthaler at uci.edu (stefan brunthaler) Date: Thu, 2 Feb 2012 22:42:12 -0800 Subject: [Python-Dev] Python 3 optimizations, continued, continued again... In-Reply-To: References: <4F23C657.9050501@hotpy.org> Message-ID: > I really don't think that is a problem. The core contributors can deal > well with complexity in my experience. :-) > No no, I wasn't trying to insinuate anything like that at all. No, I just figured that having the code generator being able to generate 4 optimizations where only one is supported is a bad idea for several reasons, such as maintainability, etc. Anyways, I've just completed the integration of the code generator and put the corresponding patch on my page (http://www.ics.uci.edu/~sbruntha/pydev.html) for downloading. The license thing is still missing, I'll do that tomorrow or sometime next week. Regards, --stefan From walter at livinglogic.de Fri Feb 3 08:40:30 2012 From: walter at livinglogic.de (=?utf-8?Q?Walter_D=C3=B6rwald?=) Date: Fri, 3 Feb 2012 08:40:30 +0100 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: Am 03.02.2012 um 01:59 schrieb Nick Coghlan : > On Fri, Feb 3, 2012 at 10:21 AM, Victor Stinner > wrote: >> I updated and completed my PEP and published the last draft. It will >> be available at: >> http://www.python.org/dev/peps/pep-0410/ >> ( or read the source: http://hg.python.org/peps/file/tip/pep-0410.txt ) >> >> I tried to list all alternatives. > > [...] > > datetime.datetime > > - as noted earlier in the thread, total_seconds() actually gives you a > decent timestamp value and always returning UTC avoids timezone issues > - real problem with the idea is that not all timestamps can be easily > made absolute (e.g. some APIs may return "time since system started" > or "time since process started") > - the complexity argument used against timedelta also applies Wasn't datetime supposed to be the canonical date/time infrastructure that everybody uses? Why do we need yet another way to express a point in time? And even if we're going with Decimal, at least datetime.datetime should we extended to support the higher resolution (in fact it's the one where this can be done with no or minimal backward compatibility problems). > [other alternatives] Servus, Walter From victor.stinner at haypocalc.com Fri Feb 3 12:57:23 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 3 Feb 2012 12:57:23 +0100 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: > datetime.datetime > > - as noted earlier in the thread, total_seconds() actually gives you a > decent timestamp value and always returning UTC avoids timezone issues os.stat() and time.time() use the local time. Using UTC would be completly wrong. It is possible to get the current timezone for time.time(), but how do you get it for os.stat() (for old timestamps)? If we know exactly how to get the timezone, tzinfo can be filled. If there is no reliable way to get the timezone, it better to not fill the tzinfo field. I don't see datetime without tzinfo as an issue. Being unable to convert a datetime to an epoch timestamp is also not an issue: if you need an epoch timestamp, just use float or Decimal types. > - real problem with the idea is that not all timestamps can be easily > made absolute (e.g. some APIs may return "time since system started" > or "time since process started") There is no such issue: each clock knows the start of the timestamp, or if the start is undefined (e.g. monotonic clocks). We can easily add a field to the interal structure describing a timestamp, and raise an exception of the start is undefined and user asked for a datetime object. -- I don't see any real issue of adding datetime as another accepted type, if Decimal is also accepted. Each type has limitations, and the user can choose the best type for his/her use case. I dropped datetime because I prefer incremental changes (and a simpler PEP is also more easily accepted :-)). We can add datetime later when most developers agree that datetime issues are no more issues :-) > tuple of integers > > - option B doesn't force loss of precision, it's just awkward because > you have to do a complicated calculation to express the full precision > fraction in base 10 > - option C only requires that the denominator be expressible as a > power of *some* base. That's the case for all interfaces we're aware > of (either a power of 2 or a power of 10). If the denominator is coprime with 2 and 10, we cannot express it as a power of 2 or 10 without loss of precision. Example: denominator=3. For option C, we have to use base=denominator and exponent=1 if a denominator is a prime number, or to not having to compute a log at runtime. I don't see any real advantage of using base^exponent. When the denominator is unknown at build time: you have to compute the base and the exponent at runtime, whereas you will have to recompute base^exponent later to do the division. Why not simply storing the denominator directly? Even if you know the denominator at build time but is not a constant number, how do you compute a log2 or log10 using the compiler? The only advantage that I see is for optimization for float if the base is 2 or for Decimal if the base is 10. But it looks like a minor advantage over drawacks. clock() uses CLOCKS_PER_SEC (known at build time, depend on the OS), QueryPerformanceCounter() uses the CPU frequency or another frequency and is only known at runtime. On Windows 7, QueryPerformanceFrequency() is 10^8 on my VM. I don't know if it is a constant. If I remember correctly, it is the CPU frequency on older Windows versions. -- I updated the PEP for your other remarks. Victor From yselivanov.ml at gmail.com Fri Feb 3 15:53:07 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 3 Feb 2012 09:53:07 -0500 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <4F2B57F2.7030202@stoneleaf.us> References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> Message-ID: Re "raise ValueError from ..." So what does it mean now? Just resetting __cause__ to make __context__ printed? Can you show the down-to-earth snippet of code where such syntax would be useful? Speaking of Zen of Python - I think this stuff contradicts with it more than it follows. On 2012-02-02, at 10:43 PM, Ethan Furman wrote: > Tim Delaney wrote: >> In that case, would the best syntax be: >> raise Exception() from Ellipsis >> or: >> raise Exception() from ... >> ? I kinda like the second - it feels more self-descriptive to me than "from Ellipsis" - but there's the counter-argument that it could look like noise, and I think would require a grammar change to allow it there. > > raise Exception() from ... > > is... well, I am now gleeful -- especially since I went to my fresh copy of Python 3.3.0a0 and did this: > > --> ... > Ellipsis > > --> raise ValueError from ... > Traceback (most recent call last): > File "", line 1, in > ValueError > > Have I said lately how much I *love* Python? > > ~Ethan~ > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/yselivanov.ml%40gmail.com From ethan at stoneleaf.us Fri Feb 3 17:52:08 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 03 Feb 2012 08:52:08 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> Message-ID: <4F2C10B8.1080906@stoneleaf.us> Yury Selivanov wrote: > Re "raise ValueError from ..." > > So what does it mean now? Just resetting __cause__ to make __context__ printed? Whatever __cause__ was before (None, or an actual exception), it is now Ellipsis -- so __context__ will be printed and the exception chain will be followed. > Can you show the down-to-earth snippet of code where such syntax would be useful? Not sure I'll ever use it this way, but: try: try: raise IndexError() except: raise CustomError() from None except CustomError as e: # nevermind, let's see the whole thing after all raise e from Ellipsis ~Ethan~ From merwok at netwok.org Fri Feb 3 17:52:29 2012 From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=) Date: Fri, 03 Feb 2012 17:52:29 +0100 Subject: [Python-Dev] distutils 'depends' management In-Reply-To: References: Message-ID: <985f54b13038751cb945fc56db91743d@netwok.org> Hi Matteo, > Now setup.py will rebuild all every time, this is because the policy > of > newer_group in build_extension is to consider 'newer' any missing > file. Here you certainly mean ?older?. > [...] Can someone suggest me the reason of this choice distutils? notion of dependencies directly comes from make. A missing (not existing) target is perfectly normal: it?s usually a generated file that make needs to create (i.e. compile from source files). In this world, you want to (re-)compile when the target is older than the sources, or when the target is missing. So here your extension module is a target that needs to be created, and when distutils does not find a file with the name you give in depends, it just thinks it?s another thing that will be generated. This model is inherently prone to typos; I?m not sure how we can improve it to let people catch possible typos. Cheers From yselivanov.ml at gmail.com Fri Feb 3 17:56:30 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 3 Feb 2012 11:56:30 -0500 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <4F2C10B8.1080906@stoneleaf.us> References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> <4F2C10B8.1080906@stoneleaf.us> Message-ID: <38CDE0DC-0F51-43D0-A3F0-9EE821A0273F@gmail.com> While the example is valid, I doubt that it is in any sense "common" case. OTOH the language will allow strange mess of reserved words with '...', that hurts readability and even gives you an instrument to write tangled and obscured code. Most of the python code is readable in plain english, that's something a lot of people fond of. I can't read 'raise from ...' or 'raise from Ellipsis', and I even had mixed understanding of it after reading the PEP. It's much more than a simple behaviour of "raise from None" (which many of us eagerly want). I'm -1 on adding 'raise from ...'. On 2012-02-03, at 11:52 AM, Ethan Furman wrote: > Yury Selivanov wrote: >> Re "raise ValueError from ..." >> So what does it mean now? Just resetting __cause__ to make __context__ printed? > > Whatever __cause__ was before (None, or an actual exception), it is now Ellipsis -- so __context__ will be printed and the exception chain will be followed. > >> Can you show the down-to-earth snippet of code where such syntax would be useful? > > Not sure I'll ever use it this way, but: > > try: > try: > raise IndexError() > except: > raise CustomError() from None > except CustomError as e: > # nevermind, let's see the whole thing after all > raise e from Ellipsis > > ~Ethan~ From status at bugs.python.org Fri Feb 3 18:07:33 2012 From: status at bugs.python.org (Python tracker) Date: Fri, 3 Feb 2012 18:07:33 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20120203170733.845981CC12@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2012-01-27 - 2012-02-03) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3248 (+14) closed 22466 (+29) total 25714 (+43) Open issues with patches: 1392 Issues opened (30) ================== #13892: distutils handling of windows manifest isn't optimal http://bugs.python.org/issue13892 opened by jackjansen #13893: Make CGIHTTPServer capable of redirects (and status other than http://bugs.python.org/issue13893 opened by Giovanni.Funchal #13896: shelf doesn't work with 'with' http://bugs.python.org/issue13896 opened by gruszczy #13897: Move fields relevant to coroutine/generators out of frame into http://bugs.python.org/issue13897 opened by Mark.Shannon #13898: Ignored exception in test_ssl http://bugs.python.org/issue13898 opened by nadeem.vawda #13899: re pattern r"[\A]" should work like "A" but matches nothing. D http://bugs.python.org/issue13899 opened by sjmachin #13902: Sporadic test_threading failure on FreeBSD 6.4 buildbot http://bugs.python.org/issue13902 opened by nadeem.vawda #13903: New shared-keys dictionary implementation http://bugs.python.org/issue13903 opened by Mark.Shannon #13904: Generator as *args: TypeError replaced http://bugs.python.org/issue13904 opened by july #13905: Built-in Types Comparisons should mention rich comparison meth http://bugs.python.org/issue13905 opened by catalin.iacob #13907: test_pprint relies on set/dictionary repr() ordering http://bugs.python.org/issue13907 opened by Mark.Shannon #13909: Ordering of free variables in dis is dependent on dict orderin http://bugs.python.org/issue13909 opened by Mark.Shannon #13910: test_packaging is dependent on dict ordering. http://bugs.python.org/issue13910 opened by Mark.Shannon #13911: test_trace depends on dict repr() ordering http://bugs.python.org/issue13911 opened by Mark.Shannon #13912: ImportError using __import__ and relative level 1 http://bugs.python.org/issue13912 opened by jason.coombs #13913: utf-8 or utf8 or utf-8 (codec display name inconsistency) http://bugs.python.org/issue13913 opened by kennyluck #13915: Update Tutorial 6.1.3 for PEP 3145 http://bugs.python.org/issue13915 opened by terry.reedy #13916: disallow the "surrogatepass" handler for non utf-* encodings http://bugs.python.org/issue13916 opened by kennyluck #13918: locale.atof documentation is missing func argument http://bugs.python.org/issue13918 opened by ced #13921: sqlite3: OptimizedUnicode doesn't work in Py3k http://bugs.python.org/issue13921 opened by petri.lehtinen #13922: argparse handling multiple "--" in args improperly http://bugs.python.org/issue13922 opened by Warren.Turkal #13923: new formatter for argparse http://bugs.python.org/issue13923 opened by Warren.Turkal #13924: Mercurial robots.txt should let robots crawl landing pages. http://bugs.python.org/issue13924 opened by Ivaylo.Popov #13926: pydoc - stall when requesting a list of available modules in t http://bugs.python.org/issue13926 opened by Jeroen #13927: Extra spaces in the output of time.ctime http://bugs.python.org/issue13927 opened by Roger.Caldwell #13928: bug in asyncore.dispatcher_with_send http://bugs.python.org/issue13928 opened by adamhj #13929: fnmatch to support escape characters http://bugs.python.org/issue13929 opened by fruch #13930: lib2to3 ability to output files into a different directory and http://bugs.python.org/issue13930 opened by gregory.p.smith #13932: If some test module fails to import another module unittest re http://bugs.python.org/issue13932 opened by sbarthelemy #13933: IDLE:not able to complete the hashlib module http://bugs.python.org/issue13933 opened by ramchandra.apte Most recent 15 issues with no replies (15) ========================================== #13933: IDLE:not able to complete the hashlib module http://bugs.python.org/issue13933 #13932: If some test module fails to import another module unittest re http://bugs.python.org/issue13932 #13930: lib2to3 ability to output files into a different directory and http://bugs.python.org/issue13930 #13929: fnmatch to support escape characters http://bugs.python.org/issue13929 #13923: new formatter for argparse http://bugs.python.org/issue13923 #13922: argparse handling multiple "--" in args improperly http://bugs.python.org/issue13922 #13916: disallow the "surrogatepass" handler for non utf-* encodings http://bugs.python.org/issue13916 #13915: Update Tutorial 6.1.3 for PEP 3145 http://bugs.python.org/issue13915 #13911: test_trace depends on dict repr() ordering http://bugs.python.org/issue13911 #13910: test_packaging is dependent on dict ordering. http://bugs.python.org/issue13910 #13909: Ordering of free variables in dis is dependent on dict orderin http://bugs.python.org/issue13909 #13907: test_pprint relies on set/dictionary repr() ordering http://bugs.python.org/issue13907 #13904: Generator as *args: TypeError replaced http://bugs.python.org/issue13904 #13902: Sporadic test_threading failure on FreeBSD 6.4 buildbot http://bugs.python.org/issue13902 #13893: Make CGIHTTPServer capable of redirects (and status other than http://bugs.python.org/issue13893 Most recent 15 issues waiting for review (15) ============================================= #13930: lib2to3 ability to output files into a different directory and http://bugs.python.org/issue13930 #13921: sqlite3: OptimizedUnicode doesn't work in Py3k http://bugs.python.org/issue13921 #13905: Built-in Types Comparisons should mention rich comparison meth http://bugs.python.org/issue13905 #13904: Generator as *args: TypeError replaced http://bugs.python.org/issue13904 #13903: New shared-keys dictionary implementation http://bugs.python.org/issue13903 #13897: Move fields relevant to coroutine/generators out of frame into http://bugs.python.org/issue13897 #13896: shelf doesn't work with 'with' http://bugs.python.org/issue13896 #13893: Make CGIHTTPServer capable of redirects (and status other than http://bugs.python.org/issue13893 #13889: str(float) and round(float) issues with FPU precision http://bugs.python.org/issue13889 #13886: readline-related test_builtin failure http://bugs.python.org/issue13886 #13884: IDLE 2.6.5 Recent Files undocks http://bugs.python.org/issue13884 #13882: PEP 410: Use decimal.Decimal type for timestamps http://bugs.python.org/issue13882 #13879: Argparse does not support subparser aliases in 2.7 http://bugs.python.org/issue13879 #13872: socket.detach doesn't mark socket._closed http://bugs.python.org/issue13872 #13857: Add textwrap.indent() as counterpart to textwrap.dedent() http://bugs.python.org/issue13857 Top 10 most discussed issues (10) ================================= #13703: Hash collision security issue http://bugs.python.org/issue13703 30 msgs #13882: PEP 410: Use decimal.Decimal type for timestamps http://bugs.python.org/issue13882 18 msgs #6210: Exception Chaining missing method for suppressing context http://bugs.python.org/issue6210 14 msgs #10181: Problems with Py_buffer management in memoryobject.c (and else http://bugs.python.org/issue10181 14 msgs #13609: Add "os.get_terminal_size()" function http://bugs.python.org/issue13609 11 msgs #13889: str(float) and round(float) issues with FPU precision http://bugs.python.org/issue13889 11 msgs #13734: Add a generic directory walker method to avoid symlink attacks http://bugs.python.org/issue13734 10 msgs #13856: xmlrpc / httplib changes to allow for certificate verification http://bugs.python.org/issue13856 10 msgs #11457: os.stat(): add new fields to get timestamps as Decimal objects http://bugs.python.org/issue11457 9 msgs #13912: ImportError using __import__ and relative level 1 http://bugs.python.org/issue13912 9 msgs Issues closed (27) ================== #5231: Change format of a memoryview http://bugs.python.org/issue5231 closed by skrah #8828: Atomic function to rename a file http://bugs.python.org/issue8828 closed by pitrou #13402: Document absoluteness of sys.executable http://bugs.python.org/issue13402 closed by python-dev #13506: IDLE sys.path does not contain Current Working Directory http://bugs.python.org/issue13506 closed by terry.reedy #13676: sqlite3: Zero byte truncates string contents http://bugs.python.org/issue13676 closed by python-dev #13777: socket: communicating with Mac OS X KEXT controls http://bugs.python.org/issue13777 closed by loewis #13806: Audioop decompression frames size check fix http://bugs.python.org/issue13806 closed by pitrou #13817: deadlock in subprocess while running several threads using Pop http://bugs.python.org/issue13817 closed by neologix #13836: Define key failed http://bugs.python.org/issue13836 closed by terry.reedy #13848: io.open() doesn't check for embedded NUL characters http://bugs.python.org/issue13848 closed by pitrou #13851: Packaging distutils2 for Fedora http://bugs.python.org/issue13851 closed by eric.araujo #13868: Add hyphen doc fix http://bugs.python.org/issue13868 closed by georg.brandl #13874: test_faulthandler: read_null test fails with current clang http://bugs.python.org/issue13874 closed by haypo #13890: test_importlib failures under Windows http://bugs.python.org/issue13890 closed by brett.cannon #13891: CPU DoS With Python's socket module http://bugs.python.org/issue13891 closed by neologix #13894: threading._CRLock should not be tested if _thread.RLock isn't http://bugs.python.org/issue13894 closed by neologix #13895: test_ssl hangs on Ubuntu http://bugs.python.org/issue13895 closed by pitrou #13900: documentation page on email.parser contains self-referential n http://bugs.python.org/issue13900 closed by georg.brandl #13901: test_get_outputs (test_distutils) failure with --enable-shared http://bugs.python.org/issue13901 closed by ned.deily #13906: mimetypes.py under windows - bad exception catch http://bugs.python.org/issue13906 closed by r.david.murray #13908: PyType_FromSpec() lacks PyType_Ready() call http://bugs.python.org/issue13908 closed by python-dev #13914: In module re the repeat interval {} doesn't accept numbers gre http://bugs.python.org/issue13914 closed by haypo #13917: Python 2.7.2 and 3.2.2 execl crash http://bugs.python.org/issue13917 closed by pitrou #13919: invalid http://bugs.python.org/issue13919 closed by nadeem.vawda #13920: intern() doc wrong spelling http://bugs.python.org/issue13920 closed by brian.curtin #13925: making assignments to an empty two dimensional list http://bugs.python.org/issue13925 closed by mark.dickinson #13931: os.path.exists inconsistent between 32 bit and 64 bit http://bugs.python.org/issue13931 closed by zxw From techtonik at gmail.com Fri Feb 3 18:27:50 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 3 Feb 2012 20:27:50 +0300 Subject: [Python-Dev] Dev In a Box: Pumped for a better UX (with video) Message-ID: Hi, If you don't know, Dev In a Box is "everything you need to contribute to Python in under 700 MB". I've patched it up to the latest standards of colorless console user interfaces and uploaded a video of the process for you to enjoy. http://www.youtube.com/watch?v=jbJcI9MnO_c This tool can be greatly improved to provide entrypoint for other healthy activities. Like improving docs by editing, comparing, building and sending patches for review. Specialized menus can greatly help with automating common tasks, which are not limited by sources fetching. https://bitbucket.org/techtonik/devinabox -- anatoly t. -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Fri Feb 3 18:29:11 2012 From: barry at python.org (Barry Warsaw) Date: Fri, 3 Feb 2012 12:29:11 -0500 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <4F2C10B8.1080906@stoneleaf.us> References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> <4F2C10B8.1080906@stoneleaf.us> Message-ID: <20120203122911.4557b633@resist.wooz.org> On Feb 03, 2012, at 08:52 AM, Ethan Furman wrote: >Not sure I'll ever use it this way, but: > >try: > try: > raise IndexError() > except: > raise CustomError() from None >except CustomError as e: > # nevermind, let's see the whole thing after all > raise e from Ellipsis In that context, I have to say that the last line, even if it were written raise e from ... is certainly cute, but not very informative. Triple-dots will be confusing and difficult to read in documentation and code, and Ellipsis has no logical connection to the purpose of this PEP. So while I'm +1 on everything else in the PEP, I'm -1 on this particular decision. One of the alternatives states: Create a special exception class, __NoException__. Rejected as possibly confusing, possibly being mistakenly raised by users, and not being a truly unique value as None, True, and False are. I think this should be revisited. First, `__NoException__` doesn't need to be an exception class. Ellipsis isn't so this doesn't need to be either. I have no problem adding a new non-exception derived singleton to mark this. And while __NoException__ may be a little confusing, something like __NoCause__ reads great and can't be mistaken for a raiseable exception. So your example would then be: try: try: raise IndexError() except: raise CustomError() from None except CustomError as e: # nevermind, let's see the whole thing after all raise e from __NoCause__ Cheers, -Barry From yselivanov.ml at gmail.com Fri Feb 3 18:52:02 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 3 Feb 2012 12:52:02 -0500 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <4F2C1975.6070101@stoneleaf.us> References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> <4F2C10B8.1080906@stoneleaf.us> <38CDE0DC-0F51-43D0-A3F0-9EE821A0273F@gmail.com> <4F2C1975.6070101@stoneleaf.us> Message-ID: <94A87DC0-99CD-44C0-836A-D5CA60FE673A@gmail.com> I got it, and I think it's fine to use explicit __cause__ reset, using Ellipsis, or even some __NoException__ special object if we decide to introduce one. I'm against allowing 'from ...' syntax. On 2012-02-03, at 12:29 PM, Ethan Furman wrote: > Yury Selivanov wrote: >> While the example is valid, I doubt that it is in any sense "common" case. > > No it is a corner case. Another way to spell it is: > > try: > try: > raise IndexError() > except: > raise CustomError() from None > except CustomError as e: > # nevermind, let's see the whole thing after all > e.__cause__ = Ellipsis > raise e > > Ethan From ethan at stoneleaf.us Fri Feb 3 18:29:25 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 03 Feb 2012 09:29:25 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <38CDE0DC-0F51-43D0-A3F0-9EE821A0273F@gmail.com> References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> <4F2C10B8.1080906@stoneleaf.us> <38CDE0DC-0F51-43D0-A3F0-9EE821A0273F@gmail.com> Message-ID: <4F2C1975.6070101@stoneleaf.us> Yury Selivanov wrote: > While the example is valid, I doubt that it is in any sense > "common" case. No it is a corner case. Another way to spell it is: try: try: raise IndexError() except: raise CustomError() from None except CustomError as e: # nevermind, let's see the whole thing after all e.__cause__ = Ellipsis raise e Ethan From yselivanov.ml at gmail.com Fri Feb 3 18:54:33 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 3 Feb 2012 12:54:33 -0500 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <20120203122911.4557b633@resist.wooz.org> References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> <4F2C10B8.1080906@stoneleaf.us> <20120203122911.4557b633@resist.wooz.org> Message-ID: Funny thing, it seems like you don't get it in the same way I did not in the first place. His example is more like: try: try: raise IndexError() except: raise CustomError() from __NoContext__ except CustomError as e: # nevermind, let's see the whole thing after all raise e from __OupsLooksLikeINeedContextAfterAll__ On 2012-02-03, at 12:29 PM, Barry Warsaw wrote: > On Feb 03, 2012, at 08:52 AM, Ethan Furman wrote: > >> Not sure I'll ever use it this way, but: >> >> try: >> try: >> raise IndexError() >> except: >> raise CustomError() from None >> except CustomError as e: >> # nevermind, let's see the whole thing after all >> raise e from Ellipsis > > In that context, I have to say that the last line, even if it were written > > raise e from ... > > is certainly cute, but not very informative. Triple-dots will be confusing > and difficult to read in documentation and code, and Ellipsis has no logical > connection to the purpose of this PEP. So while I'm +1 on everything else in > the PEP, I'm -1 on this particular decision. > > One of the alternatives states: > > Create a special exception class, __NoException__. > > Rejected as possibly confusing, possibly being mistakenly raised by users, > and not being a truly unique value as None, True, and False are. > > I think this should be revisited. First, `__NoException__` doesn't need to be > an exception class. Ellipsis isn't so this doesn't need to be either. I have > no problem adding a new non-exception derived singleton to mark this. And > while __NoException__ may be a little confusing, something like __NoCause__ > reads great and can't be mistaken for a raiseable exception. > > So your example would then be: > > try: > try: > raise IndexError() > except: > raise CustomError() from None > except CustomError as e: > # nevermind, let's see the whole thing after all > raise e from __NoCause__ > > Cheers, > -Barry > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/yselivanov.ml%40gmail.com From yselivanov.ml at gmail.com Fri Feb 3 19:00:45 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 3 Feb 2012 13:00:45 -0500 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <4F2C2112.7000709@stoneleaf.us> References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> <4F2C10B8.1080906@stoneleaf.us> <38CDE0DC-0F51-43D0-A3F0-9EE821A0273F@gmail.com> <4F2C1975.6070101@stoneleaf.us> <94A87DC0-99CD-44C0-836A-D5CA60FE673A@gmail.com> <4F2C2112.7000709@stoneleaf.us> Message-ID: ;) I completely understand what Ellipsis object and its short syntax (...) is. And for me 'raise from Ellipsis' or 'raise from ...', or even using the Ellipsis object internally instead of special __NoContext__ object is godawful. Why do we want to use some completely unrelated singleton in the exception contexts? Is the policy of "think before adding a new builtin object" really worth it in this concrete case? On 2012-02-03, at 1:01 PM, Ethan Furman wrote: > Yury Selivanov wrote: >> I got it, and I think it's fine to use explicit __cause__ reset, >> using Ellipsis, or even some __NoException__ special object if we decide to introduce one. >> I'm against allowing 'from ...' syntax. > > Well, ... /is/ Ellipsis -- no way to tell them apart by them time this part of the code sees it. > > ~Ethan~ From guido at python.org Fri Feb 3 19:20:31 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 3 Feb 2012 10:20:31 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <20120203122911.4557b633@resist.wooz.org> References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> <4F2C10B8.1080906@stoneleaf.us> <20120203122911.4557b633@resist.wooz.org> Message-ID: On Fri, Feb 3, 2012 at 9:29 AM, Barry Warsaw wrote: > On Feb 03, 2012, at 08:52 AM, Ethan Furman wrote: > >>Not sure I'll ever use it this way, but: >> >>try: >> ? try: >> ? ? raise IndexError() >> ? except: >> ? ? raise CustomError() from None >>except CustomError as e: >> ? # nevermind, let's see the whole thing after all >> ? raise e from Ellipsis > > In that context, I have to say that the last line, even if it were written > > ? ?raise e from ... > > is certainly cute, but not very informative. Please. Let's stop this. There is no known use case to ever write that. We're just not putting specific measures to prevent it. Writing >>> a = ... Is likewise cute but not very informative. But it is valid syntax. -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Fri Feb 3 18:48:34 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 03 Feb 2012 09:48:34 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <20120203122911.4557b633@resist.wooz.org> References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> <4F2C10B8.1080906@stoneleaf.us> <20120203122911.4557b633@resist.wooz.org> Message-ID: <4F2C1DF2.9080909@stoneleaf.us> Barry Warsaw wrote: > raise e from ... > > is certainly cute, but not very informative. Triple-dots will be confusing > and difficult to read in documentation and code, and Ellipsis has no logical > connection to the purpose of this PEP. So while I'm +1 on everything else in > the PEP, I'm -1 on this particular decision. > > One of the alternatives states: > > Create a special exception class, __NoException__. > > Rejected as possibly confusing, possibly being mistakenly raised by users, > and not being a truly unique value as None, True, and False are. > > I think this should be revisited. First, `__NoException__` doesn't need to be > an exception class. Ellipsis isn't so this doesn't need to be either. I have > no problem adding a new non-exception derived singleton to mark this. And > while __NoException__ may be a little confusing, something like __NoCause__ > reads great and can't be mistaken for a raiseable exception. The problem I have with names like __NoException__, __NoCause__, __NoWhatever__ is that is sounds a lot like None -- in otherwords, like there won't be any chaining. > So your example would then be: > > try: > try: > raise IndexError() > except: > raise CustomError() from None > except CustomError as e: > # nevermind, let's see the whole thing after all > raise e from __NoCause__ If we do switch from Ellipsis to something else I think a name like __Chain__ would be more appropriate. Or __ExcChain__. raise e from __ExcChain__ Less cute, but probably less confusing. ~Ethan~ From yselivanov.ml at gmail.com Fri Feb 3 19:46:19 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 3 Feb 2012 13:46:19 -0500 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> <4F2C10B8.1080906@stoneleaf.us> <20120203122911.4557b633@resist.wooz.org> Message-ID: <59CE7F6F-C363-4C58-A89A-B05C4358C3C3@gmail.com> On 2012-02-03, at 1:20 PM, Guido van Rossum wrote: > Please. Let's stop this. There is no known use case to ever write > that. We're just not putting specific measures to prevent it. Writing > >>>> a = ... > > Is likewise cute but not very informative. But it is valid syntax. Well, right now you'll get TypeError if you want to raise an exception from something that is not an exception. 'raise from None' will loosen the check allowing None values, in the 'raise from' statement, but that should be it. To achieve the same effect as 'raise from ...' just do 'e.__cause__ = ...'. On the question of using Ellipsis instead of some new singleton like __NoContext__: how's Ellipsis semantically related to exceptions after all? - Yury From jyasskin at gmail.com Fri Feb 3 19:48:13 2012 From: jyasskin at gmail.com (Jeffrey Yasskin) Date: Fri, 3 Feb 2012 10:48:13 -0800 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: On Fri, Feb 3, 2012 at 3:57 AM, Victor Stinner wrote: >> datetime.datetime >> >> - as noted earlier in the thread, total_seconds() actually gives you a >> decent timestamp value and always returning UTC avoids timezone issues > > os.stat() and time.time() use the local time. The documentation disagrees with you. http://docs.python.org/dev/library/time.html#time.time says "Return the time as a floating point number expressed in seconds since the epoch, in UTC." os.stat is documented less clearly, but it's implemented by forwarding to the system stat(), and that's defined at http://www.opengroup.org/sud/sud1/xsh/sysstat.h.htm#sysstat.h-file-desc-stru to return times since the Epoch. http://pubs.opengroup.org/onlinepubs/000095399/functions/localtime.html says "January 1, 1970 0:00 UTC (the Epoch)" > I don't see datetime without tzinfo as an issue. > > Being unable to convert a datetime to an epoch timestamp is also not > an issue: if you need an epoch timestamp, just use float or Decimal > types. Your PEP still says, 'there is no easy way to convert it into "seconds since the epoch"'. If you don't actually think it's an issue (which it's not, because there is an easy way to convert it), then take that out of the PEP. Jeffrey From ethan at stoneleaf.us Fri Feb 3 19:01:54 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 03 Feb 2012 10:01:54 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <94A87DC0-99CD-44C0-836A-D5CA60FE673A@gmail.com> References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> <4F2C10B8.1080906@stoneleaf.us> <38CDE0DC-0F51-43D0-A3F0-9EE821A0273F@gmail.com> <4F2C1975.6070101@stoneleaf.us> <94A87DC0-99CD-44C0-836A-D5CA60FE673A@gmail.com> Message-ID: <4F2C2112.7000709@stoneleaf.us> Yury Selivanov wrote: > I got it, and I think it's fine to use explicit __cause__ reset, > using Ellipsis, or even some __NoException__ special object if > we decide to introduce one. > > I'm against allowing 'from ...' syntax. Well, ... /is/ Ellipsis -- no way to tell them apart by them time this part of the code sees it. ~Ethan~ From ethan at stoneleaf.us Fri Feb 3 20:03:20 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 03 Feb 2012 11:03:20 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <59CE7F6F-C363-4C58-A89A-B05C4358C3C3@gmail.com> References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> <4F2C10B8.1080906@stoneleaf.us> <20120203122911.4557b633@resist.wooz.org> <59CE7F6F-C363-4C58-A89A-B05C4358C3C3@gmail.com> Message-ID: <4F2C2F78.1000206@stoneleaf.us> Yury Selivanov wrote: > On 2012-02-03, at 1:20 PM, Guido van Rossum wrote: >> Please. Let's stop this. There is no known use case to ever write >> that. We're just not putting specific measures to prevent it. Writing >> >>>>> a = ... >> Is likewise cute but not very informative. But it is valid syntax. > > Well, right now you'll get TypeError if you want to raise an exception > from something that is not an exception. 'raise from None' will > loosen the check allowing None values, in the 'raise from' statement, > but that should be it. > > To achieve the same effect as 'raise from ...' just do > 'e.__cause__ = ...'. > > On the question of using Ellipsis instead of some new singleton like > __NoContext__: how's Ellipsis semantically related to exceptions after > all? Merrian Webster says: --------------------- el?lip?sis noun \i-?lip-s?s, e-\ plural el?lip?ses\-?s?z\ Definition of ELLIPSIS 1 a : the omission of one or more words that are obviously understood but that must be supplied to make a construction grammatically complete --------------------- Relation to exceptions: Two places to look: __context__ and __cause__ Priority? __cause__ When do we check __context__? if __cause__ is omitted (or Ellipsis) ~Ethan~ From chris at simplistix.co.uk Fri Feb 3 19:00:07 2012 From: chris at simplistix.co.uk (Chris Withers) Date: Fri, 03 Feb 2012 18:00:07 +0000 Subject: [Python-Dev] PEP 408 -- Standard library __preview__ package In-Reply-To: <20120127160934.2ad5e0bf@pitrou.net> References: <20120127160934.2ad5e0bf@pitrou.net> Message-ID: <4F2C20A7.9060303@simplistix.co.uk> On 27/01/2012 15:09, Antoine Pitrou wrote: > On Fri, 27 Jan 2012 15:21:33 +0200 > Eli Bendersky wrote: >> >> Following an earlier discussion on python-ideas [1], we would like to >> propose the following PEP for review. Discussion is welcome. The PEP >> can also be viewed in HTML form at >> http://www.python.org/dev/peps/pep-0408/ > > A big +1 from me. Actually a pretty big -1 from me. I'd prefer to see the standard library getting smaller, not bigger, and packages being upgradeable independently from the Python version as a result. Every time I see things like the following I cry a little inside: try: try: from py2stdliblocation import FooBar as Foo except ImportError: from py3stdliblocation import foo as Foo except ImportError: from pypilocation import Foo Now we're talking about having to add __preview__ into that mix too? :'( Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk From jyasskin at gmail.com Fri Feb 3 20:04:14 2012 From: jyasskin at gmail.com (Jeffrey Yasskin) Date: Fri, 3 Feb 2012 11:04:14 -0800 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: On Thu, Feb 2, 2012 at 4:59 PM, Nick Coghlan wrote: > datetime.datetime > > - real problem with the idea is that not all timestamps can be easily > made absolute (e.g. some APIs may return "time since system started" > or "time since process started") I think this is an argument for returning the appropriate one of datetime or timedelta from all of these functions: users need to keep track of whether they've got an absolute time, or an offset from an unspecified starting point, and that's a type-like distinction. > - the complexity argument used against timedelta also applies A plain number of seconds is superficially simpler, but it forces more complexity onto the user, who has to track what that number represents. datetime and timedelta are even available from a C module, which I had expected to be a blocking issue. The biggest problem I see with using datetime and timedelta for everything is that switching to them is very backwards-incompatible. Jeffrey From yselivanov.ml at gmail.com Fri Feb 3 20:18:53 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 3 Feb 2012 14:18:53 -0500 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: <4F2C2F78.1000206@stoneleaf.us> References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> <4F2C10B8.1080906@stoneleaf.us> <20120203122911.4557b633@resist.wooz.org> <59CE7F6F-C363-4C58-A89A-B05C4358C3C3@gmail.com> <4F2C2F78.1000206@stoneleaf.us> Message-ID: That's a bit far-fetched. Using same level argumentation we can utilize even `0`. `raise e from 0` (or `-1`), and use `0` object instead of Ellipsis. Anyways, if the PEP is not yet fully approved, I'm minus one on allowing of using anything other than Exception instance or None in 'raise from' statement. On 2012-02-03, at 2:03 PM, Ethan Furman wrote: > Yury Selivanov wrote: >> On 2012-02-03, at 1:20 PM, Guido van Rossum wrote: >>> Please. Let's stop this. There is no known use case to ever write >>> that. We're just not putting specific measures to prevent it. Writing >>> >>>>>> a = ... >>> Is likewise cute but not very informative. But it is valid syntax. >> Well, right now you'll get TypeError if you want to raise an exception >> from something that is not an exception. 'raise from None' will >> loosen the check allowing None values, in the 'raise from' statement, >> but that should be it. >> To achieve the same effect as 'raise from ...' just do 'e.__cause__ = ...'. >> On the question of using Ellipsis instead of some new singleton like >> __NoContext__: how's Ellipsis semantically related to exceptions after all? > > > Merrian Webster says: > --------------------- > el?lip?sis > noun \i-?lip-s?s, e-\ > plural el?lip?ses\-?s?z\ > Definition of ELLIPSIS > 1 > a : the omission of one or more words that are obviously understood but that must be supplied to make a construction grammatically complete > --------------------- > > Relation to exceptions: > Two places to look: __context__ and __cause__ > Priority? __cause__ > When do we check __context__? if __cause__ is omitted (or Ellipsis) > > ~Ethan~ From solipsis at pitrou.net Fri Feb 3 20:17:12 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 3 Feb 2012 20:17:12 +0100 Subject: [Python-Dev] PEP: New timestamp formats References: Message-ID: <20120203201712.5d653857@pitrou.net> On Fri, 3 Feb 2012 11:04:14 -0800 Jeffrey Yasskin wrote: > On Thu, Feb 2, 2012 at 4:59 PM, Nick Coghlan wrote: > > datetime.datetime > > > > - real problem with the idea is that not all timestamps can be easily > > made absolute (e.g. some APIs may return "time since system started" > > or "time since process started") > > I think this is an argument for returning the appropriate one of > datetime or timedelta from all of these functions: users need to keep > track of whether they've got an absolute time, or an offset from an > unspecified starting point, and that's a type-like distinction. Keep in mind timedelta has a microsecond resolution. The use cases meant for the PEP imply nanosecond resolution (POSIX' clock_gettime(), for example). > A plain number of seconds is superficially simpler, but it forces more > complexity onto the user, who has to track what that number > represents. If all you are doing is comparing timestamps (which I guess is most of what people do with e.g. st_mtime), a number is fine. If you want the current time and date in a high-level form, you can already use datetime.now() or datetime.utcnow() (which "only" has microsecond resolution as well :-)). We don't need another way to spell it. Regards Antoine. From jyasskin at gmail.com Fri Feb 3 20:32:48 2012 From: jyasskin at gmail.com (Jeffrey Yasskin) Date: Fri, 3 Feb 2012 11:32:48 -0800 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: <20120203201712.5d653857@pitrou.net> References: <20120203201712.5d653857@pitrou.net> Message-ID: On Fri, Feb 3, 2012 at 11:17 AM, Antoine Pitrou wrote: > On Fri, 3 Feb 2012 11:04:14 -0800 > Jeffrey Yasskin wrote: >> On Thu, Feb 2, 2012 at 4:59 PM, Nick Coghlan wrote: >> > datetime.datetime >> > >> > - real problem with the idea is that not all timestamps can be easily >> > made absolute (e.g. some APIs may return "time since system started" >> > or "time since process started") >> >> I think this is an argument for returning the appropriate one of >> datetime or timedelta from all of these functions: users need to keep >> track of whether they've got an absolute time, or an offset from an >> unspecified starting point, and that's a type-like distinction. > > Keep in mind timedelta has a microsecond resolution. The use cases > meant for the PEP imply nanosecond resolution (POSIX' clock_gettime(), > for example). Yes, I think someone had noted that datetime and timedelta would need to be extended to support nanosecond resolution. >> A plain number of seconds is superficially simpler, but it forces more >> complexity onto the user, who has to track what that number >> represents. > > If all you are doing is comparing timestamps (which I guess is most of > what people do with e.g. st_mtime), a number is fine. Sure. I don't think the argument for datetime is totally convincing, just that it's stronger than the PEP currently presents. > If you want the current time and date in a high-level form, you can > already use datetime.now() or datetime.utcnow() (which "only" has > microsecond resolution as well :-)). We don't need another way to spell > it. Whoops, yes, there's no need to extend time() to return a datetime. From guido at python.org Fri Feb 3 21:02:21 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 3 Feb 2012 12:02:21 -0800 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> <4F2C10B8.1080906@stoneleaf.us> <20120203122911.4557b633@resist.wooz.org> <59CE7F6F-C363-4C58-A89A-B05C4358C3C3@gmail.com> <4F2C2F78.1000206@stoneleaf.us> Message-ID: On Fri, Feb 3, 2012 at 11:18 AM, Yury Selivanov wrote: > That's a bit far-fetched. ?Using same level argumentation we can utilize > even `0`. ?`raise e from 0` (or `-1`), and use `0` object instead of > Ellipsis. > > Anyways, if the PEP is not yet fully approved, I'm minus one on allowing > of using anything other than Exception instance or None in 'raise from' > statement. I read your objection and disagree. The PEP *is* fully approved. > On 2012-02-03, at 2:03 PM, Ethan Furman wrote: > >> Yury Selivanov wrote: >>> On 2012-02-03, at 1:20 PM, Guido van Rossum wrote: >>>> Please. Let's stop this. There is no known use case to ever write >>>> that. We're just not putting specific measures to prevent it. Writing >>>> >>>>>>> a = ... >>>> Is likewise cute but not very informative. But it is valid syntax. >>> Well, right now you'll get TypeError if you want to raise an exception >>> from something that is not an exception. ?'raise from None' will >>> loosen the check allowing None values, in the 'raise from' statement, >>> but that should be it. >>> To achieve the same effect as 'raise from ...' just do 'e.__cause__ = ...'. >>> On the question of using Ellipsis instead of some new singleton like >>> __NoContext__: how's Ellipsis semantically related to exceptions after all? >> >> >> Merrian Webster says: >> --------------------- >> el?lip?sis >> noun \i-?lip-s?s, e-\ >> plural el?lip?ses\-?s?z\ >> Definition of ELLIPSIS >> 1 >> a : the omission of one or more words that are obviously understood but that must be supplied to make a construction grammatically complete >> --------------------- >> >> Relation to exceptions: >> Two places to look: ?__context__ and __cause__ >> Priority? ?__cause__ >> When do we check __context__? ?if __cause__ is omitted (or Ellipsis) >> >> ~Ethan~ > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From p.f.moore at gmail.com Fri Feb 3 21:08:10 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 3 Feb 2012 20:08:10 +0000 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> <4F2C10B8.1080906@stoneleaf.us> <20120203122911.4557b633@resist.wooz.org> <59CE7F6F-C363-4C58-A89A-B05C4358C3C3@gmail.com> <4F2C2F78.1000206@stoneleaf.us> Message-ID: On 3 February 2012 19:18, Yury Selivanov wrote: > That's a bit far-fetched. ?Using same level argumentation we can utilize > even `0`. ?`raise e from 0` (or `-1`), and use `0` object instead of > Ellipsis. > > Anyways, if the PEP is not yet fully approved, I'm minus one on allowing > of using anything other than Exception instance or None in 'raise from' > statement. I may have missed something here, but as far as I am aware, the PEP is fundamentally only about allowing raise...from None to suppress chaining. There is an extremely obscure case where certain (generally library, not end user) code might want to reinstate chaining. For that very obscure case, the PEP suggests setting __cause__ to a sentinel value, and Ellipsis is used rather than inventing a new singleton for such a rare case. Purely by accident, the form "raise X from ..." will do the same as explicitly setting __cause__, and it's not worth the effort of testing for and rejecting this case. This issue is so not worth arguing about, it's silly. Paul. From ethan at stoneleaf.us Fri Feb 3 21:14:31 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 03 Feb 2012 12:14:31 -0800 Subject: [Python-Dev] PEP 409 - Accepted! In-Reply-To: <4F28B81B.20801@stoneleaf.us> References: <4F28B81B.20801@stoneleaf.us> Message-ID: <4F2C4027.2050002@stoneleaf.us> Good news! PEP 409 has been accepted! Not so good news: There is no one assigned to Issue 6210 to review the patches... any volunteers? http://bugs.python.org/issue6210 ~Ethan~ From greg at krypto.org Fri Feb 3 22:09:22 2012 From: greg at krypto.org (Gregory P. Smith) Date: Fri, 3 Feb 2012 13:09:22 -0800 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: <20120203201712.5d653857@pitrou.net> Message-ID: Why is the PEP promoting the float type being used as the default on the new-in-3.3 APIs that were added explicitly to provide nanosecond level resolution that cannot be represented by a float? The *new* APIs should default to the high precision return value (be that datetime/timedelta or decimal). -gps On Fri, Feb 3, 2012 at 11:32 AM, Jeffrey Yasskin wrote: > On Fri, Feb 3, 2012 at 11:17 AM, Antoine Pitrou > wrote: > > On Fri, 3 Feb 2012 11:04:14 -0800 > > Jeffrey Yasskin wrote: > >> On Thu, Feb 2, 2012 at 4:59 PM, Nick Coghlan > wrote: > >> > datetime.datetime > >> > > >> > - real problem with the idea is that not all timestamps can be easily > >> > made absolute (e.g. some APIs may return "time since system started" > >> > or "time since process started") > >> > >> I think this is an argument for returning the appropriate one of > >> datetime or timedelta from all of these functions: users need to keep > >> track of whether they've got an absolute time, or an offset from an > >> unspecified starting point, and that's a type-like distinction. > > > > Keep in mind timedelta has a microsecond resolution. The use cases > > meant for the PEP imply nanosecond resolution (POSIX' clock_gettime(), > > for example). > > Yes, I think someone had noted that datetime and timedelta would need > to be extended to support nanosecond resolution. > > >> A plain number of seconds is superficially simpler, but it forces more > >> complexity onto the user, who has to track what that number > >> represents. > > > > If all you are doing is comparing timestamps (which I guess is most of > > what people do with e.g. st_mtime), a number is fine. > > Sure. I don't think the argument for datetime is totally convincing, > just that it's stronger than the PEP currently presents. > > > If you want the current time and date in a high-level form, you can > > already use datetime.now() or datetime.utcnow() (which "only" has > > microsecond resolution as well :-)). We don't need another way to spell > > it. > > Whoops, yes, there's no need to extend time() to return a datetime. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Fri Feb 3 23:02:33 2012 From: barry at python.org (Barry Warsaw) Date: Fri, 3 Feb 2012 17:02:33 -0500 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> <4F2C10B8.1080906@stoneleaf.us> <20120203122911.4557b633@resist.wooz.org> Message-ID: <20120203170233.03da7389@resist.wooz.org> On Feb 03, 2012, at 10:20 AM, Guido van Rossum wrote: >>>> a = ... > >Is likewise cute but not very informative. But it is valid syntax. FWIW (probably not much at this point), it's not the syntax I have a problem with, but the semantics as described in the PEP of setting __cause__ to Ellipsis to mean use __context__. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From steve at pearwood.info Sat Feb 4 00:18:37 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 04 Feb 2012 10:18:37 +1100 Subject: [Python-Dev] PEP 408 -- Standard library __preview__ package In-Reply-To: <4F2C20A7.9060303@simplistix.co.uk> References: <20120127160934.2ad5e0bf@pitrou.net> <4F2C20A7.9060303@simplistix.co.uk> Message-ID: <4F2C6B4D.7020608@pearwood.info> Chris Withers wrote: > Every time I see things like the following I cry a little inside: > > try: > try: > from py2stdliblocation import FooBar as Foo > except ImportError: > from py3stdliblocation import foo as Foo > except ImportError: > from pypilocation import Foo The syntax is inelegant, but the concept is straightforward and simple and not worth tears. "I need a thing called Foo, which can be found here, or here, or here. Use the first one found." In principle this is not terribly different from the idea of a search PATH when looking for an executable, except the executable can be found under different names as well as different locations. > Now we're talking about having to add __preview__ into that mix too? As I understand it, Guido nixed that idea. (Or did I imagine that?) Preview modules will be just added to the std lib as normal, and you have to read the docs to find out they're preview. -- Steven From victor.stinner at haypocalc.com Sat Feb 4 00:39:55 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sat, 4 Feb 2012 00:39:55 +0100 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: <20120203201712.5d653857@pitrou.net> Message-ID: > consider changing the default on any of these that return a time > value.?these for example: > ?* time.clock_gettime() > ?* time.wallclock() (reuse time.clock_gettime(time.CLOCK_MONOTONIC)) Ah. Nanosecond resolution is overkill is common cases, float is enough and is faster. I prefer to use the same type (float) by default for all functions creating timestamps. Victor From benjamin at python.org Sat Feb 4 01:19:21 2012 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 3 Feb 2012 19:19:21 -0500 Subject: [Python-Dev] PEP 409 - Accepted! In-Reply-To: <4F2C4027.2050002@stoneleaf.us> References: <4F28B81B.20801@stoneleaf.us> <4F2C4027.2050002@stoneleaf.us> Message-ID: 2012/2/3 Ethan Furman : > Good news! ?PEP 409 has been accepted! It may be too late for this, but I find the whole Ellipsis business most unpleasant. Why not just have a extra attribute on exception objects like __chain__ = False/True? -- Regards, Benjamin From tjreedy at udel.edu Sat Feb 4 01:23:41 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 03 Feb 2012 19:23:41 -0500 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> Message-ID: On 2/3/2012 9:53 AM, Yury Selivanov wrote: > Re "raise ValueError from ..." The use cases for Ellipsis/... are 99.99% internal. The typical Python programmer will never see or have cause to worry about such a thing. The problem is that we really want an exception attribute that is missing in certain cases. But C does not allow missing struct members (the corresponding block of memory *will* have some bit pattern!). So unset attributes requires a dict instead of slots (I am presuming each builting exception class uses slots now) and the use of the C equivalent of hasattr (or try: except:) and delattr. So instead the proposal is to use a marker value that effectively means 'unset' or 'unspecified'. But what? None cannot be used because it is being used as a set value. Ethan initially proposed 'False', but then realizaed that 'True' fits as well, so neither fit. I proposed a new internal exception class primarily to get us thinking about alternatives to True/False. Ellipsis, properly understoo, comes close to meaning 'unspecified'. My memory is that that it how it is used in NumPy slicings. The manual gives no meaning for Ellipsis, only saying that it is used in slicings. The linked slicings section does not mention it. Ethan: I think the PEP should say more about ... being a grammatical placeholder in English, much like 'pass' is in Python. Otherwise, we will see periodic posts objecting to it in python-list. -- Terry Jan Reedy From solipsis at pitrou.net Sat Feb 4 01:23:26 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 4 Feb 2012 01:23:26 +0100 Subject: [Python-Dev] PEP 409 - Accepted! References: <4F28B81B.20801@stoneleaf.us> <4F2C4027.2050002@stoneleaf.us> Message-ID: <20120204012326.2eb7746e@pitrou.net> On Fri, 3 Feb 2012 19:19:21 -0500 Benjamin Peterson wrote: > 2012/2/3 Ethan Furman : > > Good news! ?PEP 409 has been accepted! > > It may be too late for this, but I find the whole Ellipsis business > most unpleasant. Why not just have a extra attribute on exception > objects like __chain__ = False/True? Incredibly agreed with Benjamin. Regards Antoine. From victor.stinner at haypocalc.com Sat Feb 4 01:34:33 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sat, 4 Feb 2012 01:34:33 +0100 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: References: Message-ID: > I don't see any real issue of adding datetime as another accepted > type, if Decimal is also accepted. Each type has limitations, and the > user can choose the best type for his/her use case. > > I dropped datetime because I prefer incremental changes (and a simpler > PEP is also more easily accepted :-)). We can add datetime later when > most developers agree that datetime issues are no more issues :-) About incremental changes, I wrote a patch (timestamp_datetime.patch) to support the datetime.datetime type using my API: http://bugs.python.org/issue13882#msg152571 Example: $ ./python >>> import datetime, os, time >>> open("x", "wb").close(); print(datetime.datetime.now()) 2012-02-04 01:17:27.593834 >>> print(os.stat("x", timestamp=datetime.datetime).st_ctime) 2012-02-04 00:17:27.592284+00:00 >>> print(time.time(timestamp=datetime.datetime)) 2012-02-04 00:18:21.329012+00:00 >>> time.clock(timestamp=datetime.datetime) ValueError: clock has an unspecified starting point >>> print(time.clock_gettime(time.CLOCK_REALTIME, timestamp=datetime.datetime)) 2012-02-04 00:21:37.815663+00:00 >>> print(time.clock_gettime(time.CLOCK_MONOTONIC, timestamp=datetime.datetime)) ValueError: clock has an unspecified starting point I still don't know if using UTC is correct. Victor From victor.stinner at haypocalc.com Sat Feb 4 02:38:36 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sat, 4 Feb 2012 02:38:36 +0100 Subject: [Python-Dev] PEP: New timestamp formats In-Reply-To: <20120203201712.5d653857@pitrou.net> References: <20120203201712.5d653857@pitrou.net> Message-ID: > Keep in mind timedelta has a microsecond resolution. The use cases > meant for the PEP imply nanosecond resolution (POSIX' clock_gettime(), > for example). datetime.datetime and datetime.timedelta can be patched to support nanosecond. >> A plain number of seconds is superficially simpler, but it forces more >> complexity onto the user, who has to track what that number >> represents. > > If all you are doing is comparing timestamps (which I guess is most of > what people do with e.g. st_mtime), a number is fine. > > If you want the current time and date in a high-level form, you can > already use datetime.now() or datetime.utcnow() (which "only" has > microsecond resolution as well :-)). We don't need another way to spell > it. datetime.datetime is interesting with os.stat() if you want to display the creation, modification or last access timestamp to the user. With datetime.datime, you don't have to read the documentation to get the reference date (Epoch for os.stat(), 1970.1.1) or the timezone (UTC for os.stat()?). So datetime.datime contains two more information (start date and timezone) than int, float or Decimal cannot store. Supporting datetime.datetime just for os.start(), whereas time.clock(), time.wallclock(), time.clock_gettime() and time.clock_getres() fail for this format, is maybe a bad idea. There is an exception: time.clock_gettime(time.CLOCK_REALTIME, timestamp=datetime.datetime) would be accept and you can get a timestamp with a nanosecond resolution... But datetime.datetime doesn't support nanosecond currently :-) The best reason to reject datetime.datetime is that it would only "work" with some functions, whereas it would fail (with a ValueError) in most cases. Victor From yselivanov.ml at gmail.com Sat Feb 4 03:11:29 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 3 Feb 2012 21:11:29 -0500 Subject: [Python-Dev] PEP 409 update [was: PEP 409 - final?] In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2B09D7.3020704@stoneleaf.us> <4F2B49AC.1000101@stoneleaf.us> <4F2B57F2.7030202@stoneleaf.us> Message-ID: <9CDD84C0-2724-4BB9-9644-3D28EC1B3ED5@gmail.com> On 2012-02-03, at 7:23 PM, Terry Reedy wrote: > On 2/3/2012 9:53 AM, Yury Selivanov wrote: >> Re "raise ValueError from ..." > > The use cases for Ellipsis/... are 99.99% internal. The typical Python programmer will never see or have cause to worry about such a thing. I get your points. But I don't like this argument about some spherical "typical Python programmer". Any programmer at some point may go and investigate some bug in stdlib or any other library and see this "raise Exc() from ...", or "e = Exc(); e.__cause__ = ...; raise e" nonsense. BTW, will "raise .. from .." statement allow raising only from exceptions, None, and Ellipsis exclusively, or any python object can be used? Right now it would throw a TypeError if you try to raise from None or Ellipsis. And as Benjamin said in the later latter of his -- simple __chain__ attribute will be much more understandable, easy to document and explain. That can be used as simple as writing "raise Exception().no_chaining()" or something like that. - Yury From eliben at gmail.com Sat Feb 4 04:28:09 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 4 Feb 2012 05:28:09 +0200 Subject: [Python-Dev] PEP 409 - Accepted! In-Reply-To: <4F2C4027.2050002@stoneleaf.us> References: <4F28B81B.20801@stoneleaf.us> <4F2C4027.2050002@stoneleaf.us> Message-ID: On Fri, Feb 3, 2012 at 22:14, Ethan Furman wrote: > Good news! ?PEP 409 has been accepted! > > Not so good news: ?There is no one assigned to Issue 6210 to review the > patches... any volunteers? > > http://bugs.python.org/issue6210 > Hi Ethan, I've just looked at PEP 409 online (http://www.python.org/dev/peps/pep-0409/) and I'm not sure where it details the final syntax that was chosen. The "Proposal" section says: " I proprose going with the second option: raise NewException from None " This makes no mention of ellipsis / .... Could you please clarify the PEP to make it detail the new syntax and its proposed semantics more precisely? Thanks in advance, Eli From guido at python.org Sat Feb 4 05:41:58 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 3 Feb 2012 20:41:58 -0800 Subject: [Python-Dev] PEP 409 - Accepted! In-Reply-To: References: <4F28B81B.20801@stoneleaf.us> <4F2C4027.2050002@stoneleaf.us> Message-ID: There is no new syntax! It's going to remain "raise from ". The types of the expressions are constrained by the runtime, not by the syntax. If either type is unacceptable, a TypeError (with the default context :-) will be raised. None of that is new. Really, there is no new syntax to clarify, only new allowable types for , and a new meaning assigned to those types. On Fri, Feb 3, 2012 at 7:28 PM, Eli Bendersky wrote: > On Fri, Feb 3, 2012 at 22:14, Ethan Furman wrote: >> Good news! ?PEP 409 has been accepted! >> >> Not so good news: ?There is no one assigned to Issue 6210 to review the >> patches... any volunteers? >> >> http://bugs.python.org/issue6210 >> > > Hi Ethan, > > I've just looked at PEP 409 online > (http://www.python.org/dev/peps/pep-0409/) and I'm not sure where it > details the final syntax that was chosen. > > The "Proposal" section says: > > " > ? ?I proprose going with the second option: > > ? ? ? ?raise NewException from None > " > > This makes no mention of ellipsis / .... > > Could you please clarify the PEP to make it detail the new syntax and > its proposed semantics more precisely? > > Thanks in advance, > Eli > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Sat Feb 4 06:06:33 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 04 Feb 2012 00:06:33 -0500 Subject: [Python-Dev] PEP 408 -- Standard library __preview__ package In-Reply-To: <4F2C6B4D.7020608@pearwood.info> References: <20120127160934.2ad5e0bf@pitrou.net> <4F2C20A7.9060303@simplistix.co.uk> <4F2C6B4D.7020608@pearwood.info> Message-ID: On 2/3/2012 6:18 PM, Steven D'Aprano wrote: >> Now we're talking about having to add __preview__ into that mix too? > > As I understand it, Guido nixed that idea. (Or did I imagine that?) No, you are right, discussion should cease. It is already marked 'rejected' and listed under Abandoned, Withdrawn, and Rejected PEPs. > Preview modules will be just added to the std lib as normal, and you > have to read the docs to find out they're preview. What's New should say so too. -- Terry Jan Reedy From anacrolix at gmail.com Sat Feb 4 06:43:51 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Sat, 4 Feb 2012 13:43:51 +0800 Subject: [Python-Dev] PEP 408 -- Standard library __preview__ package In-Reply-To: References: <20120127160934.2ad5e0bf@pitrou.net> <4F2C20A7.9060303@simplistix.co.uk> <4F2C6B4D.7020608@pearwood.info> Message-ID: Woohoo! :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From meadori at gmail.com Sat Feb 4 07:11:53 2012 From: meadori at gmail.com (Meador Inge) Date: Sat, 4 Feb 2012 00:11:53 -0600 Subject: [Python-Dev] OS X build break Message-ID: On Sat, Dec 31, 2011 at 5:56 PM, Guido van Rossum wrote: > PS. I would propose a specific fix but I can't seem to build a working > CPython from the trunk on my laptop (OS X 10.6, Xcode 4.1). I get this error > late in the build: > > ./python.exe -SE -m sysconfig --generate-posix-vars > Fatal Python error: Py_Initialize: can't initialize sys standard streams > Traceback (most recent call last): > ? File "/Users/guido/cpython/Lib/io.py", line 60, in > make: *** [Lib/_sysconfigdata.py] Abort trap I am having this problem now too. I am running OS X 10.7.2. 3.2 still builds for me, but I can't build default. Did you ever get past it? Anyone else seeing this? -- # Meador From steve at pearwood.info Sat Feb 4 12:25:05 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 04 Feb 2012 22:25:05 +1100 Subject: [Python-Dev] PEP 408 -- Standard library __preview__ package In-Reply-To: References: <20120127160934.2ad5e0bf@pitrou.net> <4F2C20A7.9060303@simplistix.co.uk> <4F2C6B4D.7020608@pearwood.info> Message-ID: <4F2D1591.9000806@pearwood.info> Terry Reedy wrote: > On 2/3/2012 6:18 PM, Steven D'Aprano wrote: > >>> Now we're talking about having to add __preview__ into that mix too? >> >> As I understand it, Guido nixed that idea. (Or did I imagine that?) > > No, you are right, discussion should cease. It is already marked > 'rejected' and listed under Abandoned, Withdrawn, and Rejected PEPs. > >> Preview modules will be just added to the std lib as normal, and you >> have to read the docs to find out they're preview. > > What's New should say so too. A thought comes to mind... It strikes me that it would be helpful sometimes to programmatically recognise "preview" modules in the std lib. Could we have a recommendation in PEP 8 that such modules should have a global variable called PREVIEW, and non-preview modules should not, so that the recommended way of telling them apart is with hasattr(module, "PREVIEW")? -- Steven From p.f.moore at gmail.com Sat Feb 4 13:35:57 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 4 Feb 2012 12:35:57 +0000 Subject: [Python-Dev] PEP 408 -- Standard library __preview__ package In-Reply-To: <4F2D1591.9000806@pearwood.info> References: <20120127160934.2ad5e0bf@pitrou.net> <4F2C20A7.9060303@simplistix.co.uk> <4F2C6B4D.7020608@pearwood.info> <4F2D1591.9000806@pearwood.info> Message-ID: On 4 February 2012 11:25, Steven D'Aprano wrote: > It strikes me that it would be helpful sometimes to programmatically > recognise "preview" modules in the std lib. Could we have a recommendation > in PEP 8 that such modules should have a global variable called PREVIEW, and > non-preview modules should not, so that the recommended way of telling them > apart is with hasattr(module, "PREVIEW")? In what situation would you want that when you weren't referring to a specific module? If you're referring to a specific module and you really care, just check sys.version. (That's annoying and ugly enough that it'd probably make you thing about why you are doing it - I cannot honestly think of a case where I'd actually want to check in code if a module is a preview - hence my question as to what your use case is). Feels like YAGNI to me. Paul. From nad at acm.org Sat Feb 4 14:35:58 2012 From: nad at acm.org (Ned Deily) Date: Sat, 04 Feb 2012 14:35:58 +0100 Subject: [Python-Dev] OS X build break References: Message-ID: In article , Meador Inge wrote: > On Sat, Dec 31, 2011 at 5:56 PM, Guido van Rossum wrote: > > > PS. I would propose a specific fix but I can't seem to build a working > > CPython from the trunk on my laptop (OS X 10.6, Xcode 4.1). I get this error > > late in the build: > > > > ./python.exe -SE -m sysconfig --generate-posix-vars > > Fatal Python error: Py_Initialize: can't initialize sys standard streams > > Traceback (most recent call last): > > ? File "/Users/guido/cpython/Lib/io.py", line 60, in > > make: *** [Lib/_sysconfigdata.py] Abort trap > > I am having this problem now too. I am running OS X 10.7.2. > 3.2 still builds for me, but I can't build default. > > Did you ever get past it? Anyone else seeing this? Chances are you are using llvm-gcc-4.2, the default CC for Xcode 4.2. There is a critical compile error with it (Issue13241) when building default (3.3). My current recommendations (despite some test failures): - for OS X 10.7.x, use the latest released Xcode, currently Xcode 4.2.1, and build with clang and debug: ./configure --with-pydebug CC=clang MACOSX_DEPLOYMENT_TARGET=10.7 - for OS X 10.6.x, if possible, continue to use the last released Xcode 3.2 (3.2.6), which includes Apple gcc-4.2 (/usr/bin/gcc-4.2 not llvm-gcc-4.2) /usr/bin/gcc-4.2 --version ./configure MACOSX_DEPLOYMENT_TARGET=10.6 or ./configure --with-pydebug MACOSX_DEPLOYMENT_TARGET=10.6 - for OS X 10.6.x with Xcode 4 installed (which does not include Apple gcc-4.2), use the latest Xcode 4.2 for 10.6 and use clang and debug: ./configure --with-pydebug CC=clang MACOSX_DEPLOYMENT_TARGET=10.6 Unfortunately, testing and sorting out the issues with the current OS X compilers has taken much much longer than anticipated, primarily because it's a big task and, until several days ago, I have had no time to devote to it. But I'm making progress now with installer builds completed for all of default, 3.2-tip, 3.2.2, 2.7-tip, and 2.7.2, each with all of the major compiler combinations on 10.5, 10.6 (Xcode 3.2 and 4.2), and 10.7 (4.1 and 4.2); the tests are running now on each of the applicable environments (that will take about another week to complete). Realistically, we should be able to have everything tested, fixed, and documented by the end of the PyCon sprints next month. We will also have some recommendations for buildbot changes. BTW, the current test failures with clang without pydebug include a number of ctypes test failures (in ctypes.test.test_cfuncs.CFunctions). If anyone has time to further investigate those, it would be very helpful (Issue13370). -- Ned Deily, nad at acm.org From anacrolix at gmail.com Sat Feb 4 15:02:55 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Sat, 4 Feb 2012 22:02:55 +0800 Subject: [Python-Dev] PEP 408 -- Standard library __preview__ package In-Reply-To: References: <20120127160934.2ad5e0bf@pitrou.net> <4F2C20A7.9060303@simplistix.co.uk> <4F2C6B4D.7020608@pearwood.info> <4F2D1591.9000806@pearwood.info> Message-ID: +1 On Feb 4, 2012 8:37 PM, "Paul Moore" wrote: > > On 4 February 2012 11:25, Steven D'Aprano wrote: > > It strikes me that it would be helpful sometimes to programmatically > > recognise "preview" modules in the std lib. Could we have a recommendation > > in PEP 8 that such modules should have a global variable called PREVIEW, and > > non-preview modules should not, so that the recommended way of telling them > > apart is with hasattr(module, "PREVIEW")? > > In what situation would you want that when you weren't referring to a > specific module? If you're referring to a specific module and you > really care, just check sys.version. (That's annoying and ugly enough > that it'd probably make you thing about why you are doing it - I > cannot honestly think of a case where I'd actually want to check in > code if a module is a preview - hence my question as to what your use > case is). > > Feels like YAGNI to me. > Paul. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/anacrolix%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From meadori at gmail.com Sat Feb 4 19:59:18 2012 From: meadori at gmail.com (Meador Inge) Date: Sat, 4 Feb 2012 12:59:18 -0600 Subject: [Python-Dev] OS X build break In-Reply-To: References: Message-ID: On Sat, Feb 4, 2012 at 7:35 AM, Ned Deily wrote: > Chances are you are using llvm-gcc-4.2, the default CC for Xcode 4.2. Yup: motherbrain:python meadori$ sw_vers ProductName: Mac OS X ProductVersion: 10.7.2 BuildVersion: 11C74 motherbrain:python meadori$ gcc --version i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.1.00) Copyright (C) 2007 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. motherbrain:python meadori$ clang --version Apple clang version 3.0 (tags/Apple/clang-211.12) (based on LLVM 3.0svn) Target: x86_64-apple-darwin11.2.0 Thread model: posix > There is a critical compile error with it (Issue13241) when building > default (3.3). ?My current recommendations (despite some test failures): > > - for OS X 10.7.x, use the latest released Xcode, currently Xcode 4.2.1, > and build with clang and debug: > > ./configure --with-pydebug CC=clang MACOSX_DEPLOYMENT_TARGET=10.7 That worked. Thanks! > Unfortunately, testing and sorting out the issues with the current OS X > compilers has taken much much longer than anticipated, primarily because > it's a big task and, until several days ago, I have had no time to > devote to it. ?But I'm making progress now with installer builds > completed for all of default, 3.2-tip, 3.2.2, 2.7-tip, and 2.7.2, each > with all of the major compiler combinations on 10.5, 10.6 (Xcode 3.2 and > 4.2), and 10.7 (4.1 and 4.2); the tests are running now on each of the > applicable environments (that will take about another week to complete). > Realistically, we should be able to have everything tested, fixed, and > documented by the end of the PyCon sprints next month. ?We will also > have some recommendations for buildbot changes. I volunteer to help out if there is anything I can do. > BTW, the current test failures with clang without pydebug include a > number of ctypes test failures (in ctypes.test.test_cfuncs.CFunctions). > If anyone has time to further investigate those, it would be very > helpful (Issue13370). I will look into those. -- # Meador From breamoreboy at yahoo.co.uk Sun Feb 5 18:44:07 2012 From: breamoreboy at yahoo.co.uk (Blockheads Oi Oi) Date: Sun, 05 Feb 2012 17:44:07 +0000 Subject: [Python-Dev] Volunteer Message-ID: You may remember me from a couple of years ago when I was trying to help out with Python. Unfortunately I trod on a few toes. I now know why. I have been diagnosed with Asperger Syndrome at 55 years old. I would like to give it another go. -- Cheers. Mark Lawrence. From tjreedy at udel.edu Sun Feb 5 19:00:23 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 05 Feb 2012 13:00:23 -0500 Subject: [Python-Dev] Volunteer In-Reply-To: References: Message-ID: On 2/5/2012 12:44 PM, Blockheads Oi Oi wrote: > You may remember me from a couple of years ago when I was trying to help > out with Python. Unfortunately I trod on a few toes. I now know why. I > have been diagnosed with Asperger Syndrome at 55 years old. > I would like to give it another go. Hi Mark, I noticed you posting recently in python-list. Welcome back. I will let others speak for themselves as far as the tracker goes. -- Terry Jan Reedy From nad at acm.org Sun Feb 5 20:23:50 2012 From: nad at acm.org (Ned Deily) Date: Sun, 05 Feb 2012 20:23:50 +0100 Subject: [Python-Dev] peps: Update with bugfix releases. References: Message-ID: In article , georg.brandl wrote: > +Bugfix Releases > +=============== > + > +- 3.2.1: released July 10, 2011 > +- 3.2.2: released September 4, 2011 > + > +- 3.2.3: planned February 10-17, 2012 I would like to propose that we plan for 3.2.3 and 2.7.3 immediately after PyCon, so approximately March 17, if that works for all involved. My primary rationale is to allow time to address all of the OS X Xcode 4 issues for 10.6 and 10.7. They need to be fixed in 2.7.x, 3.2.x, and 3.3: right now it is not possible to build C extension modules in some sets of configurations. As I mentioned the other day, it is going to take a few more weeks to finish testing and generate all the fixes. -- Ned Deily, nad at acm.org From benjamin at python.org Sun Feb 5 20:25:13 2012 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 5 Feb 2012 14:25:13 -0500 Subject: [Python-Dev] peps: Update with bugfix releases. In-Reply-To: References: Message-ID: 2012/2/5 Ned Deily : > In article , > ?georg.brandl wrote: >> +Bugfix Releases >> +=============== >> + >> +- 3.2.1: released July 10, 2011 >> +- 3.2.2: released September 4, 2011 >> + >> +- 3.2.3: planned February 10-17, 2012 > > I would like to propose that we plan for 3.2.3 and 2.7.3 immediately > after PyCon, so approximately March 17, if that works for all involved. > My primary rationale is to allow time to address all of the OS X Xcode 4 > issues for 10.6 and 10.7. ?They need to be fixed in 2.7.x, 3.2.x, and > 3.3: right now it is not possible to build C extension modules in some > sets of configurations. ?As I mentioned the other day, it is going to > take a few more weeks to finish testing and generate all the fixes. The reason 3.2.3 is so soon is the need to patch the hash collision attack. -- Regards, Benjamin From nad at acm.org Sun Feb 5 20:28:04 2012 From: nad at acm.org (Ned Deily) Date: Sun, 5 Feb 2012 20:28:04 +0100 Subject: [Python-Dev] peps: Update with bugfix releases. In-Reply-To: References: Message-ID: On Feb 5, 2012, at 20:25 , Benjamin Peterson wrote: > 2012/2/5 Ned Deily : >> In article , >> georg.brandl wrote: >>> +Bugfix Releases >>> +=============== >>> + >>> +- 3.2.1: released July 10, 2011 >>> +- 3.2.2: released September 4, 2011 >>> + >>> +- 3.2.3: planned February 10-17, 2012 >> >> I would like to propose that we plan for 3.2.3 and 2.7.3 immediately >> after PyCon, so approximately March 17, if that works for all involved. >> My primary rationale is to allow time to address all of the OS X Xcode 4 >> issues for 10.6 and 10.7. They need to be fixed in 2.7.x, 3.2.x, and >> 3.3: right now it is not possible to build C extension modules in some >> sets of configurations. As I mentioned the other day, it is going to >> take a few more weeks to finish testing and generate all the fixes. > > The reason 3.2.3 is so soon is the need to patch the hash collision attack. I understand that but, to me, it makes no sense to send out truly broken releases. Besides, the hash collision attack is not exactly new either. Another few weeks can't make that much of a difference. -- Ned Deily nad at acm.org -- [] From martin at v.loewis.de Sun Feb 5 20:45:51 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Sun, 05 Feb 2012 20:45:51 +0100 Subject: [Python-Dev] peps: Update with bugfix releases. In-Reply-To: References: Message-ID: <20120205204551.Horde.NCdeYVNNcXdPLtxvnkzi1lA@webmail.df.eu> > I understand that but, to me, it makes no sense to send out truly > broken releases. Besides, the hash collision attack is not exactly > new either. Another few weeks can't make that much of a difference. Why would the release be truly broken? It surely can't be worse than the current releases (which apparently aren't truly broken, else there would have been no point in releasing them back then). Regards, Martin From nad at acm.org Sun Feb 5 21:34:26 2012 From: nad at acm.org (Ned Deily) Date: Sun, 05 Feb 2012 21:34:26 +0100 Subject: [Python-Dev] peps: Update with bugfix releases. References: <20120205204551.Horde.NCdeYVNNcXdPLtxvnkzi1lA@webmail.df.eu> Message-ID: In article <20120205204551.Horde.NCdeYVNNcXdPLtxvnkzi1lA at webmail.df.eu>, martin at v.loewis.de wrote: > > I understand that but, to me, it makes no sense to send out truly > > broken releases. Besides, the hash collision attack is not exactly > > new either. Another few weeks can't make that much of a difference. > > Why would the release be truly broken? It surely can't be worse than > the current releases (which apparently aren't truly broken, else > there would have been no point in releasing them back then). They were broken by the release of OS X 10.7 and Xcode 4.2 which were subsequent to the previous releases. None of the currently available python.org installers provide a fully working system on OS X 10.7, or on OS X 10.6 if the user has installed Xcode 4.2 for 10.6. -- Ned Deily, nad at acm.org From ncoghlan at gmail.com Sun Feb 5 21:36:28 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 6 Feb 2012 06:36:28 +1000 Subject: [Python-Dev] peps: Update with bugfix releases. In-Reply-To: <20120205204551.Horde.NCdeYVNNcXdPLtxvnkzi1lA@webmail.df.eu> References: <20120205204551.Horde.NCdeYVNNcXdPLtxvnkzi1lA@webmail.df.eu> Message-ID: On Mon, Feb 6, 2012 at 5:45 AM, wrote: > >> I understand that but, to me, it makes no sense to send out truly broken >> releases. ?Besides, the hash collision attack is not exactly new either. >> ?Another few weeks can't make that much of a difference. > > > Why would the release be truly broken? It surely can't be worse than > the current releases (which apparently aren't truly broken, else > there would have been no point in releasing them back then). Because Apple wasn't publishing versions of gcc-llvm that miscompile Python when those releases were made. (However, that's just a clarification of what changed to break the Mac OS X builds, I don't think it's a reason to hold up the hash security fix, even if it means spinning 3.2.4 not long after PyCon to sort out the XCode build problems). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From nad at acm.org Sun Feb 5 22:12:43 2012 From: nad at acm.org (Ned Deily) Date: Sun, 05 Feb 2012 22:12:43 +0100 Subject: [Python-Dev] peps: Update with bugfix releases. References: <20120205204551.Horde.NCdeYVNNcXdPLtxvnkzi1lA@webmail.df.eu> Message-ID: In article , Nick Coghlan wrote: > Because Apple wasn't publishing versions of gcc-llvm that miscompile > Python when those releases were made. More importantly, Apple removed gcc-4.2 with the current versions of Xcode 4 and the Pythons installed by our current installers require gcc-4.2 to build extension modules. That will be changed but the situation is much more complex than when the previous set of releases went out. > (However, that's just a > clarification of what changed to break the Mac OS X builds, I don't > think it's a reason to hold up the hash security fix, even if it means > spinning 3.2.4 not long after PyCon to sort out the XCode build > problems). I don't think it is a service to any of our users to hurry out two releases with minimal testing and with the knowledge that a major platform is crippled and with the expectation that another set of releases will be issued within 4 to 6 weeks, all just because of a fairly obscure problem that has been around for years (even if not publicized). Releases add a lot of work and risk for everyone in the Python chain, especially distributors of Python and end-users. That's just my take on it, of course. I can live with either option. -- Ned Deily, nad at acm.org From ben+python at benfinney.id.au Sun Feb 5 22:42:27 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Mon, 06 Feb 2012 08:42:27 +1100 Subject: [Python-Dev] Volunteer References: Message-ID: <87mx8x3qkc.fsf@benfinney.id.au> Blockheads Oi Oi writes: > I would like to give it another go. Welcome back. Your signature shows the name ?Mark Lawrence?. It would help with initial impressions if your ?From? field, instead of the pseudonym currently shown, shows your name. Could you please change it to that? -- \ ?I washed a sock. Then I put it in the dryer. When I took it | `\ out, it was gone.? ?Steven Wright | _o__) | Ben Finney From barry at python.org Mon Feb 6 00:01:58 2012 From: barry at python.org (Barry Warsaw) Date: Sun, 5 Feb 2012 18:01:58 -0500 Subject: [Python-Dev] peps: Update with bugfix releases. In-Reply-To: References: Message-ID: <20120205180158.2bf49ba2@limelight.wooz.org> On Feb 05, 2012, at 02:25 PM, Benjamin Peterson wrote: >The reason 3.2.3 is so soon is the need to patch the hash collision attack. Also remember that we are coordinating releases between several versions of Python for this issue, some of which are in security-only mode. The RMs of the active stable branches agree it's best to get these coordinated security releases out as soon as possible. -Barry From steve at pearwood.info Mon Feb 6 01:33:58 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 06 Feb 2012 11:33:58 +1100 Subject: [Python-Dev] PEP 408 -- Standard library __preview__ package In-Reply-To: References: <20120127160934.2ad5e0bf@pitrou.net> <4F2C20A7.9060303@simplistix.co.uk> <4F2C6B4D.7020608@pearwood.info> <4F2D1591.9000806@pearwood.info> Message-ID: <4F2F1FF6.7070607@pearwood.info> Paul Moore wrote: > On 4 February 2012 11:25, Steven D'Aprano wrote: >> It strikes me that it would be helpful sometimes to programmatically >> recognise "preview" modules in the std lib. Could we have a recommendation >> in PEP 8 that such modules should have a global variable called PREVIEW, and >> non-preview modules should not, so that the recommended way of telling them >> apart is with hasattr(module, "PREVIEW")? > > In what situation would you want that when you weren't referring to a > specific module? If you're referring to a specific module and you > really care, just check sys.version. (That's annoying and ugly enough > that it'd probably make you thing about why you are doing it - I > cannot honestly think of a case where I'd actually want to check in > code if a module is a preview - hence my question as to what your use > case is). What's the use-case for any sort of introspection functionality? I would say that the ability to perform introspection is valuable in and of itself, regardless of any other concrete benefits. We expect that modules may change APIs between the preview and non-preview ("stable") releases. I can see value in (say) being forewarned of API changes from the interactive interpreter, without having to troll through documentation looking for changes, or waiting for an exception. Or having to remember exactly which version modules were added in, and when they left preview. (Will this *always* be one release later? I doubt it.) If you don't believe that preview modules will change APIs, or that it would be useful to detect this programmatically when using such a module, then there's probably nothing I can say to convince you otherwise. But I think it will be useful. Python itself has a sys.version so you can detect feature sets and changes in semantics; this is just the same thing, only milder. The one obvious way[1] is to explicitly tag modules as preview, and the simplest way to do this is with an attribute. (Non-preview modules shouldn't have the attribute at all -- non-preview is the default state, averaged over the entire lifetime of a module in the standard library.) It would be just nice to sit down at the interactive interpreter and see whether a module you just imported was preview or not, without having to look it up in the docs. I do nearly everything at the interpreter: I read docs using help(), I check where modules are located using module.__file__. This is just more of the same. Some alternatives: 1) Don't try to detect whether it is a preview module, but use EAFP to detect features that have changed: try: result = spam.foo(x, y) except AttributeError: # Must be a preview release. Do something else. result = spam.bar(y, x) This is preferred so long as the differences between preview and stable releases are big, obvious changes like a function being renamed. But if there are subtle changes that you care about, things get dicey. spam.foo may not raise an exception, but just do something completely unexpected. 2) As you suggest, write version-specific code: if sys.version >= "3.4": result = spam.foo(x, y) else: # Preview release. result = spam.bar(y, x) This starts to get messy fast, particularly if (worst case, and I *really* hope this doesn't happen!) modules get put into preview, then get withdrawn, then a few releases later get put back in. This sort of mess shouldn't ever happen with non-preview modules, but preview modules explicitly have weaker guarantees. And I can never remember when modules were added to the std lib. > Feels like YAGNI to me. When people talk about YAGNI, they are referring to the principle that you shouldn't waste time and effort over-engineering a complex solution or providing significant additional functionality for no obvious gain. I don't think that PREVIEW = True in a module *quite* counts as over-engineered. [1] Disclaimer: I am not Dutch. -- Steven From brett at python.org Mon Feb 6 01:39:55 2012 From: brett at python.org (Brett Cannon) Date: Sun, 5 Feb 2012 19:39:55 -0500 Subject: [Python-Dev] [Python-checkins] cpython (3.2): remove unused import In-Reply-To: References: Message-ID: I'm going to assume pylint or pyflakes would throw too many warnings on the stdlib, but would it be worth someone's time to write a simple unused import checker to run over the stdlib on occasion? I bet even one that did nothing more than a regex search for matched import statements would be good enough. On Fri, Feb 3, 2012 at 19:09, benjamin.peterson wrote: > http://hg.python.org/cpython/rev/9eb5fec8674b > changeset: 74749:9eb5fec8674b > branch: 3.2 > parent: 74746:5eb47e1732a0 > user: Benjamin Peterson > date: Fri Feb 03 19:07:30 2012 -0500 > summary: > remove unused import > > files: > Lib/threading.py | 1 - > 1 files changed, 0 insertions(+), 1 deletions(-) > > > diff --git a/Lib/threading.py b/Lib/threading.py > --- a/Lib/threading.py > +++ b/Lib/threading.py > @@ -5,7 +5,6 @@ > > from time import time as _time, sleep as _sleep > from traceback import format_exc as _format_exc > -from collections import deque > from _weakrefset import WeakSet > > # Note regarding PEP 8 compliant names > > -- > Repository URL: http://hg.python.org/cpython > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at cheimes.de Mon Feb 6 01:53:48 2012 From: lists at cheimes.de (Christian Heimes) Date: Mon, 06 Feb 2012 01:53:48 +0100 Subject: [Python-Dev] cpython (3.2): remove unused import In-Reply-To: References: Message-ID: Am 06.02.2012 01:39, schrieb Brett Cannon: > I'm going to assume pylint or pyflakes would throw too many warnings on > the stdlib, but would it be worth someone's time to write a simple > unused import checker to run over the stdlib on occasion? I bet even one > that did nothing more than a regex search for matched import statements > would be good enough. Zope 3 has an import checker that uses the compiler package and AST tree to check for unused imports. It seems like a better approach than a simple regex search. http://svn.zope.org/Zope3/trunk/utilities/importchecker.py?rev=25177&view=auto The importorder tool uses the tokenizer module to order import statements. http://svn.zope.org/Zope3/trunk/utilities/importorder.py?rev=25177&view=auto Both are written by Jim Fulton. Christian From ericsnowcurrently at gmail.com Mon Feb 6 04:25:51 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sun, 5 Feb 2012 20:25:51 -0700 Subject: [Python-Dev] PEP 408 -- Standard library __preview__ package In-Reply-To: <4F2F1FF6.7070607@pearwood.info> References: <20120127160934.2ad5e0bf@pitrou.net> <4F2C20A7.9060303@simplistix.co.uk> <4F2C6B4D.7020608@pearwood.info> <4F2D1591.9000806@pearwood.info> <4F2F1FF6.7070607@pearwood.info> Message-ID: On Feb 5, 2012 5:36 PM, "Steven D'Aprano" wrote: > > Paul Moore wrote: >> >> On 4 February 2012 11:25, Steven D'Aprano wrote: >>> >>> It strikes me that it would be helpful sometimes to programmatically >>> recognise "preview" modules in the std lib. Could we have a recommendation >>> in PEP 8 that such modules should have a global variable called PREVIEW, and >>> non-preview modules should not, so that the recommended way of telling them >>> apart is with hasattr(module, "PREVIEW")? >> >> >> In what situation would you want that when you weren't referring to a >> specific module? If you're referring to a specific module and you >> really care, just check sys.version. (That's annoying and ugly enough >> that it'd probably make you thing about why you are doing it - I >> cannot honestly think of a case where I'd actually want to check in >> code if a module is a preview - hence my question as to what your use >> case is). > > > What's the use-case for any sort of introspection functionality? I would say that the ability to perform introspection is valuable in and of itself, regardless of any other concrete benefits. > > We expect that modules may change APIs between the preview and non-preview ("stable") releases. I can see value in (say) being forewarned of API changes from the interactive interpreter, without having to troll through documentation looking for changes, or waiting for an exception. Or having to remember exactly which version modules were added in, and when they left preview. (Will this *always* be one release later? I doubt it.) > > If you don't believe that preview modules will change APIs, or that it would be useful to detect this programmatically when using such a module, then there's probably nothing I can say to convince you otherwise. But I think it will be useful. Python itself has a sys.version so you can detect feature sets and changes in semantics; this is just the same thing, only milder. > > The one obvious way[1] is to explicitly tag modules as preview, and the simplest way to do this is with an attribute. (Non-preview modules shouldn't have the attribute at all -- non-preview is the default state, averaged over the entire lifetime of a module in the standard library.) > > It would be just nice to sit down at the interactive interpreter and see whether a module you just imported was preview or not, without having to look it up in the docs. I do nearly everything at the interpreter: I read docs using help(), I check where modules are located using module.__file__. This is just more of the same. > > Some alternatives: > > 1) Don't try to detect whether it is a preview module, but use EAFP to detect features that have changed: > > try: > result = spam.foo(x, y) > except AttributeError: > # Must be a preview release. Do something else. > result = spam.bar(y, x) > > This is preferred so long as the differences between preview and stable releases are big, obvious changes like a function being renamed. But if there are subtle changes that you care about, things get dicey. spam.foo may not raise an exception, but just do something completely unexpected. > > > > 2) As you suggest, write version-specific code: > > if sys.version >= "3.4": > result = spam.foo(x, y) > else: > # Preview release. > result = spam.bar(y, x) > > > This starts to get messy fast, particularly if (worst case, and I *really* hope this doesn't happen!) modules get put into preview, then get withdrawn, then a few releases later get put back in. This sort of mess shouldn't ever happen with non-preview modules, but preview modules explicitly have weaker guarantees. > > And I can never remember when modules were added to the std lib. > > > > >> Feels like YAGNI to me. > > > When people talk about YAGNI, they are referring to the principle that you shouldn't waste time and effort over-engineering a complex solution or providing significant additional functionality for no obvious gain. I don't think that > > PREVIEW = True > > in a module *quite* counts as over-engineered. How about sys.preview_modules to list all the preview modules in the current release? This would be useful at the interactive prompt, at the least. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Mon Feb 6 07:11:04 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 06 Feb 2012 07:11:04 +0100 Subject: [Python-Dev] peps: Update with bugfix releases. In-Reply-To: <20120205180158.2bf49ba2@limelight.wooz.org> References: <20120205180158.2bf49ba2@limelight.wooz.org> Message-ID: Am 06.02.2012 00:01, schrieb Barry Warsaw: > On Feb 05, 2012, at 02:25 PM, Benjamin Peterson wrote: > >>The reason 3.2.3 is so soon is the need to patch the hash collision attack. > > Also remember that we are coordinating releases between several versions of > Python for this issue, some of which are in security-only mode. The RMs of > the active stable branches agree it's best to get these coordinated security > releases out as soon as possible. Well, one way to do it would be to release a rc now-ish, giving the community time to test it, and to already use it productively in critical cases, and release the final with the OSX fixes after/at PyCon. Georg From breamoreboy at yahoo.co.uk Mon Feb 6 09:50:13 2012 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Mon, 06 Feb 2012 08:50:13 +0000 Subject: [Python-Dev] Volunteer In-Reply-To: <87mx8x3qkc.fsf@benfinney.id.au> References: <87mx8x3qkc.fsf@benfinney.id.au> Message-ID: On 05/02/2012 21:42, Ben Finney wrote: > Blockheads Oi Oi writes: > >> I would like to give it another go. > > Welcome back. > > Your signature shows the name ?Mark Lawrence?. It would help with > initial impressions if your ?From? field, instead of the pseudonym > currently shown, shows your name. Could you please change it to that? > Done :) -- Cheers. Mark Lawrence. From matteo at naufraghi.net Mon Feb 6 10:35:55 2012 From: matteo at naufraghi.net (Matteo Bertini) Date: Mon, 6 Feb 2012 10:35:55 +0100 Subject: [Python-Dev] distutils 'depends' management In-Reply-To: <985f54b13038751cb945fc56db91743d@netwok.org> References: <985f54b13038751cb945fc56db91743d@netwok.org> Message-ID: On Fri, Feb 3, 2012 at 5:52 PM, ?ric Araujo wrote: > Hi Matteo, > > Now setup.py will rebuild all every time, this is because the policy of >> newer_group in build_extension is to consider 'newer' any missing file. >> > > Here you certainly mean ?older?. > No, and this is the problem: newer_group(depends, ext_path, 'newer')) if (some dep is newer than the target): rebuild > [...] Can someone suggest me the reason of this choice >> > > distutils? notion of dependencies directly comes from make. A missing > (not existing) target is perfectly normal: it?s usually a generated file > that make needs to create (i.e. compile from source files). In this > world, you want to (re-)compile when the target is older than the > sources, or when the target is missing. > Here is a simple Makefile that has the behavior I was expecting from distutils too: $ cat Makefile all: missing.dep echo "Done!" $ make make: *** No rule to make target `missing.dep', needed by `all'. Stop. So here your extension module is a target that needs to be created, and > when distutils does not find a file with the name you give in depends, it > just thinks it?s another thing that will be generated. > So, if I understand correctly, starting today a better name could be 'generates' instead of 'depends'? This model is inherently prone to typos; I?m not sure how we can improve > it to let people catch possible typos. > Yes, perhaps the name of the list and the explanation in the docs are both a bit confusing: http://docs.python.org/distutils/apiref.html#distutils.ccompiler.CCompiler.compile *depends*, if given, is a list of filenames that all targets depend on. If > a source file is older than any file in depends, then the source file will > be recompiled. Can this be a better explanation? "If a source file is older than any file in depends {+or if some depend is missing+}" Cheers -- Matteo Bertini http://www.slug.it/naufraghi -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Mon Feb 6 13:29:29 2012 From: eliben at gmail.com (Eli Bendersky) Date: Mon, 6 Feb 2012 14:29:29 +0200 Subject: [Python-Dev] Fixing the XML batteries In-Reply-To: References: Message-ID: On Fri, Dec 9, 2011 at 10:02, Stefan Behnel wrote: > Hi everyone, > > I think Py3.3 would be a good milestone for cleaning up the stdlib support > for XML. Note upfront: you may or may not know me as the maintainer of lxml, > the de-facto non-stdlib standard Python XML tool. This (lengthy) post was > triggered by the following kind of conversation that I keep having with new > XML users in Python (mostly on c.l.py), which hints at some serious flaw in > the stdlib. > AFAIU nothing really happened with this. The discussion started with a lot of +1s but then got derailed. The related Issue 11379 also got stuck nearly two months ago. It would be great if some sort of consensus could be reached here, since this is an important issue :-) Eli From ironfroggy at gmail.com Mon Feb 6 13:48:00 2012 From: ironfroggy at gmail.com (Calvin Spealman) Date: Mon, 6 Feb 2012 07:48:00 -0500 Subject: [Python-Dev] Fixing the XML batteries In-Reply-To: References: Message-ID: On Dec 9, 2011 3:04 AM, "Stefan Behnel" wrote: > > Hi everyone, > > I think Py3.3 would be a good milestone for cleaning up the stdlib support for XML. Note upfront: you may or may not know me as the maintainer of lxml, the de-facto non-stdlib standard Python XML tool. This (lengthy) post was triggered by the following kind of conversation that I keep having with new XML users in Python (mostly on c.l.py), which hints at some serious flaw in the stdlib. > > User: I'm trying to do XML stuff XYZ in Python and have problem ABC. > Me: What library are you using? Could you show us some code? > User: My code looks like this snippet: ... > Me: You are using minidom which is known to be hard to use, slow and uses lots of memory. Use the xml.etree.ElementTree package instead, or rather its C implementation cElementTree, also in the stdlib. > User (coming back after a while): thanks, that was exactly what [I didn't know] I was looking for. > > What does this tell us? > > 1) MiniDOM is what new users find first. It's highly visible because there are still lots of ancient "Python and XML" web pages out there that date back from the time before Python 2.5 (or rather something like 2.2), when it was the only XML tree library in the stdlib. It's also the first hit from the top when you search for "XML" on the stdlib docs page and contains the (to some people) familiar word "DOM", which lets users stop their search and start writing code, not expecting to find a separate alternative in the same stdlib, way further down. And the description as "mini", "simple" and "lightweight" suggests to users that it's going to be easy to use and efficient. > > 2) MiniDOM is not what users want. It leads to complicated, unpythonic code and lots of problems. It is neither easy to use, nor efficient, nor "lightweight", "simple" or "mini", not in absolute numbers (see http://bugs.python.org/issue11379#msg148584 and following for a recent discussion). It's also badly maintained in the sense that its performance characteristics could likely be improved, but no-one is seriously interested in doing that, because it would not lead to something that actually *is* fast or memory friendly compared to any of the 'real' alternatives that are available right now. > > 3) ElementTree is what users should use, MiniDOM is not. ElementTree was added to the stdlib in Py2.5 on popular demand, exactly because it is very easy to use, very fast, and very memory friendly. And because users did not want to use MiniDOM any more. Today, ElementTree has a rather straight upgrade path towards lxml.etree if more XML features like validation or XSLT are needed. MiniDOM has nothing like that to offer. It's a dead end. > > 4) In the stdlib, cElementTree is independent of ElementTree, but totally hidden in the documentation. In conversations like the above, it's unnecessarily complex to explain to users that there is ElementTree (which is documented in the stdlib), but that what they want to use is really cElementTree, which has the same API but does not have a stdlib documentation page that I can send them to. Note that the other Python implementations simply provide cElementTree as an alias for ElementTree. That leaves CPython as the only Python implementation that really has these two separate modules. > > So, there are many problems here. And I think they make it unnecessarily complicated for users to process XML in Python and that the current situation helps in turning away new users from Python as a language for XML processing. Python does have impressively great tools for working with XML. It's just that the stdlib and its documentation do not reflect or even appreciate that. > > What should change? > > a) The stdlib documentation should help users to choose the right tool right from the start. Instead of using the totally misleading wording that it uses now, it should be honest about the performance characteristics of MiniDOM and should actively suggest that those who don't know what to choose (or even *that* they can choose) should not use MiniDOM in the first place. I created a ticket (issue11379) for a minor step in this direction, but given the responses, I'm rather convinced that there's a lot more that can be done and should be done, and that it should be done now, right for the next release. > > b) cElementTree should finally loose it's "special" status as a separate library and disappear as an accelerator module behind ElementTree. This has been suggested a couple of times already, and AFAIR, there was some opposition because 1) ET was maintained outside of the stdlib and 2) the APIs of both were not identical. However, getting ET 1.3 into Py2.7 and 3.2 was a U-turn. Today, ET is *only* being maintained in the stdlib by Florent Xicluna (who is doing a good job with it), and ET 1.3 has basically made the APIs of both implementations compatible again. So, 3.3 would be the right milestone for fixing the "two libs for one" quirk. > > Given that this is the third time during the last couple of years that I'm suggesting to finally fix the stdlib and its documentation, I won't provide any further patches before it has finally been accepted that a) this is a problem and b) it should be fixed, thus allowing the patches to actually serve a purpose. If we can agree on that, I'll happily help in making this change happen. > > Stefan > > this gets a strong +1 from me and, I suspect, anyone else who spends a significant amount of time in any of the python support communities (python-list, #python, etc). Defaults exist not only in our code, but also in our documentation and presentation, and those defaults are wrong here. _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ironfroggy%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Mon Feb 6 14:01:57 2012 From: eliben at gmail.com (Eli Bendersky) Date: Mon, 6 Feb 2012 15:01:57 +0200 Subject: [Python-Dev] Fixing the XML batteries In-Reply-To: References: Message-ID: > What should change? > > a) The stdlib documentation should help users to choose the right tool right > from the start. Instead of using the totally misleading wording that it uses > now, it should be honest about the performance characteristics of MiniDOM > and should actively suggest that those who don't know what to choose (or > even *that* they can choose) should not use MiniDOM in the first place. I > created a ticket (issue11379) for a minor step in this direction, but given > the responses, I'm rather convinced that there's a lot more that can be done > and should be done, and that it should be done now, right for the next > release. On one hand I agree that ET should be emphasized since it's the better API with a much faster implementation. But I also understand Martin's point of view that minidom has its place, so IMHO some sort of compromise should be reached. Perhaps we can recommend using ET for those not specifically interested in the DOM interface, but for those who *are*, minidom is still a good stdlib option (?). Tying this doc clarification with an optimization in minidom is not something that makes sense. This is just delaying a much needed change forever. > > b) cElementTree should finally loose it's "special" status as a separate > library and disappear as an accelerator module behind ElementTree. This has > been suggested a couple of times already, and AFAIR, there was some > opposition because 1) ET was maintained outside of the stdlib and 2) the > APIs of both were not identical. However, getting ET 1.3 into Py2.7 and 3.2 > was a U-turn. Today, ET is *only* being maintained in the stdlib by Florent > Xicluna (who is doing a good job with it), and ET 1.3 has basically made the > APIs of both implementations compatible again. So, 3.3 would be the right > milestone for fixing the "two libs for one" quirk. This, at least in my view, is the more important point which unfortunately got much less attention in the thread. I was a bit shocked to see that in 3.3 trunk we still have both the Python and C versions exposed and only formally document ElementTree (the Python version), The only reference to cElementTree is an un-emphasized note: A C implementation of this API is available as xml.etree.cElementTree. Is there anything that *really* blocks providing cElementTree on "import ElementTree" and removing the explicit cElementTree for 3.3 (or at least leaving it with a deprecation warning)? Eli From brett at python.org Mon Feb 6 15:57:56 2012 From: brett at python.org (Brett Cannon) Date: Mon, 6 Feb 2012 09:57:56 -0500 Subject: [Python-Dev] need help with frozen module/marshal/gc issue involving sub-interpreters for importlib bootstrapping Message-ID: So my grand quest for bootstrapping importlib into CPython is damn close to coming to fruition; I have one nasty bug blocking my way and I can't figure out what could be causing it. I'm hoping someone here will either know the solution off the top of their head or will have the time to have a quick look to see if they can figure it out as my brain is mush at this point. First, the bug tracking all of this is http://bugs.python.org/issue2377 and the repo where I have been doing my work is ssh:// hg at hg.python.org/sandbox/bcannon/#bootstrap_importlib (change as needed if you want an HTTPS checkout). Everything works fine as long as you don't use sub-interpreters via test_capi (sans some test failures based on some assumptions which can easily be fixed; the bug I'm talking about is the only real showstopper at this point). Here is the issue: if you run test_capi the code triggers an assertion of ``test_subinterps (__main__.TestPendingCalls) ... Assertion failed: (gc->gc.gc_refs != 0), function visit_decref, file Modules/gcmodule.c, line 327.``. If you run the test under gdb you will discover that the assertion is related to ref counts when collecting for a generation (basically the ref updating is hitting 0 when it shouldn't). Now the odd thing is that this is happening while importing frozen module code (something I didn't touch) which is calling marshal (something else I didn't touch) and while it is in the middle of unmarshaling the frozen module code it is triggering the assertion. Does anyone have any idea what is going on? Am I possibly doing something stupid with refcounts which is only manifesting when using sub-interpreters? All relevant code for bootstrapping is contained in Python/pythonrun.c:import_init() (with a little tweaking in the _io module to delay importing the os module and making import.c always use __import__ instead of using the C code). I'm storing the __import__ function in the PyInterpreterState to keep separate state from the other interpreters (i.e. separate sys modules so as to use the proper sys.modules, etc.). But as I said, this all works in a single interpreter view of the world (the entire test suite doesn't trigger a nasty error like this). Thanks for any help people can provide me on this now 5 year quest to get this work finished. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Mon Feb 6 17:00:50 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 6 Feb 2012 11:00:50 -0500 Subject: [Python-Dev] peps: Update with bugfix releases. In-Reply-To: References: <20120205180158.2bf49ba2@limelight.wooz.org> Message-ID: <20120206110050.5b26ff83@resist.wooz.org> On Feb 06, 2012, at 07:11 AM, Georg Brandl wrote: >Well, one way to do it would be to release a rc now-ish, giving the community >time to test it, and to already use it productively in critical cases, and >release the final with the OSX fixes after/at PyCon. That could work well. I'd be happy to release a 2.6.8 rc next week. -Barry From guido at python.org Mon Feb 6 17:05:24 2012 From: guido at python.org (Guido van Rossum) Date: Mon, 6 Feb 2012 08:05:24 -0800 Subject: [Python-Dev] need help with frozen module/marshal/gc issue involving sub-interpreters for importlib bootstrapping In-Reply-To: References: Message-ID: Usually this means that you're not doing an INCREF in a place where you should, and the object is kept alive by something else. Do you know which object it is? That might really help... Possibly deleting the last subinterpreter makes the refcount of that object go to zero. Of course it could also be that you're doing a DECREF you shouldn't be doing... But the identity of the object seems key in any case. --Guido On Mon, Feb 6, 2012 at 6:57 AM, Brett Cannon wrote: > So my grand quest for bootstrapping importlib into CPython is damn close > to coming to fruition; I have one nasty bug blocking my way and I can't > figure out what could be causing it. I'm hoping someone here will either > know the solution off the top of their head or will have the time to have a > quick look to see if they can figure it out as my brain is mush at this > point. > > First, the bug tracking all of this is http://bugs.python.org/issue2377and the repo where I have been doing my work is ssh:// > hg at hg.python.org/sandbox/bcannon/#bootstrap_importlib (change as needed > if you want an HTTPS checkout). Everything works fine as long as you don't > use sub-interpreters via test_capi (sans some test failures based on some > assumptions which can easily be fixed; the bug I'm talking about is the > only real showstopper at this point). > > Here is the issue: if you run test_capi the code triggers an assertion of > ``test_subinterps (__main__.TestPendingCalls) ... Assertion failed: > (gc->gc.gc_refs != 0), function visit_decref, file Modules/gcmodule.c, line > 327.``. If you run the test under gdb you will discover that the assertion > is related to ref counts when collecting for a generation (basically the > ref updating is hitting 0 when it shouldn't). > > Now the odd thing is that this is happening while importing frozen module > code (something I didn't touch) which is calling marshal (something else I > didn't touch) and while it is in the middle of unmarshaling the frozen > module code it is triggering the assertion. > > Does anyone have any idea what is going on? Am I possibly doing something > stupid with refcounts which is only manifesting when using > sub-interpreters? All relevant code for bootstrapping is contained in > Python/pythonrun.c:import_init() (with a little tweaking in the _io module > to delay importing the os module and making import.c always use __import__ > instead of using the C code). I'm storing the __import__ function in the > PyInterpreterState to keep separate state from the other interpreters (i.e. > separate sys modules so as to use the proper sys.modules, etc.). But as I > said, this all works in a single interpreter view of the world (the entire > test suite doesn't trigger a nasty error like this). > > Thanks for any help people can provide me on this now 5 year quest to get > this work finished. > > -Brett > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Mon Feb 6 17:28:40 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 6 Feb 2012 11:28:40 -0500 Subject: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings Message-ID: I realize that _Py_Identifier is a private name, and that PEP 3131 requires anything (except test cases) in the standard library to stick with ASCII ... but somehow, that feels like too long of a chain. I would prefer to see _Py_Identifier renamed to _Py_ASCII_Identifier, or at least a comment stating that Identifiers will (per PEP 3131) always be ASCII -- preferably with an assert to back that up. -jJ On Sat, Feb 4, 2012 at 7:46 PM, victor.stinner wrote: > http://hg.python.org/cpython/rev/d2c1521ad0a1 > changeset: ? 74772:d2c1521ad0a1 > user: ? ? ? ?Victor Stinner > date: ? ? ? ?Sun Feb 05 01:45:45 2012 +0100 > summary: > ?_Py_Identifier are always ASCII strings > > files: > ?Objects/unicodeobject.c | ?5 ++--- > ?1 files changed, 2 insertions(+), 3 deletions(-) > > > diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c > --- a/Objects/unicodeobject.c > +++ b/Objects/unicodeobject.c > @@ -1744,9 +1744,8 @@ > ?_PyUnicode_FromId(_Py_Identifier *id) > ?{ > ? ? if (!id->object) { > - ? ? ? ?id->object = PyUnicode_DecodeUTF8Stateful(id->string, > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?strlen(id->string), > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?NULL, NULL); > + ? ? ? ?id->object = unicode_fromascii((unsigned char*)id->string, > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? strlen(id->string)); > ? ? ? ? if (!id->object) > ? ? ? ? ? ? return NULL; > ? ? ? ? PyUnicode_InternInPlace(&id->object); > > -- > Repository URL: http://hg.python.org/cpython From benjamin at python.org Mon Feb 6 17:32:09 2012 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 6 Feb 2012 11:32:09 -0500 Subject: [Python-Dev] need help with frozen module/marshal/gc issue involving sub-interpreters for importlib bootstrapping In-Reply-To: References: Message-ID: 2012/2/6 Brett Cannon : > Thanks for any help people can provide me on this now 5 year quest to get > this work finished. Fixed. (_PyExc_Init was behaving badly.) -- Regards, Benjamin From martin at v.loewis.de Mon Feb 6 18:13:00 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Mon, 06 Feb 2012 18:13:00 +0100 Subject: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings In-Reply-To: References: Message-ID: <20120206181300.Horde.9mBXbFNNcXdPMAocJPTAMHA@webmail.df.eu> > I would prefer to see _Py_Identifier renamed to _Py_ASCII_Identifier, > or at least a comment stating that Identifiers will (per PEP 3131) > always be ASCII -- preferably with an assert to back that up. Please ... no. This is a *convenience* interface, whose sole purpose is to make something more convenient. Adding naming clutter destroys this objective. I'd rather restore support for allowing UTF-8 source here (I don't think that requiring ASCII really improves much), than rename the macro. The ASCII requirement is actually more in the C compiler than in Python. Since not all of the C compilers that we compile Python with support non-ASCII identifiers, failure to comply to the ASCII requirement will trigger a C compilation failure. Regards, Martin From brett at python.org Mon Feb 6 18:57:38 2012 From: brett at python.org (Brett Cannon) Date: Mon, 6 Feb 2012 12:57:38 -0500 Subject: [Python-Dev] cpython (3.2): remove unused import In-Reply-To: References: Message-ID: On Sun, Feb 5, 2012 at 19:53, Christian Heimes wrote: > Am 06.02.2012 01:39, schrieb Brett Cannon: > > I'm going to assume pylint or pyflakes would throw too many warnings on > > the stdlib, but would it be worth someone's time to write a simple > > unused import checker to run over the stdlib on occasion? I bet even one > > that did nothing more than a regex search for matched import statements > > would be good enough. > > Zope 3 has an import checker that uses the compiler package and AST tree > to check for unused imports. It seems like a better approach than a > simple regex search. > > > http://svn.zope.org/Zope3/trunk/utilities/importchecker.py?rev=25177&view=auto > > The importorder tool uses the tokenizer module to order import statements. > > > http://svn.zope.org/Zope3/trunk/utilities/importorder.py?rev=25177&view=auto > > Both are written by Jim Fulton. > Ah, but does it run against Python 3? If so then this is something to suggest on python-mentor for someone to get their feet wet for contributing. -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at cheimes.de Mon Feb 6 19:07:51 2012 From: lists at cheimes.de (Christian Heimes) Date: Mon, 06 Feb 2012 19:07:51 +0100 Subject: [Python-Dev] cpython (3.2): remove unused import In-Reply-To: References: Message-ID: <4F3016F7.7010002@cheimes.de> Am 06.02.2012 18:57, schrieb Brett Cannon: > Ah, but does it run against Python 3? If so then this is something to > suggest on python-mentor for someone to get their feet wet for contributing. Probably not, the code was last modified seven years ago. The compiler package has been removed from Python 3, too. A similar approach should yield better results than a simple regexp search. The 2to3 / 3to2 infrastructure could be reused to parse the AST and search for imports and used names. From brett at python.org Mon Feb 6 19:28:55 2012 From: brett at python.org (Brett Cannon) Date: Mon, 6 Feb 2012 13:28:55 -0500 Subject: [Python-Dev] cpython (3.2): remove unused import In-Reply-To: <4F3016F7.7010002@cheimes.de> References: <4F3016F7.7010002@cheimes.de> Message-ID: On Mon, Feb 6, 2012 at 13:07, Christian Heimes wrote: > Am 06.02.2012 18:57, schrieb Brett Cannon: > > Ah, but does it run against Python 3? If so then this is something to > > suggest on python-mentor for someone to get their feet wet for > contributing. > > Probably not, the code was last modified seven years ago. The compiler > package has been removed from Python 3, too. > > A similar approach should yield better results than a simple regexp > search. The 2to3 / 3to2 infrastructure could be reused to parse the AST > and search for imports and used names. > If that's the case I might as well add it as part of my mnfy project's verification run I do over the stdlib if someone doesn't beat me to it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From francismb at email.de Mon Feb 6 19:52:54 2012 From: francismb at email.de (francis) Date: Mon, 06 Feb 2012 19:52:54 +0100 Subject: [Python-Dev] cpython (3.2): remove unused import In-Reply-To: References: <4F3016F7.7010002@cheimes.de> Message-ID: <4F302186.5010401@email.de> Hi Brett, > If that's the case I might as well add it as part of my mnfy project's > verification run I do over the stdlib if someone doesn't beat me to it. Is that devinabox ? Thanks in advance ! francis From tjreedy at udel.edu Mon Feb 6 20:35:55 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 06 Feb 2012 14:35:55 -0500 Subject: [Python-Dev] Fixing the XML batteries In-Reply-To: References: Message-ID: On 2/6/2012 8:01 AM, Eli Bendersky wrote: > On one hand I agree that ET should be emphasized since it's the better > API with a much faster implementation. But I also understand Martin's > point of view that minidom has its place, so IMHO some sort of > compromise should be reached. Perhaps we can recommend using ET for > those not specifically interested in the DOM interface, but for those > who *are*, minidom is still a good stdlib option (?). If you can, go ahead and write a patch saying something like that. It should not be hard to come up with something that is a definite improvement. Create a tracker issue for comment. but don't let it sit forever. > Tying this doc clarification with an optimization in minidom is not > something that makes sense. This is just delaying a much needed change > forever. Right. > This, at least in my view, is the more important point which > unfortunately got much less attention in the thread. I was a bit > shocked to see that in 3.3 trunk we still have both the Python and C > versions exposed and only formally document ElementTree (the Python > version), The only reference to cElementTree is an un-emphasized note: > > A C implementation of this API is available as xml.etree.cElementTree. Since the current policy seems to be to hide C behind Python when there is both, I assume that finishing the transition here is something just not gotten around to yet. Open another issue if there is not one. > Is there anything that *really* blocks providing cElementTree on > "import ElementTree" and removing the explicit cElementTree for 3.3 > (or at least leaving it with a deprecation warning)? If cElementTree were renamed _ElementTree for import from ElementTree, then a new cElementTree.py could raise the warning and then import _ElementTree also. -- Terry Jan Reedy From solipsis at pitrou.net Mon Feb 6 20:49:48 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 6 Feb 2012 20:49:48 +0100 Subject: [Python-Dev] importlib quest References: Message-ID: <20120206204948.184cddf8@pitrou.net> On Mon, 6 Feb 2012 09:57:56 -0500 Brett Cannon wrote: > Thanks for any help people can provide me on this now 5 year quest to get > this work finished. Do you have any plan to solve the performance issue? $ ./python -m timeit -s "import sys; mod='struct'" \ "__import__(mod); del sys.modules[mod]" 10000 loops, best of 3: 75.3 usec per loop $ ./python -m timeit -s "import sys; mod='struct'; from importlib import __import__" \ "__import__(mod); del sys.modules[mod]" 1000 loops, best of 3: 421 usec per loop Startup time is already much worse in 3.3 than in 2.7. With such a slowdown in importing fresh modules, applications using many batteries (third-party or not) will be heavily impacted. Regards Antoine. From solipsis at pitrou.net Mon Feb 6 21:44:48 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 6 Feb 2012 21:44:48 +0100 Subject: [Python-Dev] importlib quest References: <20120206204948.184cddf8@pitrou.net> Message-ID: <20120206214448.41d0e782@pitrou.net> On Mon, 6 Feb 2012 20:49:48 +0100 Antoine Pitrou wrote: > On Mon, 6 Feb 2012 09:57:56 -0500 > Brett Cannon wrote: > > Thanks for any help people can provide me on this now 5 year quest to get > > this work finished. > > Do you have any plan to solve the performance issue? > > $ ./python -m timeit -s "import sys; mod='struct'" \ > "__import__(mod); del sys.modules[mod]" > 10000 loops, best of 3: 75.3 usec per loop > $ ./python -m timeit -s "import sys; mod='struct'; from importlib import __import__" \ > "__import__(mod); del sys.modules[mod]" > 1000 loops, best of 3: 421 usec per loop The culprit for the overhead is likely to be PathFinder.find_module: $ ./python -m timeit -s "import sys; mod='struct'; from importlib._bootstrap import _DefaultPathFinder; finder=_DefaultPathFinder" "finder.find_module('struct')" 1000 loops, best of 3: 355 usec per loop $ ./python -S -m timeit -s "import sys; mod='struct'; from importlib._bootstrap import _DefaultPathFinder; finder=_DefaultPathFinder" "finder.find_module('struct')" 10000 loops, best of 3: 176 usec per loop Note how it's dependent on sys.path length. On an installed Python with many additional sys.path entries (e.g. because of distribute-based module installs), import times will be much worse. Regards Antoine. From brett at python.org Mon Feb 6 21:50:59 2012 From: brett at python.org (Brett Cannon) Date: Mon, 6 Feb 2012 15:50:59 -0500 Subject: [Python-Dev] need help with frozen module/marshal/gc issue involving sub-interpreters for importlib bootstrapping In-Reply-To: References: Message-ID: On Mon, Feb 6, 2012 at 11:32, Benjamin Peterson wrote: > 2012/2/6 Brett Cannon : > > Thanks for any help people can provide me on this now 5 year quest to get > > this work finished. > > Fixed. (_PyExc_Init was behaving badly.) That did it! Thanks, Benjamin! Doing one more -uall test run before I declare the bootstrap working. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Feb 6 21:58:10 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 7 Feb 2012 06:58:10 +1000 Subject: [Python-Dev] cpython (3.2): remove unused import In-Reply-To: <4F302186.5010401@email.de> References: <4F3016F7.7010002@cheimes.de> <4F302186.5010401@email.de> Message-ID: On Tue, Feb 7, 2012 at 4:52 AM, francis wrote: > Hi Brett, > >> If that's the case I might as well add it as part of my mnfy project's >> verification run I do over the stdlib if someone doesn't beat me to it. > > Is that devinabox ? No, it's Brett's Python minifier: http://pypi.python.org/pypi/mnfy devinabox is the "everything you need to get started with contributing to CPython and Python standard library development" Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From victor.stinner at haypocalc.com Mon Feb 6 22:57:46 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Mon, 6 Feb 2012 22:57:46 +0100 Subject: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings In-Reply-To: References: Message-ID: 2012/2/6 Jim Jewett : > I realize that _Py_Identifier is a private name, and that PEP 3131 > requires anything (except test cases) in the standard library to stick > with ASCII ... but somehow, that feels like too long of a chain. > > I would prefer to see _Py_Identifier renamed to _Py_ASCII_Identifier, > or at least a comment stating that Identifiers will (per PEP 3131) > always be ASCII -- preferably with an assert to back that up. _Py_IDENTIFIER(xxx) defines a variable called PyId_xxx, so xxx can only be ASCII: the C language doesn't accept non-ASCII identifiers. I thaugh that _Py_IDENTIFIER() macro was the only way to create a identifier and so ASCII was enough... but there is also _Py_static_string. _Py_static_string(name, value) allows to specify an arbitrary string, so you may pass a non-ASCII value. I don't see any usecase where you need a non-ASCII value in Python core. >> - ? ? ? ?id->object = PyUnicode_DecodeUTF8Stateful(id->string, >> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?strlen(id->string), >> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?NULL, NULL); >> + ? ? ? ?id->object = unicode_fromascii((unsigned char*)id->string, >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? strlen(id->string)); This is just an optimization. If you think that _Py_static_string() is useful, I can revert my change. Otherwise, _Py_static_string() should be removed. Victor From solipsis at pitrou.net Mon Feb 6 22:59:36 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 6 Feb 2012 22:59:36 +0100 Subject: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings References: Message-ID: <20120206225936.6b07672e@pitrou.net> On Mon, 6 Feb 2012 22:57:46 +0100 Victor Stinner wrote: > > >> - ? ? ? ?id->object = PyUnicode_DecodeUTF8Stateful(id->string, > >> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?strlen(id->string), > >> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?NULL, NULL); > >> + ? ? ? ?id->object = unicode_fromascii((unsigned char*)id->string, > >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? strlen(id->string)); > > This is just an optimization. Is the optimization even worthwhile? This code is typically called once for every static string. Regards Antoine. From martin at v.loewis.de Tue Feb 7 09:23:34 2012 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 07 Feb 2012 09:23:34 +0100 Subject: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings In-Reply-To: References: Message-ID: <4F30DF86.4030709@v.loewis.de> > _Py_IDENTIFIER(xxx) defines a variable called PyId_xxx, so xxx can > only be ASCII: the C language doesn't accept non-ASCII identifiers. That's not exactly true. In C89, source code is in the "source character set", which is implementation-defined, except that it must contain the "basic character set". I believe that it allows for implementation-defined characters in identifiers. In C99, this is extended to include "universal character names" (\u escapes). They may appear in identifiers as long as the characters named are listed in annex D.59 (which I cannot locate). In C 2011, annexes D.1 and D.2 specify the characters that you can use in an identifier: D.1 Ranges of characters allowed 1. 00A8, 00AA, 00AD, 00AF, 00B2?00B5, 00B7?00BA, 00BC?00BE, 00C0?00D6, 00D8?00F6, 00F8?00FF 2. 0100?167F, 1681?180D, 180F?1FFF 3. 200B?200D, 202A?202E, 203F?2040, 2054, 2060?206F 4. 2070?218F, 2460?24FF, 2776?2793, 2C00?2DFF, 2E80?2FFF 5. 3004?3007, 3021?302F, 3031?303F 6. 3040?D7FF 7. F900?FD3D, FD40?FDCF, FDF0?FE44, FE47?FFFD 8. 10000?1FFFD, 20000?2FFFD, 30000?3FFFD, 40000?4FFFD, 50000?5FFFD, 60000?6FFFD, 70000?7FFFD, 80000?8FFFD, 90000?9FFFD, A0000?AFFFD, B0000?BFFFD, C0000?CFFFD, D0000?DFFFD, E0000?EFFFD D.2 Ranges of characters disallowed initially 1. 0300?036F, 1DC0?1DFF, 20D0?20FF, FE20?FE2F Regards, Martin From victor.stinner at haypocalc.com Tue Feb 7 09:55:06 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 7 Feb 2012 09:55:06 +0100 Subject: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings In-Reply-To: <4F30DF86.4030709@v.loewis.de> References: <4F30DF86.4030709@v.loewis.de> Message-ID: 2012/2/7 "Martin v. L?wis" : >> _Py_IDENTIFIER(xxx) defines a variable called PyId_xxx, so xxx can >> only be ASCII: the C language doesn't accept non-ASCII identifiers. > > That's not exactly true. In C89, source code is in the "source character > set", which is implementation-defined, except that it must contain > the "basic character set". I believe that it allows for > implementation-defined characters in identifiers. Hum, I hope that these C89 compilers use UTF-8. > In C99, this is > extended to include "universal character names" (\u escapes). They may > appear in identifiers > as long as the characters named are listed in annex D.59 (which I cannot > locate). Does C99 specify the encoding? Can we expect UTF-8? Python is supposed to work on many platforms ans so support a lot of compilers, not only compilers supporting non-ASCII identifiers. Victor From dreamingforward at gmail.com Tue Feb 7 17:55:12 2012 From: dreamingforward at gmail.com (Mark Janssen) Date: Tue, 7 Feb 2012 09:55:12 -0700 Subject: [Python-Dev] [Python-ideas] matrix operations on dict :) Message-ID: On Mon, Feb 6, 2012 at 6:12 PM, Steven D'Aprano wrote: > On Mon, Feb 06, 2012 at 09:01:29PM +0100, julien tayon wrote: > > Hello, > > > > Proposing vector operations on dict, and acknowledging there was an > > homeomorphism from rooted n-ary trees to dict, was inducing the > > possibility of making matrix of dict / trees. > > This seems interesting to me, but I don't see that they are important > enough to be built-in to dicts. [...] > > > Otherwise, this looks rather like a library of functions looking for a > use. It might help if you demonstrate what concrete problems this helps > you solve. > > I have the problem looking for this solution! The application for this functionality is in coding a fractal graph (or "multigraph" in the literature). This is the most powerful structure that Computer Science has ever conceived. If you look at the evolution of data structures in compsci, the fractal graph is the ultimate. From lists to trees to graphs to multigraphs. The latter elements can always encompass the former with only O(1) extra cost. It has the potential to encode *any* relationship from the very small to the very large (as well as across or *laterally*) in one unified structure. Optimize this one data structure and the whole standard library could be refactored and simplified by an order of magnitude. Not only that, it will pave the way for the "re-factored" internet that's being worked on which creates a content-centric Internet beyond the graph-level, hypertext internet. Believe, it will be awesome. Slowing down.... mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Feb 7 19:26:17 2012 From: brett at python.org (Brett Cannon) Date: Tue, 7 Feb 2012 13:26:17 -0500 Subject: [Python-Dev] importlib quest In-Reply-To: <20120206204948.184cddf8@pitrou.net> References: <20120206204948.184cddf8@pitrou.net> Message-ID: On Mon, Feb 6, 2012 at 14:49, Antoine Pitrou wrote: > On Mon, 6 Feb 2012 09:57:56 -0500 > Brett Cannon wrote: > > Thanks for any help people can provide me on this now 5 year quest to get > > this work finished. > > Do you have any plan to solve the performance issue? > I have not even looked at performance or attempted to profile the code, so I suspect there is room for improvement. > > $ ./python -m timeit -s "import sys; mod='struct'" \ > "__import__(mod); del sys.modules[mod]" > 10000 loops, best of 3: 75.3 usec per loop > $ ./python -m timeit -s "import sys; mod='struct'; from importlib import > __import__" \ > "__import__(mod); del sys.modules[mod]" > 1000 loops, best of 3: 421 usec per loop > > Startup time is already much worse in 3.3 than in 2.7. With such a > slowdown in importing fresh modules, applications using many batteries > (third-party or not) will be heavily impacted. > I have a benchmark suite for importing modules directly at importlib.test.benchmark, but it doesn't explicitly cover searching far down sys.path. I will see if any of the existing tests implicitly do that and if not add it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Tue Feb 7 20:10:05 2012 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 7 Feb 2012 11:10:05 -0800 Subject: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings In-Reply-To: References: <4F30DF86.4030709@v.loewis.de> Message-ID: Why do we still care about C89? It is 2012 and we're talking about Python 3. What compiler on what platform that anyone actually cares about does not support C99? -gps From amauryfa at gmail.com Tue Feb 7 20:31:00 2012 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 7 Feb 2012 20:31:00 +0100 Subject: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings In-Reply-To: References: <4F30DF86.4030709@v.loewis.de> Message-ID: 2012/2/7 Gregory P. Smith > Why do we still care about C89? It is 2012 and we're talking about > Python 3. What compiler on what platform that anyone actually cares > about does not support C99? > The Microsoft compilers on Windows do not support C99: - Declarations must be at the start of a block - No designated initializers for structures - Ascii-only identifiers: http://msdn.microsoft.com/en-us/library/e7f8y25b.aspx -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Feb 7 21:07:24 2012 From: brett at python.org (Brett Cannon) Date: Tue, 7 Feb 2012 15:07:24 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? Message-ID: I'm going to start this off with the caveat that hg.python.org/sandbox/bcannon#bootstrap_importlib is not completely at feature parity, but getting there shouldn't be hard. There is a FAILING file that has a list of the tests that are not passing because importlib bootstrapping and a comment as to why (I think) they are failing. But no switch would ever happen until the test suite passes. Anyway, to start this conversation I'm going to open with why I think removing most of the C code in Python/import.c and replacing it with importlib/_bootstrap.py is a positive thing. One is maintainability. Antoine mentioned how if change occurs everyone is going to have to be able to fix code in importlib, and that's the point! I don't know about the rest of you but I find Python code easier to work with than C code (and if you don't you might be subscribed to the wrong mailing list =). I would assume the ability to make changes or to fix bugs will be a lot easier with importlib than import.c. So maintainability should be easier when it comes to imports. Two is APIs. PEP 302 introduced this idea of an API for objects that can perform imports so that people can control it, enhance it, introspect it, etc. But as it stands right now, import.c implements none of PEP 302 for any built-in import mechanism. This mostly stems from positive thing #1 I just mentioned. but since I was able to do this code from scratch I was able to design for (and extend) PEP 302 compliance in order to make sure the entire import system was exposed cleanly. This means it is much easier now to write a custom importer for quirky syntax, a different storage mechanism, etc. Third is multi-VM support. IronPython, Jython, and PyPy have all said they would love importlib to become the default import implementation so that all VMs have the same implementation. Some people have even said they will use importlib regardless of what CPython does simply to ease their coding burden, but obviously that still leads to the possibility of subtle semantic differences that would go away if all VMs used the same implementation. So switching would lead to one less possible semantic difference between the various VMs. So, that is the positives. What are the negatives? Performance, of course. Now I'm going to be upfront and say I really did not want to have this performance conversation now as I have done *NO* profiling or analysis of the algorithms used in importlib in order to tune performance (e.g. the function that handles case-sensitivity, which is on the critical path for importing source code, has a platform check which could go away if I instead had platform-specific versions of the function that were assigned to a global variable at startup). I also know that people have a bad habit of latching on to micro-benchmark numbers, especially for something like import which involves startup or can easily be measured. I mean I wrote importlib.test.benchmark to help measure performance changes in any algorithmic changes I might make, but it isn't a real-world benchmark like what Unladen Swallow gave us (e.g. the two start-up benchmarks that use real-world apps -- hg and bzr -- aren't available on Python 3 so only normal_startup and nosite_startup can be used ATM). IOW I really do not look forward to someone saying "importlib is so much slower at importing a module containing ``pass``" when (a) that never happens, and (b) most programs do not spend their time importing but instead doing interesting work. For instance, right now importlib does ``python -c "import decimal"`` (which, BTW, is the largest module in the stdlib) 25% slower on my machine with a pydebug build (a non-debug build would probably be in my favor as I have more Python objects being used in importlib and thus more sanity checks). But if you do something (very) slightly more interesting like ``python -m calendar`` where is a slight amount of work then importlib is currently only 16% slower. So it all depends on how we measure (as usual). So, if there is going to be some baseline performance target I need to hit to make people happy I would prefer to know what that (real-world) benchmark is and what the performance target is going to be on a non-debug build. And if people are not worried about the performance then I'm happy with that as well. =) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Tue Feb 7 21:24:35 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 7 Feb 2012 15:24:35 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: <20120207152435.379ac6f4@resist.wooz.org> Brett, thanks for persevering on importlib! Given how complicated imports are in Python, I really appreciate you pushing this forward. I've been knee deep in both import.c and importlib at various times. ;) On Feb 07, 2012, at 03:07 PM, Brett Cannon wrote: >One is maintainability. Antoine mentioned how if change occurs everyone is >going to have to be able to fix code in importlib, and that's the point! I >don't know about the rest of you but I find Python code easier to work with >than C code (and if you don't you might be subscribed to the wrong mailing >list =). I would assume the ability to make changes or to fix bugs will be >a lot easier with importlib than import.c. So maintainability should be >easier when it comes to imports. I think it's *really* critical that importlib be well-documented. Not just its API, but also design documents (what classes are there, and why it's decomposed that way), descriptions of how to extend and subclass, maybe even examples for doing some typical hooks. Maybe even a guided tour or tutorial for people digging into importlib for the first time. >So, that is the positives. What are the negatives? Performance, of course. That's okay. Get it complete, right, and usable first and then unleash the Pythonic hoards to bang on performance. >IOW I really do not look forward to someone saying "importlib is so much >slower at importing a module containing ``pass``" when (a) that never >happens, and (b) most programs do not spend their time importing but >instead doing interesting work. Identifying the use cases are important here. For example, even if it were a lot slower, Mailman wouldn't care (*I* might care because it takes longer to run my test, but my users wouldn't). But Bazaar or Mercurial users would care a lot. -Barry From dirkjan at ochtman.nl Tue Feb 7 21:28:57 2012 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Tue, 7 Feb 2012 21:28:57 +0100 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: <20120207152435.379ac6f4@resist.wooz.org> References: <20120207152435.379ac6f4@resist.wooz.org> Message-ID: On Tue, Feb 7, 2012 at 21:24, Barry Warsaw wrote: > Identifying the use cases are important here. ?For example, even if it were a > lot slower, Mailman wouldn't care (*I* might care because it takes longer to > run my test, but my users wouldn't). ?But Bazaar or Mercurial users would care > a lot. Yeah, startup performance getting worse kinda sucks for command-line apps. And IIRC it's been getting worse over the past few releases... Anyway, I think there was enough of a python3 port for Mercurial (from various GSoC students) that you can probably run some of the very simple commands (like hg parents or hg id), which should be enough for your purposes, right? Cheers, Dirkjan From solipsis at pitrou.net Tue Feb 7 21:49:48 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 7 Feb 2012 21:49:48 +0100 Subject: [Python-Dev] requirements for moving __import__ over to importlib? References: Message-ID: <20120207214948.38d4503e@pitrou.net> On Tue, 7 Feb 2012 15:07:24 -0500 Brett Cannon wrote: > > Now I'm going to be upfront and say I really did not want to have this > performance conversation now as I have done *NO* profiling or analysis of > the algorithms used in importlib in order to tune performance (e.g. the > function that handles case-sensitivity, which is on the critical path for > importing source code, has a platform check which could go away if I > instead had platform-specific versions of the function that were assigned > to a global variable at startup). From a cursory look, I think you're gonna have to break (special-case) some abstractions and have some inner loop coded in C for the common cases. That said, I think profiling and solving performance issues is critical *before* integrating this work. It doesn't need to be done by you, but the python-dev community shouldn't feel strong-armed to solve the issue. > IOW I really do not look forward to someone saying "importlib is so much > slower at importing a module containing ``pass``" when (a) that never > happens, and (b) most programs do not spend their time importing but > instead doing interesting work. Well, import time is so important that the Mercurial developers have written an on-demand import mechanism, to reduce the latency of command-line operations. But it's not only important for Mercurial and the like. Even if you're developing a Web app, making imports slower will make restarts slower, and development more tedious in the first place. > So, if there is going to be some baseline performance target I need to hit > to make people happy I would prefer to know what that (real-world) > benchmark is and what the performance target is going to be on a non-debug > build. - No significant slowdown in startup time. - Within 25% of current performance when importing, say, the "struct" module (Lib/struct.py) from bytecode. Regards Antoine. From p.f.moore at gmail.com Tue Feb 7 22:19:19 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 7 Feb 2012 21:19:19 +0000 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: <20120207214948.38d4503e@pitrou.net> References: <20120207214948.38d4503e@pitrou.net> Message-ID: On 7 February 2012 20:49, Antoine Pitrou wrote: > Well, import time is so important that the Mercurial developers have > written an on-demand import mechanism, to reduce the latency of > command-line operations. One question here, I guess - does the importlib integration do anything to make writing on-demand import mechanisms easier (I'd suspect not, but you never know...) If it did, then performance issues might be somewhat less of a sticking point, as usual depending on use cases. Paul. From martin at v.loewis.de Tue Feb 7 22:38:44 2012 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 07 Feb 2012 22:38:44 +0100 Subject: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings In-Reply-To: References: <4F30DF86.4030709@v.loewis.de> Message-ID: <4F3199E4.6030107@v.loewis.de> > Does C99 specify the encoding? Can we expect UTF-8? No, it's implementation-defined. However, that really doesn't matter much for the macro (it does matter for the Mercurial repository): The files on disk are mapped, in an implementation-defined manner, into the source character set. All processing is done there, including any stringification. Then, for string literals, the source character set is converted into the execution character set. So for the definition of the _Py_identifier macro, it really matters what the run-time encoding of the stringified identifiers is. > Python is supposed to work on many platforms ans so support a lot of > compilers, not only compilers supporting non-ASCII identifiers. And your point is? Regards, Martin From martin at v.loewis.de Tue Feb 7 22:41:37 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 07 Feb 2012 22:41:37 +0100 Subject: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings In-Reply-To: References: <4F30DF86.4030709@v.loewis.de> Message-ID: <4F319A91.4020500@v.loewis.de> Am 07.02.2012 20:10, schrieb Gregory P. Smith: > Why do we still care about C89? It is 2012 and we're talking about > Python 3. What compiler on what platform that anyone actually cares > about does not support C99? As Amaury says: Visual Studio still doesn't support C99. The story is both funny and sad: In Visual Studio 2002, the release notes included a comment that they couldn't consider C99 (in 2002), because of lack of time, and the standard came so quickly. In 2003, they kept this notice. In VS 2005 (IIRC), they said that there is too little customer demand for C99 so that they didn't implement it; they recommended to use C++ or C#, anyway. Now C2011 has been published. Regards, Martin From pje at telecommunity.com Tue Feb 7 22:51:05 2012 From: pje at telecommunity.com (PJ Eby) Date: Tue, 7 Feb 2012 16:51:05 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 3:07 PM, Brett Cannon wrote: > So, if there is going to be some baseline performance target I need to hit > to make people happy I would prefer to know what that (real-world) > benchmark is and what the performance target is going to be on a non-debug > build. And if people are not worried about the performance then I'm happy > with that as well. =) > One thing I'm a bit worried about is repeated imports, especially ones that are inside frequently-called functions. In today's versions of Python, this is a performance win for "command-line tool platform" systems like Mercurial and PEAK, where you want to delay importing as long as possible, in case the code that needs the import is never called at all... but, if it *is* used, you may still need to use it a lot of times. When writing that kind of code, I usually just unconditionally import inside the function, because the C code check for an already-imported module is faster than the Python "if" statement I'd have to clutter up my otherwise-clean function with. So, in addition to the things other people have mentioned as performance targets, I'd like to keep the slowdown factor low for this type of scenario as well. Specifically, the slowdown shouldn't be so much as to motivate lazy importers like Mercurial and PEAK to need to rewrite in-function imports to do the already-imported check ourselves. ;-) (Disclaimer: I haven't actually seen Mercurial's delayed/dynamic import code, so I can't say for 100% sure if they'd be affected the same way.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at haypocalc.com Tue Feb 7 23:05:51 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 7 Feb 2012 23:05:51 +0100 Subject: [Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings In-Reply-To: <20120206181300.Horde.9mBXbFNNcXdPMAocJPTAMHA@webmail.df.eu> References: <20120206181300.Horde.9mBXbFNNcXdPMAocJPTAMHA@webmail.df.eu> Message-ID: > I'd rather restore support for allowing UTF-8 source here (I don't think > that requiring ASCII really improves much), than rename the macro. Done, I reverted my change. Victor From brett at python.org Tue Feb 7 23:16:18 2012 From: brett at python.org (Brett Cannon) Date: Tue, 7 Feb 2012 17:16:18 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: <20120207214948.38d4503e@pitrou.net> References: <20120207214948.38d4503e@pitrou.net> Message-ID: On Tue, Feb 7, 2012 at 15:49, Antoine Pitrou wrote: > On Tue, 7 Feb 2012 15:07:24 -0500 > Brett Cannon wrote: > > > > Now I'm going to be upfront and say I really did not want to have this > > performance conversation now as I have done *NO* profiling or analysis of > > the algorithms used in importlib in order to tune performance (e.g. the > > function that handles case-sensitivity, which is on the critical path for > > importing source code, has a platform check which could go away if I > > instead had platform-specific versions of the function that were assigned > > to a global variable at startup). > > >From a cursory look, I think you're gonna have to break (special-case) > some abstractions and have some inner loop coded in C for the common > cases. > Wouldn't shock me if it came to that, but obviously I would like to try to avoid it. > > That said, I think profiling and solving performance issues is critical > *before* integrating this work. It doesn't need to be done by you, but > the python-dev community shouldn't feel strong-armed to solve the issue. > > That part of the discussion I'm staying out of since I want to see this in so I'm biased. > > IOW I really do not look forward to someone saying "importlib is so much > > slower at importing a module containing ``pass``" when (a) that never > > happens, and (b) most programs do not spend their time importing but > > instead doing interesting work. > > Well, import time is so important that the Mercurial developers have > written an on-demand import mechanism, to reduce the latency of > command-line operations. > Sure, but they are a somewhat extreme case. > > But it's not only important for Mercurial and the like. Even if you're > developing a Web app, making imports slower will make restarts slower, > and development more tedious in the first place. > > Fine, startup cost from a hard crash I can buy when you are getting 1000 QPS, but development more tedious? > > So, if there is going to be some baseline performance target I need to > hit > > to make people happy I would prefer to know what that (real-world) > > benchmark is and what the performance target is going to be on a > non-debug > > build. > > - No significant slowdown in startup time. > What's significant and measuring what exactly? I mean startup already has a ton of imports as it is, so this would wash out the point of measuring practically anything else for anything small. This is why I said I want a benchmark to target which does actual work since flat-out startup time measures nothing meaningful but busy work. I would get more out of code that just stat'ed every file in Lib since at least that did some work. > > - Within 25% of current performance when importing, say, the "struct" > module (Lib/struct.py) from bytecode. > Why struct? It's such a small module that it isn't really a typical module. The median file size of Lib is 11K (e.g. tabnanny.py), not 238 bytes (which is barely past Hello World). And is this just importing struct or is this from startup, e.g. ``python -c "import struct"``? -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Feb 7 23:17:38 2012 From: brett at python.org (Brett Cannon) Date: Tue, 7 Feb 2012 17:17:38 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: <20120207152435.379ac6f4@resist.wooz.org> References: <20120207152435.379ac6f4@resist.wooz.org> Message-ID: On Tue, Feb 7, 2012 at 15:24, Barry Warsaw wrote: > Brett, thanks for persevering on importlib! Given how complicated imports > are > in Python, I really appreciate you pushing this forward. I've been knee > deep > in both import.c and importlib at various times. ;) > > On Feb 07, 2012, at 03:07 PM, Brett Cannon wrote: > > >One is maintainability. Antoine mentioned how if change occurs everyone is > >going to have to be able to fix code in importlib, and that's the point! > I > >don't know about the rest of you but I find Python code easier to work > with > >than C code (and if you don't you might be subscribed to the wrong mailing > >list =). I would assume the ability to make changes or to fix bugs will be > >a lot easier with importlib than import.c. So maintainability should be > >easier when it comes to imports. > > I think it's *really* critical that importlib be well-documented. Not just > its API, but also design documents (what classes are there, and why it's > decomposed that way), descriptions of how to extend and subclass, maybe > even > examples for doing some typical hooks. Maybe even a guided tour or > tutorial > for people digging into importlib for the first time. > That's fine and not difficult to do. > > >So, that is the positives. What are the negatives? Performance, of course. > > That's okay. Get it complete, right, and usable first and then unleash the > Pythonic hoards to bang on performance. > > >IOW I really do not look forward to someone saying "importlib is so much > >slower at importing a module containing ``pass``" when (a) that never > >happens, and (b) most programs do not spend their time importing but > >instead doing interesting work. > > Identifying the use cases are important here. For example, even if it > were a > lot slower, Mailman wouldn't care (*I* might care because it takes longer > to > run my test, but my users wouldn't). But Bazaar or Mercurial users would > care > a lot. > Right, which is why I'm looking for some agreed upon, concrete benchmark I can use which isn't fluff. -Brett > > -Barry > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Feb 7 23:21:20 2012 From: brett at python.org (Brett Cannon) Date: Tue, 7 Feb 2012 17:21:20 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120207214948.38d4503e@pitrou.net> Message-ID: On Tue, Feb 7, 2012 at 16:19, Paul Moore wrote: > On 7 February 2012 20:49, Antoine Pitrou wrote: > > Well, import time is so important that the Mercurial developers have > > written an on-demand import mechanism, to reduce the latency of > > command-line operations. > > One question here, I guess - does the importlib integration do > anything to make writing on-demand import mechanisms easier (I'd > suspect not, but you never know...) If it did, then performance issues > might be somewhat less of a sticking point, as usual depending on use > cases. Depends on what your feature set is. I have a fully working mixin you can add to any loader which makes it lazy if you trigger the import on reading an attribute from the module: http://code.google.com/p/importers/source/browse/importers/lazy.py . But if you want to trigger the import on *writing* an attribute then I have yet to make that work in Python source (maybe people have an idea on how to make that work since __setattr__ doesn't mix well with __getattribute__). -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Feb 7 23:21:59 2012 From: brett at python.org (Brett Cannon) Date: Tue, 7 Feb 2012 17:21:59 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120207152435.379ac6f4@resist.wooz.org> Message-ID: On Tue, Feb 7, 2012 at 15:28, Dirkjan Ochtman wrote: > On Tue, Feb 7, 2012 at 21:24, Barry Warsaw wrote: > > Identifying the use cases are important here. For example, even if it > were a > > lot slower, Mailman wouldn't care (*I* might care because it takes > longer to > > run my test, but my users wouldn't). But Bazaar or Mercurial users > would care > > a lot. > > Yeah, startup performance getting worse kinda sucks for command-line > apps. And IIRC it's been getting worse over the past few releases... > > Anyway, I think there was enough of a python3 port for Mercurial (from > various GSoC students) that you can probably run some of the very > simple commands (like hg parents or hg id), which should be enough for > your purposes, right? > Possibly. Where is the code? -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Feb 7 23:24:21 2012 From: brett at python.org (Brett Cannon) Date: Tue, 7 Feb 2012 17:24:21 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 16:51, PJ Eby wrote: > On Tue, Feb 7, 2012 at 3:07 PM, Brett Cannon wrote: > >> So, if there is going to be some baseline performance target I need to >> hit to make people happy I would prefer to know what that (real-world) >> benchmark is and what the performance target is going to be on a non-debug >> build. And if people are not worried about the performance then I'm happy >> with that as well. =) >> > > One thing I'm a bit worried about is repeated imports, especially ones > that are inside frequently-called functions. In today's versions of > Python, this is a performance win for "command-line tool platform" systems > like Mercurial and PEAK, where you want to delay importing as long as > possible, in case the code that needs the import is never called at all... > but, if it *is* used, you may still need to use it a lot of times. > > When writing that kind of code, I usually just unconditionally import > inside the function, because the C code check for an already-imported > module is faster than the Python "if" statement I'd have to clutter up my > otherwise-clean function with. > > So, in addition to the things other people have mentioned as performance > targets, I'd like to keep the slowdown factor low for this type of scenario > as well. Specifically, the slowdown shouldn't be so much as to motivate > lazy importers like Mercurial and PEAK to need to rewrite in-function > imports to do the already-imported check ourselves. ;-) > > (Disclaimer: I haven't actually seen Mercurial's delayed/dynamic import > code, so I can't say for 100% sure if they'd be affected the same way.) > IOW you want the sys.modules case fast, which I will never be able to match compared to C code since that is pure execution with no I/O. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Tue Feb 7 23:24:32 2012 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 7 Feb 2012 14:24:32 -0800 Subject: [Python-Dev] which C language standard CPython must conform to Message-ID: On Tue, Feb 7, 2012 at 1:41 PM, "Martin v. L?wis" wrote: > Am 07.02.2012 20:10, schrieb Gregory P. Smith: >> Why do we still care about C89? ?It is 2012 and we're talking about >> Python 3. ?What compiler on what platform that anyone actually cares >> about does not support C99? > > As Amaury says: Visual Studio still doesn't support C99. The story is > both funny and sad: In Visual Studio 2002, the release notes included > a comment that they couldn't consider C99 (in 2002), because of lack of > time, and the standard came so quickly. In 2003, they kept this notice. > In VS 2005 (IIRC), they said that there is too little customer demand > for C99 so that they didn't implement it; they recommended to use C++ > or C#, anyway. Now C2011 has been published. Thanks! I've probably asked this question before. Maybe I'll learn this time. ;) Some quick searching shows that there is at least hope Microsoft is on board with C++11x (not so surprising, their crown jewels are written in C++). We should at some point demand a C++ compiler for CPython and pick of subset of C++ features to allow use of but that is likely reserved for the Python 4 timeframe (a topic for another thread and time entirely, it isn't feasible for today's codebase). In that timeframe another alternative Question may make sense to ask: Do we need a single unified all-platform-from-one-codebase python interpreter? If we can get other VM implementations up to date language feature wise and manage to sufficiently decouple standard library development from CPython itself that becomes possibile. One of the difficulties with that would obviously be new language feature development if it meant updating more than one VM at a time in order to ship an implementation of a new pep. -gps From solipsis at pitrou.net Tue Feb 7 23:42:24 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 7 Feb 2012 23:42:24 +0100 Subject: [Python-Dev] requirements for moving __import__ over to importlib? References: Message-ID: <20120207234224.1ae8602e@pitrou.net> On Tue, 7 Feb 2012 17:24:21 -0500 Brett Cannon wrote: > > IOW you want the sys.modules case fast, which I will never be able to match > compared to C code since that is pure execution with no I/O. Why wouldn't continue using C code for that? It's trivial (just a dict lookup). Regards Antoine. From barry at python.org Tue Feb 7 23:59:03 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 7 Feb 2012 17:59:03 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120207214948.38d4503e@pitrou.net> Message-ID: <20120207175903.31bc2122@resist.wooz.org> On Feb 07, 2012, at 09:19 PM, Paul Moore wrote: >One question here, I guess - does the importlib integration do >anything to make writing on-demand import mechanisms easier (I'd >suspect not, but you never know...) If it did, then performance issues >might be somewhat less of a sticking point, as usual depending on use >cases. It might even be a feature-win if a standard on-demand import mechanism could be added on top of importlib so all these projects wouldn't have to roll their own. -Barry From solipsis at pitrou.net Wed Feb 8 00:08:37 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 8 Feb 2012 00:08:37 +0100 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120207214948.38d4503e@pitrou.net> Message-ID: <20120208000837.79dcb863@pitrou.net> On Tue, 7 Feb 2012 17:16:18 -0500 Brett Cannon wrote: > > > > IOW I really do not look forward to someone saying "importlib is so much > > > slower at importing a module containing ``pass``" when (a) that never > > > happens, and (b) most programs do not spend their time importing but > > > instead doing interesting work. > > > > Well, import time is so important that the Mercurial developers have > > written an on-demand import mechanism, to reduce the latency of > > command-line operations. > > > > Sure, but they are a somewhat extreme case. I don't think Mercurial is extreme. Any command-line tool written in Python applies. For example, yum (Fedora's apt-get) is written in Python. And I'm sure many people do small administration scripts in Python. These tools may then be run in a loop by whatever other script. > > But it's not only important for Mercurial and the like. Even if you're > > developing a Web app, making imports slower will make restarts slower, > > and development more tedious in the first place. > > > > > Fine, startup cost from a hard crash I can buy when you are getting 1000 > QPS, but development more tedious? Well, waiting several seconds when reloading a development server is tedious. Anyway, my point was that other cases (than command-line tools) can be negatively impacted by import time. > > > So, if there is going to be some baseline performance target I need to > > hit > > > to make people happy I would prefer to know what that (real-world) > > > benchmark is and what the performance target is going to be on a > > non-debug > > > build. > > > > - No significant slowdown in startup time. > > > > What's significant and measuring what exactly? I mean startup already has a > ton of imports as it is, so this would wash out the point of measuring > practically anything else for anything small. I don't understand your sentence. Yes, startup has a ton of imports and that's why I'm fearing it may be negatively impacted :) ("a ton" being a bit less than 50 currently) > This is why I said I want a > benchmark to target which does actual work since flat-out startup time > measures nothing meaningful but busy work. "Actual work" can be very small in some cases. For example, if you run "hg branch" I'm quite sure it doesn't do a lot of work except importing many modules and then reading a single file in .hg (the one named ".hg/branch" probably, but I'm not a Mercurial dev). In the absence of more "real world" benchmarks, I think the startup benchmarks in the benchmarks repo are a good baseline. That said you could also install my 3.x port of Twisted here: https://bitbucket.org/pitrou/t3k/ and then run e.g. "python3 bin/trial -h". > I would get more out of code > that just stat'ed every file in Lib since at least that did some work. stat()ing files is not really representative of import work. There are many indirections in the import machinery. (actually, even import.c appears quite slower than a bunch of stat() calls would imply) > > - Within 25% of current performance when importing, say, the "struct" > > module (Lib/struct.py) from bytecode. > > > > Why struct? It's such a small module that it isn't really a typical module. Precisely to measure the overhead. Typical module size will vary depending on development style. Some people may prefer writing many small modules. Or they may be using many small libraries, or using libraries that have adoptes such a development style. Measuring the overhead on small modules will make sure we aren't overly confident. > The median file size of Lib is 11K (e.g. tabnanny.py), not 238 bytes (which > is barely past Hello World). And is this just importing struct or is this > from startup, e.g. ``python -c "import struct"``? Just importing struct, as with the timeit snippets in the other thread. Regards Antoine. From alex.gaynor at gmail.com Wed Feb 8 00:26:21 2012 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Tue, 7 Feb 2012 23:26:21 +0000 (UTC) Subject: [Python-Dev] =?utf-8?q?requirements_for_moving_=5F=5Fimport=5F=5F?= =?utf-8?q?_over_to=09importlib=3F?= References: Message-ID: Brett Cannon python.org> writes: > IOW you want the sys.modules case fast, which I will never be able to match compared to C code since that is pure execution with no I/O. > Sure you can: have a really fast Python VM. Constructive: if you can run this code under PyPy it'd be easy to just: $ pypy -mtimeit "import struct" $ pypy -mtimeit -s "import importlib" "importlib.import_module('struct')" Or whatever the right API is. Alex From tjreedy at udel.edu Wed Feb 8 00:40:37 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 07 Feb 2012 18:40:37 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On 2/7/2012 4:51 PM, PJ Eby wrote: > One thing I'm a bit worried about is repeated imports, especially ones > that are inside frequently-called functions. In today's versions of > Python, this is a performance win for "command-line tool platform" > systems like Mercurial and PEAK, where you want to delay importing as > long as possible, in case the code that needs the import is never called > at all... but, if it *is* used, you may still need to use it a lot of > times. > > When writing that kind of code, I usually just unconditionally import > inside the function, because the C code check for an already-imported > module is faster than the Python "if" statement I'd have to clutter up > my otherwise-clean function with. importlib could provide a parameterized decorator for functions that are the only consumers of an import. It could operate much like this: def imps(mod): def makewrap(f): def wrapped(*args, **kwds): print('first/only call to wrapper') g = globals() g[mod] = __import__(mod) g[f.__name__] = f f(*args, **kwds) wrapped.__name__ = f.__name__ return wrapped return makewrap @imps('itertools') def ic(): print(itertools.count) ic() ic() # first/only call to wrapper -- Terry Jan Reedy From victor.stinner at haypocalc.com Wed Feb 8 01:02:20 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 8 Feb 2012 01:02:20 +0100 Subject: [Python-Dev] Add a new "locale" codec? Message-ID: Hi, I added PyUnicode_DecodeLocale(), PyUnicode_DecodeLocaleAndSize() and PyUnicode_EncodeLocale() to Python 3.3 to fix bugs. I hesitate to expose this codec in Python: it can be useful is some cases, especially if you need to interact with C functions. The glib library has functions using the *current* locale encoding, g_locale_from_utf8() for example. Related issue with more information: http://bugs.python.org/issue13619 Victor From pje at telecommunity.com Wed Feb 8 03:27:04 2012 From: pje at telecommunity.com (PJ Eby) Date: Tue, 7 Feb 2012 21:27:04 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 5:24 PM, Brett Cannon wrote: > > On Tue, Feb 7, 2012 at 16:51, PJ Eby wrote: > >> On Tue, Feb 7, 2012 at 3:07 PM, Brett Cannon wrote: >> >>> So, if there is going to be some baseline performance target I need to >>> hit to make people happy I would prefer to know what that (real-world) >>> benchmark is and what the performance target is going to be on a non-debug >>> build. And if people are not worried about the performance then I'm happy >>> with that as well. =) >>> >> >> One thing I'm a bit worried about is repeated imports, especially ones >> that are inside frequently-called functions. In today's versions of >> Python, this is a performance win for "command-line tool platform" systems >> like Mercurial and PEAK, where you want to delay importing as long as >> possible, in case the code that needs the import is never called at all... >> but, if it *is* used, you may still need to use it a lot of times. >> >> When writing that kind of code, I usually just unconditionally import >> inside the function, because the C code check for an already-imported >> module is faster than the Python "if" statement I'd have to clutter up my >> otherwise-clean function with. >> >> So, in addition to the things other people have mentioned as performance >> targets, I'd like to keep the slowdown factor low for this type of scenario >> as well. Specifically, the slowdown shouldn't be so much as to motivate >> lazy importers like Mercurial and PEAK to need to rewrite in-function >> imports to do the already-imported check ourselves. ;-) >> >> (Disclaimer: I haven't actually seen Mercurial's delayed/dynamic import >> code, so I can't say for 100% sure if they'd be affected the same way.) >> > > IOW you want the sys.modules case fast, which I will never be able to > match compared to C code since that is pure execution with no I/O. > Couldn't you just prefix the __import__ function with something like this: ... try: module = sys.modules[name] except KeyError: # slow code path (Admittedly, the import lock is still a problem; initially I thought you could just skip it for this case, but the problem is that another thread could be in the middle of executing the module.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Wed Feb 8 03:35:48 2012 From: pje at telecommunity.com (PJ Eby) Date: Tue, 7 Feb 2012 21:35:48 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 6:40 PM, Terry Reedy wrote: > importlib could provide a parameterized decorator for functions that are > the only consumers of an import. It could operate much like this: > > def imps(mod): > def makewrap(f): > def wrapped(*args, **kwds): > print('first/only call to wrapper') > g = globals() > g[mod] = __import__(mod) > g[f.__name__] = f > f(*args, **kwds) > wrapped.__name__ = f.__name__ > return wrapped > return makewrap > > @imps('itertools') > def ic(): > print(itertools.count) > > ic() > ic() > # > first/only call to wrapper > > > If I were going to rewrite code, I'd just use lazy imports (see http://pypi.python.org/pypi/Importing ). They're even faster than this approach (or using plain import statements), as they have zero per-call function call overhead. It's just that not everything I write can depend on Importing. Throw an equivalent into the stdlib, though, and I guess I wouldn't have to worry about dependencies... (To be clearer; I'm talking about the http://peak.telecommunity.com/DevCenter/Importing#lazy-imports feature, which sticks a dummy module subclass instance into sys.modules, whose __gettattribute__ does a reload() of the module, forcing the normal import process to run, after first changing the dummy object's type to something that doesn't have the __getattribute__ any more. This ensures that all accesses after the first one are at normal module attribute access speed. That, and the "whenImported" decorator from Importing would probably be of general stdlib usefulness too.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Feb 8 03:54:50 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 07 Feb 2012 21:54:50 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On 2/7/2012 9:35 PM, PJ Eby wrote: > On Tue, Feb 7, 2012 at 6:40 PM, Terry Reedy > wrote: > > importlib could provide a parameterized decorator for functions that > are the only consumers of an import. It could operate much like this: > > def imps(mod): > def makewrap(f): > def wrapped(*args, **kwds): > print('first/only call to wrapper') > g = globals() > g[mod] = __import__(mod) > g[f.__name__] = f > f(*args, **kwds) > wrapped.__name__ = f.__name__ > return wrapped > return makewrap > > @imps('itertools') > def ic(): > print(itertools.count) > > ic() > ic() > # > first/only call to wrapper > > > > > If I were going to rewrite code, I'd just use lazy imports (see > http://pypi.python.org/pypi/Importing ). They're even faster than this > approach (or using plain import statements), as they have zero per-call > function call overhead. My code above and Importing, as I understand it, both delay imports until needed by using a dummy object that gets replaced at first access. (Now that I am reminded, sys.modules is the better place for the dummy objects. I just wanted to show that there is a simple solution (though more specialized) even for existing code.) The cost of delay, which might mean never, is a bit of one-time extra overhead. Both have no extra overhead after the first call. Unless delayed importing is made standard, both require a bit of extra code somewhere. > It's just that not everything I write can depend on Importing. > Throw an equivalent into the stdlib, though, and I guess I wouldn't have > to worry about dependencies... And that is what I think (agree?) should be done to counteract the likely slowdown from using importlib. > (To be clearer; I'm talking about the > http://peak.telecommunity.com/DevCenter/Importing#lazy-imports feature, > which sticks a dummy module subclass instance into sys.modules, whose > __gettattribute__ does a reload() of the module, forcing the normal > import process to run, after first changing the dummy object's type to > something that doesn't have the __getattribute__ any more. This ensures > that all accesses after the first one are at normal module attribute > access speed. That, and the "whenImported" decorator from Importing > would probably be of general stdlib usefulness too.) -- Terry Jan Reedy From eliben at gmail.com Wed Feb 8 04:46:38 2012 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 8 Feb 2012 05:46:38 +0200 Subject: [Python-Dev] Fixing the XML batteries In-Reply-To: References: Message-ID: >> On one hand I agree that ET should be emphasized since it's the better >> API with a much faster implementation. But I also understand Martin's >> point of view that minidom has its place, so IMHO some sort of >> compromise should be reached. Perhaps we can recommend using ET for >> those not specifically interested in the DOM interface, but for those >> who *are*, minidom is still a good stdlib option (?). > > > If you can, go ahead and write a patch saying something like that. It should > not be hard to come up with something that is a definite improvement. Create > a tracker issue for comment. but don't let it sit forever. > > A tracker issue already exists for this - http://bugs.python.org/issue11379 - I see no reason to open a new one. I will add my opinion there - feel free to do that too. > Since the current policy seems to be to hide C behind Python when there is > both, I assume that finishing the transition here is something just not > gotten around to yet. Open another issue if there is not one. > I will open a separate discussion on this. Eli From ncoghlan at gmail.com Wed Feb 8 04:47:21 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 8 Feb 2012 13:47:21 +1000 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Wed, Feb 8, 2012 at 12:54 PM, Terry Reedy wrote: > On 2/7/2012 9:35 PM, PJ Eby wrote: >> ?It's just that not everything I write can depend on Importing. >> Throw an equivalent into the stdlib, though, and I guess I wouldn't have >> to worry about dependencies... > > And that is what I think (agree?) should be done to counteract the likely > slowdown from using importlib. Yeah, this is one frequently reinvented wheel that could definitely do with a standard implementation. Christian Heimes made an initial attempt at such a thing years ago with PEP 369, but an importlib based __import__ would let the implementation largely be pure Python (with all the increase in power and flexibility that implies). I'm not sure such an addition would help much with the base interpreter start up time though - most of the modules we bring in are because we're actually using them for some reason. The other thing that shouldn't be underrated here is the value in making the builtin import system PEP 302 compliant from a *documentation* perspective. I've made occasional attempts at fully documenting the import system over the years, and I always end up giving up because the combination of the pre-PEP 302 builtin mechanisms in import.c and the PEP 302 compliant mechanisms for things like zipimport just degenerate into a mess of special cases that are impossible to justify beyond "nobody got around to fixing this yet". The fact that we have an undocumented PEP 302 based reimplementation of imports squirrelled away in pkgutil to make pkgutil and runpy work is sheer insanity (replacing *that* with importlib might actually be a good first step towards full integration). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eliben at gmail.com Wed Feb 8 04:59:21 2012 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 8 Feb 2012 05:59:21 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 Message-ID: Hello, Here's a note from "What's new in Python 3.0": """A common pattern in Python 2.x is to have one version of a module implemented in pure Python, with an optional accelerated version implemented as a C extension; for example, pickle and cPickle. This places the burden of importing the accelerated version and falling back on the pure Python version on each user of these modules. In Python 3.0, the accelerated versions are considered implementation details of the pure Python versions. Users should always import the standard version, which attempts to import the accelerated version and falls back to the pure Python version. The pickle / cPickle pair received this treatment. The profile module is on the list for 3.1. The StringIO module has been turned into a class in the io module.""" Is there a good reason why xml.etree.ElementTree / xml.etree.cElementTree did not "receive this treatment"? In the case of this module, it's quite unfortunate because: 1. The accelerated module is much faster and memory efficient (see recent benchmarks here: http://bugs.python.org/issue11379), and XML processing is an area where processing matters 2. The accelerated module implements the same API 3. It's very hard to even find out about the existence of the accelerated module. Its sole mention in the docs is this un-emphasized line in http://docs.python.org/dev/py3k/library/xml.etree.elementtree.html: "A C implementation of this API is available as xml.etree.cElementTree." Even to an experienced user who carefully reads the whole documentation it's not easy to notice. For the typical user who just jumps around to functions/methods he's interested in, it's essentially invisible. Eli From ncoghlan at gmail.com Wed Feb 8 05:15:26 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 8 Feb 2012 14:15:26 +1000 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: On Wed, Feb 8, 2012 at 1:59 PM, Eli Bendersky wrote: > Is there a good reason why xml.etree.ElementTree / > xml.etree.cElementTree did not "receive this treatment"? See PEP 360, which lists "Externally Maintained Packages". In the past we allowed additions to the standard library without requiring that the standard library version become the master version. These days we expect python.org to become the master version, perhaps with backports and experimental features published on PyPI (cf. packaging vs distutils2, unittest vs unittest, contextlib vs contextlib2). ElementTree was one of the last of those externally maintained modules added to the standard library - as documented in the PEP, it's still officially maintained by Fredrik Lundh. Folding the two implementations together in the standard library would mean officially declaring that xml.etree is now an independently maintained fork of Fredrik's version rather than just a "snapshot in time" of a particular version (which is what it has been historically). So the reasons for keeping these two separate to date isn't technical, it's because Fredrik publishes them as separate modules. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From brian at python.org Wed Feb 8 05:29:12 2012 From: brian at python.org (Brian Curtin) Date: Tue, 7 Feb 2012 22:29:12 -0600 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 22:15, Nick Coghlan wrote: > Folding the two > implementations together in the standard library would mean officially > declaring that xml.etree is now an independently maintained fork of > Fredrik's version rather than just a "snapshot in time" of a > particular version (which is what it has been historically). Is ElementTree even still maintained externally? I seem to remember Florent going through headaches to get changes into this area, and I can't find an external repository for this code. From eliben at gmail.com Wed Feb 8 05:31:58 2012 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 8 Feb 2012 06:31:58 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: On Wed, Feb 8, 2012 at 06:15, Nick Coghlan wrote: > On Wed, Feb 8, 2012 at 1:59 PM, Eli Bendersky wrote: >> Is there a good reason why xml.etree.ElementTree / >> xml.etree.cElementTree did not "receive this treatment"? > > See PEP 360, which lists "Externally Maintained Packages". In the past > we allowed additions to the standard library without requiring that > the standard library version become the master version. These days we > expect python.org to become the master version, perhaps with backports > and experimental features published on PyPI (cf. packaging vs > distutils2, unittest vs unittest, contextlib vs contextlib2). > > ElementTree was one of the last of those externally maintained modules > added to the standard library - as documented in the PEP, it's still > officially maintained by Fredrik Lundh. Folding the two > implementations together in the standard library would mean officially > declaring that xml.etree is now an independently maintained fork of > Fredrik's version rather than just a "snapshot in time" of a > particular version (which is what it has been historically). > > So the reasons for keeping these two separate to date isn't technical, > it's because Fredrik publishes them as separate modules. > The idea is to import the C module when xml.etree.ElementTree is imported, falling back to the Python module if that fails for some reason. So this is not modifying the modules, just the Python stdlib facade for them. Besides, in http://mail.python.org/pipermail/python-dev/2011-December/114812.html Stefan Behnel said "[...] Today, ET is *only* being maintained in the stdlib by Florent Xicluna [...]". Is this not true? Eli P.S. Would declaring that xml.etree is now independently maintained by pydev be a bad thing? Why? From ericsnowcurrently at gmail.com Wed Feb 8 05:36:28 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 7 Feb 2012 21:36:28 -0700 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 8:47 PM, Nick Coghlan wrote: > On Wed, Feb 8, 2012 at 12:54 PM, Terry Reedy wrote: >> On 2/7/2012 9:35 PM, PJ Eby wrote: >>> ?It's just that not everything I write can depend on Importing. >>> Throw an equivalent into the stdlib, though, and I guess I wouldn't have >>> to worry about dependencies... >> >> And that is what I think (agree?) should be done to counteract the likely >> slowdown from using importlib. > > Yeah, this is one frequently reinvented wheel that could definitely do > with a standard implementation. Christian Heimes made an initial > attempt at such a thing years ago with PEP 369, but an importlib based > __import__ would let the implementation largely be pure Python (with > all the increase in power and flexibility that implies). > > I'm not sure such an addition would help much with the base > interpreter start up time though - most of the modules we bring in are > because we're actually using them for some reason. > > The other thing that shouldn't be underrated here is the value in > making the builtin import system PEP 302 compliant from a > *documentation* perspective. I've made occasional attempts at fully > documenting the import system over the years, and I always end up > giving up because the combination of the pre-PEP 302 builtin > mechanisms in import.c and the PEP 302 compliant mechanisms for things > like zipimport just degenerate into a mess of special cases that are > impossible to justify beyond "nobody got around to fixing this yet". > The fact that we have an undocumented PEP 302 based reimplementation > of imports squirrelled away in pkgutil to make pkgutil and runpy work > is sheer insanity (replacing *that* with importlib might actually be a > good first step towards full integration). +1 on all counts -eric From fdrake at acm.org Wed Feb 8 05:41:14 2012 From: fdrake at acm.org (Fred Drake) Date: Tue, 7 Feb 2012 23:41:14 -0500 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 11:31 PM, Eli Bendersky wrote: > Besides, in http://mail.python.org/pipermail/python-dev/2011-December/114812.html > Stefan Behnel said "[...] Today, ET is *only* being maintained in the > stdlib by Florent Xicluna [...]". Is this not true? I don't know. I took this to be an observation rather than a declaration of intent by the package owner (Fredrik Lundh). > P.S. Would declaring that xml.etree is now independently maintained by > pydev be a bad thing? Why? So long as Fredrik owns the package, I think forking it for the standard library would be a bad thing, though not for technical reasons. Fredrik provided his libraries for the standard library in good faith, and we still list him as the external maintainer. Until *that* changes, forking would be inappropriate. I'd much rather see a discussion with Fredrik about the future maintenance plan for ElementTree and cElementTree. -Fred -- Fred L. Drake, Jr.? ? "A person who won't read has no advantage over one who can't read." ?? --Samuel Langhorne Clemens From eliben at gmail.com Wed Feb 8 05:46:46 2012 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 8 Feb 2012 06:46:46 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: On Wed, Feb 8, 2012 at 06:41, Fred Drake wrote: > On Tue, Feb 7, 2012 at 11:31 PM, Eli Bendersky wrote: >> Besides, in http://mail.python.org/pipermail/python-dev/2011-December/114812.html >> Stefan Behnel said "[...] Today, ET is *only* being maintained in the >> stdlib by Florent Xicluna [...]". Is this not true? > > I don't know. ?I took this to be an observation rather than a declaration > of intent by the package owner (Fredrik Lundh). > >> P.S. Would declaring that xml.etree is now independently maintained by >> pydev be a bad thing? Why? > > So long as Fredrik owns the package, I think forking it for the standard > library would be a bad thing, though not for technical reasons. ?Fredrik > provided his libraries for the standard library in good faith, and we still > list him as the external maintainer. ?Until *that* changes, forking would > be inappropriate. ?I'd much rather see a discussion with Fredrik about the > future maintenance plan for ElementTree and cElementTree. > Yes, I realize this is a loaded issue and I agree that all steps in this direction should be taken with Fredrik's agreement. However, to re-focus: The initial proposal of changing *the stdlib import facade* for xml.etree.ElementTree to use the C accelerator (_elementtree) by default. Will that somehow harm Fredrik's sovereignty over ET? Are there any other problems hidden here? Because if not, it appears like a change of only a few lines of code could provide a significantly better XML processing experience in 3.3 for a lot of users (and save some keystrokes for the ones who already know to look for cElementTree). Eli From fdrake at acm.org Wed Feb 8 06:10:03 2012 From: fdrake at acm.org (Fred Drake) Date: Wed, 8 Feb 2012 00:10:03 -0500 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 11:46 PM, Eli Bendersky wrote: > The initial proposal of changing *the stdlib > import facade* for xml.etree.ElementTree to use the C accelerator > (_elementtree) by default. I guess this is one source of confusion: what are you referring to an an "import fa?ade"? When I look in Lib/xml/etree/, I see the ElementTree, ElementPath, and ElementInclude modules, and a wrapper for cElementTree's extension module. There isn't any sort of fa?ade for ElementTree; are you proposing to add one, perhaps in xml.etree/__init__.py? -Fred -- Fred L. Drake, Jr.? ? "A person who won't read has no advantage over one who can't read." ?? --Samuel Langhorne Clemens From eliben at gmail.com Wed Feb 8 07:07:42 2012 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 8 Feb 2012 08:07:42 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: On Wed, Feb 8, 2012 at 07:10, Fred Drake wrote: > On Tue, Feb 7, 2012 at 11:46 PM, Eli Bendersky wrote: >> The initial proposal of changing *the stdlib >> import facade* for xml.etree.ElementTree to use the C accelerator >> (_elementtree) by default. > > I guess this is one source of confusion: what are you referring to an > an "import fa?ade"? ?When I look in Lib/xml/etree/, I see the ElementTree, > ElementPath, and ElementInclude modules, and a wrapper for cElementTree's > extension module. > > There isn't any sort of fa?ade for ElementTree; are you proposing to add > one, perhaps in xml.etree/__init__.py? AFAICS ElementPath is a helper used by ElementTree, and cElementTree has one of its own. It's not documented for stand-alone use. ElementInclude also isn't documented and doesn't appear to be used anywhere. The facade can be added to xml/etree/ElementTree.py since that's the only documented module. It can attempt to do: from _elementtree import * (which is what cElementTree.py) does, and on failure, just go on doing what it does now. Eli From stefan_ml at behnel.de Wed Feb 8 08:03:11 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 08 Feb 2012 08:03:11 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: Eli Bendersky, 08.02.2012 07:07: > On Wed, Feb 8, 2012 at 07:10, Fred Drake wrote: >> On Tue, Feb 7, 2012 at 11:46 PM, Eli Bendersky wrote: >>> The initial proposal of changing *the stdlib >>> import facade* for xml.etree.ElementTree to use the C accelerator >>> (_elementtree) by default. >> >> I guess this is one source of confusion: what are you referring to an >> an "import fa?ade"? When I look in Lib/xml/etree/, I see the ElementTree, >> ElementPath, and ElementInclude modules, and a wrapper for cElementTree's >> extension module. >> >> There isn't any sort of fa?ade for ElementTree; are you proposing to add >> one, perhaps in xml.etree/__init__.py? > > > AFAICS ElementPath is a helper used by ElementTree, and cElementTree > has one of its own. It's not documented for stand-alone use. > ElementInclude also isn't documented and doesn't appear to be used > anywhere. > > The facade can be added to xml/etree/ElementTree.py since that's the > only documented module. It can attempt to do: > > from _elementtree import * > > (which is what cElementTree.py) does, and on failure, just go on doing > what it does now. Basically, cElementTree (actually the accelerator module) reuses everything from ElementTree that it does not implement itself, e.g. the serialiser or the ElementPath implementation in ElementPath.py (which is not commonly being used by itself anyway). ElementInclude is meant to be independently imported by user code and works with both implementations, although it uses plain ElementTree by default and currently needs explicit configuring for cElementTree. It looks like that need would vanish when ElementTree uses the accelerator module internally. So, ElementTree.py is a superset of cElementTree's C module, and importing that C module into ElementTree.py instead of only importing it into cElementTree.py would just make ElementTree.py faster, that's basically it. Stefan From eliben at gmail.com Wed Feb 8 08:31:16 2012 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 8 Feb 2012 09:31:16 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: >> The facade can be added to xml/etree/ElementTree.py since that's the >> only documented module. It can attempt to do: >> >> from _elementtree import * >> >> (which is what cElementTree.py) does, and on failure, just go on doing >> what it does now. > > Basically, cElementTree (actually the accelerator module) reuses everything > from ElementTree that it does not implement itself, e.g. the serialiser or > the ElementPath implementation in ElementPath.py (which is not commonly > being used by itself anyway). > > ElementInclude is meant to be independently imported by user code and works > with both implementations, although it uses plain ElementTree by default > and currently needs explicit configuring for cElementTree. It looks like > that need would vanish when ElementTree uses the accelerator module internally. > > So, ElementTree.py is a superset of cElementTree's C module, and importing > that C module into ElementTree.py instead of only importing it into > cElementTree.py would just make ElementTree.py faster, that's basically it. > Yep. Any objections from pydev? Stefan, in the other thread (... XML batteries ) you said you will contact Fredrik, did you manage to get hold of him? Eli From hodgestar+pythondev at gmail.com Wed Feb 8 08:34:52 2012 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Wed, 8 Feb 2012 09:34:52 +0200 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: References: Message-ID: Is the idea to have: b"foo".decode("locale") be roughly equivalent to encoding = locale.getpreferredencoding(False) b"foo".decode(encoding) ? From stefan_ml at behnel.de Wed Feb 8 08:37:50 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 08 Feb 2012 08:37:50 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: Fred Drake, 08.02.2012 05:41: > On Tue, Feb 7, 2012 at 11:31 PM, Eli Bendersky wrote: >> Besides, in http://mail.python.org/pipermail/python-dev/2011-December/114812.html >> Stefan Behnel said "[...] Today, ET is *only* being maintained in the >> stdlib by Florent Xicluna [...]". Is this not true? > > I don't know. I took this to be an observation rather than a declaration > of intent by the package owner (Fredrik Lundh). This observation resulted from the fact that Fredrik hasn't updated the code in his public ElementTree repository(ies) since 2009, i.e. way before the release of Python 2.7 and 3.2 that integrated these changes. https://bitbucket.org/effbot/et-2009-provolone/overview The integration of ElementTree 1.3 into the standard library was almost exclusively done by Florent, with some supporting comments by Fredrik. Note that ElementTree 1.3 has not even been officially released yet, so the only "final" public release of it is in the standard library. Since then, Florent has been actively working on bug tickets, most of which have not received any reaction on the side of Fredrik. That makes me consider it the reality that "today, ET is only being maintained in the stdlib". >> P.S. Would declaring that xml.etree is now independently maintained by >> pydev be a bad thing? Why? > > So long as Fredrik owns the package, I think forking it for the standard > library would be a bad thing, though not for technical reasons. Fredrik > provided his libraries for the standard library in good faith, and we still > list him as the external maintainer. Until *that* changes, forking would > be inappropriate. I'd much rather see a discussion with Fredrik about the > future maintenance plan for ElementTree and cElementTree. I didn't get a response from him to my e-mails since early 2010. Maybe others have more luck if they try, but I don't have the impression that waiting another two years gets us anywhere interesting. Given that it was two months ago that I started the "Fixing the XML batteries" thread (and years since I brought up the topic for the first time), it seems to be hard enough already to get anyone on python-dev actually do something for Python's XML support, instead of just actively discouraging those who invest time and work into it. Stefan From victor.stinner at haypocalc.com Wed Feb 8 10:12:40 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 8 Feb 2012 10:12:40 +0100 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: References: Message-ID: 2012/2/8 Simon Cross : > Is the idea to have: > > ?b"foo".decode("locale") > > be roughly equivalent to > > ?encoding = locale.getpreferredencoding(False) > ?b"foo".decode(encoding) > > ? Yes. Whereas: b"foo".decode(sys.getfilesystemencoding()) is equivalent to encoding = locale.getpreferredencoding(True) b"foo".decode(encoding) Victor From hodgestar+pythondev at gmail.com Wed Feb 8 10:28:55 2012 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Wed, 8 Feb 2012 11:28:55 +0200 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: References: Message-ID: I think I'm -1 on a "locale" encoding because it refers to different actual encodings depending on where and when it's run, which seems surprising, and there's already a more explicit way to achieve the same effect. The documentation on .getpreferredencoding() says some scary things about needing to call .setlocale() sometimes but doesn't really say when or why. Could any of those cases make "locale" do weird things because it doesn't call setlocale()? From dirkjan at ochtman.nl Wed Feb 8 10:36:19 2012 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Wed, 8 Feb 2012 10:36:19 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: On Wed, Feb 8, 2012 at 08:37, Stefan Behnel wrote: > I didn't get a response from him to my e-mails since early 2010. Maybe > others have more luck if they try, but I don't have the impression that > waiting another two years gets us anywhere interesting. > > Given that it was two months ago that I started the "Fixing the XML > batteries" thread (and years since I brought up the topic for the first > time), it seems to be hard enough already to get anyone on python-dev > actually do something for Python's XML support, instead of just actively > discouraging those who invest time and work into it. I concur. It's important that we consider Fredrik's ownership of the modules, but if he fails to reply to email and doesn't update his repositories, there should be enough cause for python-dev to go on and appropriate the stdlib versions of those modules. Cheers, Dirkjan From eliben at gmail.com Wed Feb 8 10:49:40 2012 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 8 Feb 2012 11:49:40 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: On Wed, Feb 8, 2012 at 11:36, Dirkjan Ochtman wrote: > On Wed, Feb 8, 2012 at 08:37, Stefan Behnel wrote: >> I didn't get a response from him to my e-mails since early 2010. Maybe >> others have more luck if they try, but I don't have the impression that >> waiting another two years gets us anywhere interesting. >> >> Given that it was two months ago that I started the "Fixing the XML >> batteries" thread (and years since I brought up the topic for the first >> time), it seems to be hard enough already to get anyone on python-dev >> actually do something for Python's XML support, instead of just actively >> discouraging those who invest time and work into it. > > I concur. It's important that we consider Fredrik's ownership of the > modules, but if he fails to reply to email and doesn't update his > repositories, there should be enough cause for python-dev to go on and > appropriate the stdlib versions of those modules. > +1. That said, I think that the particular change discussed in this thread can be made anyway, since it doesn't really modify ET's APIs or functionality, just the way it gets imported from stdlib. Eli From and-dev at doxdesk.com Wed Feb 8 11:25:07 2012 From: and-dev at doxdesk.com (And Clover) Date: Wed, 08 Feb 2012 10:25:07 +0000 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: References: Message-ID: <4F324D83.6080509@doxdesk.com> On 2012-02-08 09:28, Simon Cross wrote: > I think I'm -1 on a "locale" encoding because it refers to different > actual encodings depending on where and when it's run, which seems > surprising, and there's already a more explicit way to achieve the > same effect. I'd agree that this is undesirable, and I don't really want locale-specific behaviour to leak out in other places that accept a encoding name (eg ), but we already have this behaviour with the "mbcs" encoding on Windows which refers to the locale-specific 'ANSI' code page. -- And Clover mailto:and at doxdesk.com http://www.doxdesk.com/ gtalk:chat?jid=bobince at doxdesk.com From p.f.moore at gmail.com Wed Feb 8 12:11:07 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 8 Feb 2012 11:11:07 +0000 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: On 8 February 2012 09:49, Eli Bendersky wrote: >> I concur. It's important that we consider Fredrik's ownership of the >> modules, but if he fails to reply to email and doesn't update his >> repositories, there should be enough cause for python-dev to go on and >> appropriate the stdlib versions of those modules. > > +1. > > That said, I think that the particular change discussed in this thread > can be made anyway, since it doesn't really modify ET's APIs or > functionality, just the way it gets imported from stdlib. I would suggest that, assuming python-dev want to take ownership of the module, one last-ditch attempt be made to contact Fredrik. We should email him, and copy python-dev (and maybe even python-list) asking for his view, and ideally his blessing on the stdlib version being forked and maintained independently going forward. Put a time limit on responses ("if we don't hear by XXX, we'll assume Fredrik is either uncontactable or not interested, and therefore we can go ahead with maintaining the stdlib version independently"). It's important to respect Fredrik's wishes and ownership, but we can't leave part of the stdlib frozen and abandoned just because he's not available any longer. Paul. PS The only other options I can see are to remove elementtree from the stdlib altogether, or explicitly document it as frozen and no longer maintained. From solipsis at pitrou.net Wed Feb 8 13:04:19 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 8 Feb 2012 13:04:19 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 References: Message-ID: <20120208130419.3ae6bbae@pitrou.net> On Wed, 8 Feb 2012 11:11:07 +0000 Paul Moore wrote: > On 8 February 2012 09:49, Eli Bendersky wrote: > >> I concur. It's important that we consider Fredrik's ownership of the > >> modules, but if he fails to reply to email and doesn't update his > >> repositories, there should be enough cause for python-dev to go on and > >> appropriate the stdlib versions of those modules. > > > > +1. > > > > That said, I think that the particular change discussed in this thread > > can be made anyway, since it doesn't really modify ET's APIs or > > functionality, just the way it gets imported from stdlib. > > I would suggest that, assuming python-dev want to take ownership of > the module, one last-ditch attempt be made to contact Fredrik. We > should email him, and copy python-dev (and maybe even python-list) > asking for his view, and ideally his blessing on the stdlib version > being forked and maintained independently going forward. Put a time > limit on responses ("if we don't hear by XXX, we'll assume Fredrik is > either uncontactable or not interested, and therefore we can go ahead > with maintaining the stdlib version independently"). > > It's important to respect Fredrik's wishes and ownership, but we can't > leave part of the stdlib frozen and abandoned just because he's not > available any longer. It's not frozen, it's actually maintained. Regards Antoine. From ncoghlan at gmail.com Wed Feb 8 13:21:13 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 8 Feb 2012 22:21:13 +1000 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <20120208130419.3ae6bbae@pitrou.net> References: <20120208130419.3ae6bbae@pitrou.net> Message-ID: On Wed, Feb 8, 2012 at 10:04 PM, Antoine Pitrou wrote: > On Wed, 8 Feb 2012 11:11:07 +0000 > Paul Moore wrote: >> It's important to respect Fredrik's wishes and ownership, but we can't >> leave part of the stdlib frozen and abandoned just because he's not >> available any longer. > > It's not frozen, it's actually maintained. Indeed, it sounds like the most appropriate course (if we don't hear otherwise from Fredrik) may be to just update PEP 360 to acknowledge current reality (i.e. the most current release of ElementTree is actually the one maintained by Florent in the stdlib). I'll note that this change isn't *quite* as simple as Eli's description earlier in the thread may suggest, though - the test suite also needs to be updated to ensure that the Python version is still fully exercised without the C acceleration applied. And such an an alteration would definitely be an explicit fork, even though the user facing API doesn't change - we're changing the structure of the code in a way that means some upstream deltas (if they happen to occur) may not apply cleanly. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Wed Feb 8 13:46:59 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 8 Feb 2012 22:46:59 +1000 Subject: [Python-Dev] [Python-checkins] Daily reference leaks (140f7de4d2a5): sum=888 In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 2:34 PM, wrote: > results for 140f7de4d2a5 on branch "default" > -------------------------------------------- > > test_capi leaked [296, 296, 296] references, sum=888 This appears to have started shortly after Benjamin's _PyExc_Init bltinmod refcounting change to fix Brett's crash when bootstrapping importlib. Perhaps we have a leak in import.c that was being masked by the DECREF in _PyExc_Init? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eliben at gmail.com Wed Feb 8 13:48:00 2012 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 8 Feb 2012 14:48:00 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <20120208130419.3ae6bbae@pitrou.net> Message-ID: >> It's not frozen, it's actually maintained. > > Indeed, it sounds like the most appropriate course (if we don't hear > otherwise from Fredrik) may be to just update PEP 360 to acknowledge > current reality (i.e. the most current release of ElementTree is > actually the one maintained by Florent in the stdlib). > > I'll note that this change isn't *quite* as simple as Eli's > description earlier in the thread may suggest, though - the test suite > also needs to be updated to ensure that the Python version is still > fully exercised without the C acceleration applied. Sure thing. I suppose similar machinery already exists for things like pickle / cPickle. I still maintain that it's a simple change :-) > And such an an > alteration would definitely be an explicit fork, even though the user > facing API doesn't change - we're changing the structure of the code > in a way that means some upstream deltas (if they happen to occur) may > not apply cleanly. This is a very minimal delta, however. I think it can even be made simpler by replacing ElementTree with a facade module that either imports _elementtree or the Python ElementTree. So the delta vs. upstream would only be in file placement. But these are two conflicting discussions - if changes were made in stdlib *already* that were not propagated upstream, what use is a clean delta? Eli From benjamin at python.org Wed Feb 8 14:11:46 2012 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 8 Feb 2012 08:11:46 -0500 Subject: [Python-Dev] [Python-checkins] Daily reference leaks (140f7de4d2a5): sum=888 In-Reply-To: References: Message-ID: 2012/2/8 Nick Coghlan : > On Tue, Feb 7, 2012 at 2:34 PM, ? wrote: >> results for 140f7de4d2a5 on branch "default" >> -------------------------------------------- >> >> test_capi leaked [296, 296, 296] references, sum=888 > > This appears to have started shortly after Benjamin's _PyExc_Init > bltinmod refcounting change to fix Brett's crash when bootstrapping > importlib. Perhaps we have a leak in import.c that was being masked by > the DECREF in _PyExc_Init? According to test_capi, it's expected to leak? -- Regards, Benjamin From victor.stinner at haypocalc.com Wed Feb 8 14:25:36 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 8 Feb 2012 14:25:36 +0100 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: References: Message-ID: 2012/2/8 Simon Cross : > I think I'm -1 on a "locale" encoding because it refers to different > actual encodings depending on where and when it's run, which seems > surprising, and there's already a more explicit way to achieve the > same effect. The following code is just an example to explain how locale is supposed to work, but the implementation is completly different: encoding = locale.getpreferredencoding(False) ... execute some code ... text = bytes.decode(encoding) bytes = text.encode(encoding) The current locale is process-wide: if a thread changes the locale, all threads are affected. Some functions have to use the current locale encoding, and not the locale encoding read at startup. Examples with C functions: strerror(), strftime(), tzname, etc. My codec implementation uses mbstowcs() and wcstombs() which don't touch the current locale, but just use it. Said diffferently, the locale codec would just give access to these functions. > The documentation on .getpreferredencoding() says some scary things > about needing to call .setlocale() sometimes but doesn't really say > when or why. locale.getpreferredencoding() always call setlocale() by default. locale.getpreferredencoding(False) doesn't call setlocale(). setlocale() is not called on Windows or if locale.CODESET is not available (it is available on FreeBSD, Mac OS X, Linux, etc.). > Could any of those cases make "locale" do weird things because it doesn't call setlocale()? Sorry, I don't understand what do you mean by "weird things". The "locale" codec doesn't touch the locale. From hodgestar+pythondev at gmail.com Wed Feb 8 14:30:15 2012 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Wed, 8 Feb 2012 15:30:15 +0200 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: References: Message-ID: On Wed, Feb 8, 2012 at 3:25 PM, Victor Stinner wrote: > Sorry, I don't understand what do you mean by "weird things". The > "locale" codec doesn't touch the locale. Sorry for being unclear. My question was about the following lines from http://docs.python.org/library/locale.html#locale.getpreferredencoding: """On some systems, it is necessary to invoke setlocale() to obtain the user preferences, so this function is not thread-safe. If invoking setlocale is not necessary or desired, do_setlocale should be set to False.""" So my question was about what happens on such systems where invoking setlocale is necessary to obtain the user preferences? Schiavo Simon From hodgestar+pythondev at gmail.com Wed Feb 8 14:33:07 2012 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Wed, 8 Feb 2012 15:33:07 +0200 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: References: Message-ID: On Wed, Feb 8, 2012 at 3:25 PM, Victor Stinner wrote: > The current locale is process-wide: if a thread changes the locale, > all threads are affected. Some functions have to use the current > locale encoding, and not the locale encoding read at startup. Examples > with C functions: strerror(), strftime(), tzname, etc. Could a core part of Python breaking because of a sequence like: 1) Encode unicode to bytes using locale codec. 2) Silly third-party library code changes the locale codec. 3) Attempt to decode bytes back to unicode using the locale codec (which is now a different underlying codec). ? Schiavo Simon From p.f.moore at gmail.com Wed Feb 8 14:40:38 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 8 Feb 2012 13:40:38 +0000 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <20120208130419.3ae6bbae@pitrou.net> Message-ID: On 8 February 2012 12:21, Nick Coghlan wrote: > On Wed, Feb 8, 2012 at 10:04 PM, Antoine Pitrou wrote: >> On Wed, 8 Feb 2012 11:11:07 +0000 >> Paul Moore wrote: >>> It's important to respect Fredrik's wishes and ownership, but we can't >>> leave part of the stdlib frozen and abandoned just because he's not >>> available any longer. >> >> It's not frozen, it's actually maintained. > > Indeed, it sounds like the most appropriate course (if we don't hear > otherwise from Fredrik) may be to just update PEP 360 to acknowledge > current reality (i.e. the most current release of ElementTree is > actually the one maintained by Florent in the stdlib). Ah, OK. My apologies, I had misunderstood the previous discussion. In which case I agree with Nick, lets' update PEP 360 and move forward. On that basis, +1 to Eli's suggestion of making cElementTree a transparent accelerator. Paul From mark at hotpy.org Wed Feb 8 16:16:25 2012 From: mark at hotpy.org (Mark Shannon) Date: Wed, 08 Feb 2012 15:16:25 +0000 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: <4F2AE13C.6010900@hotpy.org> References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> <4F2AE13C.6010900@hotpy.org> Message-ID: <4F3291C9.9070305@hotpy.org> Hi, Version 2 is now available. Version 2 makes as few changes to tunable constants as possible, and generally does not change iteration order (so repr() is unchanged). All tests pass (the only changes to tests are for sys.getsizeof() ). Repository: https://bitbucket.org/markshannon/cpython_new_dict Issue http://bugs.python.org/issue13903 Performance changes are basically zero for non-OO code. Average -0.5% speed change on 2n3 benchamrks, a few benchmarks show a small reduction in memory use. (see notes below) GCbench uses 47% less memory and is 12% faster. 2to3, which seems to be the only "realistic" benchmark that runs on Py3, shows no change in speed and uses 10% less memory. All benchmarks and tests performed on old, slow 32bit machine with linux. Do please try it on your machine(s). If accepted, the new dict implementation will allow a useful optimisation of the LOAD_GLOBAL (and possibly LOAD_ATTR) bytecode: By testing to see if the (immutable) keys-tables is the expected table, the value can accessed directly by index, rather than by name. Cheers, Mark. Notes: All benchmarks from http://hg.python.org/benchmarks/ using the -m flag to get memory usage data. I've ignored the json benchmarks which shows unstable behaviour on my machine. Tiny changes to the dict being serialized or to the random seed can change the relative speed of my implementation vs CPython from -25% to +10%. From brett at python.org Wed Feb 8 17:01:35 2012 From: brett at python.org (Brett Cannon) Date: Wed, 8 Feb 2012 11:01:35 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: <20120207234224.1ae8602e@pitrou.net> References: <20120207234224.1ae8602e@pitrou.net> Message-ID: On Tue, Feb 7, 2012 at 17:42, Antoine Pitrou wrote: > On Tue, 7 Feb 2012 17:24:21 -0500 > Brett Cannon wrote: > > > > IOW you want the sys.modules case fast, which I will never be able to > match > > compared to C code since that is pure execution with no I/O. > > Why wouldn't continue using C code for that? It's trivial (just a dict > lookup). > Sure, but it's all the code between the function call and hitting sys.modules which would also need to get shoved into the C code. As I said, I have not tried to optimize anything yet (and unfortunately a lot of the upfront costs are over stupid things like checking if __import__ is being called with a string for the module name). -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Wed Feb 8 17:07:10 2012 From: brett at python.org (Brett Cannon) Date: Wed, 8 Feb 2012 11:07:10 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: <20120208000837.79dcb863@pitrou.net> References: <20120207214948.38d4503e@pitrou.net> <20120208000837.79dcb863@pitrou.net> Message-ID: On Tue, Feb 7, 2012 at 18:08, Antoine Pitrou wrote: > On Tue, 7 Feb 2012 17:16:18 -0500 > Brett Cannon wrote: > > > > > > IOW I really do not look forward to someone saying "importlib is so > much > > > > slower at importing a module containing ``pass``" when (a) that never > > > > happens, and (b) most programs do not spend their time importing but > > > > instead doing interesting work. > > > > > > Well, import time is so important that the Mercurial developers have > > > written an on-demand import mechanism, to reduce the latency of > > > command-line operations. > > > > > > > Sure, but they are a somewhat extreme case. > > I don't think Mercurial is extreme. Any command-line tool written in > Python applies. For example, yum (Fedora's apt-get) is written in > Python. And I'm sure many people do small administration scripts in > Python. These tools may then be run in a loop by whatever other script. > > > > But it's not only important for Mercurial and the like. Even if you're > > > developing a Web app, making imports slower will make restarts slower, > > > and development more tedious in the first place. > > > > > > > > Fine, startup cost from a hard crash I can buy when you are getting 1000 > > QPS, but development more tedious? > > Well, waiting several seconds when reloading a development server is > tedious. Anyway, my point was that other cases (than command-line > tools) can be negatively impacted by import time. > > > > > So, if there is going to be some baseline performance target I need > to > > > hit > > > > to make people happy I would prefer to know what that (real-world) > > > > benchmark is and what the performance target is going to be on a > > > non-debug > > > > build. > > > > > > - No significant slowdown in startup time. > > > > > > > What's significant and measuring what exactly? I mean startup already > has a > > ton of imports as it is, so this would wash out the point of measuring > > practically anything else for anything small. > > I don't understand your sentence. Yes, startup has a ton of imports and > that's why I'm fearing it may be negatively impacted :) > > ("a ton" being a bit less than 50 currently) > So you want less than a 50% startup cost on the standard startup benchmarks? > > > This is why I said I want a > > benchmark to target which does actual work since flat-out startup time > > measures nothing meaningful but busy work. > > "Actual work" can be very small in some cases. For example, if you run > "hg branch" I'm quite sure it doesn't do a lot of work except importing > many modules and then reading a single file in .hg (the one named > ".hg/branch" probably, but I'm not a Mercurial dev). > > In the absence of more "real world" benchmarks, I think the startup > benchmarks in the benchmarks repo are a good baseline. > > That said you could also install my 3.x port of Twisted here: > https://bitbucket.org/pitrou/t3k/ > > and then run e.g. "python3 bin/trial -h". > > > I would get more out of code > > that just stat'ed every file in Lib since at least that did some work. > > stat()ing files is not really representative of import work. There are > many indirections in the import machinery. > (actually, even import.c appears quite slower than a bunch of stat() > calls would imply) > > > > - Within 25% of current performance when importing, say, the "struct" > > > module (Lib/struct.py) from bytecode. > > > > > > > Why struct? It's such a small module that it isn't really a typical > module. > > Precisely to measure the overhead. Typical module size will vary > depending on development style. Some people may prefer writing many > small modules. Or they may be using many small libraries, or using > libraries that have adoptes such a development style. > > Measuring the overhead on small modules will make sure we aren't overly > confident. > > > The median file size of Lib is 11K (e.g. tabnanny.py), not 238 bytes > (which > > is barely past Hello World). And is this just importing struct or is this > > from startup, e.g. ``python -c "import struct"``? > > Just importing struct, as with the timeit snippets in the other thread. OK, so less than 25% slowdown when importing a module with pre-existing bytecode that is very small. And here I was worrying you were going to suggest easy goals to reach for. ;) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Wed Feb 8 17:09:36 2012 From: brett at python.org (Brett Cannon) Date: Wed, 8 Feb 2012 11:09:36 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 21:27, PJ Eby wrote: > > > On Tue, Feb 7, 2012 at 5:24 PM, Brett Cannon wrote: > >> >> On Tue, Feb 7, 2012 at 16:51, PJ Eby wrote: >> >>> On Tue, Feb 7, 2012 at 3:07 PM, Brett Cannon wrote: >>> >>>> So, if there is going to be some baseline performance target I need to >>>> hit to make people happy I would prefer to know what that (real-world) >>>> benchmark is and what the performance target is going to be on a non-debug >>>> build. And if people are not worried about the performance then I'm happy >>>> with that as well. =) >>>> >>> >>> One thing I'm a bit worried about is repeated imports, especially ones >>> that are inside frequently-called functions. In today's versions of >>> Python, this is a performance win for "command-line tool platform" systems >>> like Mercurial and PEAK, where you want to delay importing as long as >>> possible, in case the code that needs the import is never called at all... >>> but, if it *is* used, you may still need to use it a lot of times. >>> >>> When writing that kind of code, I usually just unconditionally import >>> inside the function, because the C code check for an already-imported >>> module is faster than the Python "if" statement I'd have to clutter up my >>> otherwise-clean function with. >>> >>> So, in addition to the things other people have mentioned as performance >>> targets, I'd like to keep the slowdown factor low for this type of scenario >>> as well. Specifically, the slowdown shouldn't be so much as to motivate >>> lazy importers like Mercurial and PEAK to need to rewrite in-function >>> imports to do the already-imported check ourselves. ;-) >>> >>> (Disclaimer: I haven't actually seen Mercurial's delayed/dynamic import >>> code, so I can't say for 100% sure if they'd be affected the same way.) >>> >> >> IOW you want the sys.modules case fast, which I will never be able to >> match compared to C code since that is pure execution with no I/O. >> > > Couldn't you just prefix the __import__ function with something like this: > > ... > try: > module = sys.modules[name] > except KeyError: > # slow code path > > (Admittedly, the import lock is still a problem; initially I thought you > could just skip it for this case, but the problem is that another thread > could be in the middle of executing the module.) > I practically do already. As of right now there are some 'if' checks that come ahead of it that I could shift around to fast path this even more (since who cares about types and such if the module name happens to be in sys.modules), but it isn't that far off as-is. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Feb 8 17:09:17 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 08 Feb 2012 17:09:17 +0100 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120207234224.1ae8602e@pitrou.net> Message-ID: <1328717357.3387.22.camel@localhost.localdomain> Le mercredi 08 f?vrier 2012 ? 11:01 -0500, Brett Cannon a ?crit : > > > On Tue, Feb 7, 2012 at 17:42, Antoine Pitrou > wrote: > On Tue, 7 Feb 2012 17:24:21 -0500 > Brett Cannon wrote: > > > > IOW you want the sys.modules case fast, which I will never > be able to match > > compared to C code since that is pure execution with no I/O. > > > Why wouldn't continue using C code for that? It's trivial > (just a dict > lookup). > > > Sure, but it's all the code between the function call and hitting > sys.modules which would also need to get shoved into the C code. As I > said, I have not tried to optimize anything yet (and unfortunately a > lot of the upfront costs are over stupid things like checking if > __import__ is being called with a string for the module name). I guess my point was: why is there a function call in that case? The "import" statement could look up sys.modules directly. Or the built-in __import__ could still be written in C, and only defer to importlib when the module isn't found in sys.modules. Practicality beats purity. Regards Antoine. From brett at python.org Wed Feb 8 17:13:17 2012 From: brett at python.org (Brett Cannon) Date: Wed, 8 Feb 2012 11:13:17 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 22:47, Nick Coghlan wrote: > On Wed, Feb 8, 2012 at 12:54 PM, Terry Reedy wrote: > > On 2/7/2012 9:35 PM, PJ Eby wrote: > >> It's just that not everything I write can depend on Importing. > >> Throw an equivalent into the stdlib, though, and I guess I wouldn't have > >> to worry about dependencies... > > > > And that is what I think (agree?) should be done to counteract the likely > > slowdown from using importlib. > > Yeah, this is one frequently reinvented wheel that could definitely do > with a standard implementation. Christian Heimes made an initial > attempt at such a thing years ago with PEP 369, but an importlib based > __import__ would let the implementation largely be pure Python (with > all the increase in power and flexibility that implies). > > I'll see if I can come up with a pure Python way to handle setting attributes on the module since that is the one case that my importers project code can't handle. > I'm not sure such an addition would help much with the base > interpreter start up time though - most of the modules we bring in are > because we're actually using them for some reason. > It wouldn't. This would be for third-parties only. > > The other thing that shouldn't be underrated here is the value in > making the builtin import system PEP 302 compliant from a > *documentation* perspective. I've made occasional attempts at fully > documenting the import system over the years, and I always end up > giving up because the combination of the pre-PEP 302 builtin > mechanisms in import.c and the PEP 302 compliant mechanisms for things > like zipimport just degenerate into a mess of special cases that are > impossible to justify beyond "nobody got around to fixing this yet". > The fact that we have an undocumented PEP 302 based reimplementation > of imports squirrelled away in pkgutil to make pkgutil and runpy work > is sheer insanity (replacing *that* with importlib might actually be a > good first step towards full integration). > I actually have never bothered to explain import as it is currently implemented in any of my PyCon import talks precisely because it is such a mess. It's just easier to explain from a PEP 302 perspective since you can actually comprehend that. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Wed Feb 8 17:15:14 2012 From: brett at python.org (Brett Cannon) Date: Wed, 8 Feb 2012 11:15:14 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 22:47, Nick Coghlan wrote [SNIP] > The fact that we have an undocumented PEP 302 based reimplementation > of imports squirrelled away in pkgutil to make pkgutil and runpy work > is sheer insanity (replacing *that* with importlib might actually be a > good first step towards full integration). > It easily goes beyond runpy. You could ditch much of imp's C code (e.g. load_module()), you could write py_compile and compileall using importlib, you could rewrite zipimport, etc. Anything that touches import could be refactored to (a) use just Python code, and (b) reshare code so as to not re-invent the wheel constantly. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Wed Feb 8 17:16:24 2012 From: brett at python.org (Brett Cannon) Date: Wed, 8 Feb 2012 11:16:24 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Tue, Feb 7, 2012 at 18:26, Alex Gaynor wrote: > Brett Cannon python.org> writes: > > > > IOW you want the sys.modules case fast, which I will never be able to > match > compared to C code since that is pure execution with no I/O. > > > > > Sure you can: have a really fast Python VM. > > Constructive: if you can run this code under PyPy it'd be easy to just: > > $ pypy -mtimeit "import struct" > $ pypy -mtimeit -s "import importlib" "importlib.import_module('struct')" > > Or whatever the right API is. I'm not worried about PyPy. =) I assume you will just flat-out use importlib regardless of what happens with CPython since it is/will be fully compatible and is already written for you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Wed Feb 8 17:24:58 2012 From: brett at python.org (Brett Cannon) Date: Wed, 8 Feb 2012 11:24:58 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: <1328717357.3387.22.camel@localhost.localdomain> References: <20120207234224.1ae8602e@pitrou.net> <1328717357.3387.22.camel@localhost.localdomain> Message-ID: On Wed, Feb 8, 2012 at 11:09, Antoine Pitrou wrote: > Le mercredi 08 f?vrier 2012 ? 11:01 -0500, Brett Cannon a ?crit : > > > > > > On Tue, Feb 7, 2012 at 17:42, Antoine Pitrou > > wrote: > > On Tue, 7 Feb 2012 17:24:21 -0500 > > Brett Cannon wrote: > > > > > > IOW you want the sys.modules case fast, which I will never > > be able to match > > > compared to C code since that is pure execution with no I/O. > > > > > > Why wouldn't continue using C code for that? It's trivial > > (just a dict > > lookup). > > > > > > Sure, but it's all the code between the function call and hitting > > sys.modules which would also need to get shoved into the C code. As I > > said, I have not tried to optimize anything yet (and unfortunately a > > lot of the upfront costs are over stupid things like checking if > > __import__ is being called with a string for the module name). > > I guess my point was: why is there a function call in that case? The > "import" statement could look up sys.modules directly. > Because people like to do wacky stuff with their imports and so fully bypassing __import__ would be bad. > Or the built-in __import__ could still be written in C, and only defer > to importlib when the module isn't found in sys.modules. > Practicality beats purity. It's a possibility, although that would require every function call to fetch the PyInterpreterState to get at the cached __import__ (so the proper sys and imp modules are used) and I don't know how expensive that would be (probably as not as expensive as calling out to Python code but I'm thinking out loud). -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Wed Feb 8 17:28:31 2012 From: brett at python.org (Brett Cannon) Date: Wed, 8 Feb 2012 11:28:31 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Wed, Feb 8, 2012 at 11:15, Brett Cannon wrote: > > > On Tue, Feb 7, 2012 at 22:47, Nick Coghlan wrote > > [SNIP] > > >> The fact that we have an undocumented PEP 302 based reimplementation >> of imports squirrelled away in pkgutil to make pkgutil and runpy work >> is sheer insanity (replacing *that* with importlib might actually be a >> good first step towards full integration). >> > > It easily goes beyond runpy. You could ditch much of imp's C code (e.g. > load_module()), you could write py_compile and compileall using importlib, > you could rewrite zipimport, etc. Anything that touches import could be > refactored to (a) use just Python code, and (b) reshare code so as to not > re-invent the wheel constantly. > And taking it even farther, all of the blackbox aspects of import go away. For instance, the implicit, hidden importers for built-in modules, frozen modules, extensions, and source could actually be set on sys.path_hooks. The Meta path importer that handles sys.path could actually exist on sys.meta_path. -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at haypocalc.com Wed Feb 8 17:40:03 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 8 Feb 2012 17:40:03 +0100 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: References: Message-ID: >> The current locale is process-wide: if a thread changes the locale, >> all threads are affected. Some functions have to use the current >> locale encoding, and not the locale encoding read at startup. Examples >> with C functions: strerror(), strftime(), tzname, etc. > > Could a core part of Python breaking because of a sequence like: > > 1) Encode unicode to bytes using locale codec. > 2) Silly third-party library code changes the locale codec. > 3) Attempt to decode bytes back to unicode using the locale codec > (which is now a different underlying codec). When you decode data from the OS, you have to use the current locale encoding. If you use a variable to store the encoding and the locale is changed, you have to update your variable or you get mojibake. Example with Python 2: lisa$ python2.7 Python 2.7.2+ (default, Oct 4 2011, 20:06:09) >>> import locale >>> encoding=locale.getpreferredencoding(False) >>> encoding 'ANSI_X3.4-1968' >>> encoding, os.strerror(23).decode(encoding) u'Too many open files in system' >>> locale.setlocale(locale.LC_ALL, '') # set the locale 'fr_FR.UTF-8' >>> os.strerror(23).decode(encoding) Traceback (most recent call last): ... UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 37: ordinal not in range(128) >>> encoding=locale.getpreferredencoding(False) >>> encoding 'UTF-8' >>> os.strerror(23).decode(encoding) u'Trop de fichiers ouverts dans le syst\xe8me' You have to update manually encoding because setlocale() changed LC_MESSAGES locale category (message language) but also LC_CTYPE locale category (encoding). Using the "locale" encoding, you always get the current locale encoding. In some cases, you must use sys.getfilesystemencoding() (e.g. write into the console or encode/decode filenames), in other cases, you must use the current locale encoding (e.g. sterror() or strftime()). Python 3 does most of the work for me, so you don't have to care of the locale encoding (you just manipulate Unicode, it decodes bytes or encode back to bytes for you). But in some cases, you have to decode or encode manually using the right encoding. In this case, the "locale" codec can help you. The documentation will have to explain exactly what this new codec is, because as expected, it is confusing :-) Victor From mark at hotpy.org Wed Feb 8 18:13:04 2012 From: mark at hotpy.org (Mark Shannon) Date: Wed, 08 Feb 2012 17:13:04 +0000 Subject: [Python-Dev] Code review tool uses my old email address Message-ID: <4F32AD20.80609@hotpy.org> Hi, I changed my email address (about a year ago) and updated my bug tracker settings to my new address (late last year). However, the code review tool still shows my old email address. How do I change it? Cheers, Mark. From nadeem.vawda at gmail.com Wed Feb 8 19:52:31 2012 From: nadeem.vawda at gmail.com (Nadeem Vawda) Date: Wed, 8 Feb 2012 20:52:31 +0200 Subject: [Python-Dev] Code review tool uses my old email address In-Reply-To: <4F32AD20.80609@hotpy.org> References: <4F32AD20.80609@hotpy.org> Message-ID: This may be a bug in the tracker, possibly related to http://psf.upfronthosting.co.za/roundup/meta/issue402 - it seems like changes to a user's details on bugs.python.org are not propagated to the review tool. Cheers, Nadeem From mark at hotpy.org Wed Feb 8 20:18:14 2012 From: mark at hotpy.org (Mark Shannon) Date: Wed, 08 Feb 2012 19:18:14 +0000 Subject: [Python-Dev] PEP for new dictionary implementation Message-ID: <4F32CA76.5040307@hotpy.org> Proposed PEP for new dictionary implementation, PEP 410? is attached. Cheers, Mark. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: pep-new-dict.txt URL: From tjreedy at udel.edu Wed Feb 8 20:57:55 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 08 Feb 2012 14:57:55 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On 2/8/2012 11:13 AM, Brett Cannon wrote: > On Tue, Feb 7, 2012 at 22:47, Nick Coghlan I'm not sure such an addition would help much with the base > interpreter start up time though - most of the modules we bring in are > because we're actually using them for some reason. > It wouldn't. This would be for third-parties only. such as hg. That is what I had in mind. Would the following work? Treat a function as a 'loop' in that it may be executed repeatedly. Treat 'import x' in a function as what it is, an __import__ call plus a local assignment. Apply a version of the usual optimization: put a sys.modules-based lazy import outside of the function (at the top of the module?) and leave the local assignment "x = sys.modules['x']" in the function. Change sys.modules.__delattr__ to replace a module with a dummy, so the function will still work after a deletion, as it does now. -- Terry Jan Reedy From solipsis at pitrou.net Wed Feb 8 21:11:33 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 8 Feb 2012 21:11:33 +0100 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120207214948.38d4503e@pitrou.net> <20120208000837.79dcb863@pitrou.net> Message-ID: <20120208211133.079e74d6@pitrou.net> On Wed, 8 Feb 2012 11:07:10 -0500 Brett Cannon wrote: > > > > > > > So, if there is going to be some baseline performance target I need > > to > > > > hit > > > > > to make people happy I would prefer to know what that (real-world) > > > > > benchmark is and what the performance target is going to be on a > > > > non-debug > > > > > build. > > > > > > > > - No significant slowdown in startup time. > > > > > > > > > > What's significant and measuring what exactly? I mean startup already > > has a > > > ton of imports as it is, so this would wash out the point of measuring > > > practically anything else for anything small. > > > > I don't understand your sentence. Yes, startup has a ton of imports and > > that's why I'm fearing it may be negatively impacted :) > > > > ("a ton" being a bit less than 50 currently) > > > > So you want less than a 50% startup cost on the standard startup benchmarks? No, ~50 is the number of imports at startup. I think startup time should grow by less than 10%. (even better if it shrinks of course :)) > And here I was worrying you were going to suggest easy goals to reach for. > ;) He. Well, if importlib enabled user-level functionality, I guess it could be attractive to trade a slice of performance against it. But from an user's point of view, bootstrapping importlib is mostly an implementation detail with not much of a positive impact. Regards Antoine. From brett at python.org Wed Feb 8 21:16:54 2012 From: brett at python.org (Brett Cannon) Date: Wed, 8 Feb 2012 15:16:54 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Wed, Feb 8, 2012 at 14:57, Terry Reedy wrote: > On 2/8/2012 11:13 AM, Brett Cannon wrote: > >> On Tue, Feb 7, 2012 at 22:47, Nick Coghlan > > > I'm not sure such an addition would help much with the base >> interpreter start up time though - most of the modules we bring in are >> because we're actually using them for some reason. >> > > It wouldn't. This would be for third-parties only. >> > > such as hg. That is what I had in mind. > > Would the following work? Treat a function as a 'loop' in that it may be > executed repeatedly. Treat 'import x' in a function as what it is, an > __import__ call plus a local assignment. Apply a version of the usual > optimization: put a sys.modules-based lazy import outside of the function > (at the top of the module?) and leave the local assignment "x = > sys.modules['x']" in the function. Change sys.modules.__delattr__ to > replace a module with a dummy, so the function will still work after a > deletion, as it does now. Probably, but I would hate to force people to code in a specific way for it to work. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Feb 8 21:31:24 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 08 Feb 2012 15:31:24 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On 2/8/2012 3:16 PM, Brett Cannon wrote: > On Wed, Feb 8, 2012 at 14:57, Terry Reedy Would the following work? Treat a function as a 'loop' in that it > may be executed repeatedly. Treat 'import x' in a function as what > it is, an __import__ call plus a local assignment. Apply a version > of the usual optimization: put a sys.modules-based lazy import > outside of the function (at the top of the module?) and leave the > local assignment "x = sys.modules['x']" in the function. Change > sys.modules.__delattr__ to replace a module with a dummy, so the > function will still work after a deletion, as it does now. > > Probably, but I would hate to force people to code in a specific way for > it to work. The intent of what I proposed it to be transparent for imports within functions. It would be a minor optimization if anything, but it would mean that there is a lazy mechanism in place. For top-level imports, unless *all* are made lazy, then there *must* be some indication in the code of whether to make it lazy or not. -- Terry Jan Reedy From martin at v.loewis.de Wed Feb 8 21:46:22 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Feb 2012 21:46:22 +0100 Subject: [Python-Dev] peps: Update with bugfix releases. In-Reply-To: References: <20120205204551.Horde.NCdeYVNNcXdPLtxvnkzi1lA@webmail.df.eu> Message-ID: <4F32DF1E.40205@v.loewis.de> Am 05.02.2012 21:34, schrieb Ned Deily: > In article > <20120205204551.Horde.NCdeYVNNcXdPLtxvnkzi1lA at webmail.df.eu>, > martin at v.loewis.de wrote: > >>> I understand that but, to me, it makes no sense to send out truly >>> broken releases. Besides, the hash collision attack is not exactly >>> new either. Another few weeks can't make that much of a difference. >> >> Why would the release be truly broken? It surely can't be worse than >> the current releases (which apparently aren't truly broken, else >> there would have been no point in releasing them back then). > > They were broken by the release of OS X 10.7 and Xcode 4.2 which were > subsequent to the previous releases. None of the currently available > python.org installers provide a fully working system on OS X 10.7, or on > OS X 10.6 if the user has installed Xcode 4.2 for 10.6. In what way are the current releases not fully working? Are you referring to issues with building extension modules? If it's that, I wouldn't call that "truly broken". Plus, the releases continue to work fine on older OS X releases. So when you build a bug fix release, just build it with the same tool chain as the previous bug fix release, and all is fine. Regards, Martin From martin at v.loewis.de Wed Feb 8 21:55:20 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Feb 2012 21:55:20 +0100 Subject: [Python-Dev] which C language standard CPython must conform to In-Reply-To: References: Message-ID: <4F32E138.30505@v.loewis.de> > Some quick searching shows that there is at least hope Microsoft is on > board with C++11x (not so surprising, their crown jewels are written > in C++). We should at some point demand a C++ compiler for CPython > and pick of subset of C++ features to allow use of but that is likely > reserved for the Python 4 timeframe (a topic for another thread and > time entirely, it isn't feasible for today's codebase). See my earlier post on building Python as a Windows 8 Metro App. As one strategy, I tried compiling Python as C++ code (as it wasn't clear whether C is fully supported; this is now resolved). It is actually feasible to change Python so that it compiles with a C++ compiler and still continues to compile as C also, with just a few ifdefs. This is, of course, off-topic wrt. the original question: even C++11 compilers often don't support non-ASCII identifiers. Regards, Martin From brett at python.org Wed Feb 8 22:08:29 2012 From: brett at python.org (Brett Cannon) Date: Wed, 8 Feb 2012 16:08:29 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Wed, Feb 8, 2012 at 15:31, Terry Reedy wrote: > On 2/8/2012 3:16 PM, Brett Cannon wrote: > >> On Wed, Feb 8, 2012 at 14:57, Terry Reedy > Would the following work? Treat a function as a 'loop' in that it >> may be executed repeatedly. Treat 'import x' in a function as what >> it is, an __import__ call plus a local assignment. Apply a version >> of the usual optimization: put a sys.modules-based lazy import >> outside of the function (at the top of the module?) and leave the >> local assignment "x = sys.modules['x']" in the function. Change >> sys.modules.__delattr__ to replace a module with a dummy, so the >> function will still work after a deletion, as it does now. >> >> Probably, but I would hate to force people to code in a specific way for >> it to work. >> > > The intent of what I proposed it to be transparent for imports within > functions. It would be a minor optimization if anything, but it would mean > that there is a lazy mechanism in place. > > For top-level imports, unless *all* are made lazy, then there *must* be > some indication in the code of whether to make it lazy or not. Not true; importlib would make it dead-simple to whitelist what modules to make lazy (e.g. your app code lazy but all stdlib stuff not, etc.). -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Feb 8 22:10:47 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 08 Feb 2012 16:10:47 -0500 Subject: [Python-Dev] PEP for new dictionary implementation In-Reply-To: <4F32CA76.5040307@hotpy.org> References: <4F32CA76.5040307@hotpy.org> Message-ID: On 2/8/2012 2:18 PM, Mark Shannon wrote: A pretty clear draft PEP. > Changes to repr() output and iteration order: > For most cases, this will be unchanged. > However for some split-table dictionaries the iteration order will > change. > > Neither of these cons should be a problem. > Modules which meddle with the internals of the dictionary > implementation are already broken and should be fixed to use the API. So are modules that depend on set and dict iteration order and the consequent representations. > The iteration order of dictionaries was never defined and has always been > arbitrary; it is different for Jython and PyPy. I am pretty sure iteration order has changed between CPython versions in the past (and that when it did, people got caught). The documentation for doctest has section 25.2.3.6. Warnings. It starts with this very issue! ''' doctest is serious about requiring exact matches in expected output. If even a single character doesn?t match, the test fails. This will probably surprise you a few times, as you learn exactly what Python does and doesn?t guarantee about output. For example, when printing a dict, Python doesn?t guarantee that the key-value pairs will be printed in any particular order, so a test like >>> foo() {"Hermione": "hippogryph", "Harry": "broomstick"} is vulnerable! One workaround is to do >>> foo() == {"Hermione": "hippogryph", "Harry": "broomstick"} True instead. Another is to do >>> d = sorted(foo().items()) >>> d [('Harry', 'broomstick'), ('Hermione', 'hippogryph')] ''' (Object addresses and full-precision float representations are also discussed.) -- Terry Jan Reedy From nad at acm.org Wed Feb 8 22:13:29 2012 From: nad at acm.org (Ned Deily) Date: Wed, 08 Feb 2012 22:13:29 +0100 Subject: [Python-Dev] peps: Update with bugfix releases. References: <20120205204551.Horde.NCdeYVNNcXdPLtxvnkzi1lA@webmail.df.eu> <4F32DF1E.40205@v.loewis.de> Message-ID: In article <4F32DF1E.40205 at v.loewis.de>, "Martin v. Lowis" wrote: > Am 05.02.2012 21:34, schrieb Ned Deily: > > In article > > <20120205204551.Horde.NCdeYVNNcXdPLtxvnkzi1lA at webmail.df.eu>, > > martin at v.loewis.de wrote: > > > >>> I understand that but, to me, it makes no sense to send out truly > >>> broken releases. Besides, the hash collision attack is not exactly > >>> new either. Another few weeks can't make that much of a difference. > >> > >> Why would the release be truly broken? It surely can't be worse than > >> the current releases (which apparently aren't truly broken, else > >> there would have been no point in releasing them back then). > > > > They were broken by the release of OS X 10.7 and Xcode 4.2 which were > > subsequent to the previous releases. None of the currently available > > python.org installers provide a fully working system on OS X 10.7, or on > > OS X 10.6 if the user has installed Xcode 4.2 for 10.6. > > In what way are the current releases not fully working? Are you > referring to issues with building extension modules? Yes > If it's that, I wouldn't call that "truly broken". Plus, the releases > continue to work fine on older OS X releases. If not "truly", then how about "seriously broken"? And it's not quite the case that the releases work fine on older OS X releases. The installers in question, the 64-/32-bit installer variants, work only on OS X 10.6 and above. If the user installed the optional Xcode 4.2 for 10.6, then they have the same problem with building extension modules as 10.7 users do. > So when you build a bug fix release, just build it with the same tool > chain as the previous bug fix release, and all is fine. I am not proposing changing the build tool chain for 3.2.x and 2.7.x bug fix releases. But, users not being able to build extension modules out of the box with the default vendor-supplied build tools as they have in the past is not a case of of all is fine, IMO. However, this may all be a moot point now as I've subsequently proposed a patch to Distutils to smooth over the problem by checking for the case of gcc-4.2 being required but not available and, if so, automatically substituting clang instead. (http://bugs.python.org/issue13590) This trades off a certain risk of using clang for extension modules against the 100% certainty of users being unable to build extension modules. -- Ned Deily, nad at acm.org From mark at hotpy.org Wed Feb 8 22:23:48 2012 From: mark at hotpy.org (Mark Shannon) Date: Wed, 08 Feb 2012 21:23:48 +0000 Subject: [Python-Dev] PEP for new dictionary implementation In-Reply-To: References: <4F32CA76.5040307@hotpy.org> Message-ID: <4F32E7E4.6070207@hotpy.org> Terry Reedy wrote: > On 2/8/2012 2:18 PM, Mark Shannon wrote: > > A pretty clear draft PEP. > >> Changes to repr() output and iteration order: >> For most cases, this will be unchanged. >> However for some split-table dictionaries the iteration order will >> change. >> >> Neither of these cons should be a problem. >> Modules which meddle with the internals of the dictionary >> implementation are already broken and should be fixed to use the API. > > So are modules that depend on set and dict iteration order and the > consequent representations. > >> The iteration order of dictionaries was never defined and has always been >> arbitrary; it is different for Jython and PyPy. > > I am pretty sure iteration order has changed between CPython versions in > the past (and that when it did, people got caught). The documentation > for doctest has section 25.2.3.6. Warnings. It starts with this very issue! > ''' > doctest is serious about requiring exact matches in expected output. If > even a single character doesn?t match, the test fails. This will > probably surprise you a few times, as you learn exactly what Python does > and doesn?t guarantee about output. For example, when printing a dict, > Python doesn?t guarantee that the key-value pairs will be printed in any > particular order, so a test like > > >>> foo() > {"Hermione": "hippogryph", "Harry": "broomstick"} > is vulnerable! One workaround is to do > > >>> foo() == {"Hermione": "hippogryph", "Harry": "broomstick"} > True > instead. Another is to do > > >>> d = sorted(foo().items()) > >>> d > [('Harry', 'broomstick'), ('Hermione', 'hippogryph')] > ''' > (Object addresses and full-precision float representations are also > discussed.) > There are a few things in the standard lib that rely on dict repr ordering: http://bugs.python.org/issue13907 http://bugs.python.org/issue13909 I expect that the long-awaited fix to the hash-collision security issue will expose a few more. Version 2 of the new dict passes all these tests, but that doesn't mean the tests are correct. Cheers, Mark. From francismb at email.de Thu Feb 9 00:11:20 2012 From: francismb at email.de (francis) Date: Thu, 09 Feb 2012 00:11:20 +0100 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: <4F3291C9.9070305@hotpy.org> References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> <4F2AE13C.6010900@hotpy.org> <4F3291C9.9070305@hotpy.org> Message-ID: <4F330118.5050109@email.de> Just more info: changeset is: 74843:20702d1acf17 Cheers, francis From dgoulet at efficios.com Thu Feb 9 00:03:29 2012 From: dgoulet at efficios.com (David Goulet) Date: Wed, 08 Feb 2012 18:03:29 -0500 Subject: [Python-Dev] ctypes/utils.py problem Message-ID: <4F32FF41.4080704@efficios.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi everyone, I'm working with the LTTng (Linux Tracing) team and we came across a problem with our user-space tracer and Python default behavior. We provide a libc wrapper that instrument free() and malloc() with a simple ld_preload of that lib. This lib *was* named "liblttng-ust-libc.so" and we came across python software registering to our trace registry daemon (meaning that somehow the python binary is using our in-process library). We dig a bit and found this: Lib/ctypes/utils.py: def _findLib_ldconfig(name): # XXX assuming GLIBC's ldconfig (with option -p) expr = r'/[^\(\)\s]*lib%s\.[^\(\)\s]*' % re.escape(name) res = re.search(expr, os.popen('/sbin/ldconfig -p 2>/dev/null').read()) and, at least, also found in _findLib_gcc(name) and _findSoname_ldconfig(name). This cause Python to use any library ending with "libc.so" to be loaded.... I don't know the reasons behind this but we are concerned about "future issues" that can occur with this kind of behavior. For now, we renamed our lib so everything is fine. Thanks a lot guys. David -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQEcBAEBAgAGBQJPMv9BAAoJEELoaioR9I02jwkIALmLg0esubJL+TrZFEahNwz7 85RUKSa/GKDx2sagsi62PWy5RfvRABs5Ij6ldtyQoszyuZuOlM5B7rMrpDvO588P WqO1lzT6rdO9uyq2B6vPZRjjAr++StLKyIBbQodQd8PJkEsdN0kJISdRgIrSFL/E 0+2aUllrRgsVxc/oOF2LG+u7828iAYPfB71pC4euj2PgiwffZZ6J5gH4Q+mrUqt0 KiYU5X+vCEzWLv+ZLtq+h2rVrLNk8cFTL5N092iMwFfooSC70urD5a0cTR6pf/iI UfFvuIVROsqiT2MwQxHApyChkrLnX0eWDPdeZZAFjnWVm4QPy8q09m6qX5eHloA= =9wj8 -----END PGP SIGNATURE----- From francismb at email.de Thu Feb 9 00:09:48 2012 From: francismb at email.de (francis) Date: Thu, 09 Feb 2012 00:09:48 +0100 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: <4F3291C9.9070305@hotpy.org> References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> <4F2AE13C.6010900@hotpy.org> <4F3291C9.9070305@hotpy.org> Message-ID: <4F3300BC.5080201@email.de> Hi Mark, I've just cloned : > > Repository: https://bitbucket.org/markshannon/cpython_new_dict .... > Do please try it on your machine(s). that's a: Linux random 3.1.0-1-amd64 #1 SMP Tue Jan 10 05:01:58 UTC 2012 x86_64 GNU/Linux and I'm getting: gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. -I./Include -DPy_BUILD_CORE -o Objects/dictobject.o Objects/dictobject.c gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. -I./Include -DPy_BUILD_CORE -o Objects/memoryobject.o Objects/memoryobject.c Objects/dictobject.c: In function ?dict_popitem?: Objects/dictobject.c:2208:5: error: ?PyDictKeyEntry? has no member named ?me_value? make: *** [Objects/dictobject.o] Error 1 make: *** Waiting for unfinished jobs.... Cheers francis From steve at pearwood.info Thu Feb 9 01:35:33 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 09 Feb 2012 11:35:33 +1100 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: References: Message-ID: <4F3314D5.8090907@pearwood.info> Simon Cross wrote: > I think I'm -1 on a "locale" encoding because it refers to different > actual encodings depending on where and when it's run, which seems > surprising Why is it surprising? Surely that's the whole point of a locale encoding: to use the locale encoding, whatever that happens to be at the time. Perhaps I'm missing something, but I don't see how this proposal is any more surprising than the fact that (say) Decimal uses a global context if you don't specify one explicitly. Only this should be *less* surprising, because Decimal uses the global context by default, while this will use the global locale encoding only if you explicitly tell it to. -- Steven From steve at pearwood.info Thu Feb 9 01:40:01 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 09 Feb 2012 11:40:01 +1100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: <4F3315E1.3090408@pearwood.info> Paul Moore wrote: > I would suggest that, assuming python-dev want to take ownership of > the module, one last-ditch attempt be made to contact Fredrik. We > should email him, I wouldn't call email to be "last-ditch". I call email "first-ditch". I would expect that a last-ditch attempt would include trying to call him by phone, sending him a dead-tree letter by post, and if important enough, actually driving out to his home or place of work and trying to see him face to face. (All depending on the importance of making contact, naturally.) -- Steven From brett at python.org Thu Feb 9 01:48:26 2012 From: brett at python.org (Brett Cannon) Date: Wed, 8 Feb 2012 19:48:26 -0500 Subject: [Python-Dev] ctypes/utils.py problem In-Reply-To: <4F32FF41.4080704@efficios.com> References: <4F32FF41.4080704@efficios.com> Message-ID: Could you file a bug at bugs.python.org, David, so we don't lose track of this? On Wed, Feb 8, 2012 at 18:03, David Goulet wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi everyone, > > I'm working with the LTTng (Linux Tracing) team and we came across a > problem > with our user-space tracer and Python default behavior. We provide a libc > wrapper that instrument free() and malloc() with a simple ld_preload of > that lib. > > This lib *was* named "liblttng-ust-libc.so" and we came across python > software > registering to our trace registry daemon (meaning that somehow the python > binary > is using our in-process library). We dig a bit and found this: > > Lib/ctypes/utils.py: > > def _findLib_ldconfig(name): > # XXX assuming GLIBC's ldconfig (with option -p) > expr = r'/[^\(\)\s]*lib%s\.[^\(\)\s]*' % re.escape(name) > res = re.search(expr, > os.popen('/sbin/ldconfig -p 2>/dev/null').read()) > > and, at least, also found in _findLib_gcc(name) and > _findSoname_ldconfig(name). > > This cause Python to use any library ending with "libc.so" to be loaded.... > > I don't know the reasons behind this but we are concerned about "future > issues" > that can occur with this kind of behavior. > > For now, we renamed our lib so everything is fine. > > Thanks a lot guys. > David > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (GNU/Linux) > > iQEcBAEBAgAGBQJPMv9BAAoJEELoaioR9I02jwkIALmLg0esubJL+TrZFEahNwz7 > 85RUKSa/GKDx2sagsi62PWy5RfvRABs5Ij6ldtyQoszyuZuOlM5B7rMrpDvO588P > WqO1lzT6rdO9uyq2B6vPZRjjAr++StLKyIBbQodQd8PJkEsdN0kJISdRgIrSFL/E > 0+2aUllrRgsVxc/oOF2LG+u7828iAYPfB71pC4euj2PgiwffZZ6J5gH4Q+mrUqt0 > KiYU5X+vCEzWLv+ZLtq+h2rVrLNk8cFTL5N092iMwFfooSC70urD5a0cTR6pf/iI > UfFvuIVROsqiT2MwQxHApyChkrLnX0eWDPdeZZAFjnWVm4QPy8q09m6qX5eHloA= > =9wj8 > -----END PGP SIGNATURE----- > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Feb 9 02:26:01 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 9 Feb 2012 11:26:01 +1000 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: <1328717357.3387.22.camel@localhost.localdomain> References: <20120207234224.1ae8602e@pitrou.net> <1328717357.3387.22.camel@localhost.localdomain> Message-ID: On Thu, Feb 9, 2012 at 2:09 AM, Antoine Pitrou wrote: > I guess my point was: why is there a function call in that case? The > "import" statement could look up sys.modules directly. > Or the built-in __import__ could still be written in C, and only defer > to importlib when the module isn't found in sys.modules. > Practicality beats purity. I quite like the idea of having builtin __import__ be a *very* thin veneer around importlib that just does the "is this in sys.modules already so we can just return it from there?" checks and delegates other more complex cases to Python code in importlib. Poking around in importlib.__import__ [1] (as well as importlib._gcd_import), I'm thinking what we may want to do is break up the logic a bit so that there are multiple helper functions that a C version can call back into so that we can optimise certain simple code paths to not call back into Python at all, and others to only do so selectively. Step 1: separate out the "fromlist" processing from __import__ into a separate helper function def _process_fromlist(module, fromlist): # Perform any required imports as per existing code: # http://hg.python.org/cpython/file/aba513307f78/Lib/importlib/_bootstrap.py#l987 Step 2: separate out the relative import resolution from _gcd_import into a separate helper function. def _resolve_relative_name(name, package, level): assert hasattr(name, 'rpartition') assert hasattr(package, 'rpartition') assert level > 0 name = # Recalculate as per the existing code: # http://hg.python.org/cpython/file/aba513307f78/Lib/importlib/_bootstrap.py#l889 return name Step 3: Implement builtin __import__ in C (pseudo-code below): def __import__(name, globals={}, locals={}, fromlist=[], level=0): if level > 0: name = importlib._resolve_relative_import(name) try: module = sys.modules[name] except KeyError: # Not cached yet, need to invoke the full import machinery # We already resolved any relative imports though, so # treat it as an absolute import return importlib.__import__(name, globals, locals, fromlist, 0) # Got a hit in the cache, see if there's any more work to do if not fromlist: # Duplicate relevant importlib.__import__ logic as C code # to find the right module to return from sys.modules elif hasattr(module, "__path__"): importlib._process_fromlist(module, fromlist) return module This would then be similar to the way main.c already works when it interacts with runpy - simple cases are handled directly in C, more complex cases get handed over to the Python module. Cheers, Nick. [1] http://hg.python.org/cpython/file/default/Lib/importlib/_bootstrap.py#l950 -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From pje at telecommunity.com Thu Feb 9 02:28:56 2012 From: pje at telecommunity.com (PJ Eby) Date: Wed, 8 Feb 2012 20:28:56 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Wed, Feb 8, 2012 at 4:08 PM, Brett Cannon wrote: > > On Wed, Feb 8, 2012 at 15:31, Terry Reedy wrote: > >> For top-level imports, unless *all* are made lazy, then there *must* be >> some indication in the code of whether to make it lazy or not. >> > > Not true; importlib would make it dead-simple to whitelist what modules to > make lazy (e.g. your app code lazy but all stdlib stuff not, etc.). > There's actually only a few things stopping all imports from being lazy. "from x import y" immediately de-lazies them, after all. ;-) The main two reasons you wouldn't want imports to *always* be lazy are: 1. Changing sys.path or other parameters between the import statement and the actual import 2. ImportErrors are likewise deferred until point-of-use, so conditional importing with try/except would break. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Feb 9 02:43:02 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 9 Feb 2012 11:43:02 +1000 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Thu, Feb 9, 2012 at 11:28 AM, PJ Eby wrote: > The main two reasons you wouldn't want imports to *always* be lazy are: > > 1. Changing sys.path or other parameters between the import statement and > the actual import > 2. ImportErrors are likewise deferred until point-of-use, so conditional > importing with try/except would break. 3. Module level code may have non-local side effects (e.g. installing codecs, pickle handlers, atexit handlers) A white-listing based approach to lazy imports would let you manage all those issues without having to change all the code that actually *does* the imports. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Feb 9 05:48:22 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 9 Feb 2012 14:48:22 +1000 Subject: [Python-Dev] [Python-checkins] cpython: PEP 410 In-Reply-To: References: Message-ID: On Thu, Feb 9, 2012 at 7:52 AM, victor.stinner wrote: > http://hg.python.org/cpython/rev/f8409b3d6449 > changeset: ? 74832:f8409b3d6449 > user: ? ? ? ?Victor Stinner > date: ? ? ? ?Wed Feb 08 14:31:50 2012 +0100 > summary: > ?PEP 410 Ah, even when written by a core dev, a PEP should still be at Accepted before we check anything in. PEP 410 is still at Draft. Did Guido accept this one by private email? (He never made me his delegate, and without that, my agreement doesn't count as acceptance of the PEP). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Feb 9 05:49:22 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 9 Feb 2012 14:49:22 +1000 Subject: [Python-Dev] [Python-checkins] cpython: PEP 410 In-Reply-To: References: Message-ID: On Thu, Feb 9, 2012 at 2:48 PM, Nick Coghlan wrote: > On Thu, Feb 9, 2012 at 7:52 AM, victor.stinner > wrote: >> http://hg.python.org/cpython/rev/f8409b3d6449 >> changeset: ? 74832:f8409b3d6449 >> user: ? ? ? ?Victor Stinner >> date: ? ? ? ?Wed Feb 08 14:31:50 2012 +0100 >> summary: >> ?PEP 410 > > Ah, even when written by a core dev, a PEP should still be at Accepted > before we check anything in. PEP 410 is still at Draft. Never mind, I just saw the checkin that reverted the change. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From hodgestar+pythondev at gmail.com Thu Feb 9 07:43:02 2012 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Thu, 9 Feb 2012 08:43:02 +0200 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: <4F3314D5.8090907@pearwood.info> References: <4F3314D5.8090907@pearwood.info> Message-ID: On Thu, Feb 9, 2012 at 2:35 AM, Steven D'Aprano wrote: > Simon Cross wrote: >> >> I think I'm -1 on a "locale" encoding because it refers to different >> actual encodings depending on where and when it's run, which seems >> surprising > > > Why is it surprising? Surely that's the whole point of a locale encoding: to > use the locale encoding, whatever that happens to be at the time. I think there's a general expectation that if you encode something with one codec you will be able to decode it with the same codec. That's not necessarily true for the locale encoding. From victor.stinner at haypocalc.com Thu Feb 9 10:30:11 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 9 Feb 2012 10:30:11 +0100 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: References: <4F3314D5.8090907@pearwood.info> Message-ID: > I think there's a general expectation that if you encode something > with one codec you will be able to decode it with the same codec. > That's not necessarily true for the locale encoding. There is the same problem with the filesystem encoding (sys.getfilesystemencoding()), which is the user locale encoding (LC_ALL, LANG or LC_CTYPE) or the Windows ANSI code page. If you wrote a file using this encoding, you may not be able to read it if the filesystem encoding changes between two run, or on another computer. I agree that it is more surprising because the current locale encoding can change anytime, not only between two runs or when you use another computer. Don't you think that this special behaviour can be documented? Victor From victor.stinner at haypocalc.com Thu Feb 9 10:32:29 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 9 Feb 2012 10:32:29 +0100 Subject: [Python-Dev] [Python-checkins] cpython: PEP 410 In-Reply-To: References: Message-ID: >>> changeset: ? 74832:f8409b3d6449 >>> user: ? ? ? ?Victor Stinner >>> date: ? ? ? ?Wed Feb 08 14:31:50 2012 +0100 >>> summary: >>> ?PEP 410 >> >> Ah, even when written by a core dev, a PEP should still be at Accepted >> before we check anything in. PEP 410 is still at Draft. > > Never mind, I just saw the checkin that reverted the change. Yeah, I should use a clone of the repository instead of always working in the same repository. I pushed the commit by mistake. It is difficult to manipulate such huge patch. I just created a clone on my computer to avoid similar mistakes :-) Victor From mark at hotpy.org Thu Feb 9 12:18:57 2012 From: mark at hotpy.org (Mark Shannon) Date: Thu, 09 Feb 2012 11:18:57 +0000 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: <4F3300BC.5080201@email.de> References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> <4F2AE13C.6010900@hotpy.org> <4F3291C9.9070305@hotpy.org> <4F3300BC.5080201@email.de> Message-ID: <4F33ABA1.6020601@hotpy.org> francis wrote: > Hi Mark, > I've just cloned : >> >> Repository: https://bitbucket.org/markshannon/cpython_new_dict > .... >> Do please try it on your machine(s). > that's a: > Linux random 3.1.0-1-amd64 #1 SMP Tue Jan 10 05:01:58 UTC 2012 x86_64 > GNU/Linux > > > and I'm getting: > > gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. > -I./Include -DPy_BUILD_CORE -o Objects/dictobject.o Objects/dictobject.c > gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. > -I./Include -DPy_BUILD_CORE -o Objects/memoryobject.o > Objects/memoryobject.c > Objects/dictobject.c: In function ?dict_popitem?: > Objects/dictobject.c:2208:5: error: ?PyDictKeyEntry? has no member named > ?me_value? > make: *** [Objects/dictobject.o] Error 1 > make: *** Waiting for unfinished jobs.... Bah... typo in assert statement. My fault for not testing the debug build (release build worked fine). Both builds working now. Cheers, Mark. From fuzzyman at voidspace.org.uk Thu Feb 9 12:51:31 2012 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 09 Feb 2012 11:51:31 +0000 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: <4F3291C9.9070305@hotpy.org> References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> <4F2AE13C.6010900@hotpy.org> <4F3291C9.9070305@hotpy.org> Message-ID: <4F33B343.1050801@voidspace.org.uk> On 08/02/2012 15:16, Mark Shannon wrote: > Hi, > > Version 2 is now available. > > Version 2 makes as few changes to tunable constants as possible, and > generally does not change iteration order (so repr() is unchanged). > All tests pass (the only changes to tests are for sys.getsizeof() ). > > Repository: https://bitbucket.org/markshannon/cpython_new_dict > Issue http://bugs.python.org/issue13903 > > Performance changes are basically zero for non-OO code. > Average -0.5% speed change on 2n3 benchamrks, a few benchmarks show > a small reduction in memory use. (see notes below) > > GCbench uses 47% less memory and is 12% faster. > 2to3, which seems to be the only "realistic" benchmark that runs on Py3, > shows no change in speed and uses 10% less memory. In your first version 2to3 used 28% less memory. Do you know why it's worse in this version? Michael > > All benchmarks and tests performed on old, slow 32bit machine > with linux. > Do please try it on your machine(s). > > If accepted, the new dict implementation will allow a useful > optimisation of the LOAD_GLOBAL (and possibly LOAD_ATTR) bytecode: > By testing to see if the (immutable) keys-tables is the expected table, > the value can accessed directly by index, rather than by name. > > Cheers, > Mark. > > > Notes: > All benchmarks from http://hg.python.org/benchmarks/ > using the -m flag to get memory usage data. > > I've ignored the json benchmarks which shows unstable behaviour > on my machine. > Tiny changes to the dict being serialized or to the random seed can > change the relative speed of my implementation vs CPython from -25% to > +10%. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From ncoghlan at gmail.com Thu Feb 9 13:40:41 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 9 Feb 2012 22:40:41 +1000 Subject: [Python-Dev] [Python-checkins] cpython: PEP 410 In-Reply-To: References: Message-ID: On Thu, Feb 9, 2012 at 7:32 PM, Victor Stinner wrote: >>>> changeset: ? 74832:f8409b3d6449 >>>> user: ? ? ? ?Victor Stinner >>>> date: ? ? ? ?Wed Feb 08 14:31:50 2012 +0100 >>>> summary: >>>> ?PEP 410 >>> >>> Ah, even when written by a core dev, a PEP should still be at Accepted >>> before we check anything in. PEP 410 is still at Draft. >> >> Never mind, I just saw the checkin that reverted the change. > > Yeah, I should use a clone of the repository instead of always working > in the same repository. I pushed the commit by mistake. It is > difficult to manipulate such huge patch. I just created a clone on my > computer to avoid similar mistakes :-) I maintain a separate sandbox clone for the same reason. I think I'm finally starting to get the hang of the mq extension for working with smaller "not yet ready" changes, too. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Thu Feb 9 13:42:54 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 9 Feb 2012 13:42:54 +0100 Subject: [Python-Dev] Add a new "locale" codec? References: <4F3314D5.8090907@pearwood.info> Message-ID: <20120209134254.6a7cb62c@pitrou.net> On Thu, 9 Feb 2012 08:43:02 +0200 Simon Cross wrote: > On Thu, Feb 9, 2012 at 2:35 AM, Steven D'Aprano wrote: > > Simon Cross wrote: > >> > >> I think I'm -1 on a "locale" encoding because it refers to different > >> actual encodings depending on where and when it's run, which seems > >> surprising > > > > > > Why is it surprising? Surely that's the whole point of a locale encoding: to > > use the locale encoding, whatever that happens to be at the time. > > I think there's a general expectation that if you encode something > with one codec you will be able to decode it with the same codec. > That's not necessarily true for the locale encoding. As And pointed out, this is already the behaviour of the "mbcs" codec under Windows. "locale" would be the moral (*) equivalent of that under Unix. (*) or perhaps immoral :-) Regards Antoine. From amauryfa at gmail.com Thu Feb 9 13:55:17 2012 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Thu, 9 Feb 2012 13:55:17 +0100 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: <20120209134254.6a7cb62c@pitrou.net> References: <4F3314D5.8090907@pearwood.info> <20120209134254.6a7cb62c@pitrou.net> Message-ID: 2012/2/9 Antoine Pitrou > > I think there's a general expectation that if you encode something > > with one codec you will be able to decode it with the same codec. > > That's not necessarily true for the locale encoding. > > As And pointed out, this is already the behaviour of the "mbcs" codec > under Windows. "locale" would be the moral (*) equivalent of that under > Unix. With the difference that mbcs cannot change during execution. I don't even know if it is possible to change it at all, except by reinstalling Windows. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at haypocalc.com Thu Feb 9 14:14:07 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 9 Feb 2012 14:14:07 +0100 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: References: <4F3314D5.8090907@pearwood.info> <20120209134254.6a7cb62c@pitrou.net> Message-ID: > With the difference that mbcs cannot change during execution. It is possible to change the "thread ANSI code page" (CP_THREAD_ACP) at runtime, but setting the system ANSI code page (CP_ACP) requires to restart Windows. > I don't even know if it is possible to change it at all, except by > reinstalling Windows. The system ANSI code page can be set in the regional dialog of the control panel. If I remember correctly, it is badly called the "language". Victor From victor.stinner at haypocalc.com Thu Feb 9 14:16:15 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 9 Feb 2012 14:16:15 +0100 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: <20120209134254.6a7cb62c@pitrou.net> References: <4F3314D5.8090907@pearwood.info> <20120209134254.6a7cb62c@pitrou.net> Message-ID: > As And pointed out, this is already the behaviour of the "mbcs" codec > under Windows. "locale" would be the moral (*) equivalent of that under > Unix. On Windows, the ANSI code page codec will be accessible using 3 different names: "locale", "mbcs" and the real encoding name (sys.getfilesystemencoding())! Victor From victor.stinner at haypocalc.com Thu Feb 9 14:41:34 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 9 Feb 2012 14:41:34 +0100 Subject: [Python-Dev] patch Message-ID: -------------- next part -------------- A non-text attachment was scrubbed... Name: patch Type: application/octet-stream Size: 24118 bytes Desc: not available URL: From brett at python.org Thu Feb 9 15:58:40 2012 From: brett at python.org (Brett Cannon) Date: Thu, 9 Feb 2012 09:58:40 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Wed, Feb 8, 2012 at 20:28, PJ Eby wrote: > > > On Wed, Feb 8, 2012 at 4:08 PM, Brett Cannon wrote: > >> >> On Wed, Feb 8, 2012 at 15:31, Terry Reedy wrote: >> >>> For top-level imports, unless *all* are made lazy, then there *must* be >>> some indication in the code of whether to make it lazy or not. >>> >> >> Not true; importlib would make it dead-simple to whitelist what modules >> to make lazy (e.g. your app code lazy but all stdlib stuff not, etc.). >> > > There's actually only a few things stopping all imports from being lazy. > "from x import y" immediately de-lazies them, after all. ;-) > > The main two reasons you wouldn't want imports to *always* be lazy are: > > 1. Changing sys.path or other parameters between the import statement and > the actual import > 2. ImportErrors are likewise deferred until point-of-use, so conditional > importing with try/except would break. > This actually depends on the type of ImportError. My current solution actually would trigger an ImportError at the import statement if no finder could locate the module. But if some ImportError was raised because of some other issue during load then that would come up at first use. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Thu Feb 9 15:59:48 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 09 Feb 2012 23:59:48 +0900 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: References: <4F3314D5.8090907@pearwood.info> Message-ID: <87aa4sxdaz.fsf@uwakimon.sk.tsukuba.ac.jp> Victor Stinner writes: > There is the same problem [that encode-decode with the 'locale' > codec doesn't roundtrip reliably] with the filesystem encoding > (sys.getfilesystemencoding()), -1 on a query to the OS that pretends to be a constant. You see, it's not the same problem. The difference is that 'locale' is a constant and should correspond to a constant encoding, while 'sys.getfilesystemcoding()' is a library function that queries the system, and it's obvious from the syntax that this is expected to change in various circumstances, so if you want roundtripping you need to save the result. Having a nondeterministic "locale" codec is just begging application (and maybe a few middleware) programmers to use it everywhere they don't feel like thinking about I18N. Experience shows that that is everywhere! If this is needed, it should be spelled "os.getlocaleencoding()" (or "sys.getlocaleencoding()"?) Possibly there should be corresponding getlocalelanguage(), getlocaleregion(), and getlocalemodifier() functions, and they should take an optional string argument whose appropriate component is returned. Or maybe there should be a "parselocalestring()" function that returns a named tuple. Or maybe this three-line function doesn't need to be a builtin? From brett at python.org Thu Feb 9 16:05:22 2012 From: brett at python.org (Brett Cannon) Date: Thu, 9 Feb 2012 10:05:22 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120207234224.1ae8602e@pitrou.net> <1328717357.3387.22.camel@localhost.localdomain> Message-ID: On Wed, Feb 8, 2012 at 20:26, Nick Coghlan wrote: > On Thu, Feb 9, 2012 at 2:09 AM, Antoine Pitrou > wrote: > > I guess my point was: why is there a function call in that case? The > > "import" statement could look up sys.modules directly. > > Or the built-in __import__ could still be written in C, and only defer > > to importlib when the module isn't found in sys.modules. > > Practicality beats purity. > > I quite like the idea of having builtin __import__ be a *very* thin > veneer around importlib that just does the "is this in sys.modules > already so we can just return it from there?" checks and delegates > other more complex cases to Python code in importlib. > > Poking around in importlib.__import__ [1] (as well as > importlib._gcd_import), I'm thinking what we may want to do is break > up the logic a bit so that there are multiple helper functions that a > C version can call back into so that we can optimise certain simple > code paths to not call back into Python at all, and others to only do > so selectively. > > Step 1: separate out the "fromlist" processing from __import__ into a > separate helper function > > def _process_fromlist(module, fromlist): > # Perform any required imports as per existing code: > # > http://hg.python.org/cpython/file/aba513307f78/Lib/importlib/_bootstrap.py#l987 > > Fine by me. > > Step 2: separate out the relative import resolution from _gcd_import > into a separate helper function. > > def _resolve_relative_name(name, package, level): > assert hasattr(name, 'rpartition') > assert hasattr(package, 'rpartition') > assert level > 0 > name = # Recalculate as per the existing code: > # > http://hg.python.org/cpython/file/aba513307f78/Lib/importlib/_bootstrap.py#l889 > return name > I was actually already thinking of exposing this as importlib.resolve_name() so breaking it out makes sense. I also think it might be possible to expose a sort of importlib.find_module() that does nothing more than find the loader for a module (if available). > > Step 3: Implement builtin __import__ in C (pseudo-code below): > > def __import__(name, globals={}, locals={}, fromlist=[], level=0): > if level > 0: > name = importlib._resolve_relative_import(name) > try: > module = sys.modules[name] > except KeyError: > # Not cached yet, need to invoke the full import machinery > # We already resolved any relative imports though, so > # treat it as an absolute import > return importlib.__import__(name, globals, locals, fromlist, 0) > # Got a hit in the cache, see if there's any more work to do > if not fromlist: > # Duplicate relevant importlib.__import__ logic as C code > # to find the right module to return from sys.modules > elif hasattr(module, "__path__"): > importlib._process_fromlist(module, fromlist) > return module > > This would then be similar to the way main.c already works when it > interacts with runpy - simple cases are handled directly in C, more > complex cases get handed over to the Python module. > I suspect that if people want the case where you load from bytecode is fast then this will have to expand beyond this to include C functions and/or classes which can be used as accelerators; while this accelerates the common case of sys.modules, this (probably) won't make Antoine happy enough for importing a small module from bytecode (importing large modules like decimal are already fast enough). -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Thu Feb 9 19:43:22 2012 From: pje at telecommunity.com (PJ Eby) Date: Thu, 9 Feb 2012 13:43:22 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? Message-ID: On Feb 9, 2012 9:58 AM, "Brett Cannon" wrote: > This actually depends on the type of ImportError. My current solution actually would trigger an ImportError at the import statement if no finder could locate the module. But if some ImportError was raised because of some other issue during load then that would come up at first use. That's not really a lazy import then, or at least not as lazy as what Mercurial or PEAK use for general lazy importing. If you have a lot of them, that module-finding time really adds up. Again, the goal is fast startup of command-line tools that only use a small subset of the overall framework; doing disk access for lazy imports goes against that goal. -------------- next part -------------- An HTML attachment was scrubbed... URL: From florent.xicluna at gmail.com Thu Feb 9 19:49:16 2012 From: florent.xicluna at gmail.com (Florent) Date: Thu, 9 Feb 2012 19:49:16 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <20120208130419.3ae6bbae@pitrou.net> Message-ID: 2012/2/8 Nick Coghlan > On Wed, Feb 8, 2012 at 10:04 PM, Antoine Pitrou > wrote:> > > It's not frozen, it's actually maintained. > > Indeed, it sounds like the most appropriate course (if we don't hear > otherwise from Fredrik) may be to just update PEP 360 to acknowledge > current reality (i.e. the most current release of ElementTree is > actually the one maintained by Florent in the stdlib). > Actually, it was part of my learning curve to the development of Python, as you can see on the thread of the issue http://bugs.python.org/issue6472 . I spent some time between December 2009 and March 2010 to merge the "experimental" 1.3 in the standard library, both for 2.7 and 3.2. Upstream, there were 2 different test suites for the Python and the C implementation, but I merged them in a single test suite, and I've patched the C accelerator to conform to the same behaviour as the Python reference module. With the knowledge I acquired, I chased some other bugs related to ElementTree at the same time. With the feedback and some support coming from Antoine, Fredrik and Stefan we shaped a decent ElementTree 1.3 for the standard library. I am not aware of any effort to maintain the ElementTree package outside of the standard library since I did this merge. So, in the current state, we could consider the standard library package as the most up to date and stable version of ElementTree. I concur with Eli proposal to set the C accelerator as default : the test suite ensures that both implementations behave the same. I cannot commit myself for the long-term maintenance of ElementTree in the standard library, both because I don't have a strong interest in XML parsing, and because I have many other projects which keep me away from core python development for long period of times. However, I think it is a good thing if all the packages which are part of the standard library follow the same rules. We should try to find an agreement with Fredrik, explicit or implicit, which delegates the evolution and the maintenance of ElementTree to the Python community. IIRC, we have other examples in the standard library where the community support helped a lot to refresh a package where the original maintainer did not have enough time to pursue its work. I'll note that this change isn't *quite* as simple as Eli's > description earlier in the thread may suggest, though - the test suite > also needs to be updated to ensure that the Python version is still > fully exercised without the C acceleration applied. And such an an > alteration would definitely be an explicit fork, even though the user > facing API doesn't change - we're changing the structure of the code > in a way that means some upstream deltas (if they happen to occur) may > not apply cleanly. > > The test suite is a "de facto" fork of the upstream test suites, since upstream test suites do not guarantee the same behaviour between cElementTree and ElementTree. -- Florent Xicluna -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Thu Feb 9 20:19:59 2012 From: brett at python.org (Brett Cannon) Date: Thu, 9 Feb 2012 14:19:59 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Thu, Feb 9, 2012 at 13:43, PJ Eby wrote: > > On Feb 9, 2012 9:58 AM, "Brett Cannon" wrote: > > This actually depends on the type of ImportError. My current solution > actually would trigger an ImportError at the import statement if no finder > could locate the module. But if some ImportError was raised because of some > other issue during load then that would come up at first use. > > That's not really a lazy import then, or at least not as lazy as what > Mercurial or PEAK use for general lazy importing. If you have a lot of > them, that module-finding time really adds up. > > Again, the goal is fast startup of command-line tools that only use a > small subset of the overall framework; doing disk access for lazy imports > goes against that goal. > Depends if you consider stat calls the overhead vs. the actual disk read/write to load the data. Anyway, this is going to lead down to a discussion/argument over design parameters which I'm not up to having since I'm not actively working on a lazy loader for the stdlib right now. -------------- next part -------------- An HTML attachment was scrubbed... URL: From francismb at email.de Thu Feb 9 20:47:50 2012 From: francismb at email.de (francis) Date: Thu, 09 Feb 2012 20:47:50 +0100 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: <4F33ABA1.6020601@hotpy.org> References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> <4F2AE13C.6010900@hotpy.org> <4F3291C9.9070305@hotpy.org> <4F3300BC.5080201@email.de> <4F33ABA1.6020601@hotpy.org> Message-ID: <4F3422E6.6050309@email.de> Hi Mark, > Bah... typo in assert statement. > My fault for not testing the debug build (release build worked fine). > Both builds working now. Yeah, now is working and passes all tests also on my machine. I've tried to run the test suite but I'm getting a SyntaxError: (may be you know it's just the first time that I try the tool): ============================= ci at random:~/prog/cpython/benchmarks$ python perf.py -r -b apps python ../cpython_new_dict/python Running 2to3... INFO:root:Running ../cpython_new_dict/python lib/2to3/2to3 -f all lib/2to3_data Traceback (most recent call last): File "perf.py", line 2236, in main(sys.argv[1:]) File "perf.py", line 2192, in main options))) File "perf.py", line 1279, in BM_2to3 return SimpleBenchmark(Measure2to3, *args, **kwargs) File "perf.py", line 706, in SimpleBenchmark *args, **kwargs) File "perf.py", line 1275, in Measure2to3 return MeasureCommand(command, trials, env, options.track_memory) File "perf.py", line 1223, in MeasureCommand CallAndCaptureOutput(command, env=env) File "perf.py", line 1053, in CallAndCaptureOutput raise RuntimeError(u"Benchmark died: " + unicode(stderr, 'ascii')) RuntimeError: Benchmark died: Traceback (most recent call last): File "lib/2to3/2to3", line 3, in from lib2to3.main import main File "/home/ci/prog/cpython/benchmarks/lib/2to3/lib2to3/main.py", line 47 except os.error, err: ^ SyntaxError: invalid syntax ============================= And the baseline is: Python 2.7.2+ (but it also gives me an SyntaxError running on python3 default (e50db1b7ad7b) What I'm doing wrong ? (from it's doc: ?This project is intended to be an authoritative source of benchmarks for all Python implementations.?) Thanks in advance ! francis From rowen at uw.edu Thu Feb 9 20:52:29 2012 From: rowen at uw.edu (Russell E. Owen) Date: Thu, 09 Feb 2012 11:52:29 -0800 Subject: [Python-Dev] peps: Update with bugfix releases. References: <20120205204551.Horde.NCdeYVNNcXdPLtxvnkzi1lA@webmail.df.eu> <4F32DF1E.40205@v.loewis.de> Message-ID: In article <4F32DF1E.40205 at v.loewis.de>, "Martin v. Lowis" wrote: > Am 05.02.2012 21:34, schrieb Ned Deily: > > In article > > <20120205204551.Horde.NCdeYVNNcXdPLtxvnkzi1lA at webmail.df.eu>, > > martin at v.loewis.de wrote: > > > >>> I understand that but, to me, it makes no sense to send out truly > >>> broken releases. Besides, the hash collision attack is not exactly > >>> new either. Another few weeks can't make that much of a difference. > >> > >> Why would the release be truly broken? It surely can't be worse than > >> the current releases (which apparently aren't truly broken, else > >> there would have been no point in releasing them back then). > > > > They were broken by the release of OS X 10.7 and Xcode 4.2 which were > > subsequent to the previous releases. None of the currently available > > python.org installers provide a fully working system on OS X 10.7, or on > > OS X 10.6 if the user has installed Xcode 4.2 for 10.6. > > In what way are the current releases not fully working? Are you > referring to issues with building extension modules? One problem I've run into is that the 64-bit Mac python 2.7 does not work properly with ActiveState Tcl/Tk. One symptom is to build matplotlib. The results fail -- both versions of Tcl/Tk somehow get linked in. We have had similar problems with the 32-bit python.org python in the past, but recent builds have been fine. I believe the solution that worked for the 32-bit versions was to install ActiveState Tcl/Tk before making the distribution build. The results would work fine with Apple's Tcl/Tk or with ActiveState Tcl/Tk. I don't know if the same solution would work for 64-bit python. I don't know of any issues with the 32-bit build of Python 2.7. I've not tried the Python 3 builds. -- Russell From mwm at mired.org Thu Feb 9 20:53:02 2012 From: mwm at mired.org (Mike Meyer) Date: Thu, 9 Feb 2012 11:53:02 -0800 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: <20120209115302.2ce94321@bhuda.mired.org> On Thu, 9 Feb 2012 14:19:59 -0500 Brett Cannon wrote: > On Thu, Feb 9, 2012 at 13:43, PJ Eby wrote: > > Again, the goal is fast startup of command-line tools that only use a > > small subset of the overall framework; doing disk access for lazy imports > > goes against that goal. > > > Depends if you consider stat calls the overhead vs. the actual disk > read/write to load the data. Anyway, this is going to lead down to a > discussion/argument over design parameters which I'm not up to having since > I'm not actively working on a lazy loader for the stdlib right now. For those of you not watching -ideas, or ignoring the "Python TIOBE -3%" discussion, this would seem to be relevant to any discussion of reworking the import mechanism: http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059801.html http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From v+python at g.nevcal.com Thu Feb 9 21:27:27 2012 From: v+python at g.nevcal.com (Glenn Linderman) Date: Thu, 09 Feb 2012 12:27:27 -0800 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: <20120209115302.2ce94321@bhuda.mired.org> References: <20120209115302.2ce94321@bhuda.mired.org> Message-ID: <4F342C2F.1040004@g.nevcal.com> On 2/9/2012 11:53 AM, Mike Meyer wrote: > On Thu, 9 Feb 2012 14:19:59 -0500 > Brett Cannon wrote: >> On Thu, Feb 9, 2012 at 13:43, PJ Eby wrote: >>> Again, the goal is fast startup of command-line tools that only use a >>> small subset of the overall framework; doing disk access for lazy imports >>> goes against that goal. >>> >> Depends if you consider stat calls the overhead vs. the actual disk >> read/write to load the data. Anyway, this is going to lead down to a >> discussion/argument over design parameters which I'm not up to having since >> I'm not actively working on a lazy loader for the stdlib right now. > For those of you not watching -ideas, or ignoring the "Python TIOBE > -3%" discussion, this would seem to be relevant to any discussion of > reworking the import mechanism: > > http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059801.html > > From ncoghlan at gmail.com Thu Feb 9 21:43:24 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 10 Feb 2012 06:43:24 +1000 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: <87aa4sxdaz.fsf@uwakimon.sk.tsukuba.ac.jp> References: <4F3314D5.8090907@pearwood.info> <87aa4sxdaz.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Fri, Feb 10, 2012 at 12:59 AM, Stephen J. Turnbull wrote: > If this is needed, it should be spelled "os.getlocaleencoding()" (or > "sys.getlocaleencoding()"?) Or locale.getpreferredencoding(), even ;) FWIW, I agree with Stephen on this one, but take that with the grain of salt that I could probably decode most of the strings I work with as ASCII without breaking anything. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From tjreedy at udel.edu Thu Feb 9 21:56:11 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 09 Feb 2012 15:56:11 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: <4F342C2F.1040004@g.nevcal.com> References: <20120209115302.2ce94321@bhuda.mired.org> <4F342C2F.1040004@g.nevcal.com> Message-ID: On 2/9/2012 3:27 PM, Glenn Linderman wrote: > On 2/9/2012 11:53 AM, Mike Meyer wrote: >> On Thu, 9 Feb 2012 14:19:59 -0500 >> Brett Cannon wrote: >>> On Thu, Feb 9, 2012 at 13:43, PJ Eby wrote: >>>> Again, the goal is fast startup of command-line tools that only use a >>>> small subset of the overall framework; doing disk access for lazy imports >>>> goes against that goal. >>>> >>> Depends if you consider stat calls the overhead vs. the actual disk >>> read/write to load the data. Anyway, this is going to lead down to a >>> discussion/argument over design parameters which I'm not up to having since >>> I'm not actively working on a lazy loader for the stdlib right now. >> For those of you not watching -ideas, or ignoring the "Python TIOBE >> -3%" discussion, this would seem to be relevant to any discussion of >> reworking the import mechanism: >> >> http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059801.html "For 32k processes on BlueGene/P, importing 100 trivial C-extension modules takes 5.5 hours, compared to 35 minutes for all other interpreter loading and initialization. We developed a simple pure-Python module (based on knee.py, a hierarchical import example) that cuts the import time from 5.5 hours to 6 minutes." > So what is the implication here? That building a cache of module > locations (cleared when a new module is installed) would be more > effective than optimizing the search for modules on every invocation of > Python? -- Terry Jan Reedy From nad at acm.org Thu Feb 9 22:07:05 2012 From: nad at acm.org (Ned Deily) Date: Thu, 09 Feb 2012 22:07:05 +0100 Subject: [Python-Dev] peps: Update with bugfix releases. References: <20120205204551.Horde.NCdeYVNNcXdPLtxvnkzi1lA@webmail.df.eu> <4F32DF1E.40205@v.loewis.de> Message-ID: In article , "Russell E. Owen" wrote: > One problem I've run into is that the 64-bit Mac python 2.7 does not > work properly with ActiveState Tcl/Tk. One symptom is to build > matplotlib. The results fail -- both versions of Tcl/Tk somehow get > linked in. The 64-bit OS X installer is built on and tested on systems with A/S Tcl/Tk 8.5.x and we explicitly recommend its use when possible. http://www.python.org/download/mac/tcltk/ Please open a python bug for this and any other issues you know of regarding the use with current A/S Tcl/Tk 8.5.x with current 2.7.x or 3.2.x installers on OS X 10.6 or 10.7. -- Ned Deily, nad at acm.org From victor.stinner at haypocalc.com Thu Feb 9 22:47:16 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 9 Feb 2012 22:47:16 +0100 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: <87aa4sxdaz.fsf@uwakimon.sk.tsukuba.ac.jp> References: <4F3314D5.8090907@pearwood.info> <87aa4sxdaz.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: > If this is needed, it should be spelled "os.getlocaleencoding()" (or > "sys.getlocaleencoding()"?) There is already a locale.getpreferredencoding(False) function which give your the current locale encoding. The problem is that the current locale encoding may change and so you have to get the new value each time than you would like to encode or decode data. Victor From pje at telecommunity.com Thu Feb 9 23:00:04 2012 From: pje at telecommunity.com (PJ Eby) Date: Thu, 9 Feb 2012 17:00:04 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: <20120209115302.2ce94321@bhuda.mired.org> References: <20120209115302.2ce94321@bhuda.mired.org> Message-ID: On Thu, Feb 9, 2012 at 2:53 PM, Mike Meyer wrote: > For those of you not watching -ideas, or ignoring the "Python TIOBE > -3%" discussion, this would seem to be relevant to any discussion of > reworking the import mechanism: > > http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059801.html > > Interesting. This gives me an idea for a way to cut stat calls per sys.path entry per import by roughly 4x, at the cost of a one-time directory read per sys.path entry. That is, an importer created for a particular directory could, upon first use, cache a frozenset(listdir()), and the stat().st_mtime of the directory. All the filename checks could then be performed against the frozenset, and the st_mtime of the directory only checked once per import, to verify whether the frozenset() needed refreshing. Since a failed module lookup takes at least 5 stat checks (pyc, pyo, py, directory, and compiled extension (pyd/so)), this cuts it down to only 1, at the price of a listdir(). The big question is how long does a listdir() take, compared to a stat() or failed open()? That would tell us whether the tradeoff is worth making. I did some crude timeit tests on frozenset(listdir()) and trapping failed stat calls. It looks like, for a Windows directory the size of the 2.7 stdlib, you need about four *failed* import attempts to overcome the initial caching cost, or about 8 successful bytecode imports. (For Linux, you might need to double these numbers; my tests showed a different ratio there, perhaps due to the Linux stdib I tested having nearly twice as many directory entries as the directory I tested on Windows!) However, the numbers are much better for application directories than for the stdlib, since they are located earlier on sys.path. Every successful stdlib import in an application is equal to one failed import attempt for every preceding directory on sys.path, so as long as the average directory on sys.path isn't vastly larger than the stdlib, and the average application imports at least four modules from the stdlib (on Windows, or 8 on Linux), there would be a net performance gain for the application as a whole. (That is, there'd be an improved per-sys.path entry import time for stdlib modules, even if not for any application modules.) For smaller directories, the tradeoff actually gets better. A directory one seventh the size of the 2.7 Windows stdlib has a listdir() that's proportionately faster, but failed stats() in that directory are *not* proportionately faster; they're only somewhat faster. This means that it takes fewer failed module lookups to make caching a win - about 2 in this case, vs. 4 for the stdlib. Now, these numbers are with actual disk or network access abstracted away, because the data's in the operating system cache when I run the tests. It's possible that this strategy could backfire if you used, say, an NFS directory with ten thousand files in it as your first sys.path entry. Without knowing the timings for listdir/stat/failed stat in that setup, it's hard to say how many stdlib imports you need before you come out ahead. When I tried a directory about 7 times larger than the stdlib, creating the frozenset took 10 times as long, but the cost of a failed stat didn't go up by very much. This suggests that there's probably an optimal directory size cutoff for this trick; if only there were some way to check the size of a directory without reading it, we could turn off the caching for oversize directories, and get a major speed boost for everything else. On most platforms, the stat().st_size of the directory itself will give you some idea, but on Windows that's always zero. On Windows, we could work around that by using a lower-level API than listdir() and simply stop reading the directory if we hit the maximum number of entries we're willing to build a cache for, and then call it off. (Another possibility would be to explicitly enable caching by putting a flag file in the directory, or perhaps by putting a special prefix on the sys.path entry, setting the cutoff in an environment variable, etc.) In any case, this seems really worth a closer look: in non-pathological cases, it could make directory-based importing as fast as zip imports are. I'd be especially interested in knowing how the listdir/stat/failed stat ratios work on NFS - ISTM that they might be even *more* conducive to this approach, if setup latency dominates the cost of individual system calls. If this works out, it'd be a good example of why importlib is a good idea; i.e., allowing us to play with ideas like this. Brett, wouldn't you love to be able to say importlib is *faster* than the old C-based importing? ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Feb 9 23:15:30 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 9 Feb 2012 23:15:30 +0100 Subject: [Python-Dev] requirements for moving __import__ over to importlib? References: <20120209115302.2ce94321@bhuda.mired.org> Message-ID: <20120209231530.3a623057@pitrou.net> On Thu, 9 Feb 2012 17:00:04 -0500 PJ Eby wrote: > On Thu, Feb 9, 2012 at 2:53 PM, Mike Meyer wrote: > > > For those of you not watching -ideas, or ignoring the "Python TIOBE > > -3%" discussion, this would seem to be relevant to any discussion of > > reworking the import mechanism: > > > > http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059801.html > > > > Interesting. This gives me an idea for a way to cut stat calls per > sys.path entry per import by roughly 4x, at the cost of a one-time > directory read per sys.path entry. Why do you even think this is a problem with "stat calls"? From robert.kern at gmail.com Thu Feb 9 23:34:25 2012 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 09 Feb 2012 22:34:25 +0000 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: <20120209231530.3a623057@pitrou.net> References: <20120209115302.2ce94321@bhuda.mired.org> <20120209231530.3a623057@pitrou.net> Message-ID: On 2/9/12 10:15 PM, Antoine Pitrou wrote: > On Thu, 9 Feb 2012 17:00:04 -0500 > PJ Eby wrote: >> On Thu, Feb 9, 2012 at 2:53 PM, Mike Meyer wrote: >> >>> For those of you not watching -ideas, or ignoring the "Python TIOBE >>> -3%" discussion, this would seem to be relevant to any discussion of >>> reworking the import mechanism: >>> >>> http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059801.html >>> >>> Interesting. This gives me an idea for a way to cut stat calls per >> sys.path entry per import by roughly 4x, at the cost of a one-time >> directory read per sys.path entry. > > Why do you even think this is a problem with "stat calls"? All he said is that reading about that problem and its solution gave him an idea about dealing with stat call overhead. The cost of stat calls has demonstrated itself to be a significant problem in other, more typical contexts. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From mark at hotpy.org Thu Feb 9 23:45:28 2012 From: mark at hotpy.org (Mark Shannon) Date: Thu, 09 Feb 2012 22:45:28 +0000 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: <4F3422E6.6050309@email.de> References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> <4F2AE13C.6010900@hotpy.org> <4F3291C9.9070305@hotpy.org> <4F3300BC.5080201@email.de> <4F33ABA1.6020601@hotpy.org> <4F3422E6.6050309@email.de> Message-ID: <4F344C88.6090501@hotpy.org> francis wrote: > Hi Mark, >> Bah... typo in assert statement. >> My fault for not testing the debug build (release build worked fine). >> Both builds working now. > Yeah, now is working and passes all tests also on my machine. > > I've tried to run the test suite but I'm getting a SyntaxError: > (may be you know it's just the first time that I try the tool): > > > ============================= > ci at random:~/prog/cpython/benchmarks$ python perf.py -r -b apps python > ../cpython_new_dict/python > Running 2to3... > INFO:root:Running ../cpython_new_dict/python lib/2to3/2to3 -f all > lib/2to3_data > Traceback (most recent call last): > File "perf.py", line 2236, in > main(sys.argv[1:]) > File "perf.py", line 2192, in main > options))) > File "perf.py", line 1279, in BM_2to3 > return SimpleBenchmark(Measure2to3, *args, **kwargs) > File "perf.py", line 706, in SimpleBenchmark > *args, **kwargs) > File "perf.py", line 1275, in Measure2to3 > return MeasureCommand(command, trials, env, options.track_memory) > File "perf.py", line 1223, in MeasureCommand > CallAndCaptureOutput(command, env=env) > File "perf.py", line 1053, in CallAndCaptureOutput > raise RuntimeError(u"Benchmark died: " + unicode(stderr, 'ascii')) > RuntimeError: Benchmark died: Traceback (most recent call last): > File "lib/2to3/2to3", line 3, in > from lib2to3.main import main > File "/home/ci/prog/cpython/benchmarks/lib/2to3/lib2to3/main.py", line 47 > except os.error, err: > ^ > SyntaxError: invalid syntax > ============================= > > And the baseline is: Python 2.7.2+ (but it also gives me an SyntaxError > running on > python3 default (e50db1b7ad7b) > > What I'm doing wrong ? (from it's doc: ?This project is intended to be an > authoritative source of benchmarks for all Python implementations.?) You need to convert the benchamrks to Python3 using 2to3. Instructions are in the make_perf3.sh file. You may need to manually fix up the output as well :( Cheers, Mark. From pje at telecommunity.com Fri Feb 10 01:19:46 2012 From: pje at telecommunity.com (PJ Eby) Date: Thu, 9 Feb 2012 19:19:46 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120209115302.2ce94321@bhuda.mired.org> <20120209231530.3a623057@pitrou.net> Message-ID: On Thu, Feb 9, 2012 at 5:34 PM, Robert Kern wrote: > On 2/9/12 10:15 PM, Antoine Pitrou wrote: > >> On Thu, 9 Feb 2012 17:00:04 -0500 >> PJ Eby wrote: >> >>> On Thu, Feb 9, 2012 at 2:53 PM, Mike Meyer wrote: >>> >>> For those of you not watching -ideas, or ignoring the "Python TIOBE >>>> -3%" discussion, this would seem to be relevant to any discussion of >>>> reworking the import mechanism: >>>> >>>> http://mail.scipy.org/**pipermail/numpy-discussion/** >>>> 2012-January/059801.html >>>> >>>> Interesting. This gives me an idea for a way to cut stat calls per >>>> >>> sys.path entry per import by roughly 4x, at the cost of a one-time >>> directory read per sys.path entry. >>> >> >> Why do you even think this is a problem with "stat calls"? >> > > All he said is that reading about that problem and its solution gave him > an idea about dealing with stat call overhead. The cost of stat calls has > demonstrated itself to be a significant problem in other, more typical > contexts. Right. It was the part of the post that mentioned that all they sped up was knowing which directory the files were in, not the actual loading of bytecode. The thought then occurred to me that this could perhaps be applied to normal importing, as a zipimport-style speedup. (The zipimport module caches each zipfile directory it finds on sys.path, so failed import lookups are extremely fast.) It occurs to me, too, that applying the caching trick to *only* the stdlib directories would still be a win as soon as you have between four and eight site-packages (or user specific site-packages) imports in an application, so it might be worth applying unconditionally to system-defined stdlib (non-site) directories. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Feb 10 01:22:01 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 10 Feb 2012 10:22:01 +1000 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120207234224.1ae8602e@pitrou.net> <1328717357.3387.22.camel@localhost.localdomain> Message-ID: On Fri, Feb 10, 2012 at 1:05 AM, Brett Cannon wrote: >> This would then be similar to the way main.c already works when it >> interacts with runpy - simple cases are handled directly in C, more >> complex cases get handed over to the Python module. > > I suspect that if people want the case where you load from bytecode is fast > then this will have to expand beyond this to include C functions and/or > classes which can be used as accelerators; while this accelerates the common > case of sys.modules, this (probably) won't make Antoine happy enough for > importing a small module from bytecode (importing large modules like decimal > are already fast enough). No, my suggestion of keeping a de minimis C implementation for the builtin __import__ is purely about ensuring the case of repeated imports (especially those nested inside functions) remains as fast as it is today. To speed up *first time* imports (regardless of their origin), I think it makes a lot more sense to use better algorithms at the importlib level, and that's much easier in Python than it is in C. It's not like we've ever been philosophically *opposed* to smarter approaches, it's just that import.c was already hairy enough and we had grave doubts about messing with it too much (I still have immense respect for the effort that Victor put in to sorting out most of its problems with Unicode handling). Not having that millstone hanging around our necks should open up *lots* of avenues for improvement without breaking backwards compatibility (since we can really do what we like, so long as the PEP 302 APIs are still invoked in the right order and the various public APIs remain backwards compatible). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From tjreedy at udel.edu Fri Feb 10 04:14:16 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 09 Feb 2012 22:14:16 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120209115302.2ce94321@bhuda.mired.org> <20120209231530.3a623057@pitrou.net> Message-ID: On 2/9/2012 7:19 PM, PJ Eby wrote: > Right. It was the part of the post that mentioned that all they sped up > was knowing which directory the files were in, not the actual loading of > bytecode. The thought then occurred to me that this could perhaps be > applied to normal importing, as a zipimport-style speedup. (The > zipimport module caches each zipfile directory it finds on sys.path, so > failed import lookups are extremely fast.) > > It occurs to me, too, that applying the caching trick to *only* the > stdlib directories would still be a win as soon as you have between four > and eight site-packages (or user specific site-packages) imports in an > application, so it might be worth applying unconditionally to > system-defined stdlib (non-site) directories. It might be worthwhile to store a single file in in the directory that contains /Lib with the info inport needs to get files in /Lib and its subdirs, and check that it is not outdated relative to /Lib. Since in Python 3, .pyc files go in __pycache__, if /Lib included an empyty __pycache__ on installation, /Lib would never be touched on most installations. Ditto for the non-__pycache__ subdirs. -- Terry Jan Reedy From eliben at gmail.com Fri Feb 10 04:51:51 2012 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 10 Feb 2012 05:51:51 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <20120208130419.3ae6bbae@pitrou.net> Message-ID: >> On Wed, Feb 8, 2012 at 10:04 PM, Antoine Pitrou >> wrote:> >> > It's not frozen, it's actually maintained. >> >> Indeed, it sounds like the most appropriate course (if we don't hear >> otherwise from Fredrik) may be to just update PEP 360 to acknowledge >> current reality (i.e. the most current release of ElementTree is >> actually the one maintained by Florent in the stdlib). > > > Actually, it was part of my learning curve to the development of Python, as > you can see on the thread of the issue http://bugs.python.org/issue6472 . > I spent some time between December 2009 and March 2010 to merge the > "experimental" 1.3 in the standard library, both for 2.7 and 3.2. > Upstream, there were 2 different test suites for the Python and the C > implementation, but I merged them in a single test suite, and I've patched > the C accelerator to conform to the same behaviour as the Python reference > module. > With the knowledge I acquired, I chased some other bugs related to > ElementTree at the same time. > With the feedback and some support coming from Antoine, Fredrik and Stefan > we shaped a decent ElementTree 1.3 for the standard library. > > I am not aware of any effort to maintain the ElementTree package outside of > the standard library since I did this merge. > So, in the current state, we could consider the standard library package as > the most up to date and stable version of ElementTree. > I concur with Eli proposal to set the C accelerator as default : the test > suite ensures that both implementations behave the same. > > I cannot commit myself for the long-term maintenance of ElementTree in the > standard library, both because I don't have a strong interest in XML > parsing, and because I have many other projects which keep me away from core > python development for long period of times. > > However, I think it is a good thing if all the packages which are part of > the standard library follow the same rules. > We should try to find an agreement with Fredrik, explicit or implicit, which > delegates the evolution and the maintenance of ElementTree to the Python > community. > IIRC, we have other examples in the standard library where the community > support helped a lot to refresh a package where the original maintainer did > not have enough time to pursue its work. > Thanks for the input, Florent. So, to paraphrase, there already are code changes in the stdlib version of ET/cET which are not upstream. You made it explicit about the tests, so the question is only left for the modules themselves. Is that right? Eli From eliben at gmail.com Fri Feb 10 04:58:49 2012 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 10 Feb 2012 05:58:49 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: >> That said, I think that the particular change discussed in this thread >> can be made anyway, since it doesn't really modify ET's APIs or >> functionality, just the way it gets imported from stdlib. > > I would suggest that, assuming python-dev want to take ownership of > the module, one last-ditch attempt be made to contact Fredrik. We > should email him, and copy python-dev (and maybe even python-list) > asking for his view, and ideally his blessing on the stdlib version > being forked and maintained independently going forward. Put a time > limit on responses ("if we don't hear by XXX, we'll assume Fredrik is > either uncontactable or not interested, and therefore we can go ahead > with maintaining the stdlib version independently"). > > It's important to respect Fredrik's wishes and ownership, but we can't > leave part of the stdlib frozen and abandoned just because he's not > available any longer. IMHO it's no longer a question of "wanting" to take ownership. According to Florent, this has already happened to some extent. Also, given the support history of ET outside stdlib, we can't in the same breath not take ownership and keep recommending this module. Lack of maintenance makes it a dead end, which is a shame given the choice of alternative modules for XML parsing in the stdlib. I don't mind sending Fredrik an email as you detailed. Any suggested things to include in it? Also, the most recent email (from 2009) of him I can find is "fredrik at pythonware.com". If anyone knows of anything more up-to-date, please let me know. Eli From florent.xicluna at gmail.com Fri Feb 10 09:32:31 2012 From: florent.xicluna at gmail.com (Florent) Date: Fri, 10 Feb 2012 09:32:31 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <20120208130419.3ae6bbae@pitrou.net> Message-ID: 2012/2/10 Eli Bendersky > > > > Thanks for the input, Florent. So, to paraphrase, there already are > code changes in the stdlib version of ET/cET which are not upstream. > You made it explicit about the tests, so the question is only left for > the modules themselves. Is that right? > The port of ElementTree to Python 3000 was done in the standard library only. The work was done back in 2006, 2007 and 2008. There was never a public version of ElementTree for Python 3 outside of the standard library. It is already a significant change from the upstream branch (many changes in the C extension code). Then when I enforced the same test suite for both implementation, I have fixed many things in the C extension module too. To my knowledge, these fixes were not included upstream. Since two years, there was regular maintenance of the package in the standard library, but none of the patch were integrated upstream. I hope it answers the question, -- Florent Xicluna From stephen at xemacs.org Fri Feb 10 09:54:36 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 10 Feb 2012 17:54:36 +0900 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: References: <4F3314D5.8090907@pearwood.info> <87aa4sxdaz.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <87zkcrvzjn.fsf@uwakimon.sk.tsukuba.ac.jp> Victor Stinner writes: > > If this is needed, it should be spelled "os.getlocaleencoding()" (or > > "sys.getlocaleencoding()"?) > > There is already a locale.getpreferredencoding(False) function which > give your the current locale encoding. The problem is that the current > locale encoding may change and so you have to get the new value each > time than you would like to encode or decode data. How can that happen if the programmer (or a module she has imported) isn't messing with the locale? If the programmer is messing with the locale, really they need to be careful. A magic codec whose encoding changes *within* a process is an accident waiting to happen. Do you have a real use case for the "'locale' codec's encoding changes with the locale within process" feature? From eliben at gmail.com Fri Feb 10 10:06:27 2012 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 10 Feb 2012 11:06:27 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <20120208130419.3ae6bbae@pitrou.net> Message-ID: On Fri, Feb 10, 2012 at 10:32, Florent wrote: > 2012/2/10 Eli Bendersky >> > >> >> Thanks for the input, Florent. So, to paraphrase, there already are >> code changes in the stdlib version of ET/cET which are not upstream. >> You made it explicit about the tests, so the question is only left for >> the modules themselves. Is that right? >> > > The port of ElementTree to Python 3000 was done in the standard > library only. The work was done back in 2006, 2007 and 2008. > There was never a public version of ElementTree for Python 3 outside > of the standard library. > It is already a significant change from the upstream branch (many > changes in the C extension code). > > Then when I enforced the same test suite for both implementation, I > have fixed many things in the C extension module too. To my knowledge, > these fixes were not included upstream. > > Since two years, there was regular maintenance of the package in the > standard library, but none of the patch were integrated upstream. > Folks, with this in mind, can we just acknowledge that the stdlib ElementTree is de-facto forked from Fredrik Lundh's official releases and get on with our lives? Note the code review discussion here - http://codereview.appspot.com/207048/show - where Fredrik Lundh more or less acknowledges this fact and shows no real objections to it. By "get on with our lives" I mean keep fixing problems in ElementTree inside stdlib, as well as work on exposing the C implementation behind the ElementTree API by default, falling back on the Python API (and being true to PEP 399). Eli From martin at v.loewis.de Fri Feb 10 10:37:24 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 10 Feb 2012 10:37:24 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: <4F34E554.7090600@v.loewis.de> > That makes me consider it the reality that "today, ET is only being > maintained in the stdlib". I think different people will have different perceptions of reality here. In my interaction with Fredrik Lundh, I got the impression that he might consider code still maintained even if he didn't touch it for five and more years - as he might get back some time and work on it, and as it may not have significant bugs that need fixing. If someone steps into charge and actually takes over ElementTree maintainance, that would be fine with me (as long as I have some trust that he'll continue to work on it for the next five years). We might have to "write off" contributions from Fredrik Lundh to Python because of that. Notice that the last time something like this came up (bsddb), it actually resulted in a removal of the respective package from the standard library. > Given that it was two months ago that I started the "Fixing the XML > batteries" thread (and years since I brought up the topic for the first > time), it seems to be hard enough already to get anyone on python-dev > actually do something for Python's XML support, instead of just actively > discouraging those who invest time and work into it. It depends on the nature of the changes you want to see done. Just bashing some piece of code is not something that I personally consider a worthwhile thing, so I'll likely continue to discourage changes in a direction that demeans some XML library in favor of some other. Regards, Martin From fijall at gmail.com Fri Feb 10 10:44:48 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 10 Feb 2012 11:44:48 +0200 Subject: [Python-Dev] PyPy 1.8 released Message-ID: ============================ PyPy 1.8 - business as usual ============================ We're pleased to announce the 1.8 release of PyPy. As habitual this release brings a lot of bugfixes, together with performance and memory improvements over the 1.7 release. The main highlight of the release is the introduction of `list strategies`_ which makes homogenous lists more efficient both in terms of performance and memory. This release also upgrades us from Python 2.7.1 compatibility to 2.7.2. Otherwise it's "business as usual" in the sense that performance improved roughly 10% on average since the previous release. you can download the PyPy 1.8 release here: http://pypy.org/download.html .. _`list strategies`: http://morepypy.blogspot.com/2011/10/more-compact-lists-with-list-strategies.html What is PyPy? ============= PyPy is a very compliant Python interpreter, almost a drop-in replacement for CPython 2.7. It's fast (`pypy 1.8 and cpython 2.7.1`_ performance comparison) due to its integrated tracing JIT compiler. This release supports x86 machines running Linux 32/64, Mac OS X 32/64 or Windows 32. Windows 64 work has been stalled, we would welcome a volunteer to handle that. .. _`pypy 1.8 and cpython 2.7.1`: http://speed.pypy.org Highlights ========== * List strategies. Now lists that contain only ints or only floats should be as efficient as storing them in a binary-packed array. It also improves the JIT performance in places that use such lists. There are also special strategies for unicode and string lists. * As usual, numerous performance improvements. There are many examples of python constructs that now should be faster; too many to list them. * Bugfixes and compatibility fixes with CPython. * Windows fixes. * NumPy effort progress; for the exact list of things that have been done, consult the `numpy status page`_. A tentative list of things that has been done: * multi dimensional arrays * various sizes of dtypes * a lot of ufuncs * a lot of other minor changes Right now the `numpy` module is available under both `numpy` and `numpypy` names. However, because it's incomplete, you have to `import numpypy` first before doing any imports from `numpy`. * New JIT hooks that allow you to hook into the JIT process from your python program. There is a `brief overview`_ of what they offer. * Standard library upgrade from 2.7.1 to 2.7.2. Ongoing work ============ As usual, there is quite a bit of ongoing work that either didn't make it to the release or is not ready yet. Highlights include: * Non-x86 backends for the JIT: ARMv7 (almost ready) and PPC64 (in progress) * Specialized type instances - allocate instances as efficient as C structs, including type specialization * More numpy work * Since the last release there was a significant breakthrough in PyPy's fundraising. We now have enough funds to work on first stages of `numpypy`_ and `py3k`_. We would like to thank again to everyone who donated. * It's also probably worth noting, we're considering donations for the Software Transactional Memory project. You can read more about `our plans`_ Cheers, The PyPy Team .. _`brief overview`: http://doc.pypy.org/en/latest/jit-hooks.html .. _`numpy status page`: http://buildbot.pypy.org/numpy-status/latest.html .. _`numpy status update blog report`: http://morepypy.blogspot.com/2012/01/numpypy-status-update.html .. _`numpypy`: http://pypy.org/numpydonate.html .. _`py3k`: http://pypy.org/py3donate.html .. _`our plans`: http://morepypy.blogspot.com/2012/01/transactional-memory-ii.html From martin at v.loewis.de Fri Feb 10 10:43:24 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 10 Feb 2012 10:43:24 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: <4F34E6BC.3080603@v.loewis.de> > IMHO it's no longer a question of "wanting" to take ownership. > According to Florent, this has already happened to some extent. "Ownership to some extent" is not a useful concept. Either you have ownership, or you don't. > I don't mind sending Fredrik an email as you detailed. Any suggested > things to include in it? I'd ask Fredrik if he wants to yield ownership, to some (specific) other person. What really worries me is the question who that other person is. There is a difference between fixing some issues, and actively taking over ownership, with all its consequences (long-term commitment, willingness to defend difficult decisions even if you are constantly being insulted for that decision, and so on). *Not* having an owner just means that it will be as unmaintained in the future as it appears to be now. Regards, Martin From stefan_ml at behnel.de Fri Feb 10 10:50:46 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 10 Feb 2012 10:50:46 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <20120208130419.3ae6bbae@pitrou.net> Message-ID: Eli Bendersky, 10.02.2012 10:06: > On Fri, Feb 10, 2012 at 10:32, Florent wrote: >> 2012/2/10 Eli Bendersky >>>> >>> >>> Thanks for the input, Florent. So, to paraphrase, there already are >>> code changes in the stdlib version of ET/cET which are not upstream. >>> You made it explicit about the tests, so the question is only left for >>> the modules themselves. Is that right? >>> >> >> The port of ElementTree to Python 3000 was done in the standard >> library only. The work was done back in 2006, 2007 and 2008. >> There was never a public version of ElementTree for Python 3 outside >> of the standard library. >> It is already a significant change from the upstream branch (many >> changes in the C extension code). >> >> Then when I enforced the same test suite for both implementation, I >> have fixed many things in the C extension module too. To my knowledge, >> these fixes were not included upstream. >> >> Since two years, there was regular maintenance of the package in the >> standard library, but none of the patch were integrated upstream. > > Folks, with this in mind, can we just acknowledge that the stdlib > ElementTree is de-facto forked from Fredrik Lundh's official releases > and get on with our lives? Note the code review discussion here - > http://codereview.appspot.com/207048/show - where Fredrik Lundh more > or less acknowledges this fact and shows no real objections to it. > > By "get on with our lives" I mean keep fixing problems in ElementTree > inside stdlib, as well as work on exposing the C implementation behind > the ElementTree API by default, falling back on the Python API (and > being true to PEP 399). +1 None of this would make the situation any worse than it currently is, but provide serious improvements to the user experience. Stefan From martin at v.loewis.de Fri Feb 10 10:50:38 2012 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Fri, 10 Feb 2012 10:50:38 +0100 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: <20120209134254.6a7cb62c@pitrou.net> References: <4F3314D5.8090907@pearwood.info> <20120209134254.6a7cb62c@pitrou.net> Message-ID: <4F34E86E.6030505@v.loewis.de> > As And pointed out, this is already the behaviour of the "mbcs" codec > under Windows. "locale" would be the moral (*) equivalent of that under > Unix. Indeed, and that precedent should be enough reason *not* to include a "locale" encoding. The "mbcs" encoding has caused much user confusion over the years, and it is less useful than people typically think. For example, for some time, people thought that names in zip files ought to be encoded in "mbcs", only to find out that this is incorrect years later. With a "locale" encoding, the risk for confusion and untestable code is too high (just consider the ongoing saga of the Turkish dotless i (?)). Regards, Martin From eliben at gmail.com Fri Feb 10 10:57:48 2012 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 10 Feb 2012 11:57:48 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <4F34E6BC.3080603@v.loewis.de> References: <4F34E6BC.3080603@v.loewis.de> Message-ID: On Fri, Feb 10, 2012 at 11:43, "Martin v. L?wis" wrote: >> IMHO it's no longer a question of "wanting" to take ownership. >> According to Florent, this has already happened to some extent. > > "Ownership to some extent" is not a useful concept. Either you have > ownership, or you don't. > >> I don't mind sending Fredrik an email as you detailed. Any suggested >> things to include in it? > > I'd ask Fredrik if he wants to yield ownership, to some (specific) other > person. > > What really worries me is the question who that other person is. There > is a difference between fixing some issues, and actively taking over > ownership, with all its consequences (long-term commitment, willingness > to defend difficult decisions even if you are constantly being insulted > for that decision, and so on). > > *Not* having an owner just means that it will be as unmaintained in > the future as it appears to be now. How does this differ from any other module in stdlib that may not have a single designated owner, but which at the same time *is* being maintained by the core developers as a group? ISTM that requiring a five-year commitment is just going to scare any contributors away - is that what we want? What worries me most is that there seems to be a flow towards status quo on such things because status quo is the easiest to do. But in some cases, status quo is bad. Here we have a quite popular package in stdlib whose maintainer stopped maintaining it about two years ago. Another person stepped up and did some good work to bring the package up to date, fix bugs, and improve the test suite. What happens now? Do we give up on touching it until Fredrik Lundh decides on a come-back or some person who is willing to commit 5 years is found? Or do we just *keep* maintaining it in the stdlib as we do with other modules, fixing bugs, tests, documentation and so on? Eli From eliben at gmail.com Fri Feb 10 11:32:29 2012 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 10 Feb 2012 12:32:29 +0200 Subject: [Python-Dev] maintenance of the ElementTree / cElementTree packages in the Python standard library Message-ID: Hello Fredrik, Recently a discussion came up on the python-dev mailing list regarding continued maintenance of the ElementTree & cElementTree packages which are part of the standard library, and which were originally contributed by you. There currently exists an unclear situation with respect to the active maintainer(s) of these packages. On one hand, PEP 360 states that the packages are officially maintained in your repositories and all problems should be assigned to you and fixed upstream. On the other hand, it appears that there has already been made a considerable amount of work (e.g. http://codereview.appspot.com/207048/show) solely within the Python repositories, as well as the port to Python 3 which now lives in all 3.x branches. In other words, de-facto the package has been forked in the Python repository. Note that no changes (AFAIU) have been made to the ElementTree *API*, only to the implementations living in the stdlib. I'd like to understand your point of view on this topic. There are currently 23 open issues on the package(s) in the Python tracker, and some additional plans are being made (such as 'import ElementTree' importing the C implementation by default, falling back on the Python implementation if that's unavailable). Is that alright with you that all such new fixes and developments are being made by Python code developers in the Python repositories directly, without waiting for your approval to submit them upstream? Thanks in advance, Eli From martin at v.loewis.de Fri Feb 10 11:32:49 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 10 Feb 2012 11:32:49 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <4F34E6BC.3080603@v.loewis.de> Message-ID: <4F34F251.6070407@v.loewis.de> > How does this differ from any other module in stdlib that may not have > a single designated owner, but which at the same time *is* being > maintained by the core developers as a group? ISTM that requiring a > five-year commitment is just going to scare any contributors away - is > that what we want? I'm not talking about contributors, I'm talking about a maintainer. When we have a maintainer, that can actually attract contributors. Recognizing that something is a long-term commitment is scary, yes. However, a number of contributors has accepted such commitments, e.g. the release managers. Asking that for a subpackage is not asked too much, IMO. Compare this to distutils: if we've had a commitment of the original author to maintain that for a long period, the setuptools, distribute, distutils2, and packaging forks may not have been necessary. In absence of a maintainer, nobody is able to make difficult decisions. > What happens now? Do we give up on touching it until Fredrik Lundh > decides on a come-back or some person who is willing to commit 5 years > is found? Or do we just *keep* maintaining it in the stdlib as we do > with other modules, fixing bugs, tests, documentation and so on? If we really can't find somebody dedicated to that code base enough, we should consider removing it from the standard library. Regards, Martin From stefan_ml at behnel.de Fri Feb 10 11:44:11 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 10 Feb 2012 11:44:11 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <4F34E554.7090600@v.loewis.de> References: <4F34E554.7090600@v.loewis.de> Message-ID: "Martin v. L?wis", 10.02.2012 10:37: >> Given that it was two months ago that I started the "Fixing the XML >> batteries" thread (and years since I brought up the topic for the first >> time), it seems to be hard enough already to get anyone on python-dev >> actually do something for Python's XML support, instead of just actively >> discouraging those who invest time and work into it. > > It depends on the nature of the changes you want to see done. Just > bashing some piece of code is not something that I personally consider > a worthwhile thing, so I'll likely continue to discourage changes in > a direction that demeans some XML library in favor of some other. This is getting off-topic for this thread, but anyway. What I meant with my paragraph above was that none of the topics I brought up has received any action on the side of those with commit rights yet, regardless of how obvious they were and how much or little dispute there was about them. IMHO, all of this boils down to whether or not we should make it easier for users to efficiently use the stdlib. Backing ElementTree by the accelerator module helps here, and fixing the docs to point (new) users to ElementTree instead of MiniDOM helps as well. I can happily accept that you have a different opinion on the latter topic than I do. What I cannot accept is that, as we speak, this leads to users getting drawn into using the wrong tool for their job, into wasting their time (both for development and runtime) and potentially into getting drawn away from the (IMHO) perfect language for XML processing. I don't think bashing is the right word here. Everyone who, knowing the alternatives, decides to use MiniDOM is welcome to do so. I'm just stating, both from my personal experience and from discussions on c.l.py, that the current documentation makes it easier for new users to take the wrong decision for them than to make this decision in an informed way. MiniDOM *may* be the right thing further down along the way in some cases. It's almost *never* the right thing to start with, simply because if you do, it inherently takes way too much time until you reach the point where the evidence becomes obvious that it actually was the wrong decision. The documentation should allow innocent users to see this risk clearly before they start wasting their time. So, getting back to the topic again, is there any reason why you would oppose backing the ElementTree module in the stdlib by cElementTree's accelerator module? Or can we just consider this part of the discussion settled and start getting work done? Stefan From eliben at gmail.com Fri Feb 10 11:45:14 2012 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 10 Feb 2012 12:45:14 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <4F34F251.6070407@v.loewis.de> References: <4F34E6BC.3080603@v.loewis.de> <4F34F251.6070407@v.loewis.de> Message-ID: >> What happens now? Do we give up on touching it until Fredrik Lundh >> decides on a come-back or some person who is willing to commit 5 years >> is found? Or do we just *keep* maintaining it in the stdlib as we do >> with other modules, fixing bugs, tests, documentation and so on? > > If we really can't find somebody dedicated to that code base enough, > we should consider removing it from the standard library. Does this imply that each and every package in the stdlib currently has a dedicated maintainer who promised to be dedicated to it? Or otherwise, should those packages that *don't* have a maintainer be removed from the standard library? Isn't that a bit harsh? ElementTree is an overall functional library and AFAIK the preferred stdlib tool for processing XML for many developers. It currently needs some attention to fix a few issues, expose the fast C implementation by default when ElementTree is imported, and improve the documentation. At this point, I'm interested enough to work on these - given that the political issue with Fredrik Lundh is resolved. However, I can't *honestly* say I promise to maintain the package until 2017. So, what's next? Eli From stefan_ml at behnel.de Fri Feb 10 12:10:36 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 10 Feb 2012 12:10:36 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <4F34F251.6070407@v.loewis.de> References: <4F34E6BC.3080603@v.loewis.de> <4F34F251.6070407@v.loewis.de> Message-ID: "Martin v. L?wis", 10.02.2012 11:32: >> What happens now? Do we give up on touching it until Fredrik Lundh >> decides on a come-back or some person who is willing to commit 5 years >> is found? Or do we just *keep* maintaining it in the stdlib as we do >> with other modules, fixing bugs, tests, documentation and so on? > > If we really can't find somebody dedicated to that code base enough, > we should consider removing it from the standard library. Well, that's totally not the current situation, though. There has been a large amount of maintenance going into the ElementTree modules already, so there is evidently a substantial interest in a) having them in the stdlib and b) keeping them working well. The current decisions could easily be taken by the interested parties, of which there seem to be enough involved in the relevant python-dev threads so far. Note that even decisions taken by a maintainer are not guaranteed to pass easily and without opposition. On a related note, it may be worth mentioning that it's generally known for several years now that the MiniDOM library has very serious performance problems, and there doesn't seem to be any maintainer around who has made a visible effort to solve them. Maybe we should remove MiniDOM from the stdlib, because no-one seems to be dedicated to that code base enough to fix it. Stefan From eliben at gmail.com Fri Feb 10 12:26:22 2012 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 10 Feb 2012 13:26:22 +0200 Subject: [Python-Dev] Fwd: maintenance of the ElementTree / cElementTree packages in the Python standard library In-Reply-To: References: Message-ID: ---------- Forwarded message ---------- From: Fredrik Lundh Date: Fri, Feb 10, 2012 at 13:16 Subject: Re: maintenance of the ElementTree / cElementTree packages in the Python standard library To: Eli Bendersky Hi Eli, thanks for reaching out. ?I'll get back to you with a more "formal" reply later, but yeah, that sounds like a plan -- I have very limited time for core Python work these days anyway (as you guys have probably noticed :-). ?But feel free to loop me in on suggested API changes going forward (and I'll dig up my notes on additions that would be nice to have if someone wants to work on that). 2012/2/10 Eli Bendersky : > Hello Fredrik, > > Recently a discussion came up on the python-dev mailing list regarding > continued maintenance of the ElementTree & cElementTree packages which > are part of the standard library, and which were originally > contributed by you. > > There currently exists an unclear situation with respect to the active > maintainer(s) of these packages. On one hand, PEP 360 states that the > packages are officially maintained in your repositories and all > problems should be assigned to you and fixed upstream. On the other > hand, it appears that there has already been made a considerable > amount of work (e.g. http://codereview.appspot.com/207048/show) solely > within the Python repositories, as well as the port to Python 3 which > now lives in all 3.x branches. In other words, de-facto the package > has been forked in the Python repository. Note that no changes (AFAIU) > have been made to the ElementTree *API*, only to the implementations > living in the stdlib. > > I'd like to understand your point of view on this topic. There are > currently 23 open issues on the package(s) in the Python tracker, and > some additional plans are being made (such as 'import ElementTree' > importing the C implementation by default, falling back on the Python > implementation if that's unavailable). Is that alright with you that > all such new fixes and developments are being made by Python code > developers in the Python repositories directly, without waiting for > your approval to submit them upstream? > > Thanks in advance, > Eli From nad at acm.org Fri Feb 10 13:39:11 2012 From: nad at acm.org (Ned Deily) Date: Fri, 10 Feb 2012 13:39:11 +0100 Subject: [Python-Dev] peps: Update with bugfix releases. References: <20120205204551.Horde.NCdeYVNNcXdPLtxvnkzi1lA@webmail.df.eu> <4F32DF1E.40205@v.loewis.de> Message-ID: In article , Ned Deily wrote: > However, this may all be a moot point now as I've subsequently proposed > a patch to Distutils to smooth over the problem by checking for the case > of gcc-4.2 being required but not available and, if so, automatically > substituting clang instead. (http://bugs.python.org/issue13590) This > trades off a certain risk of using clang for extension modules against > the 100% certainty of users being unable to build extension modules. And I've now committed the patch for 2.7.x and 3.2.x so I no longer consider this a release blocking issue for 2.7.3 and 3.2.3. -- Ned Deily, nad at acm.org From ncoghlan at gmail.com Fri Feb 10 13:44:14 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 10 Feb 2012 22:44:14 +1000 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <4F34E554.7090600@v.loewis.de> References: <4F34E554.7090600@v.loewis.de> Message-ID: On Fri, Feb 10, 2012 at 7:37 PM, "Martin v. L?wis" wrote: > Notice that the last time something like this came up (bsddb), it > actually resulted in a removal of the respective package from the > standard library. bsddb was a *very* different case - it was actively causing buildbot stability problems and various reports on the tracker due to changes in the external Berkeley DB API. Once we had sqlite3 in the standard lib as an alternate DB-API backend, it was hard to justify the ongoing maintenance hassles *despite* Jesus Cea stepping up as the maintainer (and he still maintains the pybsddb version - that was actually a big factor in *letting* us drop it, since we could just direct current users towards the PyPI version). Most orphan modules in the stdlib aren't like that - yes, their APIs stagnate (because nobody feels they have the authority and/or expertise to make potentially controversial decisions), but for many of them, that's not a particularly bad thing. For others, the world has moved on around them and they becomes traps for the unwary, but still, taking the modules out is unwarranted, since we'd be breaking code without giving affected users a good alternative (for orphan modules, nobody is likely to take the time to maintain them on PyPI if they weren't willing to do so in the stdlib - this actually stands in stark *contrast* to the bsddb case, which was decidedly *not* an orphan module when it was removed). Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From florent.xicluna at gmail.com Fri Feb 10 14:03:10 2012 From: florent.xicluna at gmail.com (Florent) Date: Fri, 10 Feb 2012 14:03:10 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <4F34E554.7090600@v.loewis.de> Message-ID: 2012/2/10 Nick Coghlan : > > Most orphan modules in the stdlib aren't like that - yes, their APIs > stagnate (because nobody feels they have the authority and/or > expertise to make potentially controversial decisions), but for many > of them, that's not a particularly bad thing. You're right, and sometimes a contributor steps in and propose a PEP to move things forward for a so-called orphan module. If I'm not wrong, it was the case for StringIO, pickle, distutils, wsgiref and optparse even if each of these packages has its own story. -- Florent Xicluna From eliben at gmail.com Fri Feb 10 15:06:15 2012 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 10 Feb 2012 16:06:15 +0200 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library Message-ID: Hi all, Following the intensive and fruitful discussion of the (now rejected) PEP 408 (http://mail.python.org/pipermail/python-dev/2012-January/115850.html), we've drafted PEP 411 to summarize the conclusions with regards to the process of marking packages provisional. Note that this is an informational PEP, and that for the sake of completeness it duplicates some of the contents of PEP 408. It is pasted below, as well as online at http://www.python.org/dev/peps/pep-0411/. Comments are welcome. Eli ------------------------------------------------ PEP: 411 Title: Provisional packages in the Python standard library Version: $Revision$ Last-Modified: $Date$ Author: Nick Coghlan , Eli Bendersky Status: Draft Type: Informational Content-Type: text/x-rst Created: 2012-02-10 Python-Version: 3.3 Post-History: 2012-02-10 Abstract ======== The process of including a new package into the Python standard library is hindered by the API lock-in and promise of backward compatibility implied by a package being formally part of Python. This PEP describes a methodology for marking a standard library package "provisional" for the period of a single minor release. A provisional package may have its API modified prior to "graduating" into a "stable" state. On one hand, this state provides the package with the benefits of being formally part of the Python distribution. On the other hand, the core development team explicitly states that no promises are made with regards to the the stability of the package's API, which may change for the next release. While it is considered an unlikely outcome, such packages may even be removed from the standard library without a deprecation period if the concerns regarding their API or maintenante prove well-founded. Proposal - a documented provisional state ========================================= Whenever the Python core development team decides that a new package should be included into the standard library, but isn't entirely sure about whether the package's API is optimal, the package can be included and marked as "provisional". In the next minor release, the package may either be "graduated" into a normal "stable" state in the standard library, or be rejected and removed entirely from the Python source tree. If the package ends up graduating into the stable state after being provisional for a minor release, its API may be changed according to accumulated feedback. The core development team explicitly makes no guarantees about API stability and backward compatibility of provisional packages. Marking a package provisional ----------------------------- A package will be marked provisional by including the following paragraph as a note at the top of its documentation page: The package has been included in the standard library on a provisional basis. While major changes are not anticipated, as long as this notice remains in place, backwards incompatible changes are permitted if deemed necessary by the standard library developers. Such changes will not be made gratuitously - they will occur only if serious API flaws are uncovered that were missed prior to inclusion of the package. Moving a package from the provisional to the stable state simply implies removing this note from its documentation page. Which packages should go through the provisional state ------------------------------------------------------ We expect most packages proposed for addition into the Python standard library to go through a minor release in the provisional state. There may, however, be some exceptions, such as packages that use a pre-defined API (for example ``lzma``, which generally follows the API of the existing ``bz2`` package), or packages with an API that has wide acceptance in the Python development community. In any case, packages that are proposed to be added to the standard library, whether via the provisional state or directly, must fulfill the acceptance conditions set by PEP 2. Criteria for "graduation" ------------------------- In principle, most provisional packages should eventually graduate to the stable standard library. Some reasons for not graduating are: * The package may prove to be unstable or fragile, without sufficient developer support to maintain it. * A much better alternative package may be found during the preview release. Essentially, the decision will be made by the core developers on a per-case basis. The point to emphasize here is that a packages's inclusion in the standard library as "provisional" in some release does not guarantee it will continue being part of Python in the next release. Rationale ========= Benefits for the core development team -------------------------------------- Currently, the core developers are really reluctant to add new interfaces to the standard library. This is because as soon as they're published in a release, API design mistakes get locked in due to backward compatibility concerns. By gating all major API additions through some kind of a provisional mechanism for a full release, we get one full release cycle of community feedback before we lock in the APIs with our standard backward compatibility guarantee. We can also start integrating provisional packages with the rest of the standard library early, so long as we make it clear to packagers that the provisional packages should not be considered optional. The only difference between provisional APIs and the rest of the standard library is that provisional APIs are explicitly exempted from the usual backward compatibility guarantees. Benefits for end users ---------------------- For future end users, the broadest benefit lies in a better "out-of-the-box" experience - rather than being told "oh, the standard library tools for task X are horrible, download this 3rd party library instead", those superior tools are more likely to be just be an import away. For environments where developers are required to conduct due diligence on their upstream dependencies (severely harming the cost-effectiveness of, or even ruling out entirely, much of the material on PyPI), the key benefit lies in ensuring that all packages in the provisional state are clearly under python-dev's aegis from at least the following perspectives: * Licensing: Redistributed by the PSF under a Contributor Licensing Agreement. * Documentation: The documentation of the package is published and organized via the standard Python documentation tools (i.e. ReST source, output generated with Sphinx and published on http://docs.python.org). * Testing: The package test suites are run on the python.org buildbot fleet and results published via http://www.python.org/dev/buildbot. * Issue management: Bugs and feature requests are handled on http://bugs.python.org * Source control: The master repository for the software is published on http://hg.python.org. Candidates for provisional inclusion into the standard library ============================================================== For Python 3.3, there are a number of clear current candidates: * ``regex`` (http://pypi.python.org/pypi/regex) - approved by Guido [#]_. * ``daemon`` (PEP 3143) * ``ipaddr`` (PEP 3144) Other possible future use cases include: * Improved HTTP modules (e.g. ``requests``) * HTML 5 parsing support (e.g. ``html5lib``) * Improved URL/URI/IRI parsing * A standard image API (PEP 368) * Encapsulation of the import state (PEP 368) * Standard event loop API (PEP 3153) * A binary version of WSGI for Python 3 (e.g. PEP 444) * Generic function support (e.g. ``simplegeneric``) Rejected alternatives and variations ==================================== See PEP 408. References ========== .. [#] http://mail.python.org/pipermail/python-dev/2012-January/115962.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From ncoghlan at gmail.com Fri Feb 10 15:35:59 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 11 Feb 2012 00:35:59 +1000 Subject: [Python-Dev] Fwd: maintenance of the ElementTree / cElementTree packages in the Python standard library In-Reply-To: References: Message-ID: On Fri, Feb 10, 2012 at 9:26 PM, Eli Bendersky wrote: > ---------- Forwarded message ---------- > From: Fredrik Lundh > Date: Fri, Feb 10, 2012 at 13:16 > Subject: Re: maintenance of the ElementTree / cElementTree packages in > the Python standard library > To: Eli Bendersky > > > Hi Eli, thanks for reaching out. ?I'll get back to you with a more > "formal" reply later, but yeah, that sounds like a plan -- I have very > limited time for core Python work these days anyway (as you guys have > probably noticed :-). I've updated PEP 360 accordingly (including a link back to the archived version of Fredrik's reply). Since ElementTree was the last Python module referenced from that PEP that hadn't been converted to python-dev maintenance, I flagged the PEP so it now appears in the Historical PEPs section rather than near the top of the PEP index. Technically the reference from there to the Expat XML parser being externally maintained is still valid, but the same could be said of various 3rd party libraries we ship with the Windows binaries. I also updated the headers on several old PEPs (mostly ones related specifically to the 3.0 process and the migration to Hg) to move them down into the Historical section, and fixed the PEP 0 generator so that Draft process PEPs (i.e. the PEP 407 proposal to change the release schedule) appear in the Open PEPs section along with all the other Draft PEPs. (At time of writing, the PEP pages hadn't regenerated to show the updated status of any of the PEPs I moved around, but I figure it will sort itself out eventually) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eliben at gmail.com Fri Feb 10 15:59:48 2012 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 10 Feb 2012 16:59:48 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: > """A common pattern in Python 2.x is to have one version of a module > implemented in pure Python, with an optional accelerated version > implemented as a C extension; for example, pickle and cPickle. This > places the burden of importing the accelerated version and falling > back on the pure Python version on each user of these modules. In > Python 3.0, the accelerated versions are considered implementation > details of the pure Python versions. Users should always import the > standard version, which attempts to import the accelerated version and > falls back to the pure Python version. The pickle / cPickle pair > received this treatment. The profile module is on the list for 3.1. > The StringIO module has been turned into a class in the io module.""" > > Is there a good reason why xml.etree.ElementTree / > xml.etree.cElementTree did not "receive this treatment"? > Since there appeared to be an overall positive response for making this change in Python 3.3, and since there isn't longer any doubt about the ownership of the package *in Python's stdlib* (see http://mail.python.org/pipermail/python-dev/2012-February/116389.html), I've opened issue 13988 on the bug tracker to follow the implementation. Eli From status at bugs.python.org Fri Feb 10 18:07:36 2012 From: status at bugs.python.org (Python tracker) Date: Fri, 10 Feb 2012 18:07:36 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20120210170736.8AC4A1DEEF@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2012-02-03 - 2012-02-10) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3246 ( -2) closed 22523 (+57) total 25769 (+55) Open issues with patches: 1394 Issues opened (36) ================== #13929: fnmatch to support escape characters http://bugs.python.org/issue13929 reopened by terry.reedy #13934: sqlite3 test typo http://bugs.python.org/issue13934 opened by poq #13938: 2to3 fails to convert types.StringTypes appropriately http://bugs.python.org/issue13938 opened by mhammond #13940: imaplib: Mailbox names are not quoted http://bugs.python.org/issue13940 opened by joebauer #13942: ssl.wrap_socket does not work on socket.socketpair()'s http://bugs.python.org/issue13942 opened by weary #13943: distutils??? build_py fails when package string is unicode http://bugs.python.org/issue13943 opened by patrick.andrew #13946: readline completer could return an iterable http://bugs.python.org/issue13946 opened by nicolas_49 #13948: rm needless use of set function http://bugs.python.org/issue13948 opened by tshepang #13949: rm needless use of pass statement http://bugs.python.org/issue13949 opened by tshepang #13950: rm commented-out code http://bugs.python.org/issue13950 opened by tshepang #13951: Seg Fault in .so called by ctypes causes the interpreter to Se http://bugs.python.org/issue13951 opened by graemeglass #13952: mimetypes doesn't recognize .csv http://bugs.python.org/issue13952 opened by iwd32900 #13953: Get rid of doctests in packaging.tests.test_version http://bugs.python.org/issue13953 opened by tshepang #13954: Add regrtest option to record test results to a file http://bugs.python.org/issue13954 opened by brett.cannon #13959: Re-implement parts of imp in pure Python http://bugs.python.org/issue13959 opened by brett.cannon #13960: Handling of broken comments in HTMLParser http://bugs.python.org/issue13960 opened by ezio.melotti #13961: Have importlib use os.replace() http://bugs.python.org/issue13961 opened by brett.cannon #13962: multiple lib and include directories on Linux http://bugs.python.org/issue13962 opened by rpq #13963: dev guide has no mention of mechanics of patch review http://bugs.python.org/issue13963 opened by dmalcolm #13964: os.utimensat() and os.futimes() should accept Decimal, drop os http://bugs.python.org/issue13964 opened by haypo #13966: Add disable_interspersed_args() to argparse.ArgumentParser http://bugs.python.org/issue13966 opened by Laszlo.Attila.Toth #13967: also test for an empty pathname http://bugs.python.org/issue13967 opened by tshepang #13968: Support recursive globs http://bugs.python.org/issue13968 opened by ubershmekel #13969: path name must always be string (or None) http://bugs.python.org/issue13969 opened by tshepang #13970: frameobject should not have f_yieldfrom attribute http://bugs.python.org/issue13970 opened by Mark.Shannon #13972: set and frozenset constructors don't accept multiple iterables http://bugs.python.org/issue13972 opened by petri.lehtinen #13973: urllib.parse is imported twice in xmlrpc.client http://bugs.python.org/issue13973 opened by tshepang #13974: packaging: test for set_platform() http://bugs.python.org/issue13974 opened by tshepang #13977: importlib simplification http://bugs.python.org/issue13977 opened by Jim.Jewett #13978: OSError exception in multiprocessing module when using os.remo http://bugs.python.org/issue13978 opened by jjardon #13979: Automatic *libc.so loading behaviour http://bugs.python.org/issue13979 opened by dgoulet #13981: time.sleep() should use nanosleep() if available http://bugs.python.org/issue13981 opened by haypo #13985: Menu.tk_popup : menu doesn't disapear when main window is ico http://bugs.python.org/issue13985 opened by marc.dechico #13986: ValueError: cannot convert float NaN to integer http://bugs.python.org/issue13986 opened by shivam_python_issues #13987: Handling of broken markup in HTMLParser on 2.7 http://bugs.python.org/issue13987 opened by ezio.melotti #13988: Expose the C implementation of ElementTree by default when imp http://bugs.python.org/issue13988 opened by eli.bendersky Most recent 15 issues with no replies (15) ========================================== #13987: Handling of broken markup in HTMLParser on 2.7 http://bugs.python.org/issue13987 #13985: Menu.tk_popup : menu doesn't disapear when main window is ico http://bugs.python.org/issue13985 #13981: time.sleep() should use nanosleep() if available http://bugs.python.org/issue13981 #13979: Automatic *libc.so loading behaviour http://bugs.python.org/issue13979 #13978: OSError exception in multiprocessing module when using os.remo http://bugs.python.org/issue13978 #13977: importlib simplification http://bugs.python.org/issue13977 #13973: urllib.parse is imported twice in xmlrpc.client http://bugs.python.org/issue13973 #13972: set and frozenset constructors don't accept multiple iterables http://bugs.python.org/issue13972 #13963: dev guide has no mention of mechanics of patch review http://bugs.python.org/issue13963 #13961: Have importlib use os.replace() http://bugs.python.org/issue13961 #13959: Re-implement parts of imp in pure Python http://bugs.python.org/issue13959 #13954: Add regrtest option to record test results to a file http://bugs.python.org/issue13954 #13946: readline completer could return an iterable http://bugs.python.org/issue13946 #13940: imaplib: Mailbox names are not quoted http://bugs.python.org/issue13940 #13938: 2to3 fails to convert types.StringTypes appropriately http://bugs.python.org/issue13938 Most recent 15 issues waiting for review (15) ============================================= #13987: Handling of broken markup in HTMLParser on 2.7 http://bugs.python.org/issue13987 #13974: packaging: test for set_platform() http://bugs.python.org/issue13974 #13973: urllib.parse is imported twice in xmlrpc.client http://bugs.python.org/issue13973 #13970: frameobject should not have f_yieldfrom attribute http://bugs.python.org/issue13970 #13969: path name must always be string (or None) http://bugs.python.org/issue13969 #13968: Support recursive globs http://bugs.python.org/issue13968 #13967: also test for an empty pathname http://bugs.python.org/issue13967 #13966: Add disable_interspersed_args() to argparse.ArgumentParser http://bugs.python.org/issue13966 #13961: Have importlib use os.replace() http://bugs.python.org/issue13961 #13960: Handling of broken comments in HTMLParser http://bugs.python.org/issue13960 #13953: Get rid of doctests in packaging.tests.test_version http://bugs.python.org/issue13953 #13950: rm commented-out code http://bugs.python.org/issue13950 #13949: rm needless use of pass statement http://bugs.python.org/issue13949 #13948: rm needless use of set function http://bugs.python.org/issue13948 #13938: 2to3 fails to convert types.StringTypes appropriately http://bugs.python.org/issue13938 Top 10 most discussed issues (10) ================================= #13968: Support recursive globs http://bugs.python.org/issue13968 53 msgs #13703: Hash collision security issue http://bugs.python.org/issue13703 27 msgs #13988: Expose the C implementation of ElementTree by default when imp http://bugs.python.org/issue13988 11 msgs #1559549: ImportError needs attributes for module and file name http://bugs.python.org/issue1559549 11 msgs #13882: PEP 410: Use decimal.Decimal type for timestamps http://bugs.python.org/issue13882 10 msgs #13370: test_ctypes fails when building python with clang http://bugs.python.org/issue13370 9 msgs #13964: os.utimensat() and os.futimes() should accept Decimal, drop os http://bugs.python.org/issue13964 9 msgs #13590: extension module builds fail with python.org OS X installers o http://bugs.python.org/issue13590 8 msgs #2377: Replace __import__ w/ importlib.__import__ http://bugs.python.org/issue2377 5 msgs #4709: Mingw-w64 and python on windows x64 http://bugs.python.org/issue4709 5 msgs Issues closed (54) ================== #1975: signals not always delivered to main thread, since other threa http://bugs.python.org/issue1975 closed by neologix #5218: Check for tp_iter in ceval:ext_do_call before overriding excep http://bugs.python.org/issue5218 closed by terry.reedy #6005: Bug in socket example http://bugs.python.org/issue6005 closed by orsenthil #6617: During compiling python 3.1 getting error Undefined symbol lib http://bugs.python.org/issue6617 closed by skrah #7433: MemoryView memory_getbuf causes segfaults, double call to tp_r http://bugs.python.org/issue7433 closed by skrah #7827: recv_into() argument 1 must be pinned buffer, not bytearray http://bugs.python.org/issue7827 closed by dalke #8305: memoview[0] creates an invalid view if ndim != 1 http://bugs.python.org/issue8305 closed by skrah #9021: no copy.copy problem description http://bugs.python.org/issue9021 closed by orsenthil #9990: PyMemoryView_FromObject alters the Py_buffer after calling PyO http://bugs.python.org/issue9990 closed by skrah #11805: package_data only allows one glob per-package http://bugs.python.org/issue11805 closed by eric.araujo #11944: Function call with * and generator hide exception raised by ge http://bugs.python.org/issue11944 closed by terry.reedy #12410: Create a new helper function that enable to test that an opera http://bugs.python.org/issue12410 closed by neologix #12993: prepared statements in sqlite3 module http://bugs.python.org/issue12993 closed by georg.brandl #13286: PEP 3151 breaks backward compatibility: it should be documente http://bugs.python.org/issue13286 closed by haypo #13588: Change name of internal closure functions in importlib http://bugs.python.org/issue13588 closed by brett.cannon #13609: Add "os.get_terminal_size()" function http://bugs.python.org/issue13609 closed by pitrou #13712: pysetup create should not convert package_data to extra_files http://bugs.python.org/issue13712 closed by eric.araujo #13734: Add a generic directory walker method to avoid symlink attacks http://bugs.python.org/issue13734 closed by neologix #13845: Use GetSystemTimeAsFileTime() to get a resolution of 100 ns on http://bugs.python.org/issue13845 closed by haypo #13846: Add time.monotonic() function http://bugs.python.org/issue13846 closed by haypo #13861: test_pydoc failure http://bugs.python.org/issue13861 closed by ned.deily #13865: distutils documentation says Extension has "optional" argument http://bugs.python.org/issue13865 closed by eric.araujo #13879: Argparse does not support subparser aliases in 2.7 http://bugs.python.org/issue13879 closed by eric.araujo #13880: pydoc -k throws "AssertionError: distutils has already been pa http://bugs.python.org/issue13880 closed by ned.deily #13893: Make CGIHTTPServer capable of redirects (and status other than http://bugs.python.org/issue13893 closed by eric.araujo #13904: Generator as *args: TypeError replaced http://bugs.python.org/issue13904 closed by terry.reedy #13910: test_packaging is dependent on dict ordering. http://bugs.python.org/issue13910 closed by eric.araujo #13911: test_trace depends on dict repr() ordering http://bugs.python.org/issue13911 closed by Mark.Shannon #13921: sqlite3: OptimizedUnicode obsolete in Py3k http://bugs.python.org/issue13921 closed by python-dev #13926: pydoc - stall when requesting a list of available modules in t http://bugs.python.org/issue13926 closed by ned.deily #13928: bug in asyncore.dispatcher_with_send http://bugs.python.org/issue13928 closed by adamhj #13932: If some test module fails to import another module unittest re http://bugs.python.org/issue13932 closed by michael.foord #13933: IDLE:not able to complete the hashlib module http://bugs.python.org/issue13933 closed by ned.deily #13935: Tarfile - Fixed GNU tar header base-256 handling http://bugs.python.org/issue13935 closed by lars.gustaebel #13936: datetime.time(0,0,0) evaluates to False despite being a valid http://bugs.python.org/issue13936 closed by tim_one #13937: multiprocessing.ThreadPool.join() blocks indefinitely. http://bugs.python.org/issue13937 closed by neologix #13939: excessive cpu usage http://bugs.python.org/issue13939 closed by sandro.tosi #13941: Your Python may not be configured for Tk http://bugs.python.org/issue13941 closed by amaury.forgeotdarc #13944: HMAC object called hmac http://bugs.python.org/issue13944 closed by python-dev #13945: Mistake in the text for PEP-383 http://bugs.python.org/issue13945 closed by georg.brandl #13947: gdbm reorganize() leaves hanging file descriptor http://bugs.python.org/issue13947 closed by jcea #13955: email: RFC 2822 has been obsoleted by RFC 5322 http://bugs.python.org/issue13955 closed by r.david.murray #13956: add a note regarding building on recent versions of Debian and http://bugs.python.org/issue13956 closed by eric.araujo #13957: parsedate_tz doesn't distinguish -0000 from +0000 http://bugs.python.org/issue13957 closed by r.david.murray #13958: Comment _PyUnicode_FromId http://bugs.python.org/issue13958 closed by Jim.Jewett #13965: Windows 64-bit installer actually installing a 32-bit version http://bugs.python.org/issue13965 closed by loewis #13971: format() doesn't parse str. http://bugs.python.org/issue13971 closed by eric.smith #13975: packaging: change_root() test for os2 http://bugs.python.org/issue13975 closed by eric.araujo #13976: threading.local doesn't support super() http://bugs.python.org/issue13976 closed by Dima.Tisnek #13980: getcwd problem does not return cwd http://bugs.python.org/issue13980 closed by r.david.murray #13982: python returning errorneous value for sqrt http://bugs.python.org/issue13982 closed by loewis #13983: make test giving bus error http://bugs.python.org/issue13983 closed by loewis #13984: Python2.6 compilation breaking on mips64 bit machine http://bugs.python.org/issue13984 closed by loewis #964437: idle help is modal http://bugs.python.org/issue964437 closed by terry.reedy From brett at python.org Fri Feb 10 19:05:30 2012 From: brett at python.org (Brett Cannon) Date: Fri, 10 Feb 2012 13:05:30 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120209115302.2ce94321@bhuda.mired.org> Message-ID: On Thu, Feb 9, 2012 at 17:00, PJ Eby wrote: > On Thu, Feb 9, 2012 at 2:53 PM, Mike Meyer wrote: > >> For those of you not watching -ideas, or ignoring the "Python TIOBE >> -3%" discussion, this would seem to be relevant to any discussion of >> reworking the import mechanism: >> >> http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059801.html >> >> Interesting. This gives me an idea for a way to cut stat calls per > sys.path entry per import by roughly 4x, at the cost of a one-time > directory read per sys.path entry. > > That is, an importer created for a particular directory could, upon first > use, cache a frozenset(listdir()), and the stat().st_mtime of the > directory. All the filename checks could then be performed against the > frozenset, and the st_mtime of the directory only checked once per import, > to verify whether the frozenset() needed refreshing. > I actually contemplated this back in 2006 when I first began importlib for use at Google to get around NFS's crappy stat performance. Never got around to it as compatibility with import.c turned out to be a little tricky. =) Your solution below, PJE, is more-or-less what I was considering (although I also considered variants that didn't stat the directory when you knew your code wasn't changing stuff behind your back). > > Since a failed module lookup takes at least 5 stat checks (pyc, pyo, py, > directory, and compiled extension (pyd/so)), this cuts it down to only 1, > at the price of a listdir(). The big question is how long does a listdir() > take, compared to a stat() or failed open()? That would tell us whether > the tradeoff is worth making. > Actually it's pyc OR pyo, py, directory (which can lead to another set for __init__.py and __pycache__), .so, module.so (or whatever your platform uses for extensions). > > I did some crude timeit tests on frozenset(listdir()) and trapping failed > stat calls. It looks like, for a Windows directory the size of the 2.7 > stdlib, you need about four *failed* import attempts to overcome the > initial caching cost, or about 8 successful bytecode imports. (For Linux, > you might need to double these numbers; my tests showed a different ratio > there, perhaps due to the Linux stdib I tested having nearly twice as many > directory entries as the directory I tested on Windows!) > > However, the numbers are much better for application directories than for > the stdlib, since they are located earlier on sys.path. Every successful > stdlib import in an application is equal to one failed import attempt for > every preceding directory on sys.path, so as long as the average directory > on sys.path isn't vastly larger than the stdlib, and the average > application imports at least four modules from the stdlib (on Windows, or 8 > on Linux), there would be a net performance gain for the application as a > whole. (That is, there'd be an improved per-sys.path entry import time for > stdlib modules, even if not for any application modules.) > Does this comment take into account the number of modules required to load the interpreter to begin with? That's already like 48 modules loaded by Python 3.2 as it is. > > For smaller directories, the tradeoff actually gets better. A directory > one seventh the size of the 2.7 Windows stdlib has a listdir() that's > proportionately faster, but failed stats() in that directory are *not* > proportionately faster; they're only somewhat faster. This means that it > takes fewer failed module lookups to make caching a win - about 2 in this > case, vs. 4 for the stdlib. > > Now, these numbers are with actual disk or network access abstracted away, > because the data's in the operating system cache when I run the tests. > It's possible that this strategy could backfire if you used, say, an NFS > directory with ten thousand files in it as your first sys.path entry. > Without knowing the timings for listdir/stat/failed stat in that setup, > it's hard to say how many stdlib imports you need before you come out > ahead. When I tried a directory about 7 times larger than the stdlib, > creating the frozenset took 10 times as long, but the cost of a failed stat > didn't go up by very much. > > This suggests that there's probably an optimal directory size cutoff for > this trick; if only there were some way to check the size of a directory > without reading it, we could turn off the caching for oversize directories, > and get a major speed boost for everything else. On most platforms, the > stat().st_size of the directory itself will give you some idea, but on > Windows that's always zero. On Windows, we could work around that by using > a lower-level API than listdir() and simply stop reading the directory if > we hit the maximum number of entries we're willing to build a cache for, > and then call it off. > > (Another possibility would be to explicitly enable caching by putting a > flag file in the directory, or perhaps by putting a special prefix on the > sys.path entry, setting the cutoff in an environment variable, etc.) > > In any case, this seems really worth a closer look: in non-pathological > cases, it could make directory-based importing as fast as zip imports are. > I'd be especially interested in knowing how the listdir/stat/failed stat > ratios work on NFS - ISTM that they might be even *more* conducive to this > approach, if setup latency dominates the cost of individual system calls. > > If this works out, it'd be a good example of why importlib is a good idea; > i.e., allowing us to play with ideas like this. Brett, wouldn't you love > to be able to say importlib is *faster* than the old C-based importing? ;-) > Yes, that woud be nice. =) Now there are a couple things to clarify/question here. First is that if this were used on Windows or OS X (i.e. the OSs we support that typically have case-insensitive filesystems), then this approach would be a massive gain as we already call os.listdir() when PYTHONCASEOK isn't defined to check case-sensitivity; take your 5 stat calls and add in 5 listdir() calls and that's what you get on Windows and OS X right now. Linux doesn't have this check so you would still be potentially paying a penalty there. Second is variance in filesystems. Are we guaranteed that the stat of a directory is updated before a file change is made? Else there is a small race condition there which would suck. We also have the issue of granularity; Antoine has already had to add the source file size to .pyc files in Python 3.3 to combat crappy mtime granularity when generating bytecode. If we get file mod -> import -> file mod -> import, are we guaranteed that the second import will know there was a modification if the first three steps occur fast enough to fit within the granularity of an mtime value? I was going to say something about __pycache__, but it actually doesn't affect this. Since you would have to stat the directory anyway, you might as well just stat directory for the file you want to keep it simple. Only if you consider __pycache__ to be immutable except for what the interpreter puts in that directory during execution could you optimize that step (in which case you can stat the directory once and never care again as the set would be just updated by import whenever a new .pyc file was written). Having said all of this, implementing this idea would be trivial using importlib if you don't try to optimize the __pycache__ case. It's just a question of whether people are comfortable with the semantic change to import. This could also be made into something that was in importlib for people to use when desired if we are too worried about semantic changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Fri Feb 10 19:33:22 2012 From: brett at python.org (Brett Cannon) Date: Fri, 10 Feb 2012 13:33:22 -0500 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: References: Message-ID: Other than the misspelling of "maintenante" instead of "maintenance", LGTM. On Fri, Feb 10, 2012 at 09:06, Eli Bendersky wrote: > Hi all, > > Following the intensive and fruitful discussion of the (now rejected) > PEP 408 ( > http://mail.python.org/pipermail/python-dev/2012-January/115850.html), > we've drafted PEP 411 to summarize the conclusions with regards to the > process of marking packages provisional. Note that this is an > informational PEP, and that for the sake of completeness it duplicates > some of the contents of PEP 408. > > It is pasted below, as well as online at > http://www.python.org/dev/peps/pep-0411/. > > Comments are welcome. > > Eli > > ------------------------------------------------ > > > PEP: 411 > Title: Provisional packages in the Python standard library > Version: $Revision$ > Last-Modified: $Date$ > Author: Nick Coghlan , > Eli Bendersky > Status: Draft > Type: Informational > Content-Type: text/x-rst > Created: 2012-02-10 > Python-Version: 3.3 > Post-History: 2012-02-10 > > > Abstract > ======== > > The process of including a new package into the Python standard library is > hindered by the API lock-in and promise of backward compatibility implied > by > a package being formally part of Python. This PEP describes a methodology > for marking a standard library package "provisional" for the period of a > single > minor release. A provisional package may have its API modified prior to > "graduating" into a "stable" state. On one hand, this state provides the > package with the benefits of being formally part of the Python > distribution. > On the other hand, the core development team explicitly states that no > promises > are made with regards to the the stability of the package's API, which may > change for the next release. While it is considered an unlikely outcome, > such packages may even be removed from the standard library without a > deprecation period if the concerns regarding their API or maintenante prove > well-founded. > > > Proposal - a documented provisional state > ========================================= > > Whenever the Python core development team decides that a new package > should be > included into the standard library, but isn't entirely sure about whether > the > package's API is optimal, the package can be included and marked as > "provisional". > > In the next minor release, the package may either be "graduated" into a > normal > "stable" state in the standard library, or be rejected and removed entirely > from the Python source tree. If the package ends up graduating into the > stable state after being provisional for a minor release, its API may be > changed according to accumulated feedback. The core development team > explicitly makes no guarantees about API stability and backward > compatibility > of provisional packages. > > > Marking a package provisional > ----------------------------- > > A package will be marked provisional by including the following paragraph > as > a note at the top of its documentation page: > > The package has been included in the standard library on a > provisional basis. While major changes are not anticipated, as long as > this notice remains in place, backwards incompatible changes are > permitted if deemed necessary by the standard library developers. Such > changes will not be made gratuitously - they will occur only if > serious API flaws are uncovered that were missed prior to inclusion of > the package. > > Moving a package from the provisional to the stable state simply implies > removing this note from its documentation page. > > > Which packages should go through the provisional state > ------------------------------------------------------ > > We expect most packages proposed for addition into the Python standard > library > to go through a minor release in the provisional state. There may, however, > be some exceptions, such as packages that use a pre-defined API (for > example > ``lzma``, which generally follows the API of the existing ``bz2`` package), > or packages with an API that has wide acceptance in the Python development > community. > > In any case, packages that are proposed to be added to the standard > library, > whether via the provisional state or directly, must fulfill the acceptance > conditions set by PEP 2. > > Criteria for "graduation" > ------------------------- > > In principle, most provisional packages should eventually graduate to the > stable standard library. Some reasons for not graduating are: > > * The package may prove to be unstable or fragile, without sufficient > developer > support to maintain it. > * A much better alternative package may be found during the preview > release. > > Essentially, the decision will be made by the core developers on a per-case > basis. The point to emphasize here is that a packages's inclusion in the > standard library as "provisional" in some release does not guarantee it > will > continue being part of Python in the next release. > > > Rationale > ========= > > Benefits for the core development team > -------------------------------------- > > Currently, the core developers are really reluctant to add new interfaces > to > the standard library. This is because as soon as they're published in a > release, API design mistakes get locked in due to backward compatibility > concerns. > > By gating all major API additions through some kind of a provisional > mechanism > for a full release, we get one full release cycle of community feedback > before we lock in the APIs with our standard backward compatibility > guarantee. > > We can also start integrating provisional packages with the rest of the > standard > library early, so long as we make it clear to packagers that the > provisional > packages should not be considered optional. The only difference between > provisional APIs and the rest of the standard library is that provisional > APIs > are explicitly exempted from the usual backward compatibility guarantees. > > Benefits for end users > ---------------------- > > For future end users, the broadest benefit lies in a better > "out-of-the-box" > experience - rather than being told "oh, the standard library tools for > task X > are horrible, download this 3rd party library instead", those superior > tools > are more likely to be just be an import away. > > For environments where developers are required to conduct due diligence on > their upstream dependencies (severely harming the cost-effectiveness of, or > even ruling out entirely, much of the material on PyPI), the key benefit > lies > in ensuring that all packages in the provisional state are clearly under > python-dev's aegis from at least the following perspectives: > > * Licensing: Redistributed by the PSF under a Contributor Licensing > Agreement. > * Documentation: The documentation of the package is published and > organized via > the standard Python documentation tools (i.e. ReST source, output > generated > with Sphinx and published on http://docs.python.org). > * Testing: The package test suites are run on the python.org buildbot > fleet > and results published via http://www.python.org/dev/buildbot. > * Issue management: Bugs and feature requests are handled on > http://bugs.python.org > * Source control: The master repository for the software is published > on http://hg.python.org. > > > Candidates for provisional inclusion into the standard library > ============================================================== > > For Python 3.3, there are a number of clear current candidates: > > * ``regex`` (http://pypi.python.org/pypi/regex) - approved by Guido [#]_. > * ``daemon`` (PEP 3143) > * ``ipaddr`` (PEP 3144) > > Other possible future use cases include: > > * Improved HTTP modules (e.g. ``requests``) > * HTML 5 parsing support (e.g. ``html5lib``) > * Improved URL/URI/IRI parsing > * A standard image API (PEP 368) > * Encapsulation of the import state (PEP 368) > * Standard event loop API (PEP 3153) > * A binary version of WSGI for Python 3 (e.g. PEP 444) > * Generic function support (e.g. ``simplegeneric``) > > > Rejected alternatives and variations > ==================================== > > See PEP 408. > > > References > ========== > > .. [#] > http://mail.python.org/pipermail/python-dev/2012-January/115962.html > > Copyright > ========= > > This document has been placed in the public domain. > > > .. > Local Variables: > mode: indented-text > indent-tabs-mode: nil > sentence-end-double-space: t > fill-column: 70 > coding: utf-8 > End: > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Fri Feb 10 21:07:16 2012 From: pje at telecommunity.com (PJ Eby) Date: Fri, 10 Feb 2012 15:07:16 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120209115302.2ce94321@bhuda.mired.org> Message-ID: On Fri, Feb 10, 2012 at 1:05 PM, Brett Cannon wrote: > > > On Thu, Feb 9, 2012 at 17:00, PJ Eby wrote: > >> I did some crude timeit tests on frozenset(listdir()) and trapping failed >> stat calls. It looks like, for a Windows directory the size of the 2.7 >> stdlib, you need about four *failed* import attempts to overcome the >> initial caching cost, or about 8 successful bytecode imports. (For Linux, >> you might need to double these numbers; my tests showed a different ratio >> there, perhaps due to the Linux stdib I tested having nearly twice as many >> directory entries as the directory I tested on Windows!) >> > >> However, the numbers are much better for application directories than for >> the stdlib, since they are located earlier on sys.path. Every successful >> stdlib import in an application is equal to one failed import attempt for >> every preceding directory on sys.path, so as long as the average directory >> on sys.path isn't vastly larger than the stdlib, and the average >> application imports at least four modules from the stdlib (on Windows, or 8 >> on Linux), there would be a net performance gain for the application as a >> whole. (That is, there'd be an improved per-sys.path entry import time for >> stdlib modules, even if not for any application modules.) >> > > Does this comment take into account the number of modules required to load > the interpreter to begin with? That's already like 48 modules loaded by > Python 3.2 as it is. > I didn't count those, no. So, if they're loaded from disk *after* importlib is initialized, then they should pay off the cost of caching even fairly large directories that appear earlier on sys.path than the stdlib. We still need to know about NFS and other ratios, though... I still worry that people with more extreme directory sizes or slow-access situations will run into even worse trouble than they have now. > First is that if this were used on Windows or OS X (i.e. the OSs we > support that typically have case-insensitive filesystems), then this > approach would be a massive gain as we already call os.listdir() when > PYTHONCASEOK isn't defined to check case-sensitivity; take your 5 stat > calls and add in 5 listdir() calls and that's what you get on Windows and > OS X right now. Linux doesn't have this check so you would still be > potentially paying a penalty there. > Wow. That means it'd always be a win for pre-stdlib sys.path entries, because any successful stdlib import equals a failed pre-stdlib lookup. (Of course, that's just saving some of the overhead that's been *added* by importlib, not a new gain, but still...) Second is variance in filesystems. Are we guaranteed that the stat of a > directory is updated before a file change is made? > Not quite sure what you mean here. The directory stat is used to ensure that new files haven't been added, old ones removed, or existing ones renamed. Changes to the files themselves shouldn't factor in, should they? > Else there is a small race condition there which would suck. We also have > the issue of granularity; Antoine has already had to add the source file > size to .pyc files in Python 3.3 to combat crappy mtime granularity when > generating bytecode. If we get file mod -> import -> file mod -> import, > are we guaranteed that the second import will know there was a modification > if the first three steps occur fast enough to fit within the granularity of > an mtime value? > Again, I'm not sure how this relates. Automatic code reloaders monitor individual files that have been previously imported, so the directory timestamps aren't relevant. Of course, I could be confused here. Are you saying that if somebody makes a new .py file and saves it, that it'll be possible to import it before it's finished being written? If so, that could happen already, and again caching the directory doesn't make any difference. Alternately, you could have a situation where the file is deleted after we load the listdir(), but in that case the open will fail and we can fall back... heck, we can even force resetting the cache in that event. I was going to say something about __pycache__, but it actually doesn't > affect this. Since you would have to stat the directory anyway, you might > as well just stat directory for the file you want to keep it simple. Only > if you consider __pycache__ to be immutable except for what the interpreter > puts in that directory during execution could you optimize that step (in > which case you can stat the directory once and never care again as the set > would be just updated by import whenever a new .pyc file was written). > > Having said all of this, implementing this idea would be trivial using > importlib if you don't try to optimize the __pycache__ case. It's just a > question of whether people are comfortable with the semantic change to > import. This could also be made into something that was in importlib for > people to use when desired if we are too worried about semantic changes. > Yep. I was actually thinking this could be backported to 2.x, even without importlib, as a module to be imported in sitecustomize or via a .pth file. All it needs is a path hook, after all, and a subclass of the pkgutil importer to test it. And if we can get some people with huge NFS libraries and/or zillions of .egg directories on sys.path to test it, we could find out whether it's a win, lose, or draw for those scenarios. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Fri Feb 10 21:13:18 2012 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Fri, 10 Feb 2012 12:13:18 -0800 (PST) Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: References: Message-ID: <4f357a5e.308eec0a.6343.ffffbe10@mx.google.com> Eli Bendersky wrote (in http://mail.python.org/pipermail/python-dev/2012-February/116393.html ): > A package will be marked provisional by including the > following paragraph as a note at the top of its > documentation page: I really would like some marker available from within Python itself. Use cases: (1) During development, the documentation I normally read first is whatever results from import module; help(module), or possibly dir(module). (2) At BigCorp, there were scheduled times to move as much as possible to the current (or current-1) version. Regardless of policy, full regression test suites don't generally exist. If Python were viewed as part of the infrastructure (rather than as part of a specific application), or if I were responsible for maintaining an internal application built on python, that would be the time to upgrade python -- and I would want an easy way to figure out which applications and libraries I should concentrate on for testing. > * Encapsulation of the import state (PEP 368) Wrong PEP number. I'm guessing that you meant 406. -- If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ From brett at python.org Fri Feb 10 21:38:02 2012 From: brett at python.org (Brett Cannon) Date: Fri, 10 Feb 2012 15:38:02 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120209115302.2ce94321@bhuda.mired.org> Message-ID: On Fri, Feb 10, 2012 at 15:07, PJ Eby wrote: > On Fri, Feb 10, 2012 at 1:05 PM, Brett Cannon wrote: > >> >> >> On Thu, Feb 9, 2012 at 17:00, PJ Eby wrote: >> >>> I did some crude timeit tests on frozenset(listdir()) and trapping >>> failed stat calls. It looks like, for a Windows directory the size of the >>> 2.7 stdlib, you need about four *failed* import attempts to overcome the >>> initial caching cost, or about 8 successful bytecode imports. (For Linux, >>> you might need to double these numbers; my tests showed a different ratio >>> there, perhaps due to the Linux stdib I tested having nearly twice as many >>> directory entries as the directory I tested on Windows!) >>> >> >>> However, the numbers are much better for application directories than >>> for the stdlib, since they are located earlier on sys.path. Every >>> successful stdlib import in an application is equal to one failed import >>> attempt for every preceding directory on sys.path, so as long as the >>> average directory on sys.path isn't vastly larger than the stdlib, and the >>> average application imports at least four modules from the stdlib (on >>> Windows, or 8 on Linux), there would be a net performance gain for the >>> application as a whole. (That is, there'd be an improved per-sys.path >>> entry import time for stdlib modules, even if not for any application >>> modules.) >>> >> >> Does this comment take into account the number of modules required to >> load the interpreter to begin with? That's already like 48 modules loaded >> by Python 3.2 as it is. >> > > I didn't count those, no. So, if they're loaded from disk *after* > importlib is initialized, then they should pay off the cost of caching even > fairly large directories that appear earlier on sys.path than the stdlib. > We still need to know about NFS and other ratios, though... I still worry > that people with more extreme directory sizes or slow-access situations > will run into even worse trouble than they have now. > It's possible. No way to make it work for everyone. This is why I didn't worry about some crazy perf optimization. > > > >> First is that if this were used on Windows or OS X (i.e. the OSs we >> support that typically have case-insensitive filesystems), then this >> approach would be a massive gain as we already call os.listdir() when >> PYTHONCASEOK isn't defined to check case-sensitivity; take your 5 stat >> calls and add in 5 listdir() calls and that's what you get on Windows and >> OS X right now. Linux doesn't have this check so you would still be >> potentially paying a penalty there. >> > > Wow. That means it'd always be a win for pre-stdlib sys.path entries, > because any successful stdlib import equals a failed pre-stdlib lookup. > (Of course, that's just saving some of the overhead that's been *added* by > importlib, not a new gain, but still...) > How so? import.c does a listdir() as well (this is not special to importlib). > > > Second is variance in filesystems. Are we guaranteed that the stat of a >> directory is updated before a file change is made? >> > > Not quite sure what you mean here. The directory stat is used to ensure > that new files haven't been added, old ones removed, or existing ones > renamed. Changes to the files themselves shouldn't factor in, should they? > Changes in any fashion to the directory. Do filesystems atomically update the mtime of a directory when they commit a change? Otherwise we have a potential race condition. > > > >> Else there is a small race condition there which would suck. We also have >> the issue of granularity; Antoine has already had to add the source file >> size to .pyc files in Python 3.3 to combat crappy mtime granularity when >> generating bytecode. If we get file mod -> import -> file mod -> import, >> are we guaranteed that the second import will know there was a modification >> if the first three steps occur fast enough to fit within the granularity of >> an mtime value? >> > > Again, I'm not sure how this relates. Automatic code reloaders monitor > individual files that have been previously imported, so the directory > timestamps aren't relevant. > > Don't care about automatic reloaders. I'm just asking about the case where the mtime granularity is coarse enough to allow for a directory change, an import to execute, and then another directory change to occur all within a single mtime increment. That would lead to the set cache to be out of date. > Of course, I could be confused here. Are you saying that if somebody > makes a new .py file and saves it, that it'll be possible to import it > before it's finished being written? If so, that could happen already, and > again caching the directory doesn't make any difference. > > Alternately, you could have a situation where the file is deleted after we > load the listdir(), but in that case the open will fail and we can fall > back... heck, we can even force resetting the cache in that event. > > > I was going to say something about __pycache__, but it actually doesn't >> affect this. Since you would have to stat the directory anyway, you might >> as well just stat directory for the file you want to keep it simple. Only >> if you consider __pycache__ to be immutable except for what the interpreter >> puts in that directory during execution could you optimize that step (in >> which case you can stat the directory once and never care again as the set >> would be just updated by import whenever a new .pyc file was written). >> >> Having said all of this, implementing this idea would be trivial using >> importlib if you don't try to optimize the __pycache__ case. It's just a >> question of whether people are comfortable with the semantic change to >> import. This could also be made into something that was in importlib for >> people to use when desired if we are too worried about semantic changes. >> > > Yep. I was actually thinking this could be backported to 2.x, even > without importlib, as a module to be imported in sitecustomize or via a > .pth file. All it needs is a path hook, after all, and a subclass of the > pkgutil importer to test it. And if we can get some people with huge NFS > libraries and/or zillions of .egg directories on sys.path to test it, we > could find out whether it's a win, lose, or draw for those scenarios. > You can do that if you want, obviously I don't want to bother since it won't make it into Python 2.7. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tseaver at palladion.com Fri Feb 10 22:29:31 2012 From: tseaver at palladion.com (Tres Seaver) Date: Fri, 10 Feb 2012 16:29:31 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120209115302.2ce94321@bhuda.mired.org> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/10/2012 03:38 PM, Brett Cannon wrote: > Changes in any fashion to the directory. Do filesystems atomically > update the mtime of a directory when they commit a change? Otherwise > we have a potential race condition. Hmm, maybe I misundersand you. In POSIX land, the only thing which changes the mtime of a directory is linking / unlinking / renaming a file: changes to individual files aren't detectable by examining their containing directory's stat(). Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk81jDsACgkQ+gerLs4ltQ7YRwCePFEQA7E74dD9/j8ILuRMHLlA xbkAn1vTYGrEn4VOnVpygGafkGgnm42e =rJGg -----END PGP SIGNATURE----- From brett at python.org Fri Feb 10 22:42:28 2012 From: brett at python.org (Brett Cannon) Date: Fri, 10 Feb 2012 16:42:28 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120209115302.2ce94321@bhuda.mired.org> Message-ID: On Fri, Feb 10, 2012 at 16:29, Tres Seaver wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 02/10/2012 03:38 PM, Brett Cannon wrote: > > Changes in any fashion to the directory. Do filesystems atomically > > update the mtime of a directory when they commit a change? Otherwise > > we have a potential race condition. > > Hmm, maybe I misundersand you. In POSIX land, the only thing which > changes the mtime of a directory is linking / unlinking / renaming a > file: changes to individual files aren't detectable by examining their > containing directory's stat(). > Individual file changes are not important; either the module is already in sys.modules so no attempt is made to detect a change or it hasn't been loaded and so it will have to be read regardless. All I'm asking is whether filesystems typically update the filesystem for a e.g. file deletion atomically with the mtime for the containing directory or not. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tseaver at palladion.com Fri Feb 10 22:46:10 2012 From: tseaver at palladion.com (Tres Seaver) Date: Fri, 10 Feb 2012 16:46:10 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120209115302.2ce94321@bhuda.mired.org> Message-ID: <4F359022.5000607@palladion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/10/2012 04:42 PM, Brett Cannon wrote: > On Fri, Feb 10, 2012 at 16:29, Tres Seaver > wrote: > >> On 02/10/2012 03:38 PM, Brett Cannon wrote: >>> Changes in any fashion to the directory. Do filesystems >>> atomically update the mtime of a directory when they commit a >>> change? Otherwise we have a potential race condition. >> >> Hmm, maybe I misundersand you. In POSIX land, the only thing which >> changes the mtime of a directory is linking / unlinking / renaming >> a file: changes to individual files aren't detectable by examining >> their containing directory's stat(). >> > > Individual file changes are not important; either the module is > already in sys.modules so no attempt is made to detect a change or it > hasn't been loaded and so it will have to be read regardless. All I'm > asking is whether filesystems typically update the filesystem for a > e.g. file deletion atomically with the mtime for the containing > directory or not. In POSIX land, most certainly. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk81kCIACgkQ+gerLs4ltQ5MogCfQwP2n4gl9PfsNXuP3c5al8EX TgwAn2EoGz1vk0OQAh5n3Tl9oze1CSSC =3iuR -----END PGP SIGNATURE----- From tseaver at palladion.com Fri Feb 10 22:46:10 2012 From: tseaver at palladion.com (Tres Seaver) Date: Fri, 10 Feb 2012 16:46:10 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120209115302.2ce94321@bhuda.mired.org> Message-ID: <4F359022.5000607@palladion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/10/2012 04:42 PM, Brett Cannon wrote: > On Fri, Feb 10, 2012 at 16:29, Tres Seaver > wrote: > >> On 02/10/2012 03:38 PM, Brett Cannon wrote: >>> Changes in any fashion to the directory. Do filesystems >>> atomically update the mtime of a directory when they commit a >>> change? Otherwise we have a potential race condition. >> >> Hmm, maybe I misundersand you. In POSIX land, the only thing which >> changes the mtime of a directory is linking / unlinking / renaming >> a file: changes to individual files aren't detectable by examining >> their containing directory's stat(). >> > > Individual file changes are not important; either the module is > already in sys.modules so no attempt is made to detect a change or it > hasn't been loaded and so it will have to be read regardless. All I'm > asking is whether filesystems typically update the filesystem for a > e.g. file deletion atomically with the mtime for the containing > directory or not. In POSIX land, most certainly. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk81kCIACgkQ+gerLs4ltQ5MogCfQwP2n4gl9PfsNXuP3c5al8EX TgwAn2EoGz1vk0OQAh5n3Tl9oze1CSSC =3iuR -----END PGP SIGNATURE----- From tjreedy at udel.edu Fri Feb 10 22:56:37 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 10 Feb 2012 16:56:37 -0500 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: References: Message-ID: On 2/10/2012 9:06 AM, Eli Bendersky wrote: > Whenever the Python core development team decides that a new package should be > included into the standard library, but isn't entirely sure about whether the > package's API is optimal, the package can be included and marked as > "provisional". > > In the next minor release, the package may either be "graduated" into a normal > "stable" state in the standard library, or be rejected and removed entirely > from the Python source tree. This could be interpreted as limiting provisional status to one release cycle. I suggest that you add 'or continued as provisional'. In particular, if the api *is* changed, another provisional period might be advisable. > The package has been included in the standard library on a > provisional basis. While major changes are not anticipated, as long as > this notice remains in place, backwards incompatible changes are > permitted if deemed necessary by the standard library developers. Such 'as long as' implies no particular limit. -- Terry Jan Reedy From victor.stinner at haypocalc.com Fri Feb 10 23:22:28 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 10 Feb 2012 23:22:28 +0100 Subject: [Python-Dev] Add a new "locale" codec? In-Reply-To: <4F34E86E.6030505@v.loewis.de> References: <4F3314D5.8090907@pearwood.info> <20120209134254.6a7cb62c@pitrou.net> <4F34E86E.6030505@v.loewis.de> Message-ID: 2012/2/10 "Martin v. L?wis" : >> As And pointed out, this is already the behaviour of the "mbcs" codec >> under Windows. "locale" would be the moral (*) equivalent of that under >> Unix. > > Indeed, and that precedent should be enough reason *not* to include a > "locale" encoding. The "mbcs" encoding has caused much user confusion > over the years, and it is less useful than people typically think. For > example, for some time, people thought that names in zip files ought to > be encoded in "mbcs", only to find out that this is incorrect years > later. With a "locale" encoding, the risk for confusion and untestable > code is too high (just consider the ongoing saga of the Turkish dotless > i (?)). Well, I expected answer and I agree that there are more drawbacks than advantages. I will close the issue as wontfix. The current locale can already be read using locale.getpreferredencoding(False) and I already fixed functions using the current locale encoding. Victor From ericsnowcurrently at gmail.com Fri Feb 10 23:38:58 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 10 Feb 2012 15:38:58 -0700 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: <4f357a5e.308eec0a.6343.ffffbe10@mx.google.com> References: <4f357a5e.308eec0a.6343.ffffbe10@mx.google.com> Message-ID: On Fri, Feb 10, 2012 at 1:13 PM, Jim J. Jewett wrote: > > Eli Bendersky wrote (in > http://mail.python.org/pipermail/python-dev/2012-February/116393.html ): > >> A package will be marked provisional by including the >> following paragraph as a note at the top of its >> documentation page: > > I really would like some marker available from within Python > itself. > > Use cases: > > (1) ?During development, the documentation I normally read > first is whatever results from import module; help(module), > or possibly dir(module). > > (2) ?At BigCorp, there were scheduled times to move as much > as possible to the current (or current-1) version. > Regardless of policy, full regression test suites don't > generally exist. ?If Python were viewed as part of the > infrastructure (rather than as part of a specific > application), or if I were responsible for maintaining an > internal application built on python, that would be the time > to upgrade python -- and I would want an easy way to figure > out which applications and libraries I should concentrate on > for testing. +1 on both -eric From pje at telecommunity.com Sat Feb 11 01:23:07 2012 From: pje at telecommunity.com (PJ Eby) Date: Fri, 10 Feb 2012 19:23:07 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? Message-ID: On Feb 10, 2012 3:38 PM, "Brett Cannon" wrote: > On Fri, Feb 10, 2012 at 15:07, PJ Eby wrote: >> On Fri, Feb 10, 2012 at 1:05 PM, Brett Cannon wrote: >>> First is that if this were used on Windows or OS X (i.e. the OSs we support that typically have case-insensitive filesystems), then this approach would be a massive gain as we already call os.listdir() when PYTHONCASEOK isn't defined to check case-sensitivity; take your 5 stat calls and add in 5 listdir() calls and that's what you get on Windows and OS X right now. Linux doesn't have this check so you would still be potentially paying a penalty there. >> >> >> Wow. That means it'd always be a win for pre-stdlib sys.path entries, because any successful stdlib import equals a failed pre-stdlib lookup. (Of course, that's just saving some of the overhead that's been *added* by importlib, not a new gain, but still...) > > > How so? import.c does a listdir() as well (this is not special to importlib). IIRC, it does a FindFirstFile on Windows, which is not the same thing. That's one system call into a preallocated buffer, not a series of system calls and creation of Python string objects. > Don't care about automatic reloaders. I'm just asking about the case where the mtime granularity is coarse enough to allow for a directory change, an import to execute, and then another directory change to occur all within a single mtime increment. That would lead to the set cache to be out of date. Ah. Good point. Well, if there's any way to know what the mtime granularity is, we can avoid the race condition by never performing the listdir when the current clock time is too close to the stat(). In effect, we can bypass the optimization if the directory was just modified. Something like: mtime = stat(dir).st_mtime if abs(time.time()-mtime)>unsafe_window: old_mtime, files = cache.get(dir, (-1, ())) if mtime!=old_mtime: files = frozenset(listdir(dir)) cache[dir] = mtime, files # code to check for possibility of importing # and shortcut if found, or # exit with failure if no matching files # fallthrough to direct filesystem checking The "unsafe window" is presumably filesystem and platform dependent, but ISTR that even FAT filesystems have 2-second accuracy. The other catch is the relationship between st_mtime and time.time(); I assume they'd be the same in any sane system, but what if you're working across a network and there's clock skew? Ugh. Worst case example would be say, accessing a FAT device that's been shared over a Windows network from a machine whose clock is several hours off. So it always looks safe to read, even if it's just been changed. What's the downside in that case? You're trying to import something that just changed in the last fraction of a second... why? I mean, sure, the directory listing will be wrong, no question. But it only matters that it was wrong if you added, removed, or renamed importable files. Why are you trying to import one of them? Ah, here's a use case: you're starting up IDLE, and while it's loading, you save some .py files you plan to import later. Your editor saves them all at once, but IDLE does the listdir() midway through. You then do an import from the IDLE prompt, and it fails because the listdir() didn't catch everything. Okay, now I know how to fix this. The problem isn't that there's a race condition per se, the problem is that the race results in a broken cache later. After all, it could just as easily have been the case that the import failed due to timing. The problem is that all *future* imports would fail in this circumstance. So the fix is a time-to-live recheck: if TTL seconds have passed since the last use of the cached frozenset, reload it, and reset the TTL to infinity. In other words: mtime = stat(dir).st_mtime now - time.time() if abs(now-mtime)>unsafe_window: old_mtime, then, files = cache.get(dir, (-1, now, ())) if mtime!=old_mtime or then is not None and now-then>TTL: files = frozenset(listdir(dir)) cache[dir] = mtime, now if mtime!=old_mtime else None, files # code to check for possibility of importing # and shortcut if found, or # exit with failure if no matching files # fallthrough to direct filesystem checking What this does (or should do) is handle clock-skew race condition stale caches by reloading the listdir even if mtime hasn't changed, as soon as TTL seconds have passed since the last snapshot was taken. However, if the mtime stays the same, no subsequent listdirs will occur. As long as the TTL is set high enough that a full startup of Python can occur, but low enough that it resets by the time a human can notice something's wrong, it should be golden. ;-) The TTL approach could be used in place of the unsafe_window, actually; there's probably not much need for both. The pure unsafe_window approach has the advantage of elegance: it slows down only when you've just written to the directory, and only briefly. It doesn't load the directory twice, either. I suppose ideally, we'd set unsafe_window fairly low, and TTL fairly high, so that for command-line apps and such you'd be done your entire script (or at least all its importing) before reaching the TTL value. But interactive apps and servers wouldn't end up with a permanently stale cache just because something was changed during startup. Feh. Screw it, just use a fairly high TTL and forget trying to tune the unsafe_window, since if you're using a TTL you have to do the listdir() a second time if there are any imports later. It's also a single tunable parameter at that point. How high a TTL? It's got to be at least as high as the worst-case mtime granularity... which is how high? Yet, low enough that the human who goes, "huh, that import should've worked", checks the directory listing and tries it again will have it go through. Hopefully, the worst-case mtime granularity is shorter than that. ;-) >> Yep. I was actually thinking this could be backported to 2.x, even without importlib, as a module to be imported in sitecustomize or via a .pth file. All it needs is a path hook, after all, and a subclass of the pkgutil importer to test it. And if we can get some people with huge NFS libraries and/or zillions of .egg directories on sys.path to test it, we could find out whether it's a win, lose, or draw for those scenarios. > > > You can do that if you want, obviously I don't want to bother since it won't make it into Python 2.7. Of course. My thought wasn't to do this with a full version of importlib, just to make a proof-of-concept importer. All it really needs to do is skip over the normal import processing in the case where the cache says there's no way for the import to succeed; that's where all the speedup really comes from. On Feb 10, 2012 3:38 PM, "Brett Cannon" wrote: -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sat Feb 11 01:49:53 2012 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 11 Feb 2012 11:49:53 +1100 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: Message-ID: On Sat, Feb 11, 2012 at 11:23 AM, PJ Eby wrote: > What's the downside in that case?? You're trying to import something that > just changed in the last fraction of a second...? why? I don't know if it's normal in the Python world, but these sorts of race conditions occur most annoyingly when a single process changes a file, then attempts to import it. If you open a file, write to it, explicitly close it, and then load it, you would expect to read back what you wrote, not the version that was there previously. Chris Angelico From breamoreboy at yahoo.co.uk Sat Feb 11 02:27:32 2012 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Sat, 11 Feb 2012 01:27:32 +0000 Subject: [Python-Dev] http://pythonmentors.com/ Message-ID: Hi all, I'd never heard of this until some Dutch geezer whose name I'm now forgotten pointed me to it. Had I known about it a couple of years ago it would have saved a lot of people a lot of grief. Please could it be given a bit of publicity. -- Cheers. Mark Lawrence. p.s. The Dutch geezer in question competes with Dr. Who for having the bast time travelling machine :) From jnoller at gmail.com Sat Feb 11 02:38:36 2012 From: jnoller at gmail.com (Jesse Noller) Date: Fri, 10 Feb 2012 20:38:36 -0500 Subject: [Python-Dev] http://pythonmentors.com/ In-Reply-To: References: Message-ID: <3DC2DAB56C4C469380525FFB08E3A2D9@gmail.com> I've been trying to publicize it on twitter, my blog, google plus and elsewhere. help welcome. On Friday, February 10, 2012 at 8:27 PM, Mark Lawrence wrote: > Hi all, > > I'd never heard of this until some Dutch geezer whose name I'm now > forgotten pointed me to it. Had I known about it a couple of years ago > it would have saved a lot of people a lot of grief. Please could it be > given a bit of publicity. > > -- > Cheers. > > Mark Lawrence. > > p.s. The Dutch geezer in question competes with Dr. Who for having the > bast time travelling machine :) > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org (mailto:Python-Dev at python.org) > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jnoller%40gmail.com From steve at pearwood.info Sat Feb 11 03:00:57 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 11 Feb 2012 13:00:57 +1100 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: <4f357a5e.308eec0a.6343.ffffbe10@mx.google.com> References: <4f357a5e.308eec0a.6343.ffffbe10@mx.google.com> Message-ID: <4F35CBD9.1000200@pearwood.info> Jim J. Jewett wrote: > Eli Bendersky wrote (in > http://mail.python.org/pipermail/python-dev/2012-February/116393.html ): > >> A package will be marked provisional by including the >> following paragraph as a note at the top of its >> documentation page: > > I really would like some marker available from within Python > itself. +1 -- Steven From stephen at xemacs.org Sat Feb 11 03:44:20 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 11 Feb 2012 11:44:20 +0900 Subject: [Python-Dev] Fwd: maintenance of the ElementTree / cElementTree packages in the Python standard library In-Reply-To: References: Message-ID: <87wr7uw0l7.fsf@uwakimon.sk.tsukuba.ac.jp> A protagonist writes: > ---------- Forwarded message ---------- > From: Fredrik Lundh As a not-directly-concerned third party who's been lurking, it seems to me like people are in way too much of a rush to "get things done". Sending direct mail, addressing the precise question[1] seems to have produced immediate results. (And yes, I've been on this list long enough to know that maintenance of ET/cET has been an issue for years. Nevertheless!) While actually maintaining the code is important, continuity in the community is important, too, and probably more so. In this case, continuity evidently will be achieved by a handoff rather than reactivation of Fredrik. It's a shame that this result had to be achieved by MvL putting his foot down. I'm glad he did. Footnotes: [1] Ie, not "when is this or that code issue going to be addressed", but "in view of your apparent schedule constraints, would it be OK to maintain the package in the stdlib with python-dev taking responsibility". From eliben at gmail.com Sat Feb 11 04:04:58 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 11 Feb 2012 05:04:58 +0200 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: References: Message-ID: On Fri, Feb 10, 2012 at 20:33, Brett Cannon wrote: > Other than the misspelling of "maintenante" instead of "maintenance", LGTM. > Fixed that and another typo (thanks 'aspell' :-] ) Eli From eliben at gmail.com Sat Feb 11 04:10:26 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 11 Feb 2012 05:10:26 +0200 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: <4f357a5e.308eec0a.6343.ffffbe10@mx.google.com> References: <4f357a5e.308eec0a.6343.ffffbe10@mx.google.com> Message-ID: On Fri, Feb 10, 2012 at 22:13, Jim J. Jewett wrote: > > Eli Bendersky wrote (in > http://mail.python.org/pipermail/python-dev/2012-February/116393.html ): > >> A package will be marked provisional by including the >> following paragraph as a note at the top of its >> documentation page: > > I really would like some marker available from within Python > itself. > > Use cases: > > (1) ?During development, the documentation I normally read > first is whatever results from import module; help(module), > or possibly dir(module). > > (2) ?At BigCorp, there were scheduled times to move as much > as possible to the current (or current-1) version. > Regardless of policy, full regression test suites don't > generally exist. ?If Python were viewed as part of the > infrastructure (rather than as part of a specific > application), or if I were responsible for maintaining an > internal application built on python, that would be the time > to upgrade python -- and I would want an easy way to figure > out which applications and libraries I should concentrate on > for testing. > The big problem with this is that it's something that will have to be maintained, so it adds some additional burden (I suppose it will have to be tested as well). An easy way for (2) would be just grepping on the Python docs for the provisional note and seeing which modules have it. Anyhow, I'm not against the idea. I just think it has to be discussed in more detail so all the implications are understood. >> * Encapsulation of the import state (PEP 368) > > Wrong PEP number. ?I'm guessing that you meant 406. > Yep, thanks. Fixed. Eli From eliben at gmail.com Sat Feb 11 04:12:27 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 11 Feb 2012 05:12:27 +0200 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: References: Message-ID: On Fri, Feb 10, 2012 at 23:56, Terry Reedy wrote: > On 2/10/2012 9:06 AM, Eli Bendersky wrote: > >> Whenever the Python core development team decides that a new package >> should be >> included into the standard library, but isn't entirely sure about whether >> the >> package's API is optimal, the package can be included and marked as >> "provisional". >> >> In the next minor release, the package may either be "graduated" into a >> normal >> "stable" state in the standard library, or be rejected and removed >> entirely >> from the Python source tree. > > > This could be interpreted as limiting provisional status to one release > cycle. I suggest that you add 'or continued as provisional'. In particular, > if the api *is* changed, another provisional period might be advisable. > I think this was agreed upon when PEP 408 was discussed. Keeping a package provisional for too long is detrimental. Isn't a single release enough to decide that we want something or not? Keep in mind that many users won't touch the provisional packages in production code - we would like to make new parts of the stdlib functional as soon as possible. > >> ? ? The ?package has been included in the standard library on a >> ? ? provisional basis. While major changes are not anticipated, as long as >> ? ? this notice remains in place, backwards incompatible changes are >> ? ? permitted if deemed necessary by the standard library developers. Such > > > 'as long as' implies no particular limit. Perhaps it should also? Eli From eliben at gmail.com Sat Feb 11 04:14:58 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 11 Feb 2012 05:14:58 +0200 Subject: [Python-Dev] http://pythonmentors.com/ In-Reply-To: <3DC2DAB56C4C469380525FFB08E3A2D9@gmail.com> References: <3DC2DAB56C4C469380525FFB08E3A2D9@gmail.com> Message-ID: On Sat, Feb 11, 2012 at 03:38, Jesse Noller wrote: > I've been trying to publicize it on twitter, my blog, google plus and elsewhere. > > help welcome. > It also appears in the first paragraph of "Contributing" in the dev guide - which is pointed to by the main page at python.org (Core Development link). Mark, do you have a concrete idea of how it can be made more prominent? Eli From ericsnowcurrently at gmail.com Sat Feb 11 04:39:12 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 10 Feb 2012 20:39:12 -0700 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: References: <4f357a5e.308eec0a.6343.ffffbe10@mx.google.com> Message-ID: On Fri, Feb 10, 2012 at 8:10 PM, Eli Bendersky wrote: > On Fri, Feb 10, 2012 at 22:13, Jim J. Jewett wrote: >> >> Eli Bendersky wrote (in >> http://mail.python.org/pipermail/python-dev/2012-February/116393.html ): >> >>> A package will be marked provisional by including the >>> following paragraph as a note at the top of its >>> documentation page: >> >> I really would like some marker available from within Python >> itself. >> > > The big problem with this is that it's something that will have to be > maintained, so it adds some additional burden (I suppose it will have > to be tested as well). > > An easy way for (2) would be just grepping on the Python docs for the > provisional note and seeing which modules have it. > > Anyhow, I'm not against the idea. I just think it has to be discussed > in more detail so all the implications are understood. Is there more to it than having a simple __provisional__ attribute on the module and/or a list at sys.provisional_modules? -eric From ncoghlan at gmail.com Sat Feb 11 07:27:31 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 11 Feb 2012 16:27:31 +1000 Subject: [Python-Dev] http://pythonmentors.com/ In-Reply-To: References: <3DC2DAB56C4C469380525FFB08E3A2D9@gmail.com> Message-ID: On Sat, Feb 11, 2012 at 1:14 PM, Eli Bendersky wrote: > Mark, do you have a concrete idea of how it can be made more prominent? Mark didn't know about it because the core-mentorship list didn't exist yet in the timeframe he's talking about :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Feb 11 07:32:56 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 11 Feb 2012 16:32:56 +1000 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: References: <4f357a5e.308eec0a.6343.ffffbe10@mx.google.com> Message-ID: On Sat, Feb 11, 2012 at 1:39 PM, Eric Snow wrote: > Is there more to it than having a simple __provisional__ attribute on > the module and/or a list at sys.provisional_modules? Yes. As soon as we touch functional code, it because something to be tested and the process overhead on our end is noticeably higher. However, I'd be fine with requiring that a short form for the notice appear at the start of the module docstring. For example: "The API of this module is currently provisional. Refer to the documentation for details." This would then be seen by pydoc and help(), as well as being amenable to programmatic inspection. Also, with a documented provisional status, I think keeping things provisional for as long as it takes us to make up our minds they're right is fine. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From steve at pearwood.info Sat Feb 11 08:29:44 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 11 Feb 2012 18:29:44 +1100 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: References: <4f357a5e.308eec0a.6343.ffffbe10@mx.google.com> Message-ID: <4F3618E8.2030706@pearwood.info> Eric Snow wrote: > On Fri, Feb 10, 2012 at 8:10 PM, Eli Bendersky wrote: >> On Fri, Feb 10, 2012 at 22:13, Jim J. Jewett wrote: >>> Eli Bendersky wrote (in >>> http://mail.python.org/pipermail/python-dev/2012-February/116393.html ): >>> >>>> A package will be marked provisional by including the >>>> following paragraph as a note at the top of its >>>> documentation page: >>> I really would like some marker available from within Python >>> itself. >>> > >> The big problem with this is that it's something that will have to be >> maintained, so it adds some additional burden (I suppose it will have >> to be tested as well). "Big problem"? Maintenance of bsddb3 has been a big problem. Maintenance of a single module-level name for provisional packages is a small problem. The PEP already gives boilerplate which is required to go into the documentation of provisional packages. Requiring a top level name, and test for that, is no harder than what's already expected, and it is a constant difficulty regardless of package. In fact, we could (should?) have a single test that applies to all packages in the std lib: for package in packages: if isprovisional(package): assert hasattr(package, '__provisional__') assert package documentation includes boilerplate else: assert not hasattr(package, '__provisional__') assert package documentation does not include boilerplate Arguably, the canonical test for whether a package is provisional or not should be the existence of __provisional__: for package in packages: if hasattr(package, '__provisional__') assert package documentation includes boilerplate else: assert package documentation does not includes boilerplate >> An easy way for (2) would be just grepping on the Python docs for the >> provisional note and seeing which modules have it. Not all OSes include grep. Not all Python installations include the docs. -- Steven From eliben at gmail.com Sat Feb 11 08:31:47 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 11 Feb 2012 09:31:47 +0200 Subject: [Python-Dev] http://pythonmentors.com/ In-Reply-To: References: <3DC2DAB56C4C469380525FFB08E3A2D9@gmail.com> Message-ID: On Sat, Feb 11, 2012 at 08:27, Nick Coghlan wrote: > On Sat, Feb 11, 2012 at 1:14 PM, Eli Bendersky wrote: >> Mark, do you have a concrete idea of how it can be made more prominent? > > Mark didn't know about it because the core-mentorship list didn't > exist yet in the timeframe he's talking about :) > Yes, but he *now* asks to give it more publicity. Hence my question. Eli From breamoreboy at yahoo.co.uk Sat Feb 11 11:55:36 2012 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Sat, 11 Feb 2012 10:55:36 +0000 Subject: [Python-Dev] http://pythonmentors.com/ In-Reply-To: References: <3DC2DAB56C4C469380525FFB08E3A2D9@gmail.com> Message-ID: On 11/02/2012 03:14, Eli Bendersky wrote: > On Sat, Feb 11, 2012 at 03:38, Jesse Noller wrote: >> I've been trying to publicize it on twitter, my blog, google plus and elsewhere. >> >> help welcome. >> > > It also appears in the first paragraph of "Contributing" in the dev > guide - which is pointed to by the main page at python.org (Core > Development link). > > Mark, do you have a concrete idea of how it can be made more prominent? > > Eli Eli, quite frankly no :( The stock answer "put it on the main page at python.org" if actually followed up in all cases would result in something unreadable, as the page would be too noisy and displayed in something like Palatino size 1 (if there is such a thing). I'm just crossing my fingers and hoping that someone with far more than my own miniscule imagination can come up with a sensible suggestion. -- Cheers. Mark Lawrence. From eliben at gmail.com Sat Feb 11 12:00:40 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 11 Feb 2012 13:00:40 +0200 Subject: [Python-Dev] http://pythonmentors.com/ In-Reply-To: References: <3DC2DAB56C4C469380525FFB08E3A2D9@gmail.com> Message-ID: > Eli, quite frankly no :( > > The stock answer "put it on the main page at python.org" if actually > followed up in all cases would result in something unreadable, as the page > would be too noisy and displayed in something like Palatino size 1 (if there > is such a thing). > > I'm just crossing my fingers and hoping that someone with far more than my > own miniscule imagination can come up with a sensible suggestion. > Well, I think the situation is pretty good now. If one goes to python.org and is interested in contributing, clicking on the "Core Development" link is a sensible step, right? It then leads to the Devguide, which is a relatively new thing. Reading that opening page of the devguide which is linked to from "Core Development" should give aspiring contributors all the information they need, including a link to the mentorship site. Eli From solipsis at pitrou.net Sat Feb 11 16:20:17 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 11 Feb 2012 16:20:17 +0100 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library References: Message-ID: <20120211162017.24d0c3c4@pitrou.net> On Fri, 10 Feb 2012 16:06:15 +0200 Eli Bendersky wrote: > > Following the intensive and fruitful discussion of the (now rejected) > PEP 408 (http://mail.python.org/pipermail/python-dev/2012-January/115850.html), > we've drafted PEP 411 to summarize the conclusions with regards to the > process of marking packages provisional. Note that this is an > informational PEP, and that for the sake of completeness it duplicates > some of the contents of PEP 408. I think the word "provisional" doesn't mean anything to many (non-native English speaking) people. I would like to suggest something clearer, e.g. "experimental" or "unstable" - which have the benefit of *already* having a meaning in other software-related contexts. > The package has been included in the standard library on a > provisional basis. While major changes are not anticipated, as long as > this notice remains in place, backwards incompatible changes are > permitted if deemed necessary by the standard library developers. Such > changes will not be made gratuitously - they will occur only if > serious API flaws are uncovered that were missed prior to inclusion of > the package. That's too wordy. Let's stay clear and to the point: "This package is unstable/experimental. Its API may change in the next release." (and put a link to the relevant FAQ section if necessary) Regards Antoine. From a.badger at gmail.com Sat Feb 11 17:22:21 2012 From: a.badger at gmail.com (Toshio Kuratomi) Date: Sat, 11 Feb 2012 08:22:21 -0800 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: References: <4f357a5e.308eec0a.6343.ffffbe10@mx.google.com> Message-ID: <20120211162221.GD19713@unaka.lan> On Sat, Feb 11, 2012 at 04:32:56PM +1000, Nick Coghlan wrote: > > This would then be seen by pydoc and help(), as well as being amenable > to programmatic inspection. > Would using warnings.warn('This is a provisional API and may change radically from' ' release to release', ProvisionalWarning) where ProvisionalWarning is a new exception/warning category (a subclaass of FutureWarning?) be considered too intrusive? -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From ijmorlan at uwaterloo.ca Sat Feb 11 17:45:05 2012 From: ijmorlan at uwaterloo.ca (Isaac Morland) Date: Sat, 11 Feb 2012 11:45:05 -0500 (EST) Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: <4F3618E8.2030706@pearwood.info> References: <4f357a5e.308eec0a.6343.ffffbe10@mx.google.com> <4F3618E8.2030706@pearwood.info> Message-ID: On Sat, 11 Feb 2012, Steven D'Aprano wrote: > Arguably, the canonical test for whether a package is provisional or not > should be the existence of __provisional__: > > > for package in packages: > if hasattr(package, '__provisional__') > assert package documentation includes boilerplate > else: > assert package documentation does not includes boilerplate Could the documentation generator simply insert the boilerplate if and only if the package has the __provisional__ attribute? I'm not an expert in Python documentation but isn't it generated from properly-formatted comments within the Python source? Isaac Morland CSCF Web Guru DC 2554C, x36650 WWW Software Specialist From solipsis at pitrou.net Sat Feb 11 21:17:26 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 11 Feb 2012 21:17:26 +0100 Subject: [Python-Dev] PEP for new dictionary implementation References: <4F32CA76.5040307@hotpy.org> Message-ID: <20120211211726.0fdf086d@pitrou.net> Hello Mark, I think the PEP should explain what happens when a keys table needs resizing when setting an object's attribute. Reading the implementation, it seems the sharing can disappear definitely, which seems a bit worrying. Regards Antoine. On Wed, 08 Feb 2012 19:18:14 +0000 Mark Shannon wrote: > Proposed PEP for new dictionary implementation, PEP 410? > is attached. > > Cheers, > Mark. > From ericsnowcurrently at gmail.com Sat Feb 11 22:14:10 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 11 Feb 2012 14:14:10 -0700 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: References: <4f357a5e.308eec0a.6343.ffffbe10@mx.google.com> Message-ID: On Fri, Feb 10, 2012 at 11:32 PM, Nick Coghlan wrote: > On Sat, Feb 11, 2012 at 1:39 PM, Eric Snow wrote: >> Is there more to it than having a simple __provisional__ attribute on >> the module and/or a list at sys.provisional_modules? > > Yes. As soon as we touch functional code, it because something to be > tested and the process overhead on our end is noticeably higher. > > However, I'd be fine with requiring that a short form for the notice > appear at the start of the module docstring. For example: > > "The API of this module is currently provisional. Refer to the > documentation for details." > > This would then be seen by pydoc and help(), as well as being amenable > to programmatic inspection. Sounds good enough to me. Realistically, the utility of getting provisional modules distributed with the stdlib far outweighs the theoretical use cases of programmatic inspection. If something like "__provisional__" turns out to be really useful, that bridge can be crossed later. I certainly don't want this PEP bogged down by a relatively superfluous point. :) -eric From mark at hotpy.org Sat Feb 11 22:22:01 2012 From: mark at hotpy.org (Mark Shannon) Date: Sat, 11 Feb 2012 21:22:01 +0000 Subject: [Python-Dev] PEP for new dictionary implementation In-Reply-To: <20120211211726.0fdf086d@pitrou.net> References: <4F32CA76.5040307@hotpy.org> <20120211211726.0fdf086d@pitrou.net> Message-ID: <4F36DBF9.5050901@hotpy.org> Antoine Pitrou wrote: > Hello Mark, > > I think the PEP should explain what happens when a keys table needs > resizing when setting an object's attribute. If the object is the only instance of a class, it remains split, otherwise the table is combined. Most OO code will set attributes in the __init__ method so all attributes are set before a second instance is created. For more complex use patterns, it is impossible to know what is the best approach, so the implementation allows extra insertions up to the point of a resize when it reverts to the combined table (non-shared keys). (This may not be the case in the bitbucket repository, I'll push the newer version tomorrow). > Reading the implementation, it seems the sharing can disappear > definitely, which seems a bit worrying. It is immediately re-split (to allow sharing) when only one instance of the class exists. I've implemented it that way (resize->combined then re-split) as most resizes (999 out of 1000) will be of combined tables, and I don't want to complicate the fast path. Cheers, Mark. From ncoghlan at gmail.com Sun Feb 12 04:59:16 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 Feb 2012 13:59:16 +1000 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: <20120211162017.24d0c3c4@pitrou.net> References: <20120211162017.24d0c3c4@pitrou.net> Message-ID: On Sun, Feb 12, 2012 at 1:20 AM, Antoine Pitrou wrote: > On Fri, 10 Feb 2012 16:06:15 +0200 > Eli Bendersky wrote: >> >> Following the intensive and fruitful discussion of the (now rejected) >> PEP 408 (http://mail.python.org/pipermail/python-dev/2012-January/115850.html), >> we've drafted PEP 411 to summarize the conclusions with regards to the >> process of marking packages provisional. Note that this is an >> informational PEP, and that for the sake of completeness it duplicates >> some of the contents of PEP 408. > > I think the word "provisional" doesn't mean anything to many > (non-native English speaking) people. I would like to suggest something > clearer, e.g. "experimental" or "unstable" - which have the benefit of > *already* having a meaning in other software-related contexts. No, those words are far too strong. "Provisional" is exactly the right word for the meaning we want to convey (see http://dictionary.reference.com/browse/provisional, especially the second sense). Radical changes to provisional modules are highly unlikely (if we're that wrong about a module, we're more likely to just drop it than we are to change it significantly), but tweaks to eliminate rough edges are a real possibility. Truly experimental, unstable code has no place in the standard library in the first place (that's one of the things PyPI is for). For the record (and the PEP should probably mention this), the end user facing explanation of provisional modules is closely modelled on the one Red Hat use in explaining their Tech Preview concept: https://access.redhat.com/support/offerings/techpreview/ (Unfortunately, Google don't appear to have a clear user facing explanation of the meaning of "experimental" disclaimers in Google App Engine APIs, although it appears such APIs really are closer in nature to being truly experimental than what we're proposing for the stdlib) Now, what would make sense is to have a definition of "provisional API" in the glossary, and link to that from the module docs, rather than explaining the concept separately in each provisional module. The notice in each package could then be shortened to: The API has been included in the standard library on a provisional basis. Backwards incompatible changes (up to and including removal of the API) may occur if deemed necessary by the standard library developers. The phrase "provisional basis" would then be a link to the glossary term "provisional API", defined as: A provisional API is one which has been deliberately excluded from the standard library's normal backwards compatibility guarantees. While major changes to such APIs are not expected, as long as they are marked as provisional, backwards incompatible changes (up to and including removal of the API) may occur if deemed necessary by the standard library developers. Such changes will not be made gratuitously - they will occur only if serious flaws are uncovered that were missed prior to inclusion of the API. This process allows the standard library to continue to evolve over time, without locking in problematic design errors for extended periods of time. See PEP 411 for more details. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Feb 12 05:07:25 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 Feb 2012 14:07:25 +1000 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: References: <4f357a5e.308eec0a.6343.ffffbe10@mx.google.com> <4F3618E8.2030706@pearwood.info> Message-ID: On Sun, Feb 12, 2012 at 2:45 AM, Isaac Morland wrote: > Could the documentation generator simply insert the boilerplate if and only > if the package has the __provisional__ attribute? ?I'm not an expert in > Python documentation but isn't it generated from properly-formatted comments > within the Python source? It can be, but, in the case of the standard library, generally isn't. While there's a certain attraction to using a __provisional__ attribute (checked by both Sphinx and pydoc) to standardise the boilerplate for provisional APIs, I don't think PEP 411 should be conditional on deciding *how* we implement the notices. It should just say "provisional APIs permit well justified changes that would otherwise be disallowed by backwards compatibility concerns, and these are the notices that must be put in place to indicate to users that an API is provisional". Whether we do that via copy-and-paste in the docs and docstring or by a flag in the source code is really an implementation detail separate from the process definition. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From merwok at netwok.org Sun Feb 12 07:51:46 2012 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Sun, 12 Feb 2012 07:51:46 +0100 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120207152435.379ac6f4@resist.wooz.org> Message-ID: <4F376182.9090208@netwok.org> Le 07/02/2012 23:21, Brett Cannon a ?crit : > On Tue, Feb 7, 2012 at 15:28, Dirkjan Ochtman wrote: >> Yeah, startup performance getting worse kinda sucks for command-line >> apps. And IIRC it's been getting worse over the past few releases... >> >> Anyway, I think there was enough of a python3 port for Mercurial (from >> various GSoC students) that you can probably run some of the very >> simple commands (like hg parents or hg id), which should be enough for >> your purposes, right? > Possibly. Where is the code? hg clone http://selenic.com/repo/hg/ cd hg python3 contrib/setup3k.py build From ncoghlan at gmail.com Sun Feb 12 10:04:30 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 Feb 2012 19:04:30 +1000 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) Message-ID: PEP 394 [1] aims to document our collective recommendation for allowing shebang lines to specifically request some version of 2.x, without requiring that it be exactly 2.7 (or 2.6, etc). I'd let this drift for a while, but the imminent release of 2.7.3 makes it necessary to push for a final pronouncement. Kerrick has the necessary Makefile.pre.in patch up on the tracker [2] to add the hard link for the python2 name. We could, of course, make the recommendation to distributions without updating "make install" and "make bininstall" to follow our own advice, but that seems needlessly inconsistent. [1] http://www.python.org/dev/peps/pep-0394/ [2] http://bugs.python.org/issue12627 Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Sun Feb 12 16:21:53 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 12 Feb 2012 16:21:53 +0100 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) References: Message-ID: <20120212162153.69833c77@pitrou.net> On Sun, 12 Feb 2012 19:04:30 +1000 Nick Coghlan wrote: > PEP 394 [1] aims to document our collective recommendation for > allowing shebang lines to specifically request some version of 2.x, > without requiring that it be exactly 2.7 (or 2.6, etc). > > I'd let this drift for a while, but the imminent release of 2.7.3 > makes it necessary to push for a final pronouncement. Kerrick has the > necessary Makefile.pre.in patch up on the tracker [2] to add the hard > link for the python2 name. Why hard links? Symlinks are much more introspectable. When looking at a hard link I have no easy way to know it's the same as whatever other file in the same directory. I also don't understand this mention: ?The make install command in the CPython 3.x series will similarly install the python3.x, idle3.x, pydoc3.x, and python3.x-config binaries (with appropriate x), and python3, idle3, pydoc3, and python3-config as hard links. This feature will first appear in CPython 3.3.? This feature actually exists in 3.2 (but with a symlink, fortunately): $ ls -la ~/opt/bin/pydoc3 lrwxrwxrwx 1 antoine antoine 8 oct. 15 21:24 /home/antoine/opt/bin/pydoc3 -> pydoc3.2* Regards Antoine. From martin at v.loewis.de Sun Feb 12 16:52:56 2012 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 12 Feb 2012 16:52:56 +0100 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: <20120212162153.69833c77@pitrou.net> References: <20120212162153.69833c77@pitrou.net> Message-ID: <4F37E058.6080600@v.loewis.de> > Why hard links? Symlinks are much more introspectable. When looking at > a hard link I have no easy way to know it's the same as whatever other > file in the same directory. There actually *is* an easy way, in regular ls: look at the link count. It comes out of ls -l by default, and if it's >1, there will be an identical file. I agree with the question, though: this needs to be justified (but there may well be a justification). > I also don't understand this mention: > > ?The make install command in the CPython 3.x series will similarly > install the python3.x, idle3.x, pydoc3.x, and python3.x-config binaries > (with appropriate x), and python3, idle3, pydoc3, and python3-config as > hard links. This feature will first appear in CPython 3.3.? > > This feature actually exists in 3.2 (but with a symlink, fortunately): If you look at the patch, you'll notice that the only change is to make the links hard links. Regards, Martin From solipsis at pitrou.net Sun Feb 12 17:04:39 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 12 Feb 2012 17:04:39 +0100 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: <4F37E058.6080600@v.loewis.de> References: <20120212162153.69833c77@pitrou.net> <4F37E058.6080600@v.loewis.de> Message-ID: <1329062679.3456.11.camel@localhost.localdomain> Le dimanche 12 f?vrier 2012 ? 16:52 +0100, "Martin v. L?wis" a ?crit : > > Why hard links? Symlinks are much more introspectable. When looking at > > a hard link I have no easy way to know it's the same as whatever other > > file in the same directory. > > There actually *is* an easy way, in regular ls: look at the link count. > It comes out of ls -l by default, and if it's >1, there will be an > identical file. This doesn't tell me which file it is, which is practically useless if I have both python3.3 and python3.2 in that directory. > If you look at the patch, you'll notice that the only change is to > make the links hard links. This begs the question: why? Regards Antoine. From nad at acm.org Sun Feb 12 17:17:46 2012 From: nad at acm.org (Ned Deily) Date: Sun, 12 Feb 2012 17:17:46 +0100 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) References: Message-ID: In article , Nick Coghlan wrote: > PEP 394 [1] aims to document our collective recommendation for > allowing shebang lines to specifically request some version of 2.x, > without requiring that it be exactly 2.7 (or 2.6, etc). > > I'd let this drift for a while, but the imminent release of 2.7.3 > makes it necessary to push for a final pronouncement. Kerrick has the > necessary Makefile.pre.in patch up on the tracker [2] to add the hard > link for the python2 name. > > We could, of course, make the recommendation to distributions without > updating "make install" and "make bininstall" to follow our own > advice, but that seems needlessly inconsistent. BTW, the patch is not sufficient to do all the right things for OS X intstaller builds, so, if you are thinking of approving this for 2.7.3, I'll need a few hours to develop and test the patch for that prior to code freeze, once there is agreement what it should do. Is there any work needed on the Windows installer side? -- Ned Deily, nad at acm.org From cf.natali at gmail.com Sun Feb 12 17:35:19 2012 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Sun, 12 Feb 2012 17:35:19 +0100 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: <1329062679.3456.11.camel@localhost.localdomain> References: <20120212162153.69833c77@pitrou.net> <4F37E058.6080600@v.loewis.de> <1329062679.3456.11.camel@localhost.localdomain> Message-ID: >> There actually *is* an easy way, in regular ls: look at the link count. >> It comes out of ls -l by default, and if it's >1, there will be an >> identical file. > > This doesn't tell me which file it is, which is practically useless if I > have both python3.3 and python3.2 in that directory. You can use 'ls -i' to print the inode, or you could use find's 'samefile' option. But this is definitely not as straightforward as a it would be for a symlink, and I'm also curious to know the reason behind this choice. From martin at v.loewis.de Sun Feb 12 18:57:42 2012 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 12 Feb 2012 18:57:42 +0100 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: <1329062679.3456.11.camel@localhost.localdomain> References: <20120212162153.69833c77@pitrou.net> <4F37E058.6080600@v.loewis.de> <1329062679.3456.11.camel@localhost.localdomain> Message-ID: <4F37FD96.2010603@v.loewis.de> Am 12.02.2012 17:04, schrieb Antoine Pitrou: > Le dimanche 12 f?vrier 2012 ? 16:52 +0100, "Martin v. L?wis" a ?crit : >>> Why hard links? Symlinks are much more introspectable. When looking at >>> a hard link I have no easy way to know it's the same as whatever other >>> file in the same directory. >> >> There actually *is* an easy way, in regular ls: look at the link count. >> It comes out of ls -l by default, and if it's >1, there will be an >> identical file. > > This doesn't tell me which file it is Well, you didn't ask for that, it does "to know it's the same as whatever other file" nicely :-) As Charles-Fran?ois explains, you can use ls -i for that, which isn't that easy, but still straight-forward. Regards, Martin From solipsis at pitrou.net Sun Feb 12 21:17:17 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 12 Feb 2012 21:17:17 +0100 Subject: [Python-Dev] Adding a maximum element count to parse_qs? Message-ID: <20120212211717.13205ef3@pitrou.net> Hello, Given the randomization fix will ship disabled, I thought it would be nice to add a maximum element count argument to urlparse.parse_qs, with a default value of e.g. 1000 (including in bugfix releases). What do you think? Regards Antoine. From cs at zip.com.au Sun Feb 12 21:30:43 2012 From: cs at zip.com.au (Cameron Simpson) Date: Mon, 13 Feb 2012 07:30:43 +1100 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: <4F37FD96.2010603@v.loewis.de> References: <4F37FD96.2010603@v.loewis.de> Message-ID: <20120212203043.GA10257@cskk.homeip.net> On 12Feb2012 18:57, "Martin v. L?wis" wrote: | Am 12.02.2012 17:04, schrieb Antoine Pitrou: | > Le dimanche 12 f?vrier 2012 ? 16:52 +0100, "Martin v. L?wis" a ?crit : | >>> Why hard links? Symlinks are much more introspectable. When looking at | >>> a hard link I have no easy way to know it's the same as whatever other | >>> file in the same directory. | >> | >> There actually *is* an easy way, in regular ls: look at the link count. | >> It comes out of ls -l by default, and if it's >1, there will be an | >> identical file. Yeah! Somewhere... :-( | > This doesn't tell me which file it is | | Well, you didn't ask for that, it does "to know it's the same as | whatever other file" nicely :-) Sure, at the OS level. Not much use for _inspection_. | As Charles-Fran?ois explains, you can use ls -i for that, which isn't | that easy, but still straight-forward. If the hardlink is nearby. Of course in this example it (almost certainly?) is, but it needn't be. A symlink is a much better solution to this problem because: - usability - "ls -l" shows it to the user by default - practicality: With a symlink, to find out what it attaches to you examine the symlink. With a hardlink you first examine a fairly opaque numeric attribute of "python2", and _then_ you examine every other filename on the system! Admittedly starting with "python2.*" in the same directory, but in principle in other places. Arbitrary other places. IMO a symlink is far and away the better choice in this situation. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ I need your clothes, your boots, and your motorcycle. - Arnold Schwarzenegger, Terminator 2 From martin at v.loewis.de Sun Feb 12 21:42:53 2012 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 12 Feb 2012 21:42:53 +0100 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: <20120212203043.GA10257@cskk.homeip.net> References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> Message-ID: <4F38244D.1000908@v.loewis.de> > IMO a symlink is far and away the better choice in this situation. Please wait with that judgment until you see the rationale of the PEP author. Thanks, Martin From martin at v.loewis.de Sun Feb 12 21:44:22 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 12 Feb 2012 21:44:22 +0100 Subject: [Python-Dev] Adding a maximum element count to parse_qs? In-Reply-To: <20120212211717.13205ef3@pitrou.net> References: <20120212211717.13205ef3@pitrou.net> Message-ID: <4F3824A6.4010507@v.loewis.de> > Given the randomization fix will ship disabled, I thought it would be > nice to add a maximum element count argument to urlparse.parse_qs, with > a default value of e.g. 1000 (including in bugfix releases). What do > you think? It's an API change, so it is a) in violation with current practice for bug fix releases, and b) of limited use for existing installations which won't use the API. Regards, Martin From solipsis at pitrou.net Sun Feb 12 22:55:33 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 12 Feb 2012 22:55:33 +0100 Subject: [Python-Dev] Adding a maximum element count to parse_qs? In-Reply-To: <4F3824A6.4010507@v.loewis.de> References: <20120212211717.13205ef3@pitrou.net> <4F3824A6.4010507@v.loewis.de> Message-ID: <20120212225533.31392cf9@pitrou.net> On Sun, 12 Feb 2012 21:44:22 +0100 "Martin v. L?wis" wrote: > > Given the randomization fix will ship disabled, I thought it would be > > nice to add a maximum element count argument to urlparse.parse_qs, with > > a default value of e.g. 1000 (including in bugfix releases). What do > > you think? > > It's an API change, so it is > a) in violation with current practice for bug fix releases, and We are already violating a lot of things in order to fix this issue. > b) of limited use for existing installations which won't use the API. Obviously it won't fix vulnerabilities due to some other API. If you propose other APIs we can also fix them. Regards Antoine. From greg at krypto.org Sun Feb 12 23:53:12 2012 From: greg at krypto.org (Gregory P. Smith) Date: Sun, 12 Feb 2012 14:53:12 -0800 Subject: [Python-Dev] peps: Update with bugfix releases. In-Reply-To: References: Message-ID: On Sun, Feb 5, 2012 at 11:23 AM, Ned Deily wrote: > In article , > ?georg.brandl wrote: >> +Bugfix Releases >> +=============== >> + >> +- 3.2.1: released July 10, 2011 >> +- 3.2.2: released September 4, 2011 >> + >> +- 3.2.3: planned February 10-17, 2012 > > I would like to propose that we plan for 3.2.3 and 2.7.3 immediately > after PyCon, so approximately March 17, if that works for all involved. I also like this idea because we tend to get a lot of bug fixing done during the PyCon sprints. -gps From martin at v.loewis.de Mon Feb 13 00:08:45 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Mon, 13 Feb 2012 00:08:45 +0100 Subject: [Python-Dev] Adding a maximum element count to parse_qs? In-Reply-To: <20120212225533.31392cf9@pitrou.net> References: <20120212211717.13205ef3@pitrou.net> <4F3824A6.4010507@v.loewis.de> <20120212225533.31392cf9@pitrou.net> Message-ID: <20120213000845.Horde.d4TvI7uWis5POEZ9DuGlG5A@webmail.df.eu> >> It's an API change, so it is >> a) in violation with current practice for bug fix releases, and > > We are already violating a lot of things in order to fix this issue. Not really. There isn't any significant API change in the proposed patch (the ones that are there are safe to ignore in applications). There is, of course, a major behavior change, but that is deliberately opt-in. >> b) of limited use for existing installations which won't use the API. > > Obviously it won't fix vulnerabilities due to some other API. If you > propose other APIs we can also fix them. No, you are missing my point. I assume you proposed (even though you didn't say so explicitly) that parse_qs gets an opt-in API change to limit the number of parameters. If that is added, it will have no effect on any existing applications, as they will all currently not pass that parameter. Regards, Martin From solipsis at pitrou.net Mon Feb 13 00:19:16 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 13 Feb 2012 00:19:16 +0100 Subject: [Python-Dev] Adding a maximum element count to parse_qs? References: <20120212211717.13205ef3@pitrou.net> <4F3824A6.4010507@v.loewis.de> <20120212225533.31392cf9@pitrou.net> <20120213000845.Horde.d4TvI7uWis5POEZ9DuGlG5A@webmail.df.eu> Message-ID: <20120213001916.1e9c4899@pitrou.net> On Mon, 13 Feb 2012 00:08:45 +0100 martin at v.loewis.de wrote: > > >> b) of limited use for existing installations which won't use the API. > > > > Obviously it won't fix vulnerabilities due to some other API. If you > > propose other APIs we can also fix them. > > No, you are missing my point. I assume you proposed (even though you > didn't say so explicitly) that parse_qs gets an opt-in API change to > limit the number of parameters. If that is added, it will have no > effect on any existing applications, as they will all currently not > pass that parameter. No, I said it would include a default value of (say) 1000 parameters. That default value would be applied to anyone doesn't use the new API. (the reason I'm proposing a new API is to allow people to change or disable the limit, in case they really want to pass a large number of parameters) Regards Antoine. From martin at v.loewis.de Mon Feb 13 00:32:07 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Mon, 13 Feb 2012 00:32:07 +0100 Subject: [Python-Dev] Adding a maximum element count to parse_qs? In-Reply-To: <20120213001916.1e9c4899@pitrou.net> References: <20120212211717.13205ef3@pitrou.net> <4F3824A6.4010507@v.loewis.de> <20120212225533.31392cf9@pitrou.net> <20120213000845.Horde.d4TvI7uWis5POEZ9DuGlG5A@webmail.df.eu> <20120213001916.1e9c4899@pitrou.net> Message-ID: <20120213003207.Horde.i7EwTbuWis5POEv35UBFOfA@webmail.df.eu> >> No, you are missing my point. I assume you proposed (even though you >> didn't say so explicitly) that parse_qs gets an opt-in API change to >> limit the number of parameters. If that is added, it will have no >> effect on any existing applications, as they will all currently not >> pass that parameter. > > No, I said it would include a default value of (say) 1000 parameters. > That default value would be applied to anyone doesn't use the new API. > (the reason I'm proposing a new API is to allow people to change or > disable the limit, in case they really want to pass a large number of > parameters) I see. -1 on that proposal, then: there are certainly applications that will break with that. I don't find 1000 POST parameters a lot, and I'm sure that people use that in a programmatic fashion (e.g. to mass-upload stuff). If you really think that kind of change is necessary, develop a separate patch that people who are worried can apply. Regards, Martin From solipsis at pitrou.net Mon Feb 13 00:39:46 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 13 Feb 2012 00:39:46 +0100 Subject: [Python-Dev] Adding a maximum element count to parse_qs? References: <20120212211717.13205ef3@pitrou.net> <4F3824A6.4010507@v.loewis.de> <20120212225533.31392cf9@pitrou.net> <20120213000845.Horde.d4TvI7uWis5POEZ9DuGlG5A@webmail.df.eu> <20120213001916.1e9c4899@pitrou.net> <20120213003207.Horde.i7EwTbuWis5POEv35UBFOfA@webmail.df.eu> Message-ID: <20120213003946.53acae68@pitrou.net> On Mon, 13 Feb 2012 00:32:07 +0100 martin at v.loewis.de wrote: > >> No, you are missing my point. I assume you proposed (even though you > >> didn't say so explicitly) that parse_qs gets an opt-in API change to > >> limit the number of parameters. If that is added, it will have no > >> effect on any existing applications, as they will all currently not > >> pass that parameter. > > > > No, I said it would include a default value of (say) 1000 parameters. > > That default value would be applied to anyone doesn't use the new API. > > (the reason I'm proposing a new API is to allow people to change or > > disable the limit, in case they really want to pass a large number of > > parameters) > > I see. -1 on that proposal, then: there are certainly applications that will > break with that. I don't find 1000 POST parameters a lot, and I'm sure that > people use that in a programmatic fashion (e.g. to mass-upload stuff). > > If you really think that kind of change is necessary, develop a separate patch > that people who are worried can apply. Fair enough. Actually, people can simply call parse_qsl and check the len() of the returned list before stuffing the params into a dict. That said, we can still do the change (without any limiting default value) for 3.3. Regards Antoine. From victor.stinner at gmail.com Mon Feb 13 01:28:48 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 13 Feb 2012 01:28:48 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review Message-ID: Hi, I finished the implementation of the PEP 410 ("Use decimal.Decimal type for timestamps"). The PEP: http://www.python.org/dev/peps/pep-0410/ The implementation: http://bugs.python.org/issue13882 Rietveld code review tool for this issue: http://bugs.python.org/review/13882/show The patch is huge because it changes many modules, but I plan to split the patch into small commits. I'm still waiting for Nick Coghlan and Guido van Rossum for their decision on the PEP. Victor From ncoghlan at gmail.com Mon Feb 13 03:31:45 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 13 Feb 2012 12:31:45 +1000 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: <4F38244D.1000908@v.loewis.de> References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> Message-ID: On Mon, Feb 13, 2012 at 6:42 AM, "Martin v. L?wis" wrote: >> IMO a symlink is far and away the better choice in this situation. > > Please wait with that judgment until you see the rationale of the PEP > author. Kerrick did post a rationale in the last thread [1], but it never made it into the PEP itself. The relevant comment: ========== Also, I updated the PEP with the clarification that commands like python3 should be hard links (because they'll be invoked from code and are more efficient; also, hard links are just as flexible as symlinks here), while commands like python should be soft links (because this makes it clear to sysadmins that they can be "switched", and it's needed for flexibility if python3 changes). This really doesn't matter, but can we keep it this way unless there are serious objections? ========== I think Antoine makes a good point about ease of introspection when you have multiple versions in the same series installed, so I'd be fine with: - updating the PEP recommendation to say that either form of link is fine (with hard links marginally faster, but harder to introspect) - noting that python.org releases will consistently use symlinks for easier introspection via "ls -l" - updating Makefile.pre.in to ensure that we really do consistently use symlinks This does mean that launching Python may involve a slightly longer symlink chain in some cases (python -> python2 -> python2.7), but the impact of that is always going to be utterly dwarfed by other startup costs. [1] http://mail.python.org/pipermail/python-dev/2011-July/112322.html Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Feb 13 04:00:47 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 13 Feb 2012 13:00:47 +1000 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: Message-ID: On Mon, Feb 13, 2012 at 10:28 AM, Victor Stinner wrote: > Hi, > > I finished the implementation of the PEP 410 ("Use decimal.Decimal > type for timestamps"). The PEP: > http://www.python.org/dev/peps/pep-0410/ > > The implementation: > http://bugs.python.org/issue13882 > > Rietveld code review tool for this issue: > http://bugs.python.org/review/13882/show > > The patch is huge because it changes many modules, but I plan to split > the patch into small commits. > > I'm still waiting for Nick Coghlan and Guido van Rossum for their > decision on the PEP. Only Guido, really. I'm already on record as being happy with the API design as documented in the latest version of the PEP. (I haven't reviewed the full patch yet, but that's the next step before checking things in, it isn't needed to mark the PEP as Accepted). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From stephen at xemacs.org Mon Feb 13 04:22:41 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 13 Feb 2012 12:22:41 +0900 Subject: [Python-Dev] http://pythonmentors.com/ In-Reply-To: References: <3DC2DAB56C4C469380525FFB08E3A2D9@gmail.com> Message-ID: <87d39jfmda.fsf@uwakimon.sk.tsukuba.ac.jp> Eli Bendersky writes: > Well, I think the situation is pretty good now. I agree, but improvement is always possible. > If one goes to python.org and is interested in contributing, > clicking on the "Core Development" link is a sensible step, right? Maybe. But a lot of the "Core Dev" links I've seen are maintained by and for incumbent core devs, and are somewhat intimidating for new users and developers. How about moving the About | Getting Started link to the top level and giving it a set of links like - Standalone Scripting - Extending Apps Using Python as Extension Language - Developing Apps in Python (this is what I think of when I read the current Getting Started with Python page, FWIW YMMV) - Developing Library Modules to Extend Python - Developing Python (there's work you can do, too!) This is similar to the existing About | Getting Started page, but more accessible (in the sense of being an index, rather than a more verbose description containing scattered links). I think most of them can go to existing pages. From stephen at xemacs.org Mon Feb 13 04:36:37 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 13 Feb 2012 12:36:37 +0900 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: <20120211162017.24d0c3c4@pitrou.net> References: <20120211162017.24d0c3c4@pitrou.net> Message-ID: <87bop3flq2.fsf@uwakimon.sk.tsukuba.ac.jp> Antoine Pitrou writes: > I think the word "provisional" doesn't mean anything to many > (non-native English speaking) people. I would like to suggest something > clearer, e.g. "experimental" or "unstable" - which have the benefit of > *already* having a meaning in other software-related contexts. I sympathize, but unfortunately, as Nick points out, those words have *different* and *inappropriate* meanings, which will definately mislead and confuse native speakers. Nor is "provisional" a difficult concept, as I understand it. At the very least, it has an exact translation into Japanese, which makes it about as hard to find exact translations as I can imagine a human language could! > > The package has been included in the standard library on a > > provisional basis. While major changes are not anticipated, as long as > > this notice remains in place, backwards incompatible changes are > > permitted if deemed necessary by the standard library developers. Such > > changes will not be made gratuitously - they will occur only if > > serious API flaws are uncovered that were missed prior to inclusion of > > the package. > > That's too wordy. Let's stay clear and to the point: > > "This package is unstable/experimental. Its API may change in the next > release." > > (and put a link to the relevant FAQ section if necessary) How about this? This is a `provisional package`__. Its API may change in the next release. __ faq#provisional_package and the linked FAQ will also link to the full PEP and a dictionary definition. From eliben at gmail.com Mon Feb 13 12:35:33 2012 From: eliben at gmail.com (Eli Bendersky) Date: Mon, 13 Feb 2012 13:35:33 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: > Since there appeared to be an overall positive response for making > this change in Python 3.3, and since there isn't longer any doubt > about the ownership of the package *in Python's stdlib* (see > http://mail.python.org/pipermail/python-dev/2012-February/116389.html), > I've opened issue 13988 on the bug tracker to follow the > implementation. > > The change was committed to the default branch. In 3.3, "import xml.etree.ElementTree" will automatically use the _elementtree accelerator, if available, and will fall back to a Python implementation otherwise. The documentation of ElementTree has also been updated to reflect this fact. Thanks a lot to Florent Xicluna for the great co-operation, and all the others who submitted opinions in the issue. For more details see http://bugs.python.org/issue13988 Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at hotpy.org Mon Feb 13 13:31:38 2012 From: mark at hotpy.org (Mark Shannon) Date: Mon, 13 Feb 2012 12:31:38 +0000 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: <4F33B343.1050801@voidspace.org.uk> References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> <4F2AE13C.6010900@hotpy.org> <4F3291C9.9070305@hotpy.org> <4F33B343.1050801@voidspace.org.uk> Message-ID: <4F3902AA.3080300@hotpy.org> Hi All, My new dictionary implementation has been further updated in light of various comments and questions. The code is here: https://bitbucket.org/markshannon/cpython_new_dict The resizing code has been streamlined. This means that resizing by doubling does not suffer any real performance penalty versus quadrupling. Doubling uses less memory for most benchmarks (it never uses more memory). Michael Foord wrote: >> 2to3, which seems to be the only "realistic" benchmark that runs on Py3, >> shows no change in speed and uses 10% less memory. > In your first version 2to3 used 28% less memory. Do you know why it's > worse in this version? > The answer is that the second version used quadrupling rather than doubling for resizing. All tests pass. test_sys has been altered to allow for the different size of the dictionary. One test in test_pprint has been disabled. This test is broken anyway, see http://bugs.python.org/issue13907. In general, for the new dictionary implementation, with doubling at a resize, speed is unchanged and memory usage is reduced. On "average": ~1% slow down, ~10% reduction in memory use. Full set of benchmarks for new dict with doubling and quadrupling attached. Unfortunately the benchmarking program introduces systematic errors for timings, but it's the best we have at the moment. Note that the json benchmark is unstable and should be ignored. The GC benchmark might be unstable as well, I haven't experimented. The memory usage numbers seems to be more reliable. Revised PEP to follow. Cheers, Mark. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: new_dict_benchmarks.txt URL: From mark at hotpy.org Mon Feb 13 13:46:51 2012 From: mark at hotpy.org (Mark Shannon) Date: Mon, 13 Feb 2012 12:46:51 +0000 Subject: [Python-Dev] PEP for new dictionary implementation In-Reply-To: <4F32CA76.5040307@hotpy.org> References: <4F32CA76.5040307@hotpy.org> Message-ID: <4F39063B.6010803@hotpy.org> Revised PEP for new dictionary implementation, PEP 412? is attached. Cheers, Mark. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: pep-new-dict.txt URL: From victor.stinner at gmail.com Mon Feb 13 13:59:18 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 13 Feb 2012 13:59:18 +0100 Subject: [Python-Dev] How to round timestamps and durations? Message-ID: Hi, My work on the PEP 410 tries to unify the code to manipulate timestamps. The problem is that I'm unable to decide how to round these numbers. Functions using a resolution of 1 second (e.g. time.mktime) expects rounding towards zero (ROUND_HALF_DOWN), as does int(float). Example: >>> time.mktime(time.localtime(-1.9)), time.mktime(time.localtime(1.9)) (-1.0, 1.0) datetime.datetime.fromtimestamp() rounds to nearest with ties going away from zero (ROUND_HALF_UP). Example: >>> datetime.datetime.fromtimestamp(-1.1e-6), datetime.datetime.fromtimestamp(1.1e-6) (datetime.datetime(1970, 1, 1, 0, 59, 59, 999999), datetime.datetime(1970, 1, 1, 1, 0, 0, 1)) >>> datetime.datetime.fromtimestamp(-1.9e-6), datetime.datetime.fromtimestamp(1.9e-6) (datetime.datetime(1970, 1, 1, 0, 59, 59, 999998), datetime.datetime(1970, 1, 1, 1, 0, 0, 2)) datetime.timedelta * float and datetime.timedelta / float rounds to nearest with ties going to nearest even integer (ROUND_HALF_EVEN), as does round(). Example: >>> [(datetime.timedelta(microseconds=x) / 2.0).microseconds for x in range(6)] [0, 0, 1, 2, 2, 2] Should I also support multiple rounding methods depending on the operation and of the Python function? Should we always use the same rounding method? Antoine pointed me that ROUND_HALF_UP can produce timestamps "in the future", which is especially visible when using a resolution of 1 second. I like this rounding method because it limits the loss of precision to an half unit: abs(rounded - timestamp) <= 0.5. But it can be "surprising". The rounding method should maybe be the same than int(float) (so ROUND_HALF_DOWN) to avoid surprising results for applications using int(time.time()) for example (I had such problem with rotated logs and test_logging). -- There is an issue on rounding timedelta: http://bugs.python.org/issue8860 Victor From stefan_ml at behnel.de Mon Feb 13 14:35:04 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 13 Feb 2012 14:35:04 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: Eli Bendersky, 13.02.2012 12:35: >> Since there appeared to be an overall positive response for making >> this change in Python 3.3, and since there isn't longer any doubt >> about the ownership of the package *in Python's stdlib* (see >> http://mail.python.org/pipermail/python-dev/2012-February/116389.html), >> I've opened issue 13988 on the bug tracker to follow the >> implementation. > > The change was committed to the default branch. In 3.3, "import > xml.etree.ElementTree" will automatically use the _elementtree accelerator, > if available, and will fall back to a Python implementation otherwise. The > documentation of ElementTree has also been updated to reflect this fact. Thanks! Stefan From barry at python.org Mon Feb 13 16:26:47 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 13 Feb 2012 10:26:47 -0500 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review References: Message-ID: <20120213102647.5a143b07@resist.wooz.org> On Feb 13, 2012, at 01:28 AM, Victor Stinner wrote: >I'm still waiting for Nick Coghlan and Guido van Rossum for their >decision on the PEP. Thanks for continuing to work on this Victor. I agree with the general motivation behind the PEP, and appreciate your enthusiasm for improving Python here. However, I am still -1 on the solution proposed by the PEP. I still think that migrating to datetime use is a better way to go, rather than a proliferation of the data types used to represent timestamps, along with an API to specify the type of data returned. Let's look at each item in the PEPs rationale for discarding the use of datetimes: * datetime.datetime only supports microsecond resolution, but can be enhanced to support nanosecond. Great! JFDI! * datetime.datetime has issues with timezone. For example, a datetime object without timezone and a datetime with a timezone cannot be compared. This may be so, but I don't think it causes fatal problems with this approach. The APIs returning high resolution datetimes should return naive (i.e. timezone-less) datetimes. If this is done consistently, then most math on such datetimes (along with high resolution timedeltas) should Just Work. I'm looking at a use case from my flufl.lock library: return datetime.datetime.fromtimestamp( os.stat(self._lockfile).st_mtime) and later, this value is compared: datetime.datetime.now() > release_time So with higher resolution naive datetimes, this would still work. I think it's fine if the user wants to mix naive and timezone-ful datetimes, they will have to resolve the compatibility issues, but I think that will be the minority of cases. So this issue should not be a blocker for high resolution datetimes (and timedeltas). * datetime.datetime has ordering issues with daylight saving time (DST) in the duplicate hour of switching from DST to normal time. Sure, but only for timezone-ful datetimes, right? I can live with that, since I have to live with that for all non-naive datetimes anyway, and as I mentioned, I don't think in general it will be a practical problem when using high resolution datetime.s * datetime.datetime is not as well integrated than Epoch timestamps, some functions don't accept this type as input. For example, os.utime() expects a tuple of Epoch timestamps. So, by implication, Decimal is better integrated by virtue of its ability to be coerced to floats and other numeric stack types? Will users ever have to explicitly convert Decimal types to use other APIs? I don't think this one is insurmountable either. We could certainly improve the compatibility of datetimes with other APIs, and in fact, I think we should regardless of which direction this PEP takes. You could even argue for EIBTI in converting from datetimes to types acceptable to those other APIs, many of which derive their argument types by virtue of the C APIs underneath. It bothers me that the PEP is proposing that users will now have to be prepared to handle yet another (and potentially *many* more) data types coming from what are essentially datetime-like APIs. If it really is impossible or suboptimal to build high resolution datetimes and timedeltas, and to use them in these APIs, then at the very least, the PEP needs a stronger rationale for why this is. But I think ultimately, it would be better for Python to improve the resolution, and API support for datetimes and timestamps. In any case, thanks for your work in this (and so many other!) areas. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From victor.stinner at gmail.com Mon Feb 13 19:33:41 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 13 Feb 2012 19:33:41 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <20120213102647.5a143b07@resist.wooz.org> Message-ID: > However, I am still -1 on the solution proposed by the PEP. I still think > that migrating to datetime use is a better way to go, rather than a > proliferation of the data types used to represent timestamps, along with an > API to specify the type of data returned. > > Let's look at each item in the PEPs rationale for discarding the use of > datetimes: Oh, I forgot to mention my main concern about datetime: many functions returning timestamp have an undefined starting point (an no timezone information ), and so cannot be converted to datetime: - time.clock(), time.wallclock(), time.monotonic(), time.clock_gettime() (except for CLOCK_REALTIME) - time.clock_getres() - signal.get/setitimer() - os.wait3(), os.wait4(), resource.getrusage() - etc. Allowing datetime.datetime type just for few functions (like datetime.datetime or time.time) but not the others (raise an exception) is not an acceptable solution. > I'm looking at a use case from my flufl.lock library: > > return datetime.datetime.fromtimestamp( > os.stat(self._lockfile).st_mtime) Keep your code but just add timestamp=decimal.Decimal argument to os.stat() to get high-resolution timestamps! (well, you would at least avoid loss of precision loss if datetime is not improved to support nanosecond.) > * datetime.datetime has ordering issues with daylight saving time (DST) in > the duplicate hour of switching from DST to normal time. > > Sure, but only for timezone-ful datetimes, right? I don't know enough this topic to answer. Martin von Loewis should answer to this question! > * datetime.datetime is not as well integrated than Epoch timestamps, some > functions don't accept this type as input. For example, os.utime() expects > a tuple of Epoch timestamps. > > So, by implication, Decimal is better integrated by virtue of its ability to > be coerced to floats and other numeric stack types? Yes. decimal.Decimal is already supported by all functions accepting float (all functions expecting timestamps). > Will users ever have to explicitly convert Decimal types to use other APIs? Sorry, I don't understand. What do you mean? > It bothers me that the PEP is proposing that users will now have to be > prepared to handle yet another (and potentially *many* more) data types coming > from what are essentially datetime-like APIs. Users only get decimal.Decimal if they ask explicitly for decimal.Decimal. By default, they will still get float. Most users don't care of nanoseconds :-) If a library choose to return Decimal instead of float, it's a change in the library API unrelated to the PEP. > If it really is impossible or suboptimal to build high resolution datetimes > and timedeltas, and to use them in these APIs, then at the very least, the PEP > needs a stronger rationale for why this is. IMO supporting nanosecond in datetime and timedelta is an orthogonal issue. And yes, the PEP should maybe give better arguments against datetime :-) I will update the PEP to mention the starting point issue. > In any case, thanks for your work in this (and so many other!) areas. You're welcome :) From martin at v.loewis.de Mon Feb 13 22:07:54 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 13 Feb 2012 22:07:54 +0100 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> Message-ID: <4F397BAA.1060604@v.loewis.de> > I think Antoine makes a good point about ease of introspection when > you have multiple versions in the same series installed, so I'd be > fine with: > - updating the PEP recommendation to say that either form of link is > fine (with hard links marginally faster, but harder to introspect) > - noting that python.org releases will consistently use symlinks for > easier introspection via "ls -l" > - updating Makefile.pre.in to ensure that we really do consistently use symlinks Sounds fine to me as well. When you update the PEP, please also update the mark with the actual issue number (or add it to the References). For the patch, it seems that one open issue is OSX support, although I'm unsure what exactly the issue is. Regards, Martin From victor.stinner at gmail.com Mon Feb 13 22:47:09 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 13 Feb 2012 22:47:09 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: Message-ID: Antoine Pitrou conviced me to drop simply the int type: float and Decimal are just enough. Use an explicit cast using int() to get int. os.stat_float_times() is still deprecated by the PEP. Victor From barry at python.org Mon Feb 13 23:08:45 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 13 Feb 2012 17:08:45 -0500 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> Message-ID: <20120213170845.3ee5d4b4@resist.wooz.org> On Feb 13, 2012, at 12:31 PM, Nick Coghlan wrote: >I think Antoine makes a good point about ease of introspection when >you have multiple versions in the same series installed, so I'd be >fine with: >- updating the PEP recommendation to say that either form of link is >fine (with hard links marginally faster, but harder to introspect) >- noting that python.org releases will consistently use symlinks for >easier introspection via "ls -l" >- updating Makefile.pre.in to ensure that we really do consistently use symlinks +1, and +1 for the PEP to be accepted. >This does mean that launching Python may involve a slightly longer >symlink chain in some cases (python -> python2 -> python2.7), but the >impact of that is always going to be utterly dwarfed by other startup >costs. Agreed about startup times. However, does the symlink chain have to go in this order? Couldn't python -> python2.7 and python2 -> python2.7? OTOH, I seriously doubt removing one level of symlink chasing will have any noticeable effect on startup times. One other thing I'd like to see the PEP address is a possible migration strategy to python->python3. Even if that strategy is "don't do it, man!". IOW, can a distribution change the 'python' symlink once it's pointed to python2? What is the criteria for that? Is it up to a distribution? Will the PEP get updated when our collective wisdom says its time to change the default? etc. Also, if Python 2.7 is being changed to add this feature, why can't Python 3.2 also be changed? (And if there's a good reason for not doing it there, that should be added to the PEP.) Cheers, -Barry From rowen at uw.edu Mon Feb 13 23:52:18 2012 From: rowen at uw.edu (Russell E. Owen) Date: Mon, 13 Feb 2012 14:52:18 -0800 Subject: [Python-Dev] peps: Update with bugfix releases. References: <20120205204551.Horde.NCdeYVNNcXdPLtxvnkzi1lA@webmail.df.eu> <4F32DF1E.40205@v.loewis.de> Message-ID: In article , Ned Deily wrote: > In article , > "Russell E. Owen" wrote: > > One problem I've run into is that the 64-bit Mac python 2.7 does not > > work properly with ActiveState Tcl/Tk. One symptom is to build > > matplotlib. The results fail -- both versions of Tcl/Tk somehow get > > linked in. > > The 64-bit OS X installer is built on and tested on systems with A/S > Tcl/Tk 8.5.x and we explicitly recommend its use when possible. > > http://www.python.org/download/mac/tcltk/ > > Please open a python bug for this and any other issues you know of > regarding the use with current A/S Tcl/Tk 8.5.x with current 2.7.x or > 3.2.x installers on OS X 10.6 or 10.7. Yes. I apologize. See the discussion in the Mac python mailing list (I replied to your email there). I was trying to build a matplotlib binary installer and ran into problems. I don't know where the problem comes from, and it may well not have anything to do with the python build. -- Russell From ncoghlan at gmail.com Tue Feb 14 03:14:32 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 14 Feb 2012 12:14:32 +1000 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <20120213102647.5a143b07@resist.wooz.org> Message-ID: On Tue, Feb 14, 2012 at 4:33 AM, Victor Stinner wrote: >> However, I am still -1 on the solution proposed by the PEP. ?I still think >> that migrating to datetime use is a better way to go, rather than a >> proliferation of the data types used to represent timestamps, along with an >> API to specify the type of data returned. >> >> Let's look at each item in the PEPs rationale for discarding the use of >> datetimes: > > Oh, I forgot to mention my main concern about datetime: many functions > returning timestamp have an undefined starting point (an no timezone > information ), and so cannot be converted to datetime: > ?- time.clock(), time.wallclock(), time.monotonic(), > time.clock_gettime() (except for CLOCK_REALTIME) > ?- time.clock_getres() > ?- signal.get/setitimer() > ?- os.wait3(), os.wait4(), resource.getrusage() > ?- etc. > > Allowing datetime.datetime type just for few functions (like > datetime.datetime or time.time) but not the others (raise an > exception) is not an acceptable solution. A datetime module based approach would need to either use a mix of datetime.datetime() (when returning an absolute time) and datetime.timedelta() (when returning a time relative to an unknown starting point), or else just always return datetime.timedelta (even when we know the epoch and could theoretically make the time absolute). In the former case, it may be appropriate to adopt a boolean flag API design and the "I want high precision time" request marker would just be "datetime=True". You'd then get back either datetime.datetime() or datetime.timedelta() as appropriate for the specific API. In the latter case, the design would be identical to the current PEP, only with "datetime.timedelta" in place of "decimal.Decimal". The challenge relative to the current PEP is that any APIs that wanted to *accept* either of these as a timestamp would need to do some specific work to avoid failing with a TypeError. For timedelta values, we'd have to define a way to easily extract the full precision timestamp as a number (total_seconds() currently returns a float, and hence can't handle nanosecond resolutions), as well as improving interoperability with algorithms that expected a floating point value. If handed a datetime value, you need to know the correct epoch value, do the subtraction, then extract the full precision timestamp from the resulting timedelta object. To make a datetime module based counter-proposal acceptable, it would need to be something along the following lines: - to avoid roundtripping problems, only return timedelta() (even for cases where we know the epoch and could theoretically return datetime instead) - implement __int__ and __float__ on timedelta (where the latter is just "self.total_seconds()" and the former "int(self.total_seconds())") It may also take some fancy footwork to avoid a circular dependency between time and datetime while supporting this (Victor allowed this in an earlier version of his patch, but he did it by accepting datetime.datetime and datetime.time_delta directly as arguments to the affected APIs). That's a relatively minor implementation concern, though (at worst it would require factoring out a support module used by both datetime and time). The big problem is that datetime and timedelta pose a huge problem for compatibility with existing third party APIs that accept timestamp values. This is in stark contrast to what happens with decimal.Decimal: coercion to float() or int() will potentially lose precision, but still basically works. While addition and subtraction of floats will fail, addition and subtraction of integers works fine. To avoid losing precision, it's sufficient to just avoid the coercion. I think the outline above really illustrates why the *raw* data type for timestamps should just be a number, not a higher level semantic type like timedelta or datetime. Eventually, you want to be able to express a timestamp as a number of seconds relative to a particular epoch. To do that, you want a number. Originally, we used ints, then, to support microsecond resolution, we used floats. The natural progression to support arbitrary resolutions is to decimal.Decimal. Then, the higher level APIs can be defined in *terms* of that high precision number. Would it be nice if there was a PyPI module that provided APIs that converted the raw timestamps in stat objects and other OS level APIs into datetime() and timedelta() objects as appropriate? Perhaps, although I'm not sure it's necessary. But are those types low-level enough to be suitable for the *OS* interface definition? I don't think so - we really just want a number to express "seconds since a particular time" that plays fairly nicely with other numbers, not anything fancier than that. Notice that PEP 410 as it stands can be used to *solve* the problem of how to extract the full precision timestamp from a timedelta object as a number: timedelta.total_seconds() can be updated to accept a "timestamp" argument, just like the other time related APIs already mentioned in the PEP. Then "delta.total_seconds(timestamp=decimal.Decimal)" will get you a full precision timestamp. If PEP 410 was instead defined in *terms* of timedelta, it would need to come up with a *different* solution for this. Also, by using decimal.Decimal, we open up the possibility of, at some point in the future, switching to returning high precision values by default (there are at least two prerequisites for that, though: incorporation of cdecimal into CPython and implicit promotion of floats to decimal values in binary operations without losing data. We've already started down that path by accepting floating point values directly in the Decimal constructor). No such migration path for the default behaviour presents itself for an API based on datetime or timedelta (unless we consider making timedelta behave a *lot* more like a number than it does now). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue Feb 14 03:28:40 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 14 Feb 2012 12:28:40 +1000 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: <4F397BAA.1060604@v.loewis.de> References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> <4F397BAA.1060604@v.loewis.de> Message-ID: On Tue, Feb 14, 2012 at 7:07 AM, "Martin v. L?wis" wrote: >> I think Antoine makes a good point about ease of introspection when >> you have multiple versions in the same series installed, so I'd be >> fine with: >> - updating the PEP recommendation to say that either form of link is >> fine (with hard links marginally faster, but harder to introspect) >> - noting that python.org releases will consistently use symlinks for >> easier introspection via "ls -l" >> - updating Makefile.pre.in to ensure that we really do consistently use symlinks > > Sounds fine to me as well. When you update the PEP, please also update > the mark with the actual issue number (or add it to the References). Hmm, the PEP builder on python.org may need a kick. I added the tracker reference before starting this thread (http://hg.python.org/peps/rev/78b94f8648fa), but didn't comment on it since I expected the site to update in fairly short order. > For the patch, it seems that one open issue is OSX support, although > I'm unsure what exactly the issue is. I don't know either, but I'll take Ned's word for it if he says there's something more he needs to do to make it work in the OS X binaries. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue Feb 14 03:38:38 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 14 Feb 2012 12:38:38 +1000 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: <20120213170845.3ee5d4b4@resist.wooz.org> References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> <20120213170845.3ee5d4b4@resist.wooz.org> Message-ID: On Tue, Feb 14, 2012 at 8:08 AM, Barry Warsaw wrote: > On Feb 13, 2012, at 12:31 PM, Nick Coghlan wrote: > >>I think Antoine makes a good point about ease of introspection when >>you have multiple versions in the same series installed, so I'd be >>fine with: >>- updating the PEP recommendation to say that either form of link is >>fine (with hard links marginally faster, but harder to introspect) >>- noting that python.org releases will consistently use symlinks for >>easier introspection via "ls -l" >>- updating Makefile.pre.in to ensure that we really do consistently use symlinks > > +1, and +1 for the PEP to be accepted. > >>This does mean that launching Python may involve a slightly longer >>symlink chain in some cases (python -> python2 -> python2.7), but the >>impact of that is always going to be utterly dwarfed by other startup >>costs. > > Agreed about startup times. ?However, does the symlink chain have to go in > this order? ?Couldn't python -> python2.7 and python2 -> python2.7? ?OTOH, I > seriously doubt removing one level of symlink chasing will have any noticeable > effect on startup times. I considered that, but thought it would be odd to make people double-key a manual default version change within a series. It seemed more logical to have "python" as a binary "python2/3" switch and then have the python2/3 symlinks choose which version is the default for that series. (I'll add that rationale to the PEP, though) > One other thing I'd like to see the PEP address is a possible migration > strategy to python->python3. ?Even if that strategy is "don't do it, man!". > IOW, can a distribution change the 'python' symlink once it's pointed to > python2? ?What is the criteria for that? ?Is it up to a distribution? ?Will > the PEP get updated when our collective wisdom says its time to change the > default? ?etc. I have no idea, and I'm not going to open that can of worms for this PEP. We need to say something about the executable aliases so that people can eventually write cross-platform python2 shebang lines, but how particular distros actually manage the transition is going to depend more on their infrastructure and community than it is anything to do with us. > Also, if Python 2.7 is being changed to add this feature, why can't Python 3.2 > also be changed? ?(And if there's a good reason for not doing it there, that > should be added to the PEP.) Because Python 3.2 already installs itself as python3 and doesn't touch the python symlink. Aside from potentially cleaning up the choice of symlinks vs hardlinks in a couple of cases, the PEP really doesn't alter Python 3 deployment at all. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eliben at gmail.com Tue Feb 14 04:42:29 2012 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 14 Feb 2012 05:42:29 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: > > The change was committed to the default branch. In 3.3, "import > xml.etree.ElementTree" will automatically use the _elementtree accelerator, > if available, and will fall back to a Python implementation otherwise. The > documentation of ElementTree has also been updated to reflect this fact. > An open question remains on whether to deprecate cElementTree, now that this change is in place. Currently in 3.3 the whole cElementTree module is: # Deprecated alias for xml.etree.ElementTree from xml.etree.ElementTree import * Would it be alright to issue a DeprecationWarning if this module is imported? Then hopefully a couple of releases after 3.3 we can just dump it. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Feb 14 05:16:59 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 14 Feb 2012 14:16:59 +1000 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: On Tue, Feb 14, 2012 at 1:42 PM, Eli Bendersky wrote: > An open question remains on whether to deprecate cElementTree, now that this > change is in place. > > Currently in 3.3 the whole cElementTree module is: > > ? # Deprecated alias for xml.etree.ElementTree > > ? from xml.etree.ElementTree import * > > Would it be alright to issue a DeprecationWarning if this module is > imported? Then hopefully a couple of releases after 3.3 we can just dump it. What do we really gain by dumping it, though? Just add a CPython specific test that ensures: for key, value in xml.etree.ElementTree.__dict__.items(): self.assertIs(getattr(xml.etree.cElementTree, key), value) and then ignore it for the next decade or so. Programmatic deprecation is a significant imposition on third party developers and should really be reserved for APIs that actively encourage writing broken code (e.g. contextlib.nested) or are seriously problematic for python-dev to maintain. For cleanup stuff, documented deprecation is sufficient. Something that might be worth doing (although it would likely scare the peanut gallery) is to create a PEP 4000 to record the various cleanup tasks (like dropping cElementTree) that are being deliberately excluded from the 3.x series. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eliben at gmail.com Tue Feb 14 05:25:58 2012 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 14 Feb 2012 06:25:58 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: > Currently in 3.3 the whole cElementTree module is: > > > > # Deprecated alias for xml.etree.ElementTree > > > > from xml.etree.ElementTree import * > > > > Would it be alright to issue a DeprecationWarning if this module is > > imported? Then hopefully a couple of releases after 3.3 we can just dump > it. > > What do we really gain by dumping it, though? Just add a CPython > specific test that ensures: > > for key, value in xml.etree.ElementTree.__dict__.items(): > self.assertIs(getattr(xml.etree.cElementTree, key), value) > > and then ignore it for the next decade or so. > > With the deprecation warning being silent, is there much to lose, though? Cleanups help lower the clutter and mental burden on maintainers in the long run. If nothing is ever cleaned up don't we end up with PHP :-) ? > Programmatic deprecation is a significant imposition on third party > developers and should really be reserved for APIs that actively > encourage writing broken code (e.g. contextlib.nested) or are > seriously problematic for python-dev to maintain. For cleanup stuff, > documented deprecation is sufficient. > > A quick search of the sources for DeprecationWarning show that it's being used much more liberally than solely for stuff that encourages writing broken code. Has there been a recent policy change with regards to what's considered deprecated? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Feb 14 05:44:31 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 14 Feb 2012 14:44:31 +1000 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: On Tue, Feb 14, 2012 at 2:25 PM, Eli Bendersky wrote: > With the deprecation warning being silent, is there much to lose, though? Yes, it creates problems for anyone that deliberately converts all warnings to errors when running their test suites. This forces them to spend time switching over to a Python version dependent import of either cElementTree or ElementTree that could have been spent doing something actually productive instead of mere busywork. And, of course, even people that *don't* convert warnings to errors when running tests will have to make the same switch when the module is eventually removed. > Cleanups help lower the clutter and mental burden on maintainers in the long > run. If nothing is ever cleaned up don't we end up with PHP :-) ? It's a balancing act, sure. But when the maintenance burden for us is low and the cost to third parties clear, documented deprecation for eventual removal in the next release series is the better choice. >> Programmatic deprecation is a significant imposition on third party >> developers and should really be reserved for APIs that actively >> encourage writing broken code (e.g. contextlib.nested) or are >> seriously problematic for python-dev to maintain. For cleanup stuff, >> documented deprecation is sufficient. > > A quick search of the sources for DeprecationWarning show that it's being > used much more liberally than solely for stuff that encourages writing > broken code. Has there been a recent policy change with regards to what's > considered deprecated? It's always been judged on a case-by-case basis, but yes, there's been a deliberate push in favour of purely documented deprecations in recent years (initially mostly from Raymond Hettinger, more recently from me as well as I came to appreciate the merit of Raymond's point of view). It mainly started with the decision to leave optparse alone (aside from a deprecation note in the docs) even after argparse was added to the standard library. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From stefan_ml at behnel.de Tue Feb 14 08:58:06 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 14 Feb 2012 08:58:06 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: Nick Coghlan, 14.02.2012 05:44: > On Tue, Feb 14, 2012 at 2:25 PM, Eli Bendersky wrote: >> With the deprecation warning being silent, is there much to lose, though? > > Yes, it creates problems for anyone that deliberately converts all > warnings to errors when running their test suites. This forces them to > spend time switching over to a Python version dependent import of > either cElementTree or ElementTree that could have been spent doing > something actually productive instead of mere busywork. > > And, of course, even people that *don't* convert warnings to errors > when running tests will have to make the same switch when the module > is eventually removed. I'm -1 on emitting a deprecation warning just because cElementTree is being replaced by a bare import. That's an implementation detail, just like cElementTree should have been an implementation detail in the first place. In all currently maintained CPython releases, importing cElementTree is the right thing to do for users. These days, other Python implementations already provide the cElementTree module as a bare alias for ElementTree.py anyway, without emitting any warnings. Why should CPython be the only one that shouts at users for importing it? Stefan From python-dev at masklinn.net Tue Feb 14 09:01:55 2012 From: python-dev at masklinn.net (Xavier Morel) Date: Tue, 14 Feb 2012 09:01:55 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: <3C0FC85F-504B-4C38-AC42-8013E9DE8A7E@masklinn.net> On 2012-02-14, at 08:58 , Stefan Behnel wrote: > > These days, other Python implementations already provide the cElementTree > module as a bare alias for ElementTree.py anyway, without emitting any > warnings. Why should CPython be the only one that shouts at users for > importing it? Since all warnings are now silent by default (including DeprecationWarning), it's less of a shout and more of an eyebrow-frown and a tut-tuting really. From victor.stinner at gmail.com Tue Feb 14 13:55:23 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 14 Feb 2012 13:55:23 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <20120213102647.5a143b07@resist.wooz.org> Message-ID: > A datetime module based approach would need to either use a mix of > datetime.datetime() (when returning an absolute time) and > datetime.timedelta() (when returning a time relative to an unknown > starting point), Returning a different type depending on the function would be surprising and confusing. time.clock_gettime(CLOCK_REALTIME) would return datetime.datetime, whereas time.clock_gettime(CLOCK_MONOTONIC) would return datetime.timedelta? Or time.clock_gettime(CLOCK_REALTIME) would return datetime.timedelta whereas time.time() would return datetime.datetime? What would be the logic? > or else just always return datetime.timedelta (even > when we know the epoch and could theoretically make the time > absolute). datetime.timedelta is similar to decimal.Decimal, but I don't want to support both, one is enough. I prefer Decimal because it is simpler and "compatible" with float. > In the former case, it may be appropriate to adopt a boolean flag API > design and the "I want high precision time" request marker would just > be "datetime=True". You'd then get back either datetime.datetime() or > datetime.timedelta() as appropriate for the specific API. A boolean flag has a problem with the import of the decimal module: time.time(decimal=True) would need an implicit ("hidden") import of the decimal module. Another argument present in the PEP: "The boolean argument API was rejected because it is not "pythonic". Changing the return type with a parameter value is preferred over a boolean parameter (a flag)." http://www.python.org/dev/peps/pep-0410/#add-a-boolean-argument > If handed a datetime value, you need to know the correct epoch value, > do the subtraction, then extract the full precision timestamp from the > resulting timedelta object. datetime.datetime don't have a .totimestamp() method. If I remember correctly, time.mktime(datetime.datetime.timetuple()) has issues with timezone and the DST. > - implement __int__ and __float__ on timedelta (where the latter is > just "self.total_seconds()" and the former > "int(self.total_seconds())") It looks like an hack. Why would float(timedelta) return seconds? Why not minutes or nanoseconds? I prefer an unambiguously and explicit .toseconds() method. > The big problem is that datetime and > timedelta pose a huge problem for compatibility with existing third > party APIs that accept timestamp values. I just think that datetime and timedelta are overkill and have more drawbacks than advantages. FYI when I implemented datetime, it just just implemented by calling datetime.datetime.fromtimestamp(). The user can do an explicit call to this function, and datetime.timedelta(seconds=ts) for timedelta. > This is in stark contrast to what happens with decimal.Decimal: > coercion to float() or int() will potentially lose precision, but > still basically works. While addition and subtraction of floats will > fail, addition and subtraction of integers works fine. To avoid losing > precision, it's sufficient to just avoid the coercion. Why would you like to mix Decimal and float? If you ask explicitly to get Decimal timestamps, you should use Decimal everywhere or you lose advantages of this type (and may get TypeError). > I think the outline above really illustrates why the *raw* data type > for timestamps should just be a number, not a higher level semantic > type like timedelta or datetime. Eventually, you want to be able to > express a timestamp as a number of seconds relative to a particular > epoch. To do that, you want a number. Originally, we used ints, then, > to support microsecond resolution, we used floats. The natural > progression to support arbitrary resolutions is to decimal.Decimal. Yep. > Then, the higher level APIs can be defined in *terms* of that high > precision number. Would it be nice if there was a PyPI module that > provided APIs that converted the raw timestamps in stat objects and > other OS level APIs into datetime() and timedelta() objects as > appropriate? Do you really need a module to call datetime.datetime.fromtimestamp(ts) and datetime.timedelta(seconds=ts)? > timedelta.total_seconds() can be updated to accept a "timestamp" argument Yes, it would be consistent with the other changes introduced by the PEP. > Also, by using decimal.Decimal, we open up the possibility of, at some > point in the future, switching to returning high precision values by > default I don't think that it is necessary. Few people need this precision and float will always be faster than Decimal because float is implemented in *hardware* (FPU). I read somewhere that IBM plans to implement decimal float in their CPU, but I suppose than it will also have a "small" size like 64 bits, whereas 64 bits is not enough for a nanosecond resolution (same issue than binary float). > implicit promotion of floats to decimal values in binary > operations without losing data I don't think that such change would be accepted. You should ask Stephan Krah or Mark Dickson :-) -- I completed datetime, timedelta and boolean flag sections of the PEP 410. Victor From dirkjan at ochtman.nl Tue Feb 14 14:11:01 2012 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Tue, 14 Feb 2012 14:11:01 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <20120213102647.5a143b07@resist.wooz.org> Message-ID: FWIW, I'm with Barry on this; doing more with the datetime types seems preferable to introducing yet more different stuff to date/time handling. On Mon, Feb 13, 2012 at 19:33, Victor Stinner wrote: > Oh, I forgot to mention my main concern about datetime: many functions > returning timestamp have an undefined starting point (an no timezone > information ), and so cannot be converted to datetime: > ?- time.clock(), time.wallclock(), time.monotonic(), > time.clock_gettime() (except for CLOCK_REALTIME) > ?- time.clock_getres() > ?- signal.get/setitimer() > ?- os.wait3(), os.wait4(), resource.getrusage() > ?- etc. > > Allowing datetime.datetime type just for few functions (like > datetime.datetime or time.time) but not the others (raise an > exception) is not an acceptable solution. It seems fairly simple to suggest that the functions with an undefined starting point could return a timedelta instead of a datetime? >> ?* datetime.datetime has ordering issues with daylight saving time (DST) in >> ? the duplicate hour of switching from DST to normal time. >> >> Sure, but only for timezone-ful datetimes, right? > > I don't know enough this topic to answer. Martin von Loewis should > answer to this question! Yes, this should only be an issue for dates with timezones. >> ?* datetime.datetime is not as well integrated than Epoch timestamps, some >> ? functions don't accept this type as input. For example, os.utime() expects >> ? a tuple of Epoch timestamps. >> >> So, by implication, Decimal is better integrated by virtue of its ability to >> be coerced to floats and other numeric stack types? > > Yes. decimal.Decimal is already supported by all functions accepting > float (all functions expecting timestamps). I suppose something like os.utime() could be changed to also accept datetimes. >> If it really is impossible or suboptimal to build high resolution datetimes >> and timedeltas, and to use them in these APIs, then at the very least, the PEP >> needs a stronger rationale for why this is. > > IMO supporting nanosecond in datetime and timedelta is an orthogonal issue. Not if you use it to cast them aside for this issue. ;) Cheers, Dirkjan From victor.stinner at gmail.com Tue Feb 14 14:26:49 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 14 Feb 2012 14:26:49 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <20120213102647.5a143b07@resist.wooz.org> Message-ID: >> IMO supporting nanosecond in datetime and timedelta is an orthogonal issue. > > Not if you use it to cast them aside for this issue. ;) Hum yes, I wanted to say that even if we don't keep datetime as a supported type for time.time(), we can still patch the type to make it support nanosecond resolution. Victor From barry at python.org Tue Feb 14 15:44:35 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 14 Feb 2012 09:44:35 -0500 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> <20120213170845.3ee5d4b4@resist.wooz.org> Message-ID: <20120214094435.745d06e6@limelight.wooz.org> On Feb 14, 2012, at 12:38 PM, Nick Coghlan wrote: >> One other thing I'd like to see the PEP address is a possible migration >> strategy to python->python3. ?Even if that strategy is "don't do it, man!". >> IOW, can a distribution change the 'python' symlink once it's pointed to >> python2? ?What is the criteria for that? ?Is it up to a distribution? ?Will >> the PEP get updated when our collective wisdom says its time to change the >> default? ?etc. > >I have no idea, and I'm not going to open that can of worms for this >PEP. We need to say something about the executable aliases so that >people can eventually write cross-platform python2 shebang lines, but >how particular distros actually manage the transition is going to >depend more on their infrastructure and community than it is anything >to do with us. Then I think all the PEP needs to say is that it is explicitly up to the distros to determine if, when, where, and how they transition. I.e. take it off of python-dev's plate. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From brett at python.org Tue Feb 14 16:38:15 2012 From: brett at python.org (Brett Cannon) Date: Tue, 14 Feb 2012 10:38:15 -0500 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: On Mon, Feb 13, 2012 at 23:16, Nick Coghlan wrote: > On Tue, Feb 14, 2012 at 1:42 PM, Eli Bendersky wrote: > > An open question remains on whether to deprecate cElementTree, now that > this > > change is in place. > > > > Currently in 3.3 the whole cElementTree module is: > > > > # Deprecated alias for xml.etree.ElementTree > > > > from xml.etree.ElementTree import * > > > > Would it be alright to issue a DeprecationWarning if this module is > > imported? Then hopefully a couple of releases after 3.3 we can just dump > it. > > What do we really gain by dumping it, though? Just add a CPython > specific test that ensures: > > for key, value in xml.etree.ElementTree.__dict__.items(): > self.assertIs(getattr(xml.etree.cElementTree, key), value) > > and then ignore it for the next decade or so. > > Programmatic deprecation is a significant imposition on third party > developers and should really be reserved for APIs that actively > encourage writing broken code (e.g. contextlib.nested) or are > seriously problematic for python-dev to maintain. For cleanup stuff, > documented deprecation is sufficient. > > Something that might be worth doing (although it would likely scare > the peanut gallery) is to create a PEP 4000 to record the various > cleanup tasks (like dropping cElementTree) that are being deliberately > excluded from the 3.x series. I honestly think a PEP 4000 is a good idea simply to document stuff that we are allowing to exist in Python 3 but don't think people should necessarily be using in order to follow best practices (e.g. this, ditching optparse, no more % string formatting, etc.). -------------- next part -------------- An HTML attachment was scrubbed... URL: From merwok at netwok.org Tue Feb 14 17:12:04 2012 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Tue, 14 Feb 2012 17:12:04 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: <4F3A87D4.4010802@netwok.org> Le 14/02/2012 08:58, Stefan Behnel a ?crit : > I'm -1 on emitting a deprecation warning just because cElementTree is being > replaced by a bare import. That's an implementation detail, just like > cElementTree should have been an implementation detail in the first place. > In all currently maintained CPython releases, importing cElementTree is the > right thing to do for users. +1! From chris at simplistix.co.uk Tue Feb 14 19:55:05 2012 From: chris at simplistix.co.uk (Chris Withers) Date: Tue, 14 Feb 2012 18:55:05 +0000 Subject: [Python-Dev] PyPy 1.8 released In-Reply-To: References: Message-ID: <4F3AAE09.8090808@simplistix.co.uk> On 10/02/2012 09:44, Maciej Fijalkowski wrote: > you can download the PyPy 1.8 release here: > > http://pypy.org/download.html Why no Windows 64-bit build :'( Is the 32-bit build safe to use on 64-bit Windows? Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk From amauryfa at gmail.com Tue Feb 14 20:00:45 2012 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 14 Feb 2012 20:00:45 +0100 Subject: [Python-Dev] PyPy 1.8 released In-Reply-To: <4F3AAE09.8090808@simplistix.co.uk> References: <4F3AAE09.8090808@simplistix.co.uk> Message-ID: 2012/2/14 Chris Withers > On 10/02/2012 09:44, Maciej Fijalkowski wrote: > >> you can download the PyPy 1.8 release here: >> >> http://pypy.org/download.html >> > > Why no Windows 64-bit build :'( > The win64 port was not finished. This platform is different from others mostly because a pointer (64bit) is larger than a long (32bit on all Windows flavors) Is the 32-bit build safe to use on 64-bit Windows? Yes, like many other 32-bit programs pypy for win32 works on Windows 64-bit. It will be limited to 3Gb of memory, of course. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Tue Feb 14 22:27:27 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 14 Feb 2012 16:27:27 -0500 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review References: <20120213102647.5a143b07@resist.wooz.org> Message-ID: <20120214162727.2f9752ad@resist.wooz.org> On Feb 13, 2012, at 07:33 PM, Victor Stinner wrote: >Oh, I forgot to mention my main concern about datetime: many functions >returning timestamp have an undefined starting point (an no timezone >information ), and so cannot be converted to datetime: > - time.clock(), time.wallclock(), time.monotonic(), >time.clock_gettime() (except for CLOCK_REALTIME) > - time.clock_getres() > - signal.get/setitimer() > - os.wait3(), os.wait4(), resource.getrusage() > - etc. That's not strictly true though, is it? E.g. clock_gettime() returns the number of seconds since the Epoch, which is a well-defined start time at least on *nix systems. So clearly those types of functions could return datetimes. I'm fairly certain that between those types of functions and timedeltas you could have most of the bases covered. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From victor.stinner at gmail.com Tue Feb 14 22:58:24 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 14 Feb 2012 22:58:24 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <20120214162727.2f9752ad@resist.wooz.org> References: <20120213102647.5a143b07@resist.wooz.org> <20120214162727.2f9752ad@resist.wooz.org> Message-ID: 2012/2/14 Barry Warsaw : > On Feb 13, 2012, at 07:33 PM, Victor Stinner wrote: > >>Oh, I forgot to mention my main concern about datetime: many functions >>returning timestamp have an undefined starting point (an no timezone >>information ), and so cannot be converted to datetime: >> - time.clock(), time.wallclock(), time.monotonic(), >>time.clock_gettime() (except for CLOCK_REALTIME) >> - time.clock_getres() >> - signal.get/setitimer() >> - os.wait3(), os.wait4(), resource.getrusage() >> - etc. > > That's not strictly true though, is it? ?E.g. clock_gettime() returns the > number of seconds since the Epoch, which is a well-defined start time at least > on *nix systems. I mentionned the exception: time.clock_gettime(CLOCK_REALTIME) returns an Epoch timestamp, but all other clocks supported by clock_gettime() has an unspecified starting point: - CLOCK_MONOTONIC - CLOCK_MONOTONIC_RAW - CLOCK_PROCESS_CPUTIME_ID - CLOCK_THREAD_CPUTIME_ID > ?So clearly those types of functions could return datetimes. What? What would be the starting point for all these functions? It would be surprising to get a datetime for CLOCK_PROCESS_CPUTIME_ID for example. > I'm fairly certain that between those types of functions and timedeltas you > could have most of the bases covered. Ah, timedelta case is different. But I already replied to Nick in this thread about timedelta. You can also From victor.stinner at gmail.com Tue Feb 14 22:59:27 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 14 Feb 2012 22:59:27 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <20120213102647.5a143b07@resist.wooz.org> <20120214162727.2f9752ad@resist.wooz.org> Message-ID: (Oops, I sent my email by mistake, here is the end of my email) > (...) Ah, timedelta case is different. But I already replied to Nick in this > thread about timedelta. You can also see arguments against timedelta in the PEP 410. Victor From barry at python.org Tue Feb 14 23:29:20 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 14 Feb 2012 17:29:20 -0500 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <20120213102647.5a143b07@resist.wooz.org> <20120214162727.2f9752ad@resist.wooz.org> Message-ID: <20120214172920.0a1da837@resist.wooz.org> I think I will just state my reasoning one last time and then leave it to the BDFL or BDFOP to make the final decision. Victor on IRC says that there is not much difference between Decimal and timedelta, and this may be true from an implementation point of view. From a cognitive point of view, I think they're miles apart. Ultimately, I wish ints and floats weren't used for time-y things, and only datetimes (for values with well-defined starting points, including the epoch) and timedeltas (for values with no starting point) were used. We obviously can't eliminate the APIs that return and accept ints and floats, most of which we inherited from C, but we can avoid making it worse by extended them to also accept Decimals. I think it would be valuable work to correct any deficiencies in datetimes and timedeltas so that they can be used in all time-y APIs, with whatever resolution is necessary. My primary concern with the PEP is adding to users confusion when they have to handle (at least) 5 different types[*] that represent time in Python. Cheers, -Barry [*] int, float, Decimal, datetime, timedelta; are there others? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From ncoghlan at gmail.com Wed Feb 15 01:14:35 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2012 10:14:35 +1000 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: <20120214094435.745d06e6@limelight.wooz.org> References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> <20120213170845.3ee5d4b4@resist.wooz.org> <20120214094435.745d06e6@limelight.wooz.org> Message-ID: On Wed, Feb 15, 2012 at 12:44 AM, Barry Warsaw wrote: > On Feb 14, 2012, at 12:38 PM, Nick Coghlan wrote: > >>> One other thing I'd like to see the PEP address is a possible migration >>> strategy to python->python3. ?Even if that strategy is "don't do it, man!". >>> IOW, can a distribution change the 'python' symlink once it's pointed to >>> python2? ?What is the criteria for that? ?Is it up to a distribution? ?Will >>> the PEP get updated when our collective wisdom says its time to change the >>> default? ?etc. >> >>I have no idea, and I'm not going to open that can of worms for this >>PEP. We need to say something about the executable aliases so that >>people can eventually write cross-platform python2 shebang lines, but >>how particular distros actually manage the transition is going to >>depend more on their infrastructure and community than it is anything >>to do with us. > > Then I think all the PEP needs to say is that it is explicitly up to the > distros to determine if, when, where, and how they transition. ?I.e. take it > off of python-dev's plate. Yeah, good idea. I'll also add an explicit link to the announcement of the Arch Linux transition [1] that precipitated this PEP. [1] https://www.archlinux.org/news/python-is-now-python-3/ -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Wed Feb 15 01:23:17 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2012 10:23:17 +1000 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <20120214172920.0a1da837@resist.wooz.org> References: <20120213102647.5a143b07@resist.wooz.org> <20120214162727.2f9752ad@resist.wooz.org> <20120214172920.0a1da837@resist.wooz.org> Message-ID: On Wed, Feb 15, 2012 at 8:29 AM, Barry Warsaw wrote: > My primary concern with the PEP is adding to users confusion when they have to > handle (at least) 5 different types[*] that represent time in Python. My key question to those advocating the use of timedelta instead of Decimal: What should timedelta.total_seconds() return to avoid losing nanosecond precision? How should this be requested when calling the API? The core "timestamp" abstraction is "just a number" that (in context) represents a certain number of seconds. decimal.Decimal qualifies. datetime.timedelta doesn't - it's a higher level construct that makes the semantic context explicit (and currently refuses to interoperate with other values that are just numbers). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg at krypto.org Wed Feb 15 02:00:40 2012 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 14 Feb 2012 17:00:40 -0800 Subject: [Python-Dev] How to round timestamps and durations? In-Reply-To: References: Message-ID: On Mon, Feb 13, 2012 at 4:59 AM, Victor Stinner wrote: > Hi, > > My work on the PEP 410 tries to unify the code to manipulate > timestamps. The problem is that I'm unable to decide how to round > these numbers. > > Functions using a resolution of 1 second (e.g. time.mktime) expects > rounding towards zero (ROUND_HALF_DOWN), as does int(float). Example: > >>>> time.mktime(time.localtime(-1.9)), time.mktime(time.localtime(1.9)) > (-1.0, 1.0) Otherwise known as truncation and the same behavior as C when assigning a float to an int. > > datetime.datetime.fromtimestamp() rounds to nearest with ties going > away from zero (ROUND_HALF_UP). Example: > >>>> datetime.datetime.fromtimestamp(-1.1e-6), datetime.datetime.fromtimestamp(1.1e-6) > (datetime.datetime(1970, 1, 1, 0, 59, 59, 999999), > datetime.datetime(1970, 1, 1, 1, 0, 0, 1)) >>>> datetime.datetime.fromtimestamp(-1.9e-6), datetime.datetime.fromtimestamp(1.9e-6) > (datetime.datetime(1970, 1, 1, 0, 59, 59, 999998), > datetime.datetime(1970, 1, 1, 1, 0, 0, 2)) > > datetime.timedelta * float ?and datetime.timedelta / float rounds to > nearest with ties going to nearest even integer (ROUND_HALF_EVEN), as > does round(). Example: > >>>> [(datetime.timedelta(microseconds=x) / 2.0).microseconds for x in range(6)] > [0, 0, 1, 2, 2, 2] > > Should I also support multiple rounding methods depending on the > operation and of the Python function? Should we always use the same > rounding method? > > Antoine pointed me that ROUND_HALF_UP can produce timestamps "in the > future", which is especially visible when using a resolution of 1 > second. I like this rounding method because it limits the loss of > precision to an half unit: abs(rounded - timestamp) <= 0.5. But it can > be "surprising". I didn't know the other APIs ever rounded up so I find them a bit surprising. Realistically I expect any code out there that actually cares one way or another outside of the int(time.time()) case will care so I'd stick with that style of truncation (towards zero) rounding in all situations myself. -gps > The rounding method should maybe be the same than int(float) (so > ROUND_HALF_DOWN) to avoid surprising results for applications using > int(time.time()) for example (I had such problem with rotated logs and > test_logging). From greg at krypto.org Wed Feb 15 02:10:20 2012 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 14 Feb 2012 17:10:20 -0800 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <20120214172920.0a1da837@resist.wooz.org> References: <20120213102647.5a143b07@resist.wooz.org> <20120214162727.2f9752ad@resist.wooz.org> <20120214172920.0a1da837@resist.wooz.org> Message-ID: On Tue, Feb 14, 2012 at 2:29 PM, Barry Warsaw wrote: > I think I will just state my reasoning one last time and then leave it to the > BDFL or BDFOP to make the final decision. > > Victor on IRC says that there is not much difference between Decimal and > timedelta, and this may be true from an implementation point of view. ?>From a > cognitive point of view, I think they're miles apart. ?Ultimately, I wish ints > and floats weren't used for time-y things, and only datetimes (for values with > well-defined starting points, including the epoch) and timedeltas (for values > with no starting point) were used. > > We obviously can't eliminate the APIs that return and accept ints and floats, > most of which we inherited from C, but we can avoid making it worse by > extended them to also accept Decimals. ?I think it would be valuable work to > correct any deficiencies in datetimes and timedeltas so that they can be used > in all time-y APIs, with whatever resolution is necessary. > > My primary concern with the PEP is adding to users confusion when they have to > handle (at least) 5 different types[*] that represent time in Python. +1 From greg at krypto.org Wed Feb 15 02:13:16 2012 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 14 Feb 2012 17:13:16 -0800 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <20120213102647.5a143b07@resist.wooz.org> <20120214162727.2f9752ad@resist.wooz.org> <20120214172920.0a1da837@resist.wooz.org> Message-ID: On Tue, Feb 14, 2012 at 4:23 PM, Nick Coghlan wrote: > On Wed, Feb 15, 2012 at 8:29 AM, Barry Warsaw wrote: >> My primary concern with the PEP is adding to users confusion when they have to >> handle (at least) 5 different types[*] that represent time in Python. > > My key question to those advocating the use of timedelta instead of Decimal: > > What should timedelta.total_seconds() return to avoid losing > nanosecond precision? > How should this be requested when calling the API? It should return a float as it does today. Add a timedelta.total_nanoseconds() call for people wanting high precision as a raw number and remind people of the precision limits of total_seconds() in the docs. -gps From greg at krypto.org Wed Feb 15 02:13:58 2012 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 14 Feb 2012 17:13:58 -0800 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <20120213102647.5a143b07@resist.wooz.org> <20120214162727.2f9752ad@resist.wooz.org> <20120214172920.0a1da837@resist.wooz.org> Message-ID: On Tue, Feb 14, 2012 at 5:13 PM, Gregory P. Smith wrote: > On Tue, Feb 14, 2012 at 4:23 PM, Nick Coghlan wrote: >> On Wed, Feb 15, 2012 at 8:29 AM, Barry Warsaw wrote: >>> My primary concern with the PEP is adding to users confusion when they have to >>> handle (at least) 5 different types[*] that represent time in Python. >> >> My key question to those advocating the use of timedelta instead of Decimal: >> >> What should timedelta.total_seconds() return to avoid losing >> nanosecond precision? >> How should this be requested when calling the API? > > It should return a float as it does today. ?Add a > timedelta.total_nanoseconds() call for people wanting high precision > as a raw number and remind people of the precision limits of > total_seconds() in the docs. total_nanoseconds() would return an int() in case that wasn't obvious. > > -gps From g.brandl at gmx.net Wed Feb 15 08:42:02 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 15 Feb 2012 08:42:02 +0100 Subject: [Python-Dev] How to round timestamps and durations? In-Reply-To: References: Message-ID: Am 13.02.2012 13:59, schrieb Victor Stinner: > Hi, > > My work on the PEP 410 tries to unify the code to manipulate > timestamps. The problem is that I'm unable to decide how to round > these numbers. > > Functions using a resolution of 1 second (e.g. time.mktime) expects > rounding towards zero (ROUND_HALF_DOWN), as does int(float). Example: FWIW, that's ROUND_DOWN. ROUND_HALF_DOWN rounds up from > x.5. Georg From martin at v.loewis.de Wed Feb 15 10:11:57 2012 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 15 Feb 2012 10:11:57 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <20120214172920.0a1da837@resist.wooz.org> References: <20120213102647.5a143b07@resist.wooz.org> <20120214162727.2f9752ad@resist.wooz.org> <20120214172920.0a1da837@resist.wooz.org> Message-ID: <4F3B76DD.6080308@v.loewis.de> Am 14.02.2012 23:29, schrieb Barry Warsaw: > I think I will just state my reasoning one last time and then leave it to the > BDFL or BDFOP to make the final decision. I'd like to remind people what the original point of the PEP process was: to avoid going in cycles in discussions. To achieve this, the PEP author is supposed to record all objections in the PEP, even if he disagrees (and may state rebuttals for each objection that people brought up). So, Victor: please record all objections in a separate section of the PEP, rather than just rebutting in them in the PEP (as is currently the case). > My primary concern with the PEP is adding to users confusion when they have to > handle (at least) 5 different types[*] that represent time in Python. I agree with Barry here (despite having voiced support for using Decimal before): datetime.datetime *is* the right data type to represent time stamps. If it means that it needs to be improved before it can be used in practice, then so be it - improve it. I think improving datetime needs to go in two directions: a) arbitrary-precision second fractions. My motivation for proposing/supporting Decimal was that it can support arbitrary precision, unlike any of the alternatives (except for using numerator/denominator pairs). So just adding nanosecond resolution to datetime is not enough: it needs to support arbitrary decimal fractions (it doesn't need to support non-decimal fractions, IMO). b) distinction between universal time and local time. This distinction is currently blurred; there should be prominent API to determine whether a point-in-time is meant as universal time or local time. In terminology of the datetime documentation, there needs to be builtin support for "aware" (rather than "naive") UTC time, even if that's the only timezone that comes with Python. Regards, Martin From dirkjan at ochtman.nl Wed Feb 15 10:38:47 2012 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Wed, 15 Feb 2012 10:38:47 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4F3B76DD.6080308@v.loewis.de> References: <20120213102647.5a143b07@resist.wooz.org> <20120214162727.2f9752ad@resist.wooz.org> <20120214172920.0a1da837@resist.wooz.org> <4F3B76DD.6080308@v.loewis.de> Message-ID: On Wed, Feb 15, 2012 at 10:11, "Martin v. L?wis" wrote: >> My primary concern with the PEP is adding to users confusion when they have to >> handle (at least) 5 different types[*] that represent time in Python. > > I agree with Barry here (despite having voiced support for using Decimal > before): datetime.datetime *is* the right data type to represent time > stamps. If it means that it needs to be improved before it can be used > in practice, then so be it - improve it. > > I think improving datetime needs to go in two directions: > a) arbitrary-precision second fractions. My motivation for > ? proposing/supporting Decimal was that it can support arbitrary > ? precision, unlike any of the alternatives (except for using > ? numerator/denominator pairs). So just adding nanosecond resolution > ? to datetime is not enough: it needs to support arbitrary decimal > ? fractions (it doesn't need to support non-decimal fractions, IMO). > b) distinction between universal time and local time. This distinction > ? is currently blurred; there should be prominent API to determine > ? whether a point-in-time is meant as universal time or local time. > ? In terminology of the datetime documentation, there needs to be > ? builtin support for "aware" (rather than "naive") UTC time, even > ? if that's the only timezone that comes with Python. +1. And adding stuff to datetime to make it easier to get a unix timestamp out (as proposed by Victor before, IIRC) would also be a good thing in my book. I really want to be able to handle all my date+time needs without ever importing time or calendar. Cheers, Dirkjan From ncoghlan at gmail.com Wed Feb 15 12:43:17 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 Feb 2012 21:43:17 +1000 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4F3B76DD.6080308@v.loewis.de> References: <20120213102647.5a143b07@resist.wooz.org> <20120214162727.2f9752ad@resist.wooz.org> <20120214172920.0a1da837@resist.wooz.org> <4F3B76DD.6080308@v.loewis.de> Message-ID: On Wed, Feb 15, 2012 at 7:11 PM, "Martin v. L?wis" wrote: > I agree with Barry here (despite having voiced support for using Decimal > before): datetime.datetime *is* the right data type to represent time > stamps. If it means that it needs to be improved before it can be used > in practice, then so be it - improve it. By contrast, I think the only remotely viable choices for arbitrary precision low level timestamp APIs are decimal.Decimal and datetime.timedelta. The "unknown epoch" problem makes it impossible to consistently produce datetime.datetime objects, and an API that inconsistently returned either datetime.datetime or datetime.timedelta for operations that currently consistently return float objects would just be annoying. However, I still think that decimal.Decimal is the right choice. There's nothing wrong with layering APIs, and the core concept of a timestamp is simply a number representing a certain number of seconds. We already have a data type that lets us represent a numeric value to arbitrary precision: decimal.Decimal. Instead of trying to hoist all those APIs up to a higher semantic level, I'd prefer to just leave them as they are now: dealing with numbers (originally ints, then floats to support microseconds, now decimal.Decimal to support nanoseconds and any future increases in precision). If the higher level semantic API is incomplete, then we should *complete it* instead of trying to mash the two different layers together indiscriminately. > I think improving datetime needs to go in two directions: > a) arbitrary-precision second fractions. My motivation for > ? proposing/supporting Decimal was that it can support arbitrary > ? precision, unlike any of the alternatives (except for using > ? numerator/denominator pairs). So just adding nanosecond resolution > ? to datetime is not enough: it needs to support arbitrary decimal > ? fractions (it doesn't need to support non-decimal fractions, IMO). If our core timestamp representation is decimal.Decimal, this is trivial to implement for both datetime and timedelta - just store the seconds component as a decimal.Decimal instance. If not, we'd have to come up with some other way of obtaining arbitrary precision numeric storage (which seems rather wasteful). Even if we end up going down the datetime.timedelta path for the os module APIs, that's still the way I would want to go - arranging for timedelta.total_seconds() to return a Decimal value, rather than some other clumsy alternative like having a separate total_nanoseconds() function that returned a large integer. > b) distinction between universal time and local time. This distinction > ? is currently blurred; there should be prominent API to determine > ? whether a point-in-time is meant as universal time or local time. > ? In terminology of the datetime documentation, there needs to be > ? builtin support for "aware" (rather than "naive") UTC time, even > ? if that's the only timezone that comes with Python. As of 3.2, the datetime module already has full support for arbitrary fixed offsets from UTC, including datetime.timezone.utc (i.e. UTC+0), which allows timezone aware UTC. For 3.2+, you should only need a third party library like pytz if you want to support named timezones (including daylight savings changes). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From victor.stinner at gmail.com Wed Feb 15 14:01:32 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 15 Feb 2012 14:01:32 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4F3B76DD.6080308@v.loewis.de> References: <20120213102647.5a143b07@resist.wooz.org> <20120214162727.2f9752ad@resist.wooz.org> <20120214172920.0a1da837@resist.wooz.org> <4F3B76DD.6080308@v.loewis.de> Message-ID: > I agree with Barry here (despite having voiced support for using Decimal > before): datetime.datetime *is* the right data type to represent time > stamps. If it means that it needs to be improved before it can be used > in practice, then so be it - improve it. Maybe I missed the answer, but how do you handle timestamp with an unspecified starting point like os.times() or time.clock()? Should we leave these function unchanged? My motivation for the PEP 410 is to provide nanosecond resolution for time.clock_gettime(time.CLOCK_MONOTONIC) and time.clock_gettime(time.CLOCK_REALTIME). Victor From victor.stinner at gmail.com Wed Feb 15 14:14:54 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 15 Feb 2012 14:14:54 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4F3B76DD.6080308@v.loewis.de> References: <20120213102647.5a143b07@resist.wooz.org> <20120214162727.2f9752ad@resist.wooz.org> <20120214172920.0a1da837@resist.wooz.org> <4F3B76DD.6080308@v.loewis.de> Message-ID: 2012/2/15 "Martin v. L?wis" : > I agree with Barry here (despite having voiced support for using Decimal > before): datetime.datetime *is* the right data type to represent time > stamps. If it means that it needs to be improved before it can be used > in practice, then so be it - improve it. Decimal and datetime.datetime are not necessary exclusive options. Using the API proposed in the PEP, we can add the Decimal type today, then improve datetime.datetime API, and finally add also datetime.datetime type. Such compromise would solve the unspecified starting date issue: an exception would be raised if the timestamp has an unspecified timestamp. In such case, you can still get the timestamp as a Decimal object with nanosecond resolution. Or we may add support of datetime and Decimal today, even if datetime only support microsecond, and improve datetime later to support nanosecond. It looks like there are use cases for Decimal and datetime, both are useful. At least, datetime has a nice object API related to time, whereas Decimal requires functions from other modules. I don't know yet if one type is enough to handle all use cases. I wrote a patch to demonstrate that my internal API can be extended (store more information for new types like datetime.datetime) to add new types later, without touching the public API (func(timestamp=type)). See timestamp_datetime.patch attached to the issue #13882 (the patch is now outside, I can update it if you would like to). For example: - time.time() would support float, Decimal and datetime - os.times() would support float and Decimal (but not datetime) Victor From victor.stinner at gmail.com Wed Feb 15 14:21:37 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 15 Feb 2012 14:21:37 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4F3B76DD.6080308@v.loewis.de> References: <20120213102647.5a143b07@resist.wooz.org> <20120214162727.2f9752ad@resist.wooz.org> <20120214172920.0a1da837@resist.wooz.org> <4F3B76DD.6080308@v.loewis.de> Message-ID: > I'd like to remind people what the original point of the PEP process > was: to avoid going in cycles in discussions. To achieve this, the PEP > author is supposed to record all objections in the PEP, even if he > disagrees (and may state rebuttals for each objection that people > brought up). > > So, Victor: please record all objections in a separate section of the > PEP, rather than just rebutting in them in the PEP (as is currently the > case). Ok, I will try to list alternatives differently, e.g. by listing also advantages. I didn't know what a PEP is supposed to contain. Victor From barry at python.org Wed Feb 15 14:36:06 2012 From: barry at python.org (Barry Warsaw) Date: Wed, 15 Feb 2012 08:36:06 -0500 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <20120213102647.5a143b07@resist.wooz.org> <20120214162727.2f9752ad@resist.wooz.org> <20120214172920.0a1da837@resist.wooz.org> Message-ID: <20120215083606.74a0fba6@resist.wooz.org> On Feb 15, 2012, at 10:23 AM, Nick Coghlan wrote: >What should timedelta.total_seconds() return to avoid losing nanosecond >precision? How should this be requested when calling the API? See, I have no problem having this method return a Decimal for high precision values. This preserves the valuable abstraction of timedeltas, but also provides a useful method for interoperability. >The core "timestamp" abstraction is "just a number" that (in context) >represents a certain number of seconds. decimal.Decimal qualifies. >datetime.timedelta doesn't - it's a higher level construct that makes >the semantic context explicit (and currently refuses to interoperate >with other values that are just numbers). Right, but I think Python should promote the abstraction as the way to manipulate time-y data. Interoperability is an important principle to maintain, but IMO the right way to do that is to improve datetime and timedelta so that lower-level values can be extracted from, and added to, the higher-level abstract types. I think there are quite a few opportunities for improving the interoperability of datetime and timedelta, but that shouldn't be confused with bypassing them. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Wed Feb 15 14:48:39 2012 From: barry at python.org (Barry Warsaw) Date: Wed, 15 Feb 2012 08:48:39 -0500 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4F3B76DD.6080308@v.loewis.de> References: <20120213102647.5a143b07@resist.wooz.org> <20120214162727.2f9752ad@resist.wooz.org> <20120214172920.0a1da837@resist.wooz.org> <4F3B76DD.6080308@v.loewis.de> Message-ID: <20120215084839.6116361a@resist.wooz.org> On Feb 15, 2012, at 10:11 AM, Martin v. L?wis wrote: >I think improving datetime needs to go in two directions: >a) arbitrary-precision second fractions. My motivation for > proposing/supporting Decimal was that it can support arbitrary > precision, unlike any of the alternatives (except for using > numerator/denominator pairs). So just adding nanosecond resolution > to datetime is not enough: it needs to support arbitrary decimal > fractions (it doesn't need to support non-decimal fractions, IMO). >b) distinction between universal time and local time. This distinction > is currently blurred; there should be prominent API to determine > whether a point-in-time is meant as universal time or local time. > In terminology of the datetime documentation, there needs to be > builtin support for "aware" (rather than "naive") UTC time, even > if that's the only timezone that comes with Python. +1 -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From jimjjewett at gmail.com Wed Feb 15 15:44:19 2012 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Wed, 15 Feb 2012 06:44:19 -0800 (PST) Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: Message-ID: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> PEP author Victor asked (in http://mail.python.org/pipermail/python-dev/2012-February/116499.html): > Maybe I missed the answer, but how do you handle timestamp with an > unspecified starting point like os.times() or time.clock()? Should we > leave these function unchanged? If *all* you know is that it is monotonic, then you can't -- but then you don't really have resolution either, as the clock may well speed up or slow down. If you do have resolution, and the only problem is that you don't know what the epoch was, then you can figure that out well enough by (once per type per process) comparing it to something that does have an epoch, like time.gmtime(). -jJ -- If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ From guido at python.org Wed Feb 15 17:39:45 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2012 08:39:45 -0800 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> Message-ID: I just came to this thread. Having read the good arguments on both sides, I keep wondering why anybody would care about nanosecond precision in timestamps. Unless you're in charge of managing one of the few atomic reference clocks in the world, your clock is not going to tell time that accurate. (Hey, we don't even admit the existence of leap seconds in most places -- not that I mind. :-) What purpose is there to recording timestamps in nanoseconds? For clocks that start when the process starts running, float *is* (basically) good enough. For measuring e.g. file access times, there is no way that the actual time is know with anything like that precision (even if it is *recorded* as a number of milliseconds -- that's a different issue). Maybe it's okay to wait a few years on this, until either 128-bit floats are more common or cDecimal becomes the default floating point type? In the mean time for clock freaks we can have a few specialized APIs that return times in nanoseconds as a (long) integer. -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Wed Feb 15 17:47:11 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 15 Feb 2012 17:47:11 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> Message-ID: <20120215174711.06d75a67@pitrou.net> On Wed, 15 Feb 2012 08:39:45 -0800 Guido van Rossum wrote: > > What purpose is there to recording timestamps in nanoseconds? For > clocks that start when the process starts running, float *is* > (basically) good enough. For measuring e.g. file access times, there > is no way that the actual time is know with anything like that > precision (even if it is *recorded* as a number of milliseconds -- > that's a different issue). The number one use case, as far as I understand, is to have bit-identical file modification timestamps where it can matter. I agree that the rest is anecdotical. Regards Antoine. From guido at python.org Wed Feb 15 18:13:13 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2012 09:13:13 -0800 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <20120215174711.06d75a67@pitrou.net> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <20120215174711.06d75a67@pitrou.net> Message-ID: On Wed, Feb 15, 2012 at 8:47 AM, Antoine Pitrou wrote: > On Wed, 15 Feb 2012 08:39:45 -0800 > Guido van Rossum wrote: >> >> What purpose is there to recording timestamps in nanoseconds? For >> clocks that start when the process starts running, float *is* >> (basically) good enough. For measuring e.g. file access times, there >> is no way that the actual time is know with anything like that >> precision (even if it is *recorded* as a number of milliseconds -- >> that's a different issue). > > The number one use case, as far as I understand, is to have > bit-identical file modification timestamps where it can matter. So that can be solved by adding extra fields st_{a,c,m}time_ns and an extra os.utime_ns() call. Only the rare tool for making 100% faithful backups of filesystems and the like would care. -- --Guido van Rossum (python.org/~guido) From guido at python.org Wed Feb 15 18:20:31 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2012 09:20:31 -0800 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> <20120213170845.3ee5d4b4@resist.wooz.org> <20120214094435.745d06e6@limelight.wooz.org> Message-ID: Does this need a pronouncement? Worrying about the speed of symlinks seems silly, and exactly how the links are created (hard or soft, chaining or direct) should be up to the distro; our own Makefile should create chaining symlinks just so the mechanism is clear. -- --Guido van Rossum (python.org/~guido) From victor.stinner at gmail.com Wed Feb 15 18:23:55 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 15 Feb 2012 18:23:55 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> Message-ID: 2012/2/15 Guido van Rossum : > I just came to this thread. Having read the good arguments on both > sides, I keep wondering why anybody would care about nanosecond > precision in timestamps. Python 3.3 exposes C functions that return timespec structure. This structure contains a timestamp with a resolution of 1 nanosecond, whereas the timeval structure has only a resolution of 1 microsecond. Examples of C functions -> Python functions: - timeval: gettimeofday() -> time.time() - timespec: clock_gettime() -> time.clock_gettime() - timespec: stat() -> os.stat() - etc. If we keep float, Python would have has worse precision than C just because it uses an inappropriate type (C uses two integers in timeval). Linux supports nanosecond timestamps since Linux 2.6, Windows supports 100 ns resolution since Windows 2000 or maybe before. It doesn't mean that Windows system clock is accurate: in practical, it's hard to get something better than 1 ms :-) But you may use QueryPerformanceCounter() is you need a bettre precision, it is used by time.clock() for example. > For measuring e.g. file access times, there > is no way that the actual time is know with anything like that > precision (even if it is *recorded* as a number of milliseconds -- > that's a different issue). If you need a real world example, here is an extract of http://en.wikipedia.org/wiki/Ext4: "Improved timestamps As computers become faster in general and as Linux becomes used more for mission-critical applications, the granularity of second-based timestamps becomes insufficient. To solve this, ext4 provides timestamps measured in nanoseconds. (...)" So nanosecond resolution is needed to check if a file is newer than another. Such test is common in build programs like make or scons. Filesystems resolution: - ext4: 1 ns - btrfs: 1 ns - NTFS: 100 ns - FAT32: 2 sec (yeah!) Victor From barry at python.org Wed Feb 15 18:28:06 2012 From: barry at python.org (Barry Warsaw) Date: Wed, 15 Feb 2012 12:28:06 -0500 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> <20120213170845.3ee5d4b4@resist.wooz.org> <20120214094435.745d06e6@limelight.wooz.org> Message-ID: <20120215122806.3a57b5a7@resist.wooz.org> On Feb 15, 2012, at 09:20 AM, Guido van Rossum wrote: >Does this need a pronouncement? Worrying about the speed of symlinks >seems silly, and exactly how the links are created (hard or soft, >chaining or direct) should be up to the distro; our own Makefile >should create chaining symlinks just so the mechanism is clear. Works for me. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From solipsis at pitrou.net Wed Feb 15 18:38:28 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 15 Feb 2012 18:38:28 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> Message-ID: <20120215183828.1141883f@pitrou.net> On Wed, 15 Feb 2012 18:23:55 +0100 Victor Stinner wrote: > > Linux supports nanosecond timestamps since Linux 2.6, Windows supports > 100 ns resolution since Windows 2000 or maybe before. It doesn't mean > that Windows system clock is accurate: in practical, it's hard to get > something better than 1 ms :-) Well, do you think the Linux system clock is nanosecond-accurate? A nanosecond is what it takes to execute a couple of CPU instructions. Even on a real-time operating system, your nanosecond-precise measurement is already obsolete when it starts being processed by the higher-level application. A single cache miss in the CPU will make the precision worthless. And in a higher-level language like Python, the execution times of individual instructions are not specified or stable, so the resolution brings you nothing. > "Improved timestamps > As computers become faster in general and as Linux becomes used > more for mission-critical applications, the granularity of > second-based timestamps becomes insufficient. To solve this, ext4 > provides timestamps measured in nanoseconds. (...)" This is a fallacy. Just because ext4 is able to *store* nanoseconds timestamps doesn't mean the timestamps are accurate up to that point. > Such test is common in build programs like make or scons. scons is written in Python and its authors have not complained, AFAIK, about timestamp precision. Regards Antoine. From guido at python.org Wed Feb 15 18:43:49 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2012 09:43:49 -0800 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> Message-ID: On Wed, Feb 15, 2012 at 9:23 AM, Victor Stinner wrote: > 2012/2/15 Guido van Rossum : >> I just came to this thread. Having read the good arguments on both >> sides, I keep wondering why anybody would care about nanosecond >> precision in timestamps. > > Python 3.3 exposes C functions that return timespec structure. This > structure contains a timestamp with a resolution of 1 nanosecond, > whereas the timeval structure has only a resolution of 1 microsecond. > Examples of C functions -> Python functions: > > ?- timeval: gettimeofday() -> time.time() > ?- timespec: clock_gettime() -> time.clock_gettime() > ?- timespec: stat() -> os.stat() > ?- etc. > > If we keep float, Python would have has worse precision than C just > because it uses an inappropriate type (C uses two integers in > timeval). > > Linux supports nanosecond timestamps since Linux 2.6, Windows supports > 100 ns resolution since Windows 2000 or maybe before. It doesn't mean > that Windows system clock is accurate: in practical, it's hard to get > something better than 1 ms :-) But you may use > QueryPerformanceCounter() is you need a bettre precision, it is used > by time.clock() for example. > >> For measuring e.g. file access times, there >> is no way that the actual time is know with anything like that >> precision (even if it is *recorded* as a number of milliseconds -- >> that's a different issue). > > If you need a real world example, here is an extract of > http://en.wikipedia.org/wiki/Ext4: > > "Improved timestamps > ? ?As computers become faster in general and as Linux becomes used > more for mission-critical applications, the granularity of > second-based timestamps becomes insufficient. To solve this, ext4 > provides timestamps measured in nanoseconds. (...)" > > So nanosecond resolution is needed to check if a file is newer than > another. Such test is common in build programs like make or scons. > > Filesystems resolution: > ?- ext4: 1 ns > ?- btrfs: 1 ns > ?- NTFS: 100 ns > ?- FAT32: 2 sec (yeah!) This does not explain why microseconds aren't good enough. It seems none of the clocks involved can actually measure even relative time intervals more accurate than 100ns, and I expect that kernels don't actually keep their clock more accurate than milliseconds. (They may increment it by 1 microsecond approximately every microsecond, or even by 1 ns roughly every ns, but that doesn't fool me into believing all those digits of precision. I betcha that over say an hour even time deltas aren't more accurate than a microsecond, due to inevitable fluctuations in clock speed. It seems the argument goes simply "because Linux chose to go all the way to nanoseconds we must support nanoseconds" -- and Linux probably chose nanoseconds because that's what fits in 32 bits and there wasn't anything else to do with those bits. *Apart* from the specific use case of making an exact copy of a directory tree that can be verified by other tools that simply compare the nanosecond times for equality, I don't see any reason for complicating so many APIs to preserve the fake precision. As far as simply comparing whether one file is newer than another for tools like make/scons, I bet that it's in practice impossible to read a file and create another in less than a microsecond. (I actually doubt that you can do it faster than a millisecond, but for my argument I don't need that.) -- --Guido van Rossum (python.org/~guido) From mark at hotpy.org Wed Feb 15 18:58:49 2012 From: mark at hotpy.org (Mark Shannon) Date: Wed, 15 Feb 2012 17:58:49 +0000 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: <4F3902AA.3080300@hotpy.org> References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> <4F2AE13C.6010900@hotpy.org> <4F3291C9.9070305@hotpy.org> <4F33B343.1050801@voidspace.org.uk> <4F3902AA.3080300@hotpy.org> Message-ID: <4F3BF259.50502@hotpy.org> Any opinions on my new dictionary implementation? I'm happy to take silence on the PEP as tacit approval, but the code definitely needs reviewing. Issue: http://bugs.python.org/issue13903 PEP: https://bitbucket.org/markshannon/cpython_new_dict/src/6c4d5d9dfc6d/pep-new-dict.txt Repository https://bitbucket.org/markshannon/cpython_new_dict Cheers, Mark. From victor.stinner at gmail.com Wed Feb 15 18:58:54 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 15 Feb 2012 18:58:54 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <20120215183828.1141883f@pitrou.net> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <20120215183828.1141883f@pitrou.net> Message-ID: >> Linux supports nanosecond timestamps since Linux 2.6, Windows supports >> 100 ns resolution since Windows 2000 or maybe before. It doesn't mean >> that Windows system clock is accurate: in practical, it's hard to get >> something better than 1 ms :-) > > Well, do you think the Linux system clock is nanosecond-accurate? Test the following C program: ------------ #include #include int main(int argc, char **argv, char **arge) { struct timespec tps, tpe; if ((clock_gettime(CLOCK_REALTIME, &tps) != 0) || (clock_gettime(CLOCK_REALTIME, &tpe) != 0)) { perror("clock_gettime"); return -1; } printf("%lu s, %lu ns\n", tpe.tv_sec-tps.tv_sec, tpe.tv_nsec-tps.tv_nsec); return 0; } ------------ Compile it using gcc time.c -o time -lrt. It gives me differences smaller than 1000 ns on Ubuntu 11.10 and a Intel Core i5 @ 3.33GHz: $ ./a.out 0 s, 781 ns $ ./a.out 0 s, 785 ns $ ./a.out 0 s, 798 ns $ ./a.out 0 s, 818 ns $ ./a.out 0 s, 270 ns Victor From guido at python.org Wed Feb 15 19:11:42 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2012 10:11:42 -0800 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <20120215183828.1141883f@pitrou.net> Message-ID: So using floats we can match 100ns precision, right? On Wed, Feb 15, 2012 at 9:58 AM, Victor Stinner wrote: >>> Linux supports nanosecond timestamps since Linux 2.6, Windows supports >>> 100 ns resolution since Windows 2000 or maybe before. It doesn't mean >>> that Windows system clock is accurate: in practical, it's hard to get >>> something better than 1 ms :-) >> >> Well, do you think the Linux system clock is nanosecond-accurate? > > Test the following C program: > ------------ > #include > #include > > int main(int argc, char **argv, char **arge) { > ?struct timespec tps, tpe; > ?if ((clock_gettime(CLOCK_REALTIME, &tps) != 0) > ?|| (clock_gettime(CLOCK_REALTIME, &tpe) != 0)) { > ? ?perror("clock_gettime"); > ? ?return -1; > ?} > ?printf("%lu s, %lu ns\n", tpe.tv_sec-tps.tv_sec, > ? ?tpe.tv_nsec-tps.tv_nsec); > ?return 0; > } > ------------ > Compile it using gcc time.c -o time -lrt. > > It gives me differences smaller than 1000 ns on Ubuntu 11.10 and a > Intel Core i5 @ 3.33GHz: > > $ ./a.out > 0 s, 781 ns > $ ./a.out > 0 s, 785 ns > $ ./a.out > 0 s, 798 ns > $ ./a.out > 0 s, 818 ns > $ ./a.out > 0 s, 270 ns > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Wed Feb 15 19:10:13 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 15 Feb 2012 19:10:13 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <20120215183828.1141883f@pitrou.net> Message-ID: <1329329413.3389.9.camel@localhost.localdomain> Le mercredi 15 f?vrier 2012 ? 18:58 +0100, Victor Stinner a ?crit : > It gives me differences smaller than 1000 ns on Ubuntu 11.10 and a > Intel Core i5 @ 3.33GHz: > > $ ./a.out > 0 s, 781 ns > $ ./a.out > 0 s, 785 ns > $ ./a.out > 0 s, 798 ns > $ ./a.out > 0 s, 818 ns > $ ./a.out > 0 s, 270 ns What is it supposed to prove exactly? There is a difference between being able to *represent* nanoseconds and being able to *measure* them; let alone give a precise meaning to them. (and ironically, floating-point numbers are precise enough to represent these numbers unambiguously) Regards Antoine. From solipsis at pitrou.net Wed Feb 15 19:13:28 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 15 Feb 2012 19:13:28 +0100 Subject: [Python-Dev] A new dictionary implementation References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> <4F2AE13C.6010900@hotpy.org> <4F3291C9.9070305@hotpy.org> <4F33B343.1050801@voidspace.org.uk> <4F3902AA.3080300@hotpy.org> Message-ID: <20120215191328.6344dbba@pitrou.net> On Mon, 13 Feb 2012 12:31:38 +0000 Mark Shannon wrote: > Note that the json benchmark is unstable and should be ignored. Can you elaborate? If it's unstable it should be fixed, not ignored :) Also, there are two different mako results in your message, which one is the right one? Thanks Antoine. From martin at v.loewis.de Wed Feb 15 20:38:17 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 15 Feb 2012 20:38:17 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> Message-ID: <4F3C09A9.80009@v.loewis.de> > *Apart* from the specific use case of making an exact copy of a > directory tree that can be verified by other tools that simply compare > the nanosecond times for equality, I don't see any reason for > complicating so many APIs to preserve the fake precision. As far as > simply comparing whether one file is newer than another for tools like > make/scons, I bet that it's in practice impossible to read a file and > create another in less than a microsecond. (I actually doubt that you > can do it faster than a millisecond, but for my argument I don't need > that.) But this leads to the issue with specialized APIs just for nanoseconds (as the one you just proposed): people will use them *just because they are there*. It's like the byte-oriented APIs to do file names: most applications won't need them, either because the file names convert into character strings just fine, or because the emulation that we (now) provide will fall back to some nearly-accurate representation. Still, just because we have the byte APIs, people use them, to then find out that they don't work on Windows, so they will write very complicated code to make their code 100% correct. The same will happen with specialized API for nanosecond time stamps: people will be told to use them because it might matter, and not knowing for sure that it won't matter to them, they will use them. Therefore, I feel that we must not introduced such specialized APIs. Not supporting ns timestamps is something I can readily agree to. However, contributors won't agree to that, and will insist that these be added (and keep writing patches to do so) until it does get added. Some of them are core contributors, so there is no easy way to stop them :-) Regards, Martin From martin at v.loewis.de Wed Feb 15 20:56:26 2012 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 15 Feb 2012 20:56:26 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <1329329413.3389.9.camel@localhost.localdomain> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <20120215183828.1141883f@pitrou.net> <1329329413.3389.9.camel@localhost.localdomain> Message-ID: <4F3C0DEA.3010702@v.loewis.de> Am 15.02.2012 19:10, schrieb Antoine Pitrou: > > Le mercredi 15 f?vrier 2012 ? 18:58 +0100, Victor Stinner a ?crit : >> It gives me differences smaller than 1000 ns on Ubuntu 11.10 and a >> Intel Core i5 @ 3.33GHz: >> >> $ ./a.out >> 0 s, 781 ns >> $ ./a.out >> 0 s, 785 ns >> $ ./a.out >> 0 s, 798 ns >> $ ./a.out >> 0 s, 818 ns >> $ ./a.out >> 0 s, 270 ns > > What is it supposed to prove exactly? There is a difference between > being able to *represent* nanoseconds and being able to *measure* them; > let alone give a precise meaning to them. Linux *actually* is able to measure time in nanosecond precision, even though it is not able to keep its clock synchronized to UTC with a nanosecond accuracy. The way Linux does that is to use the time-stamping counter of the processor (the rdtsc instructions), which (originally) counts one unit per CPU clock. I believe current processors use slightly different countings (e.g. through the APIC), but still: you get a resolution within the clock frequency of the CPU quartz. With the quartz in Victor's machine, a single clock takes 0.3ns, so three of them make a nanosecond. As the quartz may not be entirely accurate (and also as the CPU frequency may change) you have to measure the clock rate against an external time source, but Linux has implemented algorithms for that. On my system, dmesg shows [ 2.236894] Refined TSC clocksource calibration: 2793.000 MHz. [ 2.236900] Switching to clocksource tsc Regards, Martin From solipsis at pitrou.net Wed Feb 15 21:06:43 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 15 Feb 2012 21:06:43 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4F3C0DEA.3010702@v.loewis.de> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <20120215183828.1141883f@pitrou.net> <1329329413.3389.9.camel@localhost.localdomain> <4F3C0DEA.3010702@v.loewis.de> Message-ID: <20120215210643.323a935e@pitrou.net> On Wed, 15 Feb 2012 20:56:26 +0100 "Martin v. L?wis" wrote: > > With the quartz in Victor's machine, a single clock takes 0.3ns, so > three of them make a nanosecond. As the quartz may not be entirely > accurate (and also as the CPU frequency may change) you have to measure > the clock rate against an external time source, but Linux has > implemented algorithms for that. On my system, dmesg shows > > [ 2.236894] Refined TSC clocksource calibration: 2793.000 MHz. > [ 2.236900] Switching to clocksource tsc But that's still not meaningful. By the time clock_gettime() returns, an unpredictable number of nanoseconds have elapsed, and even more when returning to the Python evaluation loop. So the nanosecond precision is just an illusion, and a float should really be enough to represent durations for any task where Python is suitable as a language. Regards Antoine. From mark at hotpy.org Wed Feb 15 21:15:52 2012 From: mark at hotpy.org (Mark Shannon) Date: Wed, 15 Feb 2012 20:15:52 +0000 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <20120215210643.323a935e@pitrou.net> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <20120215183828.1141883f@pitrou.net> <1329329413.3389.9.camel@localhost.localdomain> <4F3C0DEA.3010702@v.loewis.de> <20120215210643.323a935e@pitrou.net> Message-ID: <4F3C1278.2030409@hotpy.org> Antoine Pitrou wrote: > On Wed, 15 Feb 2012 20:56:26 +0100 > "Martin v. L?wis" wrote: >> With the quartz in Victor's machine, a single clock takes 0.3ns, so >> three of them make a nanosecond. As the quartz may not be entirely >> accurate (and also as the CPU frequency may change) you have to measure >> the clock rate against an external time source, but Linux has >> implemented algorithms for that. On my system, dmesg shows >> >> [ 2.236894] Refined TSC clocksource calibration: 2793.000 MHz. >> [ 2.236900] Switching to clocksource tsc > > But that's still not meaningful. By the time clock_gettime() returns, > an unpredictable number of nanoseconds have elapsed, and even more when > returning to the Python evaluation loop. > > So the nanosecond precision is just an illusion, and a float should > really be enough to represent durations for any task where Python is > suitable as a language. I reckon PyPy might be able to call clock_gettime() in a tight loop almost as frequently as the C program (although not with the overhead of converting to a decimal). Cheers, Mark. From benjamin at python.org Wed Feb 15 21:35:04 2012 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 15 Feb 2012 15:35:04 -0500 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4F3C1278.2030409@hotpy.org> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <20120215183828.1141883f@pitrou.net> <1329329413.3389.9.camel@localhost.localdomain> <4F3C0DEA.3010702@v.loewis.de> <20120215210643.323a935e@pitrou.net> <4F3C1278.2030409@hotpy.org> Message-ID: 2012/2/15 Mark Shannon : > > I reckon PyPy might be able to call clock_gettime() in a tight loop > almost as frequently as the C program (although not with the overhead > of converting to a decimal). The nanosecond resolution is just as meaningless in C. -- Regards, Benjamin From yselivanov.ml at gmail.com Wed Feb 15 22:09:17 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 15 Feb 2012 16:09:17 -0500 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: <4F3BF259.50502@hotpy.org> References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> <4F2AE13C.6010900@hotpy.org> <4F3291C9.9070305@hotpy.org> <4F33B343.1050801@voidspace.org.uk> <4F3902AA.3080300@hotpy.org> <4F3BF259.50502@hotpy.org> Message-ID: <3E3ED4B6-FAF4-48AE-A6A9-FEDB1C659315@gmail.com> Hello Mark, First, I've back-ported your patch on python 3.2.2 (which was relatively easy). Almost all tests pass, and those that don't are always failing on my machine if I remember. The patch can be found here: http://goo.gl/nSzzY Then, I compared memory footprint of one of our applications (300,000 LOC) and saw it about 6% less than on vanilla python 3.2.2 (660 MB of reserved process memory compared to 702 MB; Linux Gentoo 64bit) The application is written in heavy OOP style (for instance, ~1000 classes are generated by our ORM on the fly, and there are approximately the same amount of hand-written ones) so I hoped for a much bigger saving. As for the patch itself I found one use-case, where python with the patch behaves differently:: class Foo: def __init__(self, msg): self.msg = msg f = Foo('123') class _str(str): pass print(f.msg) print(getattr(f, _str('msg'))) The above snippet works perfectly on vanilla py3.2, but fails on the patched one (even on 3.3 compiled from your 'cpython_new_dict' branch) I'm not sure that it's a valid code, though. If not, then we need to fix some python internals to add exact type check in 'getattr', in the 'operator.getattr', etc. And if it is - your patch needs to be fixed. In any case, I propose to add the above code to the python test-suite, with either expecting a result or an exception. Cheers, Yury On 2012-02-15, at 12:58 PM, Mark Shannon wrote: > Any opinions on my new dictionary implementation? > > I'm happy to take silence on the PEP as tacit approval, > but the code definitely needs reviewing. > > Issue: > http://bugs.python.org/issue13903 > > PEP: > https://bitbucket.org/markshannon/cpython_new_dict/src/6c4d5d9dfc6d/pep-new-dict.txt > > Repository > https://bitbucket.org/markshannon/cpython_new_dict > > Cheers, > Mark. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/yselivanov.ml%40gmail.com From neologix at free.fr Wed Feb 15 22:16:13 2012 From: neologix at free.fr (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Wed, 15 Feb 2012 22:16:13 +0100 Subject: [Python-Dev] best place for an atomic file API Message-ID: Hi, Issue #8604 aims at adding an atomic file API to make it easier to create/update files atomically, using rename() on POSIX systems and MoveFileEx() on Windows (which are now available through os.replace()). It would also use fsync() on POSIX to make sure data is committed to disk. For example, it could be used by importlib to avoid races when writting bytecode files (issues #13392, #13003, #13146), or more generally by any application that wants to make sure to end up with a consistent file even in face of crash (e.g. it seems that mercurial implemented their own version). Basically the usage would be, e.g.: with AtomicFile('foo') as f: pickle.dump(obj, f) or with AtomicFile('foo') as f: chunk = heavyCrunch() f.write(chunk) chunk = CrunchSomeMore() f.write(chunk) What would be the best place for a such a class? _pyio, tempfile, or a new atomicfile Cheers, cf From guido at python.org Wed Feb 15 22:32:51 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2012 13:32:51 -0800 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4F3C09A9.80009@v.loewis.de> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C09A9.80009@v.loewis.de> Message-ID: On Wed, Feb 15, 2012 at 11:38 AM, "Martin v. L?wis" wrote: >> *Apart* from the specific use case of making an exact copy of a >> directory tree that can be verified by other tools that simply compare >> the nanosecond times for equality, I don't see any reason for >> complicating so many APIs to preserve the fake precision. As far as >> simply comparing whether one file is newer than another for tools like >> make/scons, I bet that it's in practice impossible to read a file and >> create another in less than a microsecond. (I actually doubt that you >> can do it faster than a millisecond, but for my argument I don't need >> that.) > > But this leads to the issue with specialized APIs just for nanoseconds > (as the one you just proposed): people will use them *just because they > are there*. > > It's like the byte-oriented APIs to do file names: most applications > won't need them, either because the file names convert into character > strings just fine, or because the emulation that we (now) provide will > fall back to some nearly-accurate representation. Still, just because we > have the byte APIs, people use them, to then find out that they don't > work on Windows, so they will write very complicated code to make their > code 100% correct. > > The same will happen with specialized API for nanosecond time stamps: > people will be told to use them because it might matter, and not knowing > for sure that it won't matter to them, they will use them. > > Therefore, I feel that we must not introduced such specialized APIs. You have a point, but applies just as much to the proposal in the PEP -- floats and Decimal are often not quite compatible, but people will pass type=Decimal to the clock and stat functions just because they can. The problems with mixing floats and Decimal are probably just as nasty as those with mixing byte and str. At least if people are mixing nanoseconds (integers) and seconds (floats) they will quickly notice results that are a billion times off. > Not supporting ns timestamps is something I can readily agree to. Me too. > However, contributors won't agree to that, and will insist that these > be added (and keep writing patches to do so) until it does get added. > Some of them are core contributors, so there is no easy way to stop > them :-) Actually I think a rejected PEP would be an excellent way to stop this. Maybe an alternative PEP could be written that supports the filesystem copying use case only, using some specialized ns APIs? I really think that all you need is st_{a,c,m}time_ns fields and os.utime_ns(). -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Thu Feb 16 00:29:59 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 16 Feb 2012 10:29:59 +1100 Subject: [Python-Dev] best place for an atomic file API In-Reply-To: References: Message-ID: <4F3C3FF7.3030007@pearwood.info> Charles-Fran?ois Natali wrote: > Hi, > > Issue #8604 aims at adding an atomic file API to make it easier to > create/update files atomically, using rename() on POSIX systems and > MoveFileEx() on Windows (which are now available through > os.replace()). It would also use fsync() on POSIX to make sure data is > committed to disk. [...] > What would be the best place for a such a class? > _pyio, tempfile, or a new atomicfile shutil perhaps? As a user, that's the third place I look for file utilities, after builtin functions and os module. -- Steven From ncoghlan at gmail.com Thu Feb 16 01:21:06 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 16 Feb 2012 10:21:06 +1000 Subject: [Python-Dev] best place for an atomic file API In-Reply-To: <4F3C3FF7.3030007@pearwood.info> References: <4F3C3FF7.3030007@pearwood.info> Message-ID: On Thu, Feb 16, 2012 at 9:29 AM, Steven D'Aprano wrote: > Charles-Fran?ois Natali wrote: >> What would be the best place for a such a class? >> _pyio, tempfile, or a new atomicfile > > > shutil perhaps? > > As a user, that's the third place I look for file utilities, after builtin > functions and os module. +1 for shutil from me. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From breamoreboy at yahoo.co.uk Thu Feb 16 01:28:54 2012 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Thu, 16 Feb 2012 00:28:54 +0000 Subject: [Python-Dev] cpython (3.2): remove unused import In-Reply-To: References: Message-ID: On 06/02/2012 17:57, Brett Cannon wrote: > On Sun, Feb 5, 2012 at 19:53, Christian Heimes wrote: > >> Am 06.02.2012 01:39, schrieb Brett Cannon: >>> I'm going to assume pylint or pyflakes would throw too many warnings on >>> the stdlib, but would it be worth someone's time to write a simple >>> unused import checker to run over the stdlib on occasion? I bet even one >>> that did nothing more than a regex search for matched import statements >>> would be good enough. >> >> Zope 3 has an import checker that uses the compiler package and AST tree >> to check for unused imports. It seems like a better approach than a >> simple regex search. >> >> >> http://svn.zope.org/Zope3/trunk/utilities/importchecker.py?rev=25177&view=auto >> >> The importorder tool uses the tokenizer module to order import statements. >> >> >> http://svn.zope.org/Zope3/trunk/utilities/importorder.py?rev=25177&view=auto >> >> Both are written by Jim Fulton. >> > > Ah, but does it run against Python 3? If so then this is something to > suggest on python-mentor for someone to get their feet wet for contributing. > A possible alternative is the sfood-checker tool given here http://furius.ca/snakefood/ which I stumbled across whilst looking for something completely different. -- Cheers. Mark Lawrence. From ben+python at benfinney.id.au Thu Feb 16 02:19:39 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 16 Feb 2012 12:19:39 +1100 Subject: [Python-Dev] best place for an atomic file API References: Message-ID: <87zkcjy3pw.fsf@benfinney.id.au> Charles-Fran?ois Natali writes: > Issue #8604 aims at adding an atomic file API to make it easier to > create/update files atomically, using rename() on POSIX systems and > MoveFileEx() on Windows (which are now available through > os.replace()). It would also use fsync() on POSIX to make sure data is > committed to disk. These make it quite OS-specific. [?] > What would be the best place for a such a class? > _pyio, tempfile, or a new atomicfile I would expect to find it within ?os? or submodules of ?os?. -- \ ?We should be less concerned about adding years to life, and | `\ more about adding life to years.? ?Arthur C. Clarke, 2001 | _o__) | Ben Finney From brian at python.org Thu Feb 16 02:25:19 2012 From: brian at python.org (Brian Curtin) Date: Wed, 15 Feb 2012 19:25:19 -0600 Subject: [Python-Dev] best place for an atomic file API In-Reply-To: <87zkcjy3pw.fsf@benfinney.id.au> References: <87zkcjy3pw.fsf@benfinney.id.au> Message-ID: On Wed, Feb 15, 2012 at 19:19, Ben Finney wrote: > Charles-Fran?ois Natali writes: > >> Issue #8604 aims at adding an atomic file API to make it easier to >> create/update files atomically, using rename() on POSIX systems and >> MoveFileEx() on Windows (which are now available through >> os.replace()). It would also use fsync() on POSIX to make sure data is >> committed to disk. > > These make it quite OS-specific. That'll happen when solving problems on different OSes. Do you propose a more platform agnostic solution? From nas at arctrix.com Thu Feb 16 02:26:33 2012 From: nas at arctrix.com (Neil Schemenauer) Date: Thu, 16 Feb 2012 01:26:33 +0000 (UTC) Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> <20120213170845.3ee5d4b4@resist.wooz.org> <20120214094435.745d06e6@limelight.wooz.org> Message-ID: Guido van Rossum wrote: > Does this need a pronouncement? Worrying about the speed of symlinks > seems silly I agree. I wonder if a hard-link was used for legacy reasons. Some very old versions of Unix didn't have symlinks. It looks like it was introduced in BSD 4.2, released in 1983. That seems a long time before the birth of Python but perhaps some SysV systems were around that didn't have it. Also, maybe speed was more of a concern at that time. In any case, those days are long, long gone. Neil From ben+python at benfinney.id.au Thu Feb 16 02:49:54 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 16 Feb 2012 12:49:54 +1100 Subject: [Python-Dev] best place for an atomic file API References: <87zkcjy3pw.fsf@benfinney.id.au> Message-ID: <87mx8jy2bh.fsf@benfinney.id.au> Brian Curtin writes: > On Wed, Feb 15, 2012 at 19:19, Ben Finney wrote: > > Charles-Fran?ois Natali writes: > > > >> [?] using rename() on POSIX systems and MoveFileEx() on Windows > >> (which are now available through os.replace()). It would also use > >> fsync() on POSIX to make sure data is committed to disk. > > > > These make it quite OS-specific. > > That'll happen when solving problems on different OSes. Do you propose > a more platform agnostic solution? No, I have no objection to that implementation. I'm pointing that out only because the nature of the functionality implies I'd expect to find it within the ?os? module hierarchy. -- \ ?The man who is denied the opportunity of taking decisions of | `\ importance begins to regard as important the decisions he is | _o__) allowed to take.? ?C. Northcote Parkinson | Ben Finney From anacrolix at gmail.com Thu Feb 16 02:55:29 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Thu, 16 Feb 2012 09:55:29 +0800 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> <20120213170845.3ee5d4b4@resist.wooz.org> <20120214094435.745d06e6@limelight.wooz.org> Message-ID: +1 for using symlinks where possible. In deploying Python to different operating systems and filesystems I've often had to run a script to "fix" the hardlinking done by make install because the deployment mechanism or system couldn't be trusted to do the right thing with respect to minimising installation size. Symlinks are total win when disk use is a concern, and make intent clear. I'm not aware of any mainstream systems that don't support them. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Thu Feb 16 03:06:50 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 16 Feb 2012 15:06:50 +1300 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> Message-ID: <4F3C64BA.5000702@canterbury.ac.nz> On 16/02/12 06:43, Guido van Rossum wrote: > This does not explain why microseconds aren't good enough. It seems > none of the clocks involved can actually measure even relative time > intervals more accurate than 100ns, and I expect that kernels don't > actually keep their clock more accurate than milliseconds. I gather that modern x86 CPUs have a counter that keeps track of time down to a nanosecond or so by counting clock cycles. In principle it seems like a kernel should be able to make use of it in conjunction with other timekeeping hardware to produce nanosecond-resolution timestamps. Whether any existing kernel actually does that is another matter. It probably isn't worth the bother for things like file timestamps, where the time taken to execute the system call that modifies the file is likely to be several orders of magnitude larger. Until we have computers with terahertz clocks and gigahertz disk drives, it seems like a rather theoretical issue. And it doesn't look like Mr. Moore is going to give us anything like that any time soon. -- Greg From guido at python.org Thu Feb 16 03:06:36 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2012 18:06:36 -0800 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> <20120213170845.3ee5d4b4@resist.wooz.org> <20120214094435.745d06e6@limelight.wooz.org> Message-ID: On Wed, Feb 15, 2012 at 5:26 PM, Neil Schemenauer wrote: > Guido van Rossum wrote: >> Does this need a pronouncement? Worrying about the speed of symlinks >> seems silly > > I agree. ?I wonder if a hard-link was used for legacy reasons. ?Some > very old versions of Unix didn't have symlinks. ?It looks like it > was introduced in BSD 4.2, released in 1983. ?That seems a long time > before the birth of Python but perhaps some SysV systems were around > that didn't have it. ?Also, maybe speed was more of a concern at > that time. ?In any case, those days are long, long gone. Actually I remember what was my motivation at the time (somewhere between 1995-1999 I think) that I decided to use a hard link. It was some trick whereby if you ran "make install" the target binary, e.g. "python1.3", was removed and then overwritten in such a way that code which was running it via "python" (a hard link to python1.3) would not be disturbed. Then a new hard link would be created atomically. But it was too clever, and it's long been replaced with a symlink. Anyway, I don't think anyone is objecting against the PEP allowing symlinks now. -- --Guido van Rossum (python.org/~guido) From guido at python.org Thu Feb 16 03:39:18 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2012 18:39:18 -0800 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4F3C64BA.5000702@canterbury.ac.nz> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C64BA.5000702@canterbury.ac.nz> Message-ID: On Wed, Feb 15, 2012 at 6:06 PM, Greg Ewing wrote: > On 16/02/12 06:43, Guido van Rossum wrote: >> >> This does not explain why microseconds aren't good enough. It seems >> none of the clocks involved can actually measure even relative time >> intervals more accurate than 100ns, and I expect that kernels don't >> actually keep their clock more accurate than milliseconds. > > > I gather that modern x86 CPUs have a counter that keeps track of > time down to a nanosecond or so by counting clock cycles. In > principle it seems like a kernel should be able to make use of > it in conjunction with other timekeeping hardware to produce > nanosecond-resolution timestamps. > > Whether any existing kernel actually does that is another > matter. It probably isn't worth the bother for things like > file timestamps, where the time taken to execute the system > call that modifies the file is likely to be several orders > of magnitude larger. Ironically, file timestamps are likely the only place where it matters. Read the rest of the thread. -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Thu Feb 16 03:45:55 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 16 Feb 2012 12:45:55 +1000 Subject: [Python-Dev] best place for an atomic file API In-Reply-To: <87mx8jy2bh.fsf@benfinney.id.au> References: <87zkcjy3pw.fsf@benfinney.id.au> <87mx8jy2bh.fsf@benfinney.id.au> Message-ID: On Thu, Feb 16, 2012 at 11:49 AM, Ben Finney wrote: > No, I have no objection to that implementation. I'm pointing that out > only because the nature of the functionality implies I'd expect to find > it within the ?os? module hierarchy. The (very) rough rule of thumb is that the os module handles abstracting away cross-platform differences in implementation details, while the higher level shutil algorithms can be largely platform independent by using the shared abstractions in the os module layer. In this case, os.replace() is the cross platform abstraction, while the atomic file context manager is just a particular use case for that new feature. (MvL complained in the tracker issue about a lack of concrete use cases, but I think fixing race conditions when overwriting bytecode files in importlib and the existing distutils/packaging use cases cover that) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Feb 16 03:51:34 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 16 Feb 2012 12:51:34 +1000 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> <20120213170845.3ee5d4b4@resist.wooz.org> <20120214094435.745d06e6@limelight.wooz.org> Message-ID: On Thu, Feb 16, 2012 at 12:06 PM, Guido van Rossum wrote: > Anyway, I don't think anyone is objecting against the PEP allowing symlinks now. Yeah, the onus is just back on me to do the final updates to the PEP and patch based on the discussion in this thread. Unless life unexpectedly intervenes, I expect that to happen on Saturday (my time). After that, the only further work is for Ned to supply whatever updates he needs to bring the 2.7 Mac OS X installers into line with the new naming scheme. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ben+python at benfinney.id.au Thu Feb 16 04:12:41 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 16 Feb 2012 14:12:41 +1100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C64BA.5000702@canterbury.ac.nz> Message-ID: <87ipj7xyhi.fsf@benfinney.id.au> Guido van Rossum writes: > On Wed, Feb 15, 2012 at 6:06 PM, Greg Ewing wrote: > > It probably isn't worth the bother for things like file timestamps, > > where the time taken to execute the system call that modifies the > > file is likely to be several orders of magnitude larger. > > Ironically, file timestamps are likely the only place where it > matters. Read the rest of the thread. And log message timestamps. The *two* only places where it matters, file timestamps and log messages. And communication protocols. The *three* only places ? I'll come in again. -- \ ?Why should I care about posterity? What's posterity ever done | `\ for me?? ?Groucho Marx | _o__) | Ben Finney From larry at hastings.org Thu Feb 16 04:28:01 2012 From: larry at hastings.org (Larry Hastings) Date: Wed, 15 Feb 2012 19:28:01 -0800 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> Message-ID: <4F3C77C1.7070706@hastings.org> On 02/15/2012 09:43 AM, Guido van Rossum wrote: > *Apart* from the specific use case of making an exact copy of a > directory tree that can be verified by other tools that simply compare > the nanosecond times for equality, A data point on this specific use case. The following code throws its assert ~90% of the time in Python 3.2.2 on a modern Linux machine (assuming "foo" exists and "bar" does not): import shutil import os shutil.copy2("foo", "bar") assert os.stat("foo").st_mtime == os.stat("bar").st_mtime The problem is with os.utime. IIUC stat() on Linux added nanosecond atime/mtime support back in 2.5. But the corresponding utime() functions to write nanosecond atime/mtime didn't appear until relatively recently--and Python 3.2 doesn't use them. With stat_float_times turned on, os.stat effectively reads with ~100-nanosecond precision, but os.utime still only writes with microsecond precision. I fixed this in trunk last September (issue 12904); os.utime now preserves all the precision that Python currently conveys. One way of looking at it: in Python 3.2 it's already pretty bad and almost nobody is complaining. (There's me, I guess, but I scratched my itch.) /arry From pjenvey at underboss.org Thu Feb 16 04:40:12 2012 From: pjenvey at underboss.org (Philip Jenvey) Date: Wed, 15 Feb 2012 19:40:12 -0800 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: <6CB20D34-C17A-4889-8470-D63F577BEB15@underboss.org> On Feb 13, 2012, at 8:44 PM, Nick Coghlan wrote: > On Tue, Feb 14, 2012 at 2:25 PM, Eli Bendersky wrote: >> With the deprecation warning being silent, is there much to lose, though? > > Yes, it creates problems for anyone that deliberately converts all > warnings to errors when running their test suites. This forces them to > spend time switching over to a Python version dependent import of > either cElementTree or ElementTree that could have been spent doing > something actually productive instead of mere busywork. > > And, of course, even people that *don't* convert warnings to errors > when running tests will have to make the same switch when the module > is eventually removed. What about a PendingDeprecationWarning? I think you're usually only going to convert DeprecationWarnings to errors (with python -W error::DeprecationWarning or warnings.simplefilter('error', DeprecationWarning)) -- Philip Jenvey From ncoghlan at gmail.com Thu Feb 16 05:05:38 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 16 Feb 2012 14:05:38 +1000 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <6CB20D34-C17A-4889-8470-D63F577BEB15@underboss.org> References: <6CB20D34-C17A-4889-8470-D63F577BEB15@underboss.org> Message-ID: On Thu, Feb 16, 2012 at 1:40 PM, Philip Jenvey wrote: > What about a PendingDeprecationWarning? I think you're usually only going to convert DeprecationWarnings to errors (with python -W error::DeprecationWarning or warnings.simplefilter('error', DeprecationWarning)) Think "-Wall" for strict testing regimes :) If you trawl back in the archives a few years, you'll find I've historically been on the *other* side of this kind of argument. I've since come to recognise that programmatic deprecation really is a big hammer that hits the wider Python community - it needs to be treated with appropriate respect. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Thu Feb 16 05:12:12 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Feb 2012 20:12:12 -0800 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4F3C77C1.7070706@hastings.org> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C77C1.7070706@hastings.org> Message-ID: On Wed, Feb 15, 2012 at 7:28 PM, Larry Hastings wrote: > > On 02/15/2012 09:43 AM, Guido van Rossum wrote: >> >> *Apart* from the specific use case of making an exact copy of a >> directory tree that can be verified by other tools that simply compare >> the nanosecond times for equality, > > > A data point on this specific use case. ?The following code throws its > assert ~90% of the time in Python 3.2.2 on a modern Linux machine (assuming > "foo" exists and "bar" does not): > > ? import shutil > ? import os > ? shutil.copy2("foo", "bar") > ? assert os.stat("foo").st_mtime == os.stat("bar").st_mtime > > The problem is with os.utime. ?IIUC stat() on Linux added nanosecond > atime/mtime support back in 2.5. ?But the corresponding utime() functions to > write nanosecond atime/mtime didn't appear until relatively recently--and > Python 3.2 doesn't use them. ?With stat_float_times turned on, os.stat > effectively reads with ~100-nanosecond precision, but os.utime still only > writes with microsecond precision. ?I fixed this in trunk last September > (issue 12904); os.utime now preserves all the precision that Python > currently conveys. > > One way of looking at it: in Python 3.2 it's already pretty bad and almost > nobody is complaining. ?(There's me, I guess, but I scratched my itch.) So, essentially you fixed this particular issue without having to do anything as drastic as the proposed PEP... -- --Guido van Rossum (python.org/~guido) From martin at v.loewis.de Thu Feb 16 09:02:33 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 16 Feb 2012 09:02:33 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <20120215210643.323a935e@pitrou.net> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <20120215183828.1141883f@pitrou.net> <1329329413.3389.9.camel@localhost.localdomain> <4F3C0DEA.3010702@v.loewis.de> <20120215210643.323a935e@pitrou.net> Message-ID: <4F3CB819.8050003@v.loewis.de> Am 15.02.2012 21:06, schrieb Antoine Pitrou: > On Wed, 15 Feb 2012 20:56:26 +0100 > "Martin v. L?wis" wrote: >> >> With the quartz in Victor's machine, a single clock takes 0.3ns, so >> three of them make a nanosecond. As the quartz may not be entirely >> accurate (and also as the CPU frequency may change) you have to measure >> the clock rate against an external time source, but Linux has >> implemented algorithms for that. On my system, dmesg shows >> >> [ 2.236894] Refined TSC clocksource calibration: 2793.000 MHz. >> [ 2.236900] Switching to clocksource tsc > > But that's still not meaningful. By the time clock_gettime() returns, > an unpredictable number of nanoseconds have elapsed, and even more when > returning to the Python evaluation loop. This is not exactly true: while the current time won't be what was returned when using it, it is certainly possible to predict how long it takes to return from a system call. So the result is not accurate, but meaningful. If you are formally arguing that uncertain evens may happen, such as the scheduler interrupting the thread: this is true for any clock reading; the actual time may be many milliseconds off by the time it is used. That is no reason to return to second resolution. > So the nanosecond precision is just an illusion, and a float should > really be enough to represent durations for any task where Python is > suitable as a language. I agree with that statement - I was just refuting your claim that Linux cannot do nanosecond measurements. Please do recognize the point I made to Guido: despite us three agreeing that a float is good enough for time stamps, people will continue to submit patches and ask for new features until we give in. One way to delay that by several years could be to reject the PEP in a way that makes it clear that not only the specific approach is rejected, but any approach using anything else but floats. Regards, Martin From martin at v.loewis.de Thu Feb 16 09:04:13 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 16 Feb 2012 09:04:13 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C09A9.80009@v.loewis.de> Message-ID: <4F3CB87D.2060808@v.loewis.de> > Maybe an alternative PEP could be written that supports the filesystem > copying use case only, using some specialized ns APIs? I really think > that all you need is st_{a,c,m}time_ns fields and os.utime_ns(). I'm -1 on that, because it will make people write complicated code. Regards, Martin From martin at v.loewis.de Thu Feb 16 10:08:30 2012 From: martin at v.loewis.de (=?windows-1252?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 16 Feb 2012 10:08:30 +0100 Subject: [Python-Dev] best place for an atomic file API In-Reply-To: References: <87zkcjy3pw.fsf@benfinney.id.au> <87mx8jy2bh.fsf@benfinney.id.au> Message-ID: <4F3CC78E.7010309@v.loewis.de> > (MvL complained in the tracker issue about a lack of concrete use > cases, but I think fixing race conditions when overwriting bytecode > files in importlib and the existing distutils/packaging use cases > cover that) I certainly agree that there are applications of "atomic replace", and that the os module should expose the relevant platform APIs where available. I'm not so sure that "atomic writes" is a useful concept. I haven't seen a proposed implementation, yet, but I'm doubtful that truly ACID writes are possible unless the operating system supports transactions (which only Windows 7 does). Even if you are ignoring Isolation, Atomic already is a challenge: if you first write to a tempfile, then rename it, you may end up with a state tempfile (e.g. if the process is killed), and no rollback operation. So "atomic write" to me promises something that it likely can't deliver. OTOH, I still think that the promise isn't actually asked for in practice (not even when overwriting bytecode files) Regards, Martin From martin at v.loewis.de Thu Feb 16 10:13:39 2012 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 16 Feb 2012 10:13:39 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <4F34E554.7090600@v.loewis.de> Message-ID: <4F3CC8C3.8070103@v.loewis.de> > So, getting back to the topic again, is there any reason why you would > oppose backing the ElementTree module in the stdlib by cElementTree's > accelerator module? Or can we just consider this part of the discussion > settled and start getting work done? I'd still like to know who is in charge of the etree package now. I know that I'm not, so I just don't have any opinion on the technical question of using the accelerator module (it sounds like a reasonable idea, but it also sounds like something that may break existing code). If the maintainer of the etree package would pronounce that it is ok to make this change, I'd have no objection at all. Lacking a maintainer, I feel responsible for any bad consequences of that change, which makes me feel uneasy about it. Regards, Martin From martin at v.loewis.de Thu Feb 16 10:17:15 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 16 Feb 2012 10:17:15 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <4F34E6BC.3080603@v.loewis.de> <4F34F251.6070407@v.loewis.de> Message-ID: <4F3CC99B.3060204@v.loewis.de> > Does this imply that each and every package in the stdlib currently > has a dedicated maintainer who promised to be dedicated to it? Or > otherwise, should those packages that *don't* have a maintainer be > removed from the standard library? That is my opinion, yes. Some people (including myself) are willing to act as maintainers for large sets of modules, covering even code that they don't ever use themselves. > Isn't that a bit harsh? ElementTree is an overall functional library > and AFAIK the preferred stdlib tool for processing XML for many > developers. It currently needs some attention to fix a few issues, > expose the fast C implementation by default when ElementTree is > imported, and improve the documentation. At this point, I'm interested > enough to work on these - given that the political issue with Fredrik > Lundh is resolved. However, I can't *honestly* say I promise to > maintain the package until 2017. So, what's next? If you feel qualified to make changes, go ahead and make them. Take the praise if they are good changes, take the blame if they fire back. Please do try to stay around until either has happened. It would also good if you would declare "I will maintain the etree package". Regards, Martin From eliben at gmail.com Thu Feb 16 10:23:35 2012 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 16 Feb 2012 11:23:35 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <4F3CC8C3.8070103@v.loewis.de> References: <4F34E554.7090600@v.loewis.de> <4F3CC8C3.8070103@v.loewis.de> Message-ID: > I'd still like to know who is in charge of the etree package now. I know > that I'm not, so I just don't have any opinion on the technical question > of using the accelerator module (it sounds like a reasonable idea, but > it also sounds like something that may break existing code). If the > maintainer of the etree package would pronounce that it is ok to make > this change, I'd have no objection at all. Lacking a maintainer, I feel > responsible for any bad consequences of that change, which makes me feel > uneasy about it. > Martin, as you've seen Fredrik Lundh finally officially ceded the maintenance of the ElementTree code to the Python developers: http://mail.python.org/pipermail/python-dev/2012-February/116389.html The change of backing ElementTree by cElementTree has already been implemented in the default branch (3.3) by Florent Xicluna with careful review from me and others. etree has an extensive (albeit a bit clumsy) set of tests which keep passing successfully after the change. The bots are also happy. In the past couple of years Florent has been the de-facto maintainer of etree in the standard library, although I don't think he ever "committed" to keep maintaining it for years to come. Neither can I make this commitment, however I do declare that I will do my best to keep the library functional, and I also plan to work on improving its documentation and cleaning up some of the accumulated cruft in its implementation. I also have all the intentions to take the blame if something breaks. That said, Florent is probably the one most familiar with the code at this point, and although his help will be most appreciated I can't expect or demand from him to stick around for a few years. We're all volunteers here, after all. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From nad at acm.org Thu Feb 16 10:32:24 2012 From: nad at acm.org (Ned Deily) Date: Thu, 16 Feb 2012 10:32:24 +0100 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> <20120213170845.3ee5d4b4@resist.wooz.org> <20120214094435.745d06e6@limelight.wooz.org> Message-ID: In article , Nick Coghlan wrote: > On Thu, Feb 16, 2012 at 12:06 PM, Guido van Rossum wrote: > > Anyway, I don't think anyone is objecting against the PEP allowing symlinks > > now. > > Yeah, the onus is just back on me to do the final updates to the PEP > and patch based on the discussion in this thread. Unless life > unexpectedly intervenes, I expect that to happen on Saturday (my > time). > > After that, the only further work is for Ned to supply whatever > updates he needs to bring the 2.7 Mac OS X installers into line with > the new naming scheme. There are two issues that I know of for OS X. One is just getting a python2 symlink into the bin directory of a framework build. That's easy. The other is managing symlinks (python, python2, and python3) across framework bin directories; currently there's no infrastructure for that. That part will probably have to wait until PyCon. -- Ned Deily, nad at acm.org From victor.stinner at gmail.com Thu Feb 16 10:51:05 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 16 Feb 2012 10:51:05 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4F3CB87D.2060808@v.loewis.de> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C09A9.80009@v.loewis.de> <4F3CB87D.2060808@v.loewis.de> Message-ID: 2012/2/16 "Martin v. L?wis" : >> Maybe an alternative PEP could be written that supports the filesystem >> copying use case only, using some specialized ns APIs? I really think >> that all you need is st_{a,c,m}time_ns fields and os.utime_ns(). > > I'm -1 on that, because it will make people write complicated code. Python 3.3 *has already* APIs for nanosecond timestamps: os.utimensat(), os.futimens(), signal.sigtimedwait(), etc. These functions expect a (seconds: int, nanoseconds: int) tuple. We have to decide before the Python 3.3 release if this API is just fine, or if it should be changed. After the release, it will be more difficult to change the API. If os.utimensat() expects a tuple, it would be nice to have a function getting time as a tuple, like the C language has the clock_gettime() function to get a timestamp as a timespec structure. During the discussion, many developers wanted a type allowing to do arithmetic operations like t2-t1 to compute a delta, or t+delta to "set" a timezone. It is possible to do arithmetic on a tuple, but it is not practical and I don't like a type with a fixed resolution (in some cases you need millisecond, microseconds or 100 ns resolution). If you consider that the float loss of precision is not an issue for nanoseconds, we should use float for os.utimensat(), os.futimens() and signal.sigtimedwait(), just for consistency. Victor From victor.stinner at gmail.com Thu Feb 16 10:54:34 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 16 Feb 2012 10:54:34 +0100 Subject: [Python-Dev] best place for an atomic file API In-Reply-To: <4F3CC78E.7010309@v.loewis.de> References: <87zkcjy3pw.fsf@benfinney.id.au> <87mx8jy2bh.fsf@benfinney.id.au> <4F3CC78E.7010309@v.loewis.de> Message-ID: Most users don't need a truly ACID write, but implement their own best-effort function. Instead of having a different implement in each project, Python can provide something better, especially when the OS provides low level function to implement such feature. Victor 2012/2/16 "Martin v. L?wis" : > I'm not so sure that "atomic writes" is a useful concept. I haven't seen > a proposed implementation, yet, but I'm doubtful that truly ACID > writes are possible unless the operating system supports transactions > (which only Windows 7 does). Even if you are ignoring Isolation, > Atomic already is a challenge: if you first write to a tempfile, then > rename it, you may end up with a state tempfile (e.g. if the process > is killed), and no rollback operation. > > So "atomic write" to me promises something that it likely can't > deliver. OTOH, I still think that the promise isn't actually asked > for in practice (not even when overwriting bytecode files) From martin at v.loewis.de Thu Feb 16 11:01:39 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 16 Feb 2012 11:01:39 +0100 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> <20120213170845.3ee5d4b4@resist.wooz.org> <20120214094435.745d06e6@limelight.wooz.org> Message-ID: <4F3CD403.7070102@v.loewis.de> > There are two issues that I know of for OS X. One is just getting a > python2 symlink into the bin directory of a framework build. That's > easy. Where exactly in the Makefile is that reflected? ISTM that the current patch already covers that, since the framwork* targets are not concerned with the bin directory. > The other is managing symlinks (python, python2, and python3) > across framework bin directories; currently there's no infrastructure > for that. That part will probably have to wait until PyCon. What is the "framework bin directory"? The links are proposed for /usr/local/bin resp. /usr/bin. The proposed patch already manages these links across releases (the most recent install wins). If you are concerned about multiple feature releases: this is not an issue, since the links are just proposed for Python 2.7 (distributions may also add them for 2.6 and earlier, but we are not going to make a release in that direction). It may be that the PEP becomes irrelevant before it is widely accepted: if the sole remaining Python 2 version is 2.7, users may just as well refer to python2 as python2.7. Regards, Martin From martin at v.loewis.de Thu Feb 16 11:08:26 2012 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 16 Feb 2012 11:08:26 +0100 Subject: [Python-Dev] best place for an atomic file API In-Reply-To: References: <87zkcjy3pw.fsf@benfinney.id.au> <87mx8jy2bh.fsf@benfinney.id.au> <4F3CC78E.7010309@v.loewis.de> Message-ID: <4F3CD59A.9070707@v.loewis.de> Am 16.02.2012 10:54, schrieb Victor Stinner: > Most users don't need a truly ACID write, but implement their own > best-effort function. Instead of having a different implement in each > project, Python can provide something better, especially when the OS > provides low level function to implement such feature. It's then critical how this is named, IMO (and exactly what semantics it comprises). Calling it "atomic" when it is not is a mistake. Also notice that one user commented that that he already implemented something like this, and left out the issue of *permissions*. I found that interesting, since preserving permissions might indeed a requirement in a lot of "in-place update" use cases, but hasn't been considered in this discussion yet. So rather than providing a mechanism for atomic writes, I think providing a mechanism to update a file is what people might need. One way of providing this might be a "u" mode for open, which updates an existing file on close (unlike "a", which appends, and unlike "w", which truncates first). Regards, Martin From martin at v.loewis.de Thu Feb 16 11:14:57 2012 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 16 Feb 2012 11:14:57 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C09A9.80009@v.loewis.de> <4F3CB87D.2060808@v.loewis.de> Message-ID: <4F3CD721.7080602@v.loewis.de> Am 16.02.2012 10:51, schrieb Victor Stinner: > 2012/2/16 "Martin v. L?wis" : >>> Maybe an alternative PEP could be written that supports the filesystem >>> copying use case only, using some specialized ns APIs? I really think >>> that all you need is st_{a,c,m}time_ns fields and os.utime_ns(). >> >> I'm -1 on that, because it will make people write complicated code. > > Python 3.3 *has already* APIs for nanosecond timestamps: > os.utimensat(), os.futimens(), signal.sigtimedwait(), etc. These > functions expect a (seconds: int, nanoseconds: int) tuple. I'm -1 on adding these APIs, also. Since Python 3.3 is not released yet, it's not too late to revert them. > If you consider that the float loss of precision is not an issue for > nanoseconds, we should use float for os.utimensat(), os.futimens() and > signal.sigtimedwait(), just for consistency. I'm wondering what use cases utimensat and futimens have that are not covered by utime/utimes (except for the higher resolution). Keeping the "ns" in the name but not doing nanoseconds would be bad, IMO. For sigtimedwait, accepting float is indeed the right thing to do. In the long run, we should see whether using 128-bit floats is feasible. Regards, Martin From vinay_sajip at yahoo.co.uk Thu Feb 16 11:20:34 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Thu, 16 Feb 2012 10:20:34 +0000 (UTC) Subject: [Python-Dev] best place for an atomic file API References: <87zkcjy3pw.fsf@benfinney.id.au> <87mx8jy2bh.fsf@benfinney.id.au> <4F3CC78E.7010309@v.loewis.de> <4F3CD59A.9070707@v.loewis.de> Message-ID: Martin v. L?wis v.loewis.de> writes: > One way of providing this might be a "u" mode for open, which > updates an existing file on close (unlike "a", which appends, > and unlike "w", which truncates first). Doesn't "r+" cover this? Regards, Vinay Sajip From ncoghlan at gmail.com Thu Feb 16 11:40:15 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 16 Feb 2012 20:40:15 +1000 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: <4F3CD403.7070102@v.loewis.de> References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> <20120213170845.3ee5d4b4@resist.wooz.org> <20120214094435.745d06e6@limelight.wooz.org> <4F3CD403.7070102@v.loewis.de> Message-ID: On Thu, Feb 16, 2012 at 8:01 PM, "Martin v. L?wis" wrote: > It may be that the PEP becomes irrelevant before it is widely accepted: > if the sole remaining Python 2 version is 2.7, users may just as well > refer to python2 as python2.7. My hope is that a clear signal from us supporting a python2 symlink for cross-distro compatibility will encourage the commercial distros to add such a link to their 2.6 based variants (e.g. anything with an explicit python2.7 reference won't run by default on RHEL6, or rebuilds based on that). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Feb 16 12:54:08 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 16 Feb 2012 21:54:08 +1000 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: <20120214094435.745d06e6@limelight.wooz.org> References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> <20120213170845.3ee5d4b4@resist.wooz.org> <20120214094435.745d06e6@limelight.wooz.org> Message-ID: On Wed, Feb 15, 2012 at 12:44 AM, Barry Warsaw wrote: > On Feb 14, 2012, at 12:38 PM, Nick Coghlan wrote: >>I have no idea, and I'm not going to open that can of worms for this >>PEP. We need to say something about the executable aliases so that >>people can eventually write cross-platform python2 shebang lines, but >>how particular distros actually manage the transition is going to >>depend more on their infrastructure and community than it is anything >>to do with us. > > Then I think all the PEP needs to say is that it is explicitly up to the > distros to determine if, when, where, and how they transition. ?I.e. take it > off of python-dev's plate. It turns out I'd forgotten what was in the PEP - the Notes section already contained a lot of suggestions along those lines. I changed the title of the section to "Migration Notes", but tried to make it clear that those *aren't* consensus recommendations, just ideas distros may want to think about when considering making the switch. The updated version is live on python.org: http://www.python.org/dev/peps/pep-0394/ I didn't end up giving an explicit rationale for the choice to use a symlink chain, since it really isn't that important to the main purpose of the PEP (i.e. encouraging distros to make sure "python2" is on the system path somewhere). Once MvL or Guido give the nod to the latest version, I'll bump it up to approved. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From victor.stinner at gmail.com Thu Feb 16 13:15:08 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 16 Feb 2012 13:15:08 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <20120215183828.1141883f@pitrou.net> Message-ID: 2012/2/15 Guido van Rossum : > So using floats we can match 100ns precision, right? Nope, not to store an Epoch timestamp newer than january 1987: >>> x=2**29; (x+1e-7) != x # no loss of precision True >>> x=2**30; (x+1e-7) != x # lose precision False >>> print(datetime.timedelta(seconds=2**29)) 6213 days, 18:48:32 >>> print(datetime.datetime.fromtimestamp(2**29)) 1987-01-05 19:48:32 Victor From victor.stinner at gmail.com Thu Feb 16 13:46:18 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 16 Feb 2012 13:46:18 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4F3C77C1.7070706@hastings.org> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C77C1.7070706@hastings.org> Message-ID: > A data point on this specific use case. ?The following code throws its > assert ~90% of the time in Python 3.2.2 on a modern Linux machine (assuming > "foo" exists and "bar" does not): > > ? import shutil > ? import os > ? shutil.copy2("foo", "bar") > ? assert os.stat("foo").st_mtime == os.stat("bar").st_mtime It works because Python uses float for utime() and for stat(). But this assertion may fail if another program checks file timestamps without lossing precision (because of float), e.g. a program written in C that compares st_*time and st_*time_ns fields. >?I fixed this in trunk last September > (issue 12904); os.utime now preserves all the precision that Python > currently conveys. Let's try in a ext4 filesystem: $ ~/prog/python/timestamp/python Python 3.3.0a0 (default:35d6cc531800+, Feb 16 2012, 13:32:56) >>> import decimal, os, shutil, time >>> open("test", "x").close() >>> shutil.copy2("test", "test2") >>> os.stat("test", timestamp=decimal.Decimal).st_mtime Decimal('1329395871.874886224') >>> os.stat("test2", timestamp=decimal.Decimal).st_mtime Decimal('1329395871.873350282') >>> os.stat("test2", timestamp=decimal.Decimal).st_mtime - os.stat("test", timestamp=decimal.Decimal).st_mtime Decimal('-0.001535942') So shutil.copy2() failed to copy the timestamp: test2 is 1 ms older than test... Let's try with a program not written in Python: GNU make. The makefile: --------- test2: test @echo "Copy test into test2" @~/prog/python/default/python -c 'import shutil; shutil.copy2("test", "test2")' test: @echo "Create test" @touch test clean: rm -f test test2 --------- First try: $ make clean rm -f test test2 $ make Create test Copy test into test2 $ make Copy test into test2 => test2 is always older than test and so is always "regenerated". Second try: $ make clean rm -f test test2 $ make Create test Copy test into test2 $ make make: `test2' is up to date. => oh, here test2 is newer or has the exact same modification time, so there is no need to rebuild it. Victor From victor.stinner at gmail.com Thu Feb 16 13:53:57 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 16 Feb 2012 13:53:57 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> Message-ID: > PEP author Victor asked > (in http://mail.python.org/pipermail/python-dev/2012-February/116499.html): > >> Maybe I missed the answer, but how do you handle timestamp with an >> unspecified starting point like os.times() or time.clock()? Should we >> leave these function unchanged? > > If *all* you know is that it is monotonic, then you can't -- but then > you don't really have resolution either, as the clock may well speed up > or slow down. > > If you do have resolution, and the only problem is that you don't know > what the epoch was, then you can figure that out well enough by (once > per type per process) comparing it to something that does have an epoch, > like time.gmtime(). Hum, I suppose that you can expect that time.time() - time.monotonic() is constant or evolve very slowly. time.monotonic() should return a number of second. But you are right, usually monotonic clocks are less accurate. On Windows, QueryPerformanceCounter() is less accurate than GetSystemTimeAsFileTime() for example: http://msdn.microsoft.com/en-us/magazine/cc163996.aspx (read the "The Issue of Frequency" section) time.monotonic() (function added to Python 3.3) documentation should maybe mention the second unit and the accuracy issue. Victor From solipsis at pitrou.net Thu Feb 16 13:56:41 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 16 Feb 2012 13:56:41 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C77C1.7070706@hastings.org> Message-ID: <20120216135641.5ef37c64@pitrou.net> On Thu, 16 Feb 2012 13:46:18 +0100 Victor Stinner wrote: > > Let's try in a ext4 filesystem: > > $ ~/prog/python/timestamp/python > Python 3.3.0a0 (default:35d6cc531800+, Feb 16 2012, 13:32:56) > >>> import decimal, os, shutil, time > >>> open("test", "x").close() > >>> shutil.copy2("test", "test2") > >>> os.stat("test", timestamp=decimal.Decimal).st_mtime > Decimal('1329395871.874886224') > >>> os.stat("test2", timestamp=decimal.Decimal).st_mtime > Decimal('1329395871.873350282') This looks fishy. Floating-point numbers are precise enough to represent the difference between these two numbers: >>> f = 1329395871.874886224 >>> f.hex() '0x1.3cf3e27f7fe23p+30' >>> g = 1329395871.873350282 >>> g.hex() '0x1.3cf3e27f7e4f9p+30' If I run your snippet and inspect modification times using `stat`, the difference is much smaller (around 10 ns, not 1 ms): $ stat test | \grep Modify Modify: 2012-02-16 13:51:25.643597139 +0100 $ stat test2 | \grep Modify Modify: 2012-02-16 13:51:25.643597126 +0100 In other words, you should check your PEP implementation for bugs. Regards Antoine. From victor.stinner at gmail.com Thu Feb 16 14:07:27 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 16 Feb 2012 14:07:27 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4F3C0DEA.3010702@v.loewis.de> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <20120215183828.1141883f@pitrou.net> <1329329413.3389.9.camel@localhost.localdomain> <4F3C0DEA.3010702@v.loewis.de> Message-ID: > The way Linux does that is to use the time-stamping counter of the > processor (the rdtsc instructions), which (originally) counts one unit > per CPU clock. I believe current processors use slightly different > countings (e.g. through the APIC), but still: you get a resolution > within the clock frequency of the CPU quartz. Linux has an internal clocksource API supporting different hardwares: PIT (Intel 8253 chipset): configurable frequency between 8.2 Hz and 1.2 MHz PMTMR (power management timer): ACPI clock with a frequency of 3.5 MHz TSC (Time Stamp Counter): frequency of your CPU HPET (High Precision Event Timer): frequency of at least 10 MHz (14.3 MHz on my computer) Linux has an algorithm to choose the best clock depend on its performance and accurary. Most clocks have a frequency higher than 1 MHz and so a resolution smaller than 1 us, even if the clock is not really accurate. I suppose that you can plug specialized hardward like an atomic clocks, or a GPS receiver, for a better accurary. Victor From victor.stinner at gmail.com Thu Feb 16 14:20:35 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 16 Feb 2012 14:20:35 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <20120216135641.5ef37c64@pitrou.net> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C77C1.7070706@hastings.org> <20120216135641.5ef37c64@pitrou.net> Message-ID: > If I run your snippet and inspect modification times using `stat`, the > difference is much smaller (around 10 ns, not 1 ms): > > $ stat test | \grep Modify > Modify: 2012-02-16 13:51:25.643597139 +0100 > $ stat test2 | \grep Modify > Modify: 2012-02-16 13:51:25.643597126 +0100 The loss of precision is not constant: it depends on the timestamp value. Another example using the stat program: ------------ import decimal, os, shutil, time try: os.unlink("test") except OSError: pass try: os.unlink("test2") except OSError: pass open("test", "x").close() shutil.copy2("test", "test2") print(os.stat("test", timestamp=decimal.Decimal).st_mtime) print(os.stat("test2", timestamp=decimal.Decimal).st_mtime) print(os.stat("test2", timestamp=decimal.Decimal).st_mtime - os.stat("test", timestamp=decimal.Decimal).st_mtime) os.system("stat test|grep ^Mod") os.system("stat test2|grep ^Mod") ------------ Outputs: ------------ $ ./python x.py 1329398229.918858600 1329398229.918208829 -0.000649771 Modify: 2012-02-16 14:17:09.918858600 +0100 Modify: 2012-02-16 14:17:09.918208829 +0100 $ ./python x.py 1329398230.862858588 1329398230.861343658 -0.001514930 Modify: 2012-02-16 14:17:10.862858588 +0100 Modify: 2012-02-16 14:17:10.861343658 +0100 $ ./python x.py 1329398232.450858570 1329398232.450067044 -0.000791526 Modify: 2012-02-16 14:17:12.450858570 +0100 Modify: 2012-02-16 14:17:12.450067044 +0100 $ ./python x.py 1329398233.090858561 1329398233.090853761 -0.000004800 Modify: 2012-02-16 14:17:13.090858561 +0100 Modify: 2012-02-16 14:17:13.090853761 +0100 ------------ The loss of precision is between 1 ms and 4 us. Decimal timestamps display exactly the same value than the stat program: I don't see any bug in this example. Victor PS: Don't try os.utime(Decimal) with my patch, the conversion from Decimal to _PyTime_t does still use float internally (I know this issue, it should be fixed in my patch) and so loss precision ;-) From solipsis at pitrou.net Thu Feb 16 14:26:39 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 16 Feb 2012 14:26:39 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C77C1.7070706@hastings.org> <20120216135641.5ef37c64@pitrou.net> Message-ID: <1329398799.3407.5.camel@localhost.localdomain> Le jeudi 16 f?vrier 2012 ? 14:20 +0100, Victor Stinner a ?crit : > > If I run your snippet and inspect modification times using `stat`, the > > difference is much smaller (around 10 ns, not 1 ms): > > > > $ stat test | \grep Modify > > Modify: 2012-02-16 13:51:25.643597139 +0100 > > $ stat test2 | \grep Modify > > Modify: 2012-02-16 13:51:25.643597126 +0100 > > The loss of precision is not constant: it depends on the timestamp value. Well, I've tried several times and I can't reproduce a 1 ms difference. > The loss of precision is between 1 ms and 4 us. It still looks fishy to me. IEEE doubles have a 52-bit mantissa. Since the integral part of a timestamp takes 32 bits or less, there are still 20 bits left for the fractional part: which allows for at least a 1 ?s precision (2**20 ~= 10**6). A 1 ms precision loss looks like a bug. Regards Antoine. From nad at acm.org Thu Feb 16 14:30:18 2012 From: nad at acm.org (Ned Deily) Date: Thu, 16 Feb 2012 14:30:18 +0100 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: <4F3CD403.7070102@v.loewis.de> References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> <20120213170845.3ee5d4b4@resist.wooz.org> <20120214094435.745d06e6@limelight.wooz.org> <4F3CD403.7070102@v.loewis.de> Message-ID: <150-SnapperMsgB197556ECB62B56E@[192.168.1.97]> I'm away from the source for the next 36 hours. I'll reply with patches by Saturday morning. ___ Ned Deily nad at acm.org -- [] ..... Original Message ....... On Thu, 16 Feb 2012 11:01:39 +0100 ""Martin v. L?wis"" wrote: >> There are two issues that I know of for OS X. One is just getting a >> python2 symlink into the bin directory of a framework build. That's >> easy. > >Where exactly in the Makefile is that reflected? ISTM that the current >patch already covers that, since the framwork* targets are not concerned >with the bin directory. > >> The other is managing symlinks (python, python2, and python3) >> across framework bin directories; currently there's no infrastructure >> for that. That part will probably have to wait until PyCon. > >What is the "framework bin directory"? The links are proposed for >/usr/local/bin resp. /usr/bin. The proposed patch already manages >these links across releases (the most recent install wins). > >If you are concerned about multiple feature releases: this is not an >issue, since the links are just proposed for Python 2.7 (distributions >may also add them for 2.6 and earlier, but we are not going to make >a release in that direction). > >It may be that the PEP becomes irrelevant before it is widely accepted: >if the sole remaining Python 2 version is 2.7, users may just as well >refer to python2 as python2.7. > >Regards, >Martin From storchaka at gmail.com Thu Feb 16 14:42:11 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 16 Feb 2012 15:42:11 +0200 Subject: [Python-Dev] best place for an atomic file API In-Reply-To: References: Message-ID: 15.02.12 23:16, Charles-Fran?ois Natali ???????(??): > Issue #8604 aims at adding an atomic file API to make it easier to > create/update files atomically, using rename() on POSIX systems and > MoveFileEx() on Windows (which are now available through > os.replace()). It would also use fsync() on POSIX to make sure data is > committed to disk. > For example, it could be used by importlib to avoid races when > writting bytecode files (issues #13392, #13003, #13146), or more > generally by any application that wants to make sure to end up with a > consistent file even in face of crash (e.g. it seems that mercurial > implemented their own version). What if target file is symlink? From larry at hastings.org Thu Feb 16 16:29:40 2012 From: larry at hastings.org (Larry Hastings) Date: Thu, 16 Feb 2012 07:29:40 -0800 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C77C1.7070706@hastings.org> Message-ID: <4F3D20E4.3060705@hastings.org> On 02/15/2012 08:12 PM, Guido van Rossum wrote: > On Wed, Feb 15, 2012 at 7:28 PM, Larry Hastings wrote: >> I fixed this in trunk last September >> (issue 12904); os.utime now preserves all the precision that Python >> currently conveys. > So, essentially you fixed this particular issue without having to do > anything as drastic as the proposed PEP... I wouldn't say that. The underlying representation is still nanoseconds, and Python only preserves roughly hundred-nanosecond precision. My patch only ensures that reading and writing atime/mtime looks consistent to Python programs using the os module. Any code that examined the nanosecond-precise values from stat()--written in Python or any other language--would notice the values didn't match. I'm definitely +1 for extending Python to represent nanosecond precision ctime/atime/mtime, but doing so in a way that permits seamlessly adding more precision down the road when the Linux kernel hackers get bored again and add femtosecond resolution. (And then presumably attosecond resolution four years later.) I haven't read 410 yet so I have no opinion on it. I wrote a patch last year that adds new Decimal ctime/mtime/atime fields to the output of stat, but it's a horrific performance regression (os.stat is 10x slower) and the reviewers were ambivalent so I've let it rot. Anyway I now agree that we should improve the precision of datetime objects and use those instead of Decimal. (But not timedeltas--ctime/mtime/atime are absolute times, not deltas.) /arry From barry at python.org Thu Feb 16 16:39:40 2012 From: barry at python.org (Barry Warsaw) Date: Thu, 16 Feb 2012 10:39:40 -0500 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) In-Reply-To: References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> <20120213170845.3ee5d4b4@resist.wooz.org> <20120214094435.745d06e6@limelight.wooz.org> Message-ID: <20120216103940.6c21067d@resist.wooz.org> On Feb 16, 2012, at 09:54 PM, Nick Coghlan wrote: >It turns out I'd forgotten what was in the PEP - the Notes section >already contained a lot of suggestions along those lines. I changed >the title of the section to "Migration Notes", but tried to make it >clear that those *aren't* consensus recommendations, just ideas >distros may want to think about when considering making the switch. > >The updated version is live on python.org: >http://www.python.org/dev/peps/pep-0394/ That section looks great Nick, thanks. I have one very minor quibble left. In many places the PEP says something like: "For the time being, it is recommended that python should refer to python2 (however, some distributions have already chosen otherwise; see the Rationale and Migration Notes below)." which implies that we may change our recommendation, but never quite says what the mechanism is for us to do that. You could change the status of this PEP from Draft to Active, which perhaps implies a little more strongly that this PEP will be updated should our recommendation ever change. I suspect it won't though (or at least won't any time soon). If you mark the PEP as Final, we still have the option of updating the PEP some time later to reflect new recommendations. It might be worth a quick sentence to that effect in the PEP. As I say though, this is a very minor quibble, so just DTRT. +1 and thanks for your great work on it. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From ezio.melotti at gmail.com Thu Feb 16 18:32:24 2012 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Thu, 16 Feb 2012 19:32:24 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: Message-ID: <4F3D3DA8.4010704@gmail.com> On 14/02/2012 9.58, Stefan Behnel wrote: > Nick Coghlan, 14.02.2012 05:44: >> On Tue, Feb 14, 2012 at 2:25 PM, Eli Bendersky wrote: >>> With the deprecation warning being silent, is there much to lose, though? >> Yes, it creates problems for anyone that deliberately converts all >> warnings to errors when running their test suites. This forces them to >> spend time switching over to a Python version dependent import of >> either cElementTree or ElementTree that could have been spent doing >> something actually productive instead of mere busywork. If I'm writing code that imports cElementTree on 3.3+, and I explicitly turn on DeprecationWarnings (that would otherwise be silenced) to check if I'm doing something wrong, I would like Python to tell me "You don't need to import that anymore, just use ElementTree.". If I'm also converting all the warnings to errors, it's probably because I really want my code to do the right thing and spending 1 minute to add/change two line of code to fix this won't probably bother me too much. Regular users won't even notice the warning, unless they stumble upon the note in the doc or enable the warnings (and eventually when the module is removed). >> And, of course, even people that *don't* convert warnings to errors >> when running tests will have to make the same switch when the module >> is eventually removed. When the module is eventually removed and you didn't warn them in advance, the situation is going to turn much worse, because their code will suddenly stop working once they upgrade to the newer version. I don't mind keeping the module and the warning around for a few versions and give enough time for everyone to update their imports, but if eventually the module is removed I don't want all these developers to come and say "why you removed cElementTree without saying anything and broke all my code?". > > I'm -1 on emitting a deprecation warning just because cElementTree is being > replaced by a bare import. That's an implementation detail, just like > cElementTree should have been an implementation detail in the first place. > In all currently maintained CPython releases, importing cElementTree is the > right thing to do for users. From 3.3 the right thing will be importing ElementTree, and at some point in the future that will be the only way to do it. > These days, other Python implementations already provide the cElementTree > module as a bare alias for ElementTree.py anyway, without emitting any > warnings. Why should CPython be the only one that shouts at users for > importing it? I would watch this from the opposite point of view. Why should the other Python implementation have a to keep around a dummy module due to a CPython implementation detail? If we all go through a deprecation process we will eventually be able to get rid of this. Best Regards, Ezio Melotti > Stefan From solipsis at pitrou.net Thu Feb 16 18:55:54 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 16 Feb 2012 18:55:54 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 References: <4F3D3DA8.4010704@gmail.com> Message-ID: <20120216185554.6f890376@pitrou.net> On Thu, 16 Feb 2012 19:32:24 +0200 Ezio Melotti wrote: > > If I'm writing code that imports cElementTree on 3.3+, and I explicitly > turn on DeprecationWarnings (that would otherwise be silenced) to check > if I'm doing something wrong, I would like Python to tell me "You don't > need to import that anymore, just use ElementTree.". > If I'm also converting all the warnings to errors, it's probably because > I really want my code to do the right thing and spending 1 minute to > add/change two line of code to fix this won't probably bother me too much. But then you're going from a cumbersome situation (where you have to import cElementTree and then fallback on regular ElementTree) to an even more cumbersome one (where you have to first check the Python version, then conditionally import cElementTree, then fallback on regular ElementTree). > >> And, of course, even people that *don't* convert warnings to errors > >> when running tests will have to make the same switch when the module > >> is eventually removed. > > When the module is eventually removed and you didn't warn them in > advance, the situation is going to turn much worse, because their code > will suddenly stop working once they upgrade to the newer version. Why would we remove the module? It seems "supporting" it should be mostly trivial (it's an alias). > I would watch this from the opposite point of view. Why should the > other Python implementation have a to keep around a dummy module due to > a CPython implementation detail? I don't know, but they already have this module, and it certainly costs them nothing to keep it. Regards Antoine. From cf.natali at gmail.com Thu Feb 16 19:23:42 2012 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Thu, 16 Feb 2012 19:23:42 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <20120216185554.6f890376@pitrou.net> References: <4F3D3DA8.4010704@gmail.com> <20120216185554.6f890376@pitrou.net> Message-ID: I personally don't see any reason to drop a module that isn't terminally broken or unmaintainable, apart from scaring users away by making them think that we don't care about backward compatibility. From jimjjewett at gmail.com Thu Feb 16 19:24:22 2012 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Thu, 16 Feb 2012 10:24:22 -0800 (PST) Subject: [Python-Dev] PEP for new dictionary implementation In-Reply-To: <4F32CA76.5040307@hotpy.org> Message-ID: <4f3d49d6.ec77ec0a.7988.ffffcebb@mx.google.com> PEP author Mark Shannon wrote (in http://mail.python.org/pipermail/python-dev/attachments/20120208/05be469a/attachment.txt): > ... allows ... (the ``__dict__`` attribute of an object) to share > keys with other attribute dictionaries of instances of the same class. Is "the same class" a deliberate restriction, or just a convenience of implementation? I have often created subclasses (or even families of subclasses) where instances (as opposed to the type) aren't likely to have additional attributes. These would benefit from key-sharing across classes, but I grant that it is a minority use case that isn't worth optimizing if it complicates the implementation. > By separating the keys (and hashes) from the values it is possible > to share the keys between multiple dictionaries and improve memory use. Have you timed not storing the hash (in the dict) at all, at least for (unicode) str-only dicts? Going to the string for its own cached hash breaks locality a bit more, but saves 1/3 of the memory for combined tables, and may make a big difference for classes that have relatively few instances. > Reduction in memory use is directly related to the number of dictionaries > with shared keys in existence at any time. These dictionaries are typically > half the size of the current dictionary implementation. How do you measure that? The limit for huge N across huge numbers of dicts should be 1/3 (because both hashes and keys are shared); I assume that gets swamped by object overhead in typical small dicts. > If a table is split the values in the keys table are ignored, > instead the values are held in a separate array. If they're just dead weight, then why not use them to hold indices into the array, so that values arrays only have to be as long as the number of keys, rather than rounding them up to a large-enough power-of-two? (On average, this should save half the slots.) > A combined-table dictionary never becomes a split-table dictionary. I thought it did (at least temporarily) as part of resizing; are you saying that it will be re-split by the time another thread is allowed to see it, so that it is never observed as combined? Given that this optimization is limited to class instances, I think there should be some explanation of why you didn't just automatically add slots for each variable assigned (by hard-coded name) within a method; the keys would still be stored on the type, and array storage could still be used for the values; the __dict__ slot could initially be a NULL pointer, and instance dicts could be added exactly when they were needed, covering only the oddball keys. I would reword (or at least reformat) the Cons section; at the moment, it looks like there are four separate objections, and seems to be a bit dismissive towards backwards copmatibility. Perhaps something like: While this PEP does not change any documented APIs or invariants, it does break some de facto invariants. C extension modules may be relying on the current physical layout of a dictionary. That said, extensions which rely on internals may already need to be recompiled with each feature release; there are already changes planned for both Unicode (for efficiency) and dicts (for security) that would require authors of these extensions to at least review their code. Because iteration (and repr) order can depend on the order in which keys are inserted, it will be possible to construct instances that iterate in a different order than they would under the current implementation. Note, however, that this will happen very rarely in code which does not deliberately trigger the differences, and that test cases which rely on a particular iteration order will already need to be corrected in order to take advantage of the security enhancements being discussed under hash randomization, or for use with Jython and PyPy. -jJ -- If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ From ezio.melotti at gmail.com Thu Feb 16 19:29:35 2012 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Thu, 16 Feb 2012 20:29:35 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <20120216185554.6f890376@pitrou.net> References: <4F3D3DA8.4010704@gmail.com> <20120216185554.6f890376@pitrou.net> Message-ID: <4F3D4B0F.5040605@gmail.com> On 16/02/2012 19.55, Antoine Pitrou wrote: > On Thu, 16 Feb 2012 19:32:24 +0200 > Ezio Melotti wrote: >> If I'm writing code that imports cElementTree on 3.3+, and I explicitly >> turn on DeprecationWarnings (that would otherwise be silenced) to check >> if I'm doing something wrong, I would like Python to tell me "You don't >> need to import that anymore, just use ElementTree.". >> If I'm also converting all the warnings to errors, it's probably because >> I really want my code to do the right thing and spending 1 minute to >> add/change two line of code to fix this won't probably bother me too much. > But then you're going from a cumbersome situation (where you have to > import cElementTree and then fallback on regular ElementTree) to an > even more cumbersome one (where you have to first check the Python > version, then conditionally import cElementTree, then fallback on > regular ElementTree). This is true if you need to support Python <=3.2, but on the long run this won't be needed anymore and a plain "import ElementTree" will be enough. > >> When the module is eventually removed and you didn't warn them in >> advance, the situation is going to turn much worse, because their code >> will suddenly stop working once they upgrade to the newer version. > Why would we remove the module? It seems "supporting" it should be > mostly trivial (it's an alias). I'm assuming that eventually the module will be removed (maybe for Python 4?), and I don't expect nor want to seen it removed in the near future. If something gets removed it should be deprecated first, and it's usually better to deprecate it sooner so that the developers have more time to update their code. As I proposed on the tracker though, we could even delay the deprecation to 3.4 (by that time they might not need to support 3.2 anymore). > >> I would watch this from the opposite point of view. Why should the >> other Python implementation have a to keep around a dummy module due to >> a CPython implementation detail? > I don't know, but they already have this module, and it certainly costs > them nothing to keep it. There will also be a cost if people keep importing cElementTree and fall back on ElementTree on failure even when this won't be necessary anymore. This also means that more people will have to fix their code if/when the module will be removed if they kept using cElementTree. They can also find cElementTree in old code/tutorial and figure out that it's better to use the C one because is faster and keep doing so because the only warning that would stop them is hidden in the doc. I think the problem with the DeprecationWarnings being too noisy was fixed by silencing them; if they are still too noisy then we need a better mechanism to warn people who care (and going to check the doc every once in a while to see if some new doc warning has been added doesn't strike me as a valid solution). Best Regards, Ezio Melotti From timothy.c.delaney at gmail.com Thu Feb 16 19:44:22 2012 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Fri, 17 Feb 2012 05:44:22 +1100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <20120216185554.6f890376@pitrou.net> References: <4F3D3DA8.4010704@gmail.com> <20120216185554.6f890376@pitrou.net> Message-ID: On 17 February 2012 04:55, Antoine Pitrou wrote: > But then you're going from a cumbersome situation (where you have to > import cElementTree and then fallback on regular ElementTree) to an > even more cumbersome one (where you have to first check the Python > version, then conditionally import cElementTree, then fallback on > regular ElementTree). Well, you can reverse the import so you're not relying on version numbers: import xml.etree.ElementTree as ElementTree try: import xml.etree.cElementTree as ElementTree except ImportError: pass There is a slight cost compared to previously (always importing the python version) and you'll still be using cElementTree directly until it's removed, but if/when it is removed you won't notice it. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Thu Feb 16 20:20:43 2012 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Thu, 16 Feb 2012 11:20:43 -0800 (PST) Subject: [Python-Dev] Store timestamps as decimal.Decimal objects In-Reply-To: Message-ID: <4f3d570b.0668640a.5cf4.ffffd3dc@mx.google.com> In http://mail.python.org/pipermail/python-dev/2012-February/116073.html Nick Coghlan wrote: > Besides, float128 is a bad example - such a type could just be > returned directly where we return float64 now. (The only reason we > can't do that with Decimal is because we deliberately don't allow > implicit conversion of float values to Decimal values in binary > operations). If we could really replace float with another type, then there is no reason that type couldn't be a nearly trivial Decimal subclass which simply flips the default value of the (never used by any caller) allow_float parameter to internal function _convert_other. Since decimal inherits straight from object, this subtype could even be made to inherit from float as well, and to store the lower- precision value there. It could even produce the decimal version lazily, so as to minimize slowdown on cases that do not need the greater precision. Of course, that still doesn't answer questions on whether the higher precision is a good idea ... -jJ -- If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ From solipsis at pitrou.net Thu Feb 16 21:45:47 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 16 Feb 2012 21:45:47 +0100 Subject: [Python-Dev] PEP for new dictionary implementation References: <4F32CA76.5040307@hotpy.org> Message-ID: <20120216214547.4d7487cc@pitrou.net> On Wed, 08 Feb 2012 19:18:14 +0000 Mark Shannon wrote: > Proposed PEP for new dictionary implementation, PEP 410? > is attached. > So, I'm running a few benchmarks using Twisted's test suite (see https://bitbucket.org/pitrou/t3k/wiki/Home). At the end of `python -i bin/trial twisted.internet.test`: -> vanilla 3.3: RSS = 94 MB -> new dict: RSS = 91 MB At the end of `python -i bin/trial twisted.python.test`: -> vanilla 3.3: RSS = 31.5 MB -> new dict: RSS = 30 MB At the end of `python -i bin/trial twisted.conch.test`: -> vanilla 3.3: RSS = 68 MB -> new dict: RSS = 42 MB (!) At the end of `python -i bin/trial twisted.trial.test`: -> vanilla 3.3: RSS = 32 MB -> new dict: RSS = 30 MB At the end of `python -i bin/trial twisted.test`: -> vanilla 3.3: RSS = 62 MB -> new dict: RSS = 78 MB (!) Runtimes were mostly similar in these test runs. Perspective broker benchmark (doc/core/benchmarks/tpclient.py and doc/core/benchmarks/tpserver.py): -> vanilla 3.3: 422 MB/sec -> new dict: 402 MB/sec Regards Antoine. From jimjjewett at gmail.com Thu Feb 16 22:01:45 2012 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Thu, 16 Feb 2012 13:01:45 -0800 (PST) Subject: [Python-Dev] plugging the hash attack In-Reply-To: Message-ID: <4f3d6eb9.0268640a.18ce.fffff02a@mx.google.com> In http://mail.python.org/pipermail/python-dev/2012-January/116003.html >> > Benjamin Peterson wrote: >> >> 2. It will be off by default in stable releases ... This will >> >> prevent code breakage ... >> 2012/1/27 Steven D'Aprano : >> > ... it will become on by default in some future release? > On Fri, Jan 27, 2012, Benjamin Peterson wrote: >> Yes, 3.3. The solution in 3.3 could even be one of the more >> sophisticated proposals we have today. Brett Cannon (Mon Jan 30) wrote: > I think that would be good. And I would even argue we remove support for > turning it off to force people to no longer lean on dict ordering as a > crutch (in 3.3 obviously). Turning it on by default is fine. Removing the ability to turn it off is bad. If regression tests fail with python 3, the easiest thing to do is just not to migrate to python 3. Some decisions (certainly around unittest, but I think even around hash codes) were settled precisely because tests shouldn't break unless the functionality has really changed. Python 3 isn't yet so dominant as to change that tradeoff. I would go so far as to add an extra step in the porting recommendations; before porting to python 3.x, run your test suite several times with hash randomization turned on; any failures at this point are relying on formally undefined behavior and should be fixed, but can *probably* be fixed just by wrapping the results in sorted. (I would offer a patch to the porting-to-py3 recommendation, except that I couldn't find any not associated specifically with 3.0) -jJ -- If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ From martin at v.loewis.de Thu Feb 16 22:21:47 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 16 Feb 2012 22:21:47 +0100 Subject: [Python-Dev] PEP for new dictionary implementation In-Reply-To: <4F36DBF9.5050901@hotpy.org> References: <4F32CA76.5040307@hotpy.org> <20120211211726.0fdf086d@pitrou.net> <4F36DBF9.5050901@hotpy.org> Message-ID: <4F3D736B.6090001@v.loewis.de> Am 11.02.2012 22:22, schrieb Mark Shannon: > Antoine Pitrou wrote: >> Hello Mark, >> >> I think the PEP should explain what happens when a keys table needs >> resizing when setting an object's attribute. > > If the object is the only instance of a class, it remains split, > otherwise the table is combined. Hi Mark, Answering on-list is fine, but please do add such answers to the PEP when requested. I have such a question also: why does it provide storage for the value slot in the keys array, where this slot is actually not used? Regards, Martin From martin at v.loewis.de Thu Feb 16 22:23:01 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 16 Feb 2012 22:23:01 +0100 Subject: [Python-Dev] PEP for new dictionary implementation In-Reply-To: <4F39063B.6010803@hotpy.org> References: <4F32CA76.5040307@hotpy.org> <4F39063B.6010803@hotpy.org> Message-ID: <4F3D73B5.2000309@v.loewis.de> Am 13.02.2012 13:46, schrieb Mark Shannon: > Revised PEP for new dictionary implementation, PEP 412? > is attached. Committed as PEP 412. Regards, Martin From martin at v.loewis.de Thu Feb 16 22:34:00 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 16 Feb 2012 22:34:00 +0100 Subject: [Python-Dev] PEP for new dictionary implementation In-Reply-To: <4f3d49d6.ec77ec0a.7988.ffffcebb@mx.google.com> References: <4f3d49d6.ec77ec0a.7988.ffffcebb@mx.google.com> Message-ID: <4F3D7648.6040600@v.loewis.de> Am 16.02.2012 19:24, schrieb Jim J. Jewett: > > > PEP author Mark Shannon wrote > (in http://mail.python.org/pipermail/python-dev/attachments/20120208/05be469a/attachment.txt): > >> ... allows ... (the ``__dict__`` attribute of an object) to share >> keys with other attribute dictionaries of instances of the same class. > > Is "the same class" a deliberate restriction, or just a convenience > of implementation? It's about the implementation: the class keeps a pointer to the key set. A subclass has a separate pointer for that. > I have often created subclasses (or even families > of subclasses) where instances (as opposed to the type) aren't likely > to have additional attributes. These would benefit from key-sharing > across classes, but I grant that it is a minority use case that isn't > worth optimizing if it complicates the implementation. In particular, the potential savings are small: the instances of the subclass will share the key sets per-class. So if you have S subclasses, you could save up to S keysets, whereas you are already saving N-S-1 keysets (assuming you have a total of N objects across all classes). > Have you timed not storing the hash (in the dict) at all, at least for > (unicode) str-only dicts? Going to the string for its own cached hash > breaks locality a bit more, but saves 1/3 of the memory for combined > tables, and may make a big difference for classes that have relatively > few instances. I'd be in favor of that, but it is actually an unrelated change: whether or not you share key sets is unrelated to whether or not str-only dicts drop the cached hash. Given a dict, it may be tricky to determine whether or not it is str-only, i.e. what layout to use. >> Reduction in memory use is directly related to the number of dictionaries >> with shared keys in existence at any time. These dictionaries are typically >> half the size of the current dictionary implementation. > > How do you measure that? The limit for huge N across huge numbers > of dicts should be 1/3 (because both hashes and keys are shared); I > assume that gets swamped by object overhead in typical small dicts. It's more difficult than that. He also drops the smalltable (which I think is a good idea), so accounting how this all plays together is tricky. >> If a table is split the values in the keys table are ignored, >> instead the values are held in a separate array. > > If they're just dead weight, then why not use them to hold indices > into the array, so that values arrays only have to be as long as > the number of keys, rather than rounding them up to a large-enough > power-of-two? (On average, this should save half the slots.) Good idea. However, how do you track per-dict how large the table is? Regards, Martin From victor.stinner at gmail.com Thu Feb 16 23:04:41 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 16 Feb 2012 23:04:41 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <1329398799.3407.5.camel@localhost.localdomain> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C77C1.7070706@hastings.org> <20120216135641.5ef37c64@pitrou.net> <1329398799.3407.5.camel@localhost.localdomain> Message-ID: >> > $ stat test | \grep Modify >> > Modify: 2012-02-16 13:51:25.643597139 +0100 >> > $ stat test2 | \grep Modify >> > Modify: 2012-02-16 13:51:25.643597126 +0100 >> >> The loss of precision is not constant: it depends on the timestamp value. > > Well, I've tried several times and I can't reproduce a 1 ms difference. > >> The loss of precision is between 1 ms and 4 us. > > It still looks fishy to me. IEEE doubles have a 52-bit mantissa. Since > the integral part of a timestamp takes 32 bits or less, there are still > 20 bits left for the fractional part: which allows for at least a 1 ?s > precision (2**20 ~= 10**6). A 1 ms precision loss looks like a bug. Oh... It was a important bug in my function used to change the denominator of a timestamp. I tried to workaround integer overflow, but I added a bug. I changed my patch to use PyLong which has no integer overflow issue. Fixed example: >>> open("test", "x").close() >>> import shutil >>> shutil.copy2("test", "test2") [94386 refs] >>> print(os.stat("test", datetime.datetime).st_mtime) 2012-02-16 21:58:30.835062+00:00 >>> print(os.stat("test2", datetime.datetime).st_mtime) 2012-02-16 21:58:30.835062+00:00 >>> print(os.stat("test", decimal.Decimal).st_mtime) 1329429510.835061686 >>> print(os.stat("test2", decimal.Decimal).st_mtime) 1329429510.835061789 >>> os.stat("test2", decimal.Decimal).st_mtime - os.stat("test", decimal.Decimal).st_mtime Decimal('1.03E-7') So the difference is only 0.1 us (100 ns). It doesn't change anything to the Makefile issue, if timestamps are different in a single nanosecond, they are seen as different by make (by another program comparing the timestamp of two files using nanosecond precision). Victor From jimjjewett at gmail.com Thu Feb 16 23:10:51 2012 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Thu, 16 Feb 2012 14:10:51 -0800 (PST) Subject: [Python-Dev] Counting collisions for the win In-Reply-To: <4F195A49.6050508@sievertsen.de> Message-ID: <4f3d7eeb.0bf1640a.3c0a.fffffea8@mx.google.com> In http://mail.python.org/pipermail/python-dev/2012-January/115715.html Frank Sievertsen wrote: Am 20.01.2012 13:08, schrieb Victor Stinner: >>> I'm surprised we haven't seen bug reports about it from users >>> of 64-bit Pythons long ago >> A Python dictionary only uses the lower bits of a hash value. If your >> dictionary has less than 2**32 items, the dictionary order is exactly >> the same on 32 and 64 bits system: hash32(str)& mask == hash64(str)& >> mask for mask<= 2**32-1. > No, that's not true. > Whenever a collision happens, other bits are mixed in very fast. > Frank Bits are mixed in quickly from a denial-of-service standpoint, but Victor is correct from a "Why don't the tests already fail?" standpoint. A dict with 2**12 slots, holding over 2700 entries, will be far larger than most test cases -- particularly those with visible output. In a dict that size, 32-bit and 64-bit machines will still probe the same first, second, third, fourth, fifth, and sixth slots. Even on the rare cases when there are at least 6 collisions, the next slots may well be either the same, or close enough that it doesn't show up in a changed iteration order. -jJ -- If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ From guido at python.org Thu Feb 16 23:38:13 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 16 Feb 2012 14:38:13 -0800 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C77C1.7070706@hastings.org> <20120216135641.5ef37c64@pitrou.net> <1329398799.3407.5.camel@localhost.localdomain> Message-ID: On Thu, Feb 16, 2012 at 2:04 PM, Victor Stinner wrote: > It doesn't change anything to the Makefile issue, if timestamps are > different in a single nanosecond, they are seen as different by make > (by another program comparing the timestamp of two files using > nanosecond precision). But make doesn't compare timestamps for equality -- it compares for newer. That shouldn't be so critical, since if there is an *actual* causal link between file A and B, the difference in timestamps should always be much larger than 100 ns. -- --Guido van Rossum (python.org/~guido) From victor.stinner at gmail.com Thu Feb 16 23:48:42 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 16 Feb 2012 23:48:42 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C77C1.7070706@hastings.org> <20120216135641.5ef37c64@pitrou.net> <1329398799.3407.5.camel@localhost.localdomain> Message-ID: 2012/2/16 Guido van Rossum : > On Thu, Feb 16, 2012 at 2:04 PM, Victor Stinner > wrote: >> It doesn't change anything to the Makefile issue, if timestamps are >> different in a single nanosecond, they are seen as different by make >> (by another program comparing the timestamp of two files using >> nanosecond precision). > > But make doesn't compare timestamps for equality -- it compares for > newer. That shouldn't be so critical, since if there is an *actual* > causal link between file A and B, the difference in timestamps should > always be much larger than 100 ns. The problem is that shutil.copy2() produces sometimes *older* timestamp :-/ As shown in my previous email: in such case, make will always rebuild the second file instead of only build it once. Example with two consecutive runs: $ ./python diff.py 1329432426.650957952 1329432426.650958061 1.09E-7 $ ./python diff.py 1329432427.854957910 1329432427.854957819 -9.1E-8 Victor From guido at python.org Thu Feb 16 23:58:08 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 16 Feb 2012 14:58:08 -0800 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C77C1.7070706@hastings.org> <20120216135641.5ef37c64@pitrou.net> <1329398799.3407.5.camel@localhost.localdomain> Message-ID: On Thu, Feb 16, 2012 at 2:48 PM, Victor Stinner wrote: > 2012/2/16 Guido van Rossum : >> On Thu, Feb 16, 2012 at 2:04 PM, Victor Stinner >> wrote: >>> It doesn't change anything to the Makefile issue, if timestamps are >>> different in a single nanosecond, they are seen as different by make >>> (by another program comparing the timestamp of two files using >>> nanosecond precision). >> >> But make doesn't compare timestamps for equality -- it compares for >> newer. That shouldn't be so critical, since if there is an *actual* >> causal link between file A and B, the difference in timestamps should >> always be much larger than 100 ns. > > The problem is that shutil.copy2() produces sometimes *older* > timestamp :-/ As shown in my previous email: in such case, make will > always rebuild the second file instead of only build it once. > > Example with two consecutive runs: > > $ ./python diff.py > 1329432426.650957952 > 1329432426.650958061 > 1.09E-7 > > $ ./python diff.py > 1329432427.854957910 > 1329432427.854957819 > -9.1E-8 Have you been able to reproduce this with an actual Makefile? What's the scenario? I'm thinking of a Makefile like this: a: cp /dev/null a b: a cp a b Now say a doesn't exist and we run "make b". This will create a and then b. I can't believe that the difference between the mtimes of a and b is so small that if you copy the directory containing Makefile, a and b using a Python tool that reproduces mtimes only with usec accuracy you'll end up with a directory where a is newer than n. What am I missing? -- --Guido van Rossum (python.org/~guido) From victor.stinner at gmail.com Fri Feb 17 01:04:47 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 17 Feb 2012 01:04:47 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C77C1.7070706@hastings.org> <20120216135641.5ef37c64@pitrou.net> <1329398799.3407.5.camel@localhost.localdomain> Message-ID: >> The problem is that shutil.copy2() produces sometimes *older* >> timestamp :-/ (...) > > Have you been able to reproduce this with an actual Makefile? What's > the scenario? Hum. I asked the Internet who use shutil.copy2() and I found an "old" issue (Decimal('43462967.173053') seconds ago): Python issue #10148: st_mtime differs after shutil.copy2 (october 2010) "When copying a file with shutil.copy2() between two ext4 filesystems on 64-bit Linux, the mtime of the destination file is different after the copy. It appears as if the resolution is slightly different, so the mtime is truncated slightly. (...)" I don't know if it is a "theorical" or "practical" issue. Then I found: Python issue #11941: Support st_atim, st_mtim and st_ctim attributes in os.stat_result "They would expose relevant functionality from libc's stat() and provide better precision than floating-point-based st_atime, st_mtime and st_ctime attributes." Which is connected the issue that motivated me to write the PEP: Python issue #11457: os.stat(): add new fields to get timestamps as Decimal objects with nanosecond resolution "Support for such precision is available at the least on 2.6 Linux kernels." "This is important for example with the tarfile module with the pax tar format. The POSIX tar standard[3] mandates storing the mtime in the extended header (if it is not an integer) with as much precision as is available in the underlying file system, and likewise to restore this time properly upon extraction. Currently this is not possible." "The mailbox module would benefit from having this precision available." For the tarfile use case, we need at least a way to get the modification time with a nanosecond resolution *and* to set the modification time with a nanosecond resolution. We just need to decide which type is the best for this usecase, which is the purpose of the PEP 410 :-) Another use case of nanosecond timestamps are profilers (and maybe benchmark tools). The profiler itself may be implemented in a different language than Python. For example, DTrace uses nanosecond timestamps. -- Other examples. Debian bug #627460: (gcp) Expose nanoseconds in python (15 May 2011) http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=627460 Debian bug #626787: (gcp) gcp: timestamp is not always copied exact http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=626787 "When copying a (large) file from HDD to USB the files timestamp is not copied exact. It seems to work fine with smaller files (up to 1Gig), I couldn't spot the time-diff on these files." ("gcp is a grid enabled version of the scp copy command.") fuse-python supports nanosecond resolution: they chose to mimick the C API using: class Timespec(FuseStruct): """ Cf. struct timespec in time.h: http://www.opengroup.org/onlinepubs/009695399/basedefs/time.h.html """ def __init__(self, name=None, **kw): self.tv_sec = None self.tv_nsec = None kw['name'] = name FuseStruct.__init__(self, **kw) Python issue #9079: Make gettimeofday available in time module "... exposes gettimeofday as time.gettimeofday() returning (sec, usec) pair" The Oracle database supports timestamps with a nanosecond resolution. A related article about Ruby: http://marcricblog.blogspot.com/2010/04/who-cares-about-nanosecond.html "Files are uploaded in groups (fifteen maximum). It was important to know the order on which files have been upload. Depending on the size of the files and users? internet broadband capacity, some files could be uploaded in the same second." And a last one for the fun: "This Week in Python Stupidity: os.stat, os.utime and Sub-Second Timestamps" (November 15, 2009) http://ciaranm.wordpress.com/2009/11/15/this-week-in-python-stupidity-os-stat-os-utime-and-sub-second-timestamps/ "Yup, that?s right, Python?s underlying type for floats is an IEEE 754 double, which is only good for about sixteen decimal digits. With ten digits before the decimal point, that leaves six for sub-second resolutions, which is three short of the range required to preserve POSIX nanosecond-resolution timestamps. With dates after the year 2300 or so, that leaves only five accurate digits, which isn?t even enough to deal with microseconds correctly. Brilliant." "Python does have a half-assed fixed point type. Not sure why they don?t use it more." Victor From guido at python.org Fri Feb 17 01:18:04 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 16 Feb 2012 16:18:04 -0800 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C77C1.7070706@hastings.org> <20120216135641.5ef37c64@pitrou.net> <1329398799.3407.5.camel@localhost.localdomain> Message-ID: So, make is unaffected. In my first post on this subject I already noted that the only real use case is making a directory or filesystem copy and then verifying that the copy is identical using native tools that compare times with nsec precision. At least one of the bugs you quote is about the current 1-second granularity, which is already addressed by using floats (up to ~usec precision). The fs copy use case should be pretty rare, and I would be okay with a separate lower-level API that uses a long to represent nanoseconds (though MvL doesn't like that either). Using (seconds, nsec) tuples is silly though. --Guido On Thu, Feb 16, 2012 at 4:04 PM, Victor Stinner wrote: >>> The problem is that shutil.copy2() produces sometimes *older* >>> timestamp :-/ (...) >> >> Have you been able to reproduce this with an actual Makefile? What's >> the scenario? > > Hum. I asked the Internet who use shutil.copy2() and I found an "old" > issue (Decimal('43462967.173053') seconds ago): > > Python issue #10148: st_mtime differs after shutil.copy2 (october 2010) > "When copying a file with shutil.copy2() between two ext4 filesystems > on 64-bit Linux, the mtime of the destination file is different after > the copy. It appears as if the resolution is slightly different, so > the mtime is truncated slightly. (...)" > > I don't know if it is a "theorical" or "practical" issue. Then I found: > > Python issue #11941: Support st_atim, st_mtim and st_ctim attributes > in os.stat_result > "They would expose relevant functionality from libc's stat() and > provide better precision than floating-point-based st_atime, st_mtime > and st_ctime attributes." > > Which is connected the issue that motivated me to write the PEP: > > Python issue #11457: os.stat(): add new fields to get timestamps as > Decimal objects with nanosecond resolution > "Support for such precision is available at the least on 2.6 Linux kernels." > "This is important for example with the tarfile module with the pax > tar format. The POSIX tar standard[3] mandates storing the mtime in > the extended header (if it is not an integer) with as much precision > as is available in the underlying file system, and likewise to restore > this time properly upon extraction. Currently this is not possible." > "The mailbox module would benefit from having this precision available." > > For the tarfile use case, we need at least a way to get the > modification time with a nanosecond resolution *and* to set the > modification time with a nanosecond resolution. We just need to decide > which type is the best for this usecase, which is the purpose of the > PEP 410 :-) > > Another use case of nanosecond timestamps are profilers (and maybe > benchmark tools). The profiler itself may be implemented in a > different language than Python. For example, DTrace uses nanosecond > timestamps. > > -- > > Other examples. > > Debian bug #627460: (gcp) Expose nanoseconds in python (15 May 2011) > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=627460 > Debian bug #626787: (gcp) gcp: timestamp is not always copied exact > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=626787 > "When copying a (large) file from HDD to USB the files timestamp is > not copied exact. It seems to work fine with smaller files (up to > 1Gig), I couldn't spot the time-diff on these files." > ("gcp is a grid enabled version of the scp copy command.") > > fuse-python supports nanosecond resolution: they chose to mimick the C > API using: > > class Timespec(FuseStruct): > ? ?""" > ? ?Cf. struct timespec in time.h: > ? ?http://www.opengroup.org/onlinepubs/009695399/basedefs/time.h.html > ? ?""" > ? ?def __init__(self, name=None, **kw): > ? ? ? ?self.tv_sec ?= None > ? ? ? ?self.tv_nsec = None > ? ? ? ?kw['name'] = name > ? ? ? ?FuseStruct.__init__(self, **kw) > > Python issue #9079: Make gettimeofday available in time module > "... exposes gettimeofday as time.gettimeofday() returning (sec, usec) pair" > > The Oracle database supports timestamps with a nanosecond resolution. > A related article about Ruby: > http://marcricblog.blogspot.com/2010/04/who-cares-about-nanosecond.html > "Files are uploaded in groups (fifteen maximum). It was important to > know the order on which files have been upload. Depending on the size > of the files and users? internet broadband capacity, some files could > be uploaded in the same second." > > And a last one for the fun: > > "This Week in Python Stupidity: os.stat, os.utime and Sub-Second > Timestamps" (November 15, 2009) > http://ciaranm.wordpress.com/2009/11/15/this-week-in-python-stupidity-os-stat-os-utime-and-sub-second-timestamps/ > "Yup, that?s right, Python?s underlying type for floats is an IEEE 754 > double, which is only good for about sixteen decimal digits. With ten > digits before the decimal point, that leaves six for sub-second > resolutions, which is three short of the range required to preserve > POSIX nanosecond-resolution timestamps. With dates after the year 2300 > or so, that leaves only five accurate digits, which isn?t even enough > to deal with microseconds correctly. Brilliant." > "Python does have a half-assed fixed point type. Not sure why they > don?t use it more." > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From alexander.belopolsky at gmail.com Fri Feb 17 03:15:47 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 16 Feb 2012 21:15:47 -0500 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> Message-ID: On Wed, Feb 15, 2012 at 11:39 AM, Guido van Rossum wrote: > Maybe it's okay to wait a few years on this, until either 128-bit > floats are more common or cDecimal becomes the default floating point > type? +1 From jimjjewett at gmail.com Fri Feb 17 04:32:51 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 16 Feb 2012 22:32:51 -0500 Subject: [Python-Dev] PEP for new dictionary implementation In-Reply-To: <4F3D7648.6040600@v.loewis.de> References: <4f3d49d6.ec77ec0a.7988.ffffcebb@mx.google.com> <4F3D7648.6040600@v.loewis.de> Message-ID: On Thu, Feb 16, 2012 at 4:34 PM, "Martin v. L?wis" wrote: > Am 16.02.2012 19:24, schrieb Jim J. Jewett: >> PEP author Mark Shannon wrote >> (in http://mail.python.org/pipermail/python-dev/attachments/20120208/05be469a/attachment.txt): >>> ... allows ... (the ``__dict__`` attribute of an object) to share >>> keys with other attribute dictionaries of instances of the same class. >> Is "the same class" a deliberate restriction, or just a convenience >> of implementation? > It's about the implementation: the class keeps a pointer to the key set. > A subclass has a separate pointer for that. I would prefer to see that reason in the PEP; after a few years, I have trouble finding email, even when I remember reading the conversation. >> Have you timed not storing the hash (in the dict) at all, at least for >> (unicode) str-only dicts? ?Going to the string for its own cached hash >> breaks locality a bit more, but saves 1/3 of the memory for combined >> tables, and may make a big difference for classes that have >> relatively few instances. > I'd be in favor of that, but it is actually an unrelated change: whether > or not you share key sets is unrelated to whether or not str-only dicts > drop the cached hash. Except that the biggest arguments against it are that it breaks cache locality, and it changes the dictentry struct -- which this patch already does anyway. > Given a dict, it may be tricky to determine > whether or not it is str-only, i.e. what layout to use. Isn't that exactly the same determination needed when deciding whether or not to use lookdict_unicode? (It would make the switch to the more general lookdict more expensive, as that would involve a new allocation.) >>> Reduction in memory use is directly related to the number of dictionaries >>> with shared keys in existence at any time. These dictionaries are typically >>> half the size of the current dictionary implementation. >> How do you measure that? ?The limit for huge N across huge numbers >> of dicts should be 1/3 (because both hashes and keys are shared); I >> assume that gets swamped by object overhead in typical small dicts. > It's more difficult than that. He also drops the smalltable (which I > think is a good idea), so accounting how this all plays together is tricky. All the more reason to explain in the PEP how he measured or approximated it. >>> If a table is split the values in the keys table are ignored, >>> instead the values are held in a separate array. >> If they're just dead weight, then why not use them to hold indices >> into the array, so that values arrays only have to be as long as >> the number of keys, rather than rounding them up to a large-enough >> power-of-two? ?(On average, this should save half the slots.) > Good idea. However, how do you track per-dict how large the table is? Why would you want to? The per-instance array needs to be at least as large as the highest index used by any key for which it has a value; if the keys table gets far larger (or even shrinks), that doesn't really matter to the instance. What does matter to the instance is getting a value of its own for a new (to it) key -- and then the keys table can tell it which index to use, which in turn tells it whether or not it needs to grow the array. Are are you thinking of len(o.__dict__), which will indeed be a bit slower? That will happen with split dicts and potentially missing values, regardless of how much memory is set aside (or not) for the missing values. -jJ From ncoghlan at gmail.com Fri Feb 17 04:50:22 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 17 Feb 2012 13:50:22 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Disabling a test that fails on some bots. Will investigate the failure soon In-Reply-To: References: Message-ID: On Fri, Feb 17, 2012 at 2:09 AM, eli.bendersky wrote: > diff --git a/Lib/test/test_xml_etree_c.py b/Lib/test/test_xml_etree_c.py > --- a/Lib/test/test_xml_etree_c.py > +++ b/Lib/test/test_xml_etree_c.py > @@ -53,8 +53,8 @@ > ? ? ? ? # actual class. In the Python version it's a class. > ? ? ? ? self.assertNotIsInstance(cET.Element, type) > > - ? ?def test_correct_import_cET_alias(self): > - ? ? ? ?self.assertNotIsInstance(cET_alias.Element, type) > + ? ?#def test_correct_import_cET_alias(self): > + ? ? ? ?#self.assertNotIsInstance(cET_alias.Element, type) While this one was fixed quickly, *please* don't comment tests out without some kind of explanation in the code (not just in the checkin message). Even better is to use the expected_failure() decorator or the skip() decorator. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eliben at gmail.com Fri Feb 17 04:57:05 2012 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 17 Feb 2012 05:57:05 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Disabling a test that fails on some bots. Will investigate the failure soon In-Reply-To: References: Message-ID: On Fri, Feb 17, 2012 at 05:50, Nick Coghlan wrote: > On Fri, Feb 17, 2012 at 2:09 AM, eli.bendersky > wrote: > > diff --git a/Lib/test/test_xml_etree_c.py b/Lib/test/test_xml_etree_c.py > > --- a/Lib/test/test_xml_etree_c.py > > +++ b/Lib/test/test_xml_etree_c.py > > @@ -53,8 +53,8 @@ > > # actual class. In the Python version it's a class. > > self.assertNotIsInstance(cET.Element, type) > > > > - def test_correct_import_cET_alias(self): > > - self.assertNotIsInstance(cET_alias.Element, type) > > + #def test_correct_import_cET_alias(self): > > + #self.assertNotIsInstance(cET_alias.Element, type) > > While this one was fixed quickly, *please* don't comment tests out > without some kind of explanation in the code (not just in the checkin > message). > > Even better is to use the expected_failure() decorator or the skip() > decorator. > I just saw this test failing in some bots and wanted to fix it ASAP, without spending time on a real investigation. The follow-up fix came less than 2 hours later. But yes, I agree that commenting out wasn't a good choice - I should've just deleted it for the time I was working on a fix. By the way, I later discussed the failing test with Florent and http://bugs.python.org/issue14035 is the result. That failure had made no sense until Florent got deeper into import_fresh_module. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Fri Feb 17 07:50:09 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 17 Feb 2012 07:50:09 +0100 Subject: [Python-Dev] PEP for new dictionary implementation In-Reply-To: References: <4f3d49d6.ec77ec0a.7988.ffffcebb@mx.google.com> <4F3D7648.6040600@v.loewis.de> Message-ID: <4F3DF8A1.20708@v.loewis.de> >> Good idea. However, how do you track per-dict how large the table is? > > Why would you want to? > > The per-instance array needs to be at least as large as the highest > index used by any key for which it has a value; if the keys table gets > far larger (or even shrinks), that doesn't really matter to the > instance. What does matter to the instance is getting a value of its > own for a new (to it) key -- and then the keys table can tell it which > index to use, which in turn tells it whether or not it needs to grow > the array. To determine whether it needs to grow the array, it needs to find out how large the array is, no? So: how do you do that? Regards, Martin From martin at v.loewis.de Fri Feb 17 07:57:49 2012 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 17 Feb 2012 07:57:49 +0100 Subject: [Python-Dev] PEP 394 accepted Message-ID: <4F3DFA6D.6090403@v.loewis.de> As the PEP czar for PEP 394, I have reviewed it and am happy to say that I can accept it. I suppose that Nick will keep track of actually implementing it in Python 2.7. Regards, Martin From g.brandl at gmx.net Fri Feb 17 08:51:57 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 17 Feb 2012 08:51:57 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4F3CD721.7080602@v.loewis.de> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C09A9.80009@v.loewis.de> <4F3CB87D.2060808@v.loewis.de> <4F3CD721.7080602@v.loewis.de> Message-ID: Am 16.02.2012 11:14, schrieb "Martin v. L?wis": > Am 16.02.2012 10:51, schrieb Victor Stinner: >> 2012/2/16 "Martin v. L?wis" : >>>> Maybe an alternative PEP could be written that supports the filesystem >>>> copying use case only, using some specialized ns APIs? I really think >>>> that all you need is st_{a,c,m}time_ns fields and os.utime_ns(). >>> >>> I'm -1 on that, because it will make people write complicated code. >> >> Python 3.3 *has already* APIs for nanosecond timestamps: >> os.utimensat(), os.futimens(), signal.sigtimedwait(), etc. These >> functions expect a (seconds: int, nanoseconds: int) tuple. > > I'm -1 on adding these APIs, also. Since Python 3.3 is not released > yet, it's not too late to revert them. +1. Georg From victor.stinner at gmail.com Fri Feb 17 10:09:01 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 17 Feb 2012 10:09:01 +0100 Subject: [Python-Dev] PEP 394 accepted In-Reply-To: <4F3DFA6D.6090403@v.loewis.de> References: <4F3DFA6D.6090403@v.loewis.de> Message-ID: Congratulations to Kerrick Staley and Nick Coghlan, the authors of the PEP! It's good to hear that the "python", "python2" and "python3" symlinks are now standardized in a PEP. I hope that most Linux distributions will follow this PEP :-) Victor From steve at pearwood.info Fri Feb 17 10:28:29 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 17 Feb 2012 20:28:29 +1100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C09A9.80009@v.loewis.de> <4F3CB87D.2060808@v.loewis.de> <4F3CD721.7080602@v.loewis.de> Message-ID: <4F3E1DBD.2080202@pearwood.info> Georg Brandl wrote: > Am 16.02.2012 11:14, schrieb "Martin v. L?wis": >> Am 16.02.2012 10:51, schrieb Victor Stinner: >>> 2012/2/16 "Martin v. L?wis" : >>>>> Maybe an alternative PEP could be written that supports the filesystem >>>>> copying use case only, using some specialized ns APIs? I really think >>>>> that all you need is st_{a,c,m}time_ns fields and os.utime_ns(). >>>> I'm -1 on that, because it will make people write complicated code. >>> Python 3.3 *has already* APIs for nanosecond timestamps: >>> os.utimensat(), os.futimens(), signal.sigtimedwait(), etc. These >>> functions expect a (seconds: int, nanoseconds: int) tuple. >> I'm -1 on adding these APIs, also. Since Python 3.3 is not released >> yet, it's not too late to revert them. > > +1. Sorry, is that +1 on the revert, or +1 on the APIs? -- Steven From victor.stinner at gmail.com Fri Feb 17 12:33:51 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 17 Feb 2012 12:33:51 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> Message-ID: > Maybe it's okay to wait a few years on this, until either 128-bit > floats are more common or cDecimal becomes the default floating point > type? In the mean time for clock freaks we can have a few specialized > APIs that return times in nanoseconds as a (long) integer. I don't think that the default float type does really matter here. If I understood correctly, the major issue with Decimal is that Decimal is not fully "compatible" with float: Decimal+float raises a TypeError. Can't we improve the compatibility between Decimal and float, e.g. by allowing Decimal+float? Decimal (base 10) + float (base 2) may loss precision and this issue matters in some use cases. So we still need a way to warn the user on loss of precision. We may add a global flag to allow Decimal+float and turn it on by default. Developers concerns by loss of precision can just turn the flag off at startup. Something like what we did in Python 2: allow str+unicode, and only switch to unicode when unicode was mature enough and well accepted :-) -- I have some questions about 128-bit float and Decimal. Currently, there is only one hardware supporting "IEEE 754-2008 the 128-bit base-2": the IBM S/390, which is quite rare (at least on desktop :-)). Should we expect more CPU supporting this type in the (near) future? GCC, ICC and Clang implement this type in software, but there are license issues. At least with GCC which uses MPFR: the library is distributed under the GNU LGPL license, which is not compatible with the Python license. I didn't check Clang and ICC. I don't think that we can use 128-bit float by default before it is commonly available on hardware, because arithmetic in software is usually slower. We do also support platforms with a compiler not supporting 128-bit float, e.g. Windows with Visual Studio 2008. floating point in base 2 has also an issue with timestamp using 10^k resolution: such timestamp cannot be represented exactly in base 2 because 5 is coprime with 2 (10=2*5). The loss of precision is smaller than 10^-9 (nanosecond) with 128-bit float (for Epoch timestamps), but it would be more "natural" to use the base 10. System calls and functions of the C standard library use types with 10^k resolution: - 1 (time_t): time(), mktime(), localtime(), sleep(), ... - 10^-3 (int): poll() - 10^-6 (timeval, useconds_t): select(), gettimeofday(), usleep(), ... - 10^-9 (timespec): nanosleep(), utimensat(), clock_gettime(), ... decimal and cdecimal (_decimal) have the same performance issue, so I don't expect them to become the standard float type. But Decimal is able to store exactly a timetamp with a resolution of 10^k. There are also IEEE 754 for floating point types in base 10: decimal floating point (DFP), in 32, 64 and 128 bits. IBM System z9, System z10 and POWER6 CPU support these types in hardware. We may support this format in a specific module, or maybe use it to speedup the Python decimal module. But same issue here, such hardware is also rare, so we cannot use them by default or rely on them. Victor From ncoghlan at gmail.com Fri Feb 17 13:27:15 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 17 Feb 2012 22:27:15 +1000 Subject: [Python-Dev] PEP 394 accepted In-Reply-To: <4F3DFA6D.6090403@v.loewis.de> References: <4F3DFA6D.6090403@v.loewis.de> Message-ID: On Fri, Feb 17, 2012 at 4:57 PM, "Martin v. L?wis" wrote: > As the PEP czar for PEP 394, I have reviewed it and am happy to say that > I can accept it. Excellent news, thanks! I've pushed an updated version promoting it to Active status, and also incorporating Barry's suggestion of making it explicit that we expect the recommendation to change *eventually*, we just don't know when. > I suppose that Nick will keep track of actually > implementing it in Python 2.7. Indeed I will (as well as the comparatively minor change of converting the 3.x hard link to a symlink as described in the PEP). Unfortunately, dinsdale appears to have fallen over again, so I can't push the change right now :( Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Fri Feb 17 13:42:45 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 17 Feb 2012 22:42:45 +1000 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> Message-ID: On Fri, Feb 17, 2012 at 9:33 PM, Victor Stinner wrote: >> Maybe it's okay to wait a few years on this, until either 128-bit >> floats are more common or cDecimal becomes the default floating point >> type? In the mean time for clock freaks we can have a few specialized >> APIs that return times in nanoseconds as a (long) integer. > > I don't think that the default float type does really matter here. If > I understood correctly, the major issue with Decimal is that Decimal > is not fully "compatible" with float: Decimal+float raises a > TypeError. > > Can't we improve the compatibility between Decimal and float, e.g. by > allowing Decimal+float? Decimal (base 10) + float (base 2) may loss > precision and this issue matters in some use cases. So we still need a > way to warn the user on loss of precision. We may add a global flag to > allow Decimal+float and turn it on by default. Developers concerns by > loss of precision can just turn the flag off at startup. Something > like what we did in Python 2: allow str+unicode, and only switch to > unicode when unicode was mature enough and well accepted :-) Disallowing implicit binary float and Decimal interoperability was a deliberate design decision in the original Decimal PEP, in large part to discourage use of binary floats in applications where exact Decimal values are required. While this has been relaxed slightly to allow the exact explicit conversion of a binary float value to its full binary precision Decimal equivalent, the original rationale against implicit interoperability still seems valid (See http://www.python.org/dev/peps/pep-0327/#id17). OTOH, people have long had to cope with the fact that integer+float interoperability runs the risk of triggering ValueError if the integer is too large - it seems to me that the signalling behaviour of implicit promotions from float to Decimal could be adequately controlled with the Inexact flag on the Decimal context. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Fri Feb 17 13:44:55 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 17 Feb 2012 22:44:55 +1000 Subject: [Python-Dev] PEP 394 accepted In-Reply-To: References: <4F3DFA6D.6090403@v.loewis.de> Message-ID: On Fri, Feb 17, 2012 at 10:27 PM, Nick Coghlan wrote: > Unfortunately, dinsdale appears to have fallen over again, so I can't > push the change right now :( It appears that was a temporary glitch - the 2.7 change is now in Mercurial. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From smiwa.egon at googlemail.com Fri Feb 17 09:44:09 2012 From: smiwa.egon at googlemail.com (Egon Smiwa) Date: Fri, 17 Feb 2012 09:44:09 +0100 Subject: [Python-Dev] dll name for embedding? Message-ID: <4F3E1359.7030703@googlemail.com> Hi all, I'm an app developer with a CPython dll in the folder of that app. In general, are there strict requirements about the dll name (a preference would be "python.dll" (easy to update (simple replace) ). I successfully used "python.dll" and a few standard modules, then I tried to use the sympy library and its import fails with an AV exception, unless I rename the dll back to the original "python32.dll" Is there an intrinsic filename requirement inside the CPython dll, modules, or are name-restrictions to be presumed only in case of third-party libs? From stefan at bytereef.org Fri Feb 17 14:03:10 2012 From: stefan at bytereef.org (Stefan Krah) Date: Fri, 17 Feb 2012 14:03:10 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> Message-ID: <20120217130310.GA18821@sleipnir.bytereef.org> Victor Stinner wrote: > Can't we improve the compatibility between Decimal and float, e.g. by > allowing Decimal+float? Decimal (base 10) + float (base 2) may loss > precision and this issue matters in some use cases. So we still need a > way to warn the user on loss of precision. I think this should be discussed in a separate thread. It's getting slightly difficult to follow all the issues raised here. > decimal and cdecimal (_decimal) have the same performance issue, > don't expect them to become the standard float type. Well, _decimal in tight loops is about 2 times slower than float. There are areas where _decimal is actually faster than float, e.g in the cdecimal repository printing and formatting seems to be significantly faster: $ cat format.py import time from decimal import Decimal d = Decimal("7.928137192") f = 7.928137192 out = open("/dev/null", "w") start = time.time() for i in range(1000000): out.write("%s\n" % d) end = time.time() print("Decimal: ", end-start) start = time.time() for i in range(1000000): out.write("%s\n" % f) end = time.time() print("float: ", end-start) start = time.time() for i in range(1000000): out.write("{:020,.30}\n".format(d)) end = time.time() print("Decimal: ", end-start) start = time.time() for i in range(1000000): out.write("{:020,.30}\n".format(f)) end = time.time() print("float: ", end-start) $ ./python format.py Decimal: 0.8835508823394775 float: 1.3872010707855225 Decimal: 2.1346139907836914 float: 3.154278039932251 So it would make sense to profile the exact application in order to determine the suitability of _decimal for timestamps. > There are also IEEE 754 for floating point types in base 10: decimal > floating point (DFP), in 32, 64 and 128 bits. IBM System z9, System > z10 and POWER6 CPU support these types in hardware. We may support > this format in a specific module, or maybe use it to speedup the > Python decimal module. Apart from the rarity of these systems, decimal.py is arbitrary precision. If I restricted _decimal to DECIMAL64, I could probably speed it up further. All that said, personally I wouldn't have problems with a chunked representation that includes nanoseconds, thus avoiding the decimal/float discusion entirely. I'm also a happy user of: http://cr.yp.to/libtai/tai64.html#tai64n Stefan Krah From mark at hotpy.org Fri Feb 17 14:10:51 2012 From: mark at hotpy.org (Mark Shannon) Date: Fri, 17 Feb 2012 13:10:51 +0000 Subject: [Python-Dev] PEP for new dictionary implementation In-Reply-To: <20120216214547.4d7487cc@pitrou.net> References: <4F32CA76.5040307@hotpy.org> <20120216214547.4d7487cc@pitrou.net> Message-ID: <4F3E51DB.2060201@hotpy.org> On 16/02/12 20:45, Antoine Pitrou wrote: > > On Wed, 08 Feb 2012 19:18:14 +0000 > Mark Shannon wrote: >> Proposed PEP for new dictionary implementation, PEP 410? >> is attached. >> > > So, I'm running a few benchmarks using Twisted's test suite > (see https://bitbucket.org/pitrou/t3k/wiki/Home). > > At the end of `python -i bin/trial twisted.internet.test`: > -> vanilla 3.3: RSS = 94 MB > -> new dict: RSS = 91 MB > > At the end of `python -i bin/trial twisted.python.test`: > -> vanilla 3.3: RSS = 31.5 MB > -> new dict: RSS = 30 MB > > At the end of `python -i bin/trial twisted.conch.test`: > -> vanilla 3.3: RSS = 68 MB > -> new dict: RSS = 42 MB (!) > > At the end of `python -i bin/trial twisted.trial.test`: > -> vanilla 3.3: RSS = 32 MB > -> new dict: RSS = 30 MB > > At the end of `python -i bin/trial twisted.test`: > -> vanilla 3.3: RSS = 62 MB > -> new dict: RSS = 78 MB (!) In theory, new-dict should never use more a few kbs more than vanilla. That looks like a serious leak. I'll investigate as soon as I get a chance. Which revision of new-dict are you using? Cheers, Mark. > > Runtimes were mostly similar in these test runs. > > Perspective broker benchmark (doc/core/benchmarks/tpclient.py and > doc/core/benchmarks/tpserver.py): > -> vanilla 3.3: 422 MB/sec > -> new dict: 402 MB/sec > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/mark%40hotpy.org From mark at hotpy.org Fri Feb 17 14:35:19 2012 From: mark at hotpy.org (Mark Shannon) Date: Fri, 17 Feb 2012 13:35:19 +0000 Subject: [Python-Dev] A new dictionary implementation In-Reply-To: <3E3ED4B6-FAF4-48AE-A6A9-FEDB1C659315@gmail.com> References: <4F252014.3080900@hotpy.org> <20120129160841.2343b62f@pitrou.net> <4F256EDC.70707@hotpy.org> <4F25D686.9070907@pearwood.info> <4F2AE13C.6010900@hotpy.org> <4F3291C9.9070305@hotpy.org> <4F33B343.1050801@voidspace.org.uk> <4F3902AA.3080300@hotpy.org> <4F3BF259.50502@hotpy.org> <3E3ED4B6-FAF4-48AE-A6A9-FEDB1C659315@gmail.com> Message-ID: <4F3E5797.7040705@hotpy.org> On 15/02/12 21:09, Yury Selivanov wrote: > Hello Mark, > > First, I've back-ported your patch on python 3.2.2 (which was relatively > easy). Almost all tests pass, and those that don't are always failing on > my machine if I remember. The patch can be found here: http://goo.gl/nSzzY > > Then, I compared memory footprint of one of our applications (300,000 LOC) > and saw it about 6% less than on vanilla python 3.2.2 (660 MB of reserved > process memory compared to 702 MB; Linux Gentoo 64bit) The application is > written in heavy OOP style (for instance, ~1000 classes are generated by our > ORM on the fly, and there are approximately the same amount of hand-written > ones) so I hoped for a much bigger saving. > > As for the patch itself I found one use-case, where python with the patch > behaves differently:: > > class Foo: > def __init__(self, msg): > self.msg = msg > > f = Foo('123') > > class _str(str): > pass > > print(f.msg) > print(getattr(f, _str('msg'))) > > The above snippet works perfectly on vanilla py3.2, but fails on the patched > one (even on 3.3 compiled from your 'cpython_new_dict' branch) I'm not sure > that it's a valid code, though. If not, then we need to fix some python > internals to add exact type check in 'getattr', in the 'operator.getattr', etc. > And if it is - your patch needs to be fixed. In any case, I propose to add > the above code to the python test-suite, with either expecting a result or an > exception. Your code is valid, the bug is in my code. I've fixed and updated the repository. More tests to be added later. Cheers, Mark. From solipsis at pitrou.net Fri Feb 17 14:34:20 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 17 Feb 2012 14:34:20 +0100 Subject: [Python-Dev] PEP for new dictionary implementation References: <4F32CA76.5040307@hotpy.org> <20120216214547.4d7487cc@pitrou.net> <4F3E51DB.2060201@hotpy.org> Message-ID: <20120217143420.36bc6a01@pitrou.net> On Fri, 17 Feb 2012 13:10:51 +0000 Mark Shannon wrote: > On 16/02/12 20:45, Antoine Pitrou wrote: > > > > On Wed, 08 Feb 2012 19:18:14 +0000 > > Mark Shannon wrote: > >> Proposed PEP for new dictionary implementation, PEP 410? > >> is attached. > >> > > > > So, I'm running a few benchmarks using Twisted's test suite > > (see https://bitbucket.org/pitrou/t3k/wiki/Home). > > > > At the end of `python -i bin/trial twisted.internet.test`: > > -> vanilla 3.3: RSS = 94 MB > > -> new dict: RSS = 91 MB > > > > At the end of `python -i bin/trial twisted.python.test`: > > -> vanilla 3.3: RSS = 31.5 MB > > -> new dict: RSS = 30 MB > > > > At the end of `python -i bin/trial twisted.conch.test`: > > -> vanilla 3.3: RSS = 68 MB > > -> new dict: RSS = 42 MB (!) > > > > At the end of `python -i bin/trial twisted.trial.test`: > > -> vanilla 3.3: RSS = 32 MB > > -> new dict: RSS = 30 MB > > > > At the end of `python -i bin/trial twisted.test`: > > -> vanilla 3.3: RSS = 62 MB > > -> new dict: RSS = 78 MB (!) > > In theory, new-dict should never use more a few kbs more than vanilla. > That looks like a serious leak. I'll investigate as soon as I get a chance. > Which revision of new-dict are you using? 6c4d5d9dfc6d Thanks :) Antoine. From larry at hastings.org Fri Feb 17 16:30:24 2012 From: larry at hastings.org (Larry Hastings) Date: Fri, 17 Feb 2012 07:30:24 -0800 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4F3CD721.7080602@v.loewis.de> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C09A9.80009@v.loewis.de> <4F3CB87D.2060808@v.loewis.de> <4F3CD721.7080602@v.loewis.de> Message-ID: <4F3E7290.2060902@hastings.org> On 02/16/2012 02:14 AM, "Martin v. L?wis" wrote: > Am 16.02.2012 10:51, schrieb Victor Stinner: >> 2012/2/16 "Martin v. L?wis": >>>> Maybe an alternative PEP could be written that supports the filesystem >>>> copying use case only, using some specialized ns APIs? I really think >>>> that all you need is st_{a,c,m}time_ns fields and os.utime_ns(). >>> I'm -1 on that, because it will make people write complicated code. >> Python 3.3 *has already* APIs for nanosecond timestamps: >> os.utimensat(), os.futimens(), signal.sigtimedwait(), etc. These >> functions expect a (seconds: int, nanoseconds: int) tuple. > I'm -1 on adding these APIs, also. Since Python 3.3 is not released > yet, it's not too late to revert them. +1. I also think they should be removed in favor of adding support for a nanosecond-friendly representation to the existing APIs (os.utime, etc). Python is not C, we don't need three functions that do the same thing but take different representations as their arguments. /arry From status at bugs.python.org Fri Feb 17 18:07:36 2012 From: status at bugs.python.org (Python tracker) Date: Fri, 17 Feb 2012 18:07:36 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20120217170736.AF3351D1AB@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2012-02-10 - 2012-02-17) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3257 (+11) closed 22567 (+44) total 25824 (+55) Open issues with patches: 1391 Issues opened (40) ================== #13609: Add "os.get_terminal_size()" function http://bugs.python.org/issue13609 reopened by Arfrever #13866: {urllib,urllib.parse}.urlencode should not use quote_plus http://bugs.python.org/issue13866 reopened by Stephen.Day #13989: gzip always returns byte strings, no text mode http://bugs.python.org/issue13989 opened by maubp #13990: Benchmarks: 2to3 failures on the py3 side http://bugs.python.org/issue13990 opened by francismb #13992: Segfault in PyTrash_destroy_chain http://bugs.python.org/issue13992 opened by Aaron.Staley #13997: Clearly explain the bare minimum Python 3 users should know ab http://bugs.python.org/issue13997 opened by ncoghlan #13998: Lookbehind assertions go behind the start position for the mat http://bugs.python.org/issue13998 opened by Devin Jeanpierre #13999: Queue references in multiprocessing doc points to Queue module http://bugs.python.org/issue13999 opened by sandro.tosi #14001: CVE-2012-0845 Python v2.7.2 / v3.2.2 (SimpleXMLRPCServer): DoS http://bugs.python.org/issue14001 opened by iankko #14002: distutils2 fails to install a package from PyPI on Python 2.7. http://bugs.python.org/issue14002 opened by pmoore #14003: __self__ on built-in functions is not as documented http://bugs.python.org/issue14003 opened by SpecLad #14004: Distutils filelist selects too many files on Windows http://bugs.python.org/issue14004 opened by jason.coombs #14005: IDLE Crash when running/saving a file http://bugs.python.org/issue14005 opened by Scott.Bowman #14006: Improve the documentation of xml.etree.ElementTree http://bugs.python.org/issue14006 opened by eli.bendersky #14007: xml.etree.ElementTree - XMLParser and TreeBuilder's doctype() http://bugs.python.org/issue14007 opened by eli.bendersky #14009: Clearer documentation for cElementTree http://bugs.python.org/issue14009 opened by eric.araujo #14010: deeply nested filter segfaults http://bugs.python.org/issue14010 opened by alex #14011: packaging should use shutil archiving functions transparently http://bugs.python.org/issue14011 opened by eric.araujo #14012: Misc tarfile fixes http://bugs.python.org/issue14012 opened by eric.araujo #14013: tarfile should expose supported formats http://bugs.python.org/issue14013 opened by eric.araujo #14014: codecs.StreamWriter.reset contract not fulfilled http://bugs.python.org/issue14014 opened by Jim.Jewett #14015: surrogateescape largely missing from documentation http://bugs.python.org/issue14015 opened by Jim.Jewett #14017: Make it easy to create a new TextIOWrapper based on an existin http://bugs.python.org/issue14017 opened by ncoghlan #14018: OS X installer does not detect bad symlinks created by Xcode 3 http://bugs.python.org/issue14018 opened by ned.deily #14019: Unify tests for str.format and string.Formatter http://bugs.python.org/issue14019 opened by ncoghlan #14020: Improve HTMLParser doc http://bugs.python.org/issue14020 opened by ezio.melotti #14023: bytes implied to be mutable http://bugs.python.org/issue14023 opened by SpecLad #14026: test_cmd_line_script should include more sys.argv checks http://bugs.python.org/issue14026 opened by ncoghlan #14027: distutils2 lack of pysetup.bat http://bugs.python.org/issue14027 opened by ??????.??? #14030: Be more careful about selecting the compiler in distutils http://bugs.python.org/issue14030 opened by djc #14032: test_cmd_line_script prints undefined 'data' variable http://bugs.python.org/issue14032 opened by Jason.Yeo #14034: the example in argparse doc is too complex http://bugs.python.org/issue14034 opened by tshepang #14035: behavior of test.support.import_fresh_module http://bugs.python.org/issue14035 opened by flox #14036: urlparse insufficient port property validation http://bugs.python.org/issue14036 opened by zulla #14037: Allow grouping of argparse subparser commands in help output http://bugs.python.org/issue14037 opened by ncoghlan #14038: Packaging test support code raises exception http://bugs.python.org/issue14038 opened by vinay.sajip #14039: Add "metavar" argument to add_subparsers() in argparse http://bugs.python.org/issue14039 opened by ncoghlan #14040: Deprecate some of the module file formats http://bugs.python.org/issue14040 opened by pitrou #14042: json.dumps() documentation is slightly incorrect. http://bugs.python.org/issue14042 opened by tomchristie #14043: Speed-up importlib's _FileFinder http://bugs.python.org/issue14043 opened by pitrou Most recent 15 issues with no replies (15) ========================================== #14043: Speed-up importlib's _FileFinder http://bugs.python.org/issue14043 #14042: json.dumps() documentation is slightly incorrect. http://bugs.python.org/issue14042 #14039: Add "metavar" argument to add_subparsers() in argparse http://bugs.python.org/issue14039 #14038: Packaging test support code raises exception http://bugs.python.org/issue14038 #14032: test_cmd_line_script prints undefined 'data' variable http://bugs.python.org/issue14032 #14027: distutils2 lack of pysetup.bat http://bugs.python.org/issue14027 #14023: bytes implied to be mutable http://bugs.python.org/issue14023 #14019: Unify tests for str.format and string.Formatter http://bugs.python.org/issue14019 #14018: OS X installer does not detect bad symlinks created by Xcode 3 http://bugs.python.org/issue14018 #14015: surrogateescape largely missing from documentation http://bugs.python.org/issue14015 #14014: codecs.StreamWriter.reset contract not fulfilled http://bugs.python.org/issue14014 #14013: tarfile should expose supported formats http://bugs.python.org/issue14013 #14012: Misc tarfile fixes http://bugs.python.org/issue14012 #14011: packaging should use shutil archiving functions transparently http://bugs.python.org/issue14011 #13999: Queue references in multiprocessing doc points to Queue module http://bugs.python.org/issue13999 Most recent 15 issues waiting for review (15) ============================================= #14043: Speed-up importlib's _FileFinder http://bugs.python.org/issue14043 #14040: Deprecate some of the module file formats http://bugs.python.org/issue14040 #14036: urlparse insufficient port property validation http://bugs.python.org/issue14036 #14035: behavior of test.support.import_fresh_module http://bugs.python.org/issue14035 #14020: Improve HTMLParser doc http://bugs.python.org/issue14020 #14013: tarfile should expose supported formats http://bugs.python.org/issue14013 #14012: Misc tarfile fixes http://bugs.python.org/issue14012 #14009: Clearer documentation for cElementTree http://bugs.python.org/issue14009 #14001: CVE-2012-0845 Python v2.7.2 / v3.2.2 (SimpleXMLRPCServer): DoS http://bugs.python.org/issue14001 #13974: packaging: test for set_platform() http://bugs.python.org/issue13974 #13973: urllib.parse is imported twice in xmlrpc.client http://bugs.python.org/issue13973 #13970: frameobject should not have f_yieldfrom attribute http://bugs.python.org/issue13970 #13969: path name must always be string (or None) http://bugs.python.org/issue13969 #13968: Support recursive globs http://bugs.python.org/issue13968 #13967: also test for an empty pathname http://bugs.python.org/issue13967 Top 10 most discussed issues (10) ================================= #13992: Segfault in PyTrash_destroy_chain http://bugs.python.org/issue13992 15 msgs #13609: Add "os.get_terminal_size()" function http://bugs.python.org/issue13609 14 msgs #13997: Clearly explain the bare minimum Python 3 users should know ab http://bugs.python.org/issue13997 14 msgs #13703: Hash collision security issue http://bugs.python.org/issue13703 11 msgs #14004: Distutils filelist selects too many files on Windows http://bugs.python.org/issue14004 8 msgs #14001: CVE-2012-0845 Python v2.7.2 / v3.2.2 (SimpleXMLRPCServer): DoS http://bugs.python.org/issue14001 7 msgs #14036: urlparse insufficient port property validation http://bugs.python.org/issue14036 7 msgs #13198: Remove duplicate definition of write_record_file http://bugs.python.org/issue13198 6 msgs #13579: string.Formatter doesn't understand the !a conversion specifie http://bugs.python.org/issue13579 6 msgs #13882: PEP 410: Use decimal.Decimal type for timestamps http://bugs.python.org/issue13882 6 msgs Issues closed (44) ================== #7644: bug in nntplib.body() method with possible fix http://bugs.python.org/issue7644 closed by pitrou #9750: sqlite3 iterdump fails on column with reserved name http://bugs.python.org/issue9750 closed by python-dev #10227: Improve performance of MemoryView slicing http://bugs.python.org/issue10227 closed by skrah #10287: NNTP authentication should check capabilities http://bugs.python.org/issue10287 closed by pitrou #11836: multiprocessing.queues.SimpleQueue is undocumented http://bugs.python.org/issue11836 closed by sandro.tosi #12297: Clarifications to atexit.register and unregister doc http://bugs.python.org/issue12297 closed by eric.araujo #13014: _ssl.c: refleak http://bugs.python.org/issue13014 closed by pitrou #13015: _collectionsmodule.c: refleak http://bugs.python.org/issue13015 closed by pitrou #13020: structseq.c: refleak http://bugs.python.org/issue13020 closed by pitrou #13089: parsetok.c: memory leak http://bugs.python.org/issue13089 closed by skrah #13092: pep-393: memory leaks #2 http://bugs.python.org/issue13092 closed by pitrou #13619: Add a new codec: "locale", the current locale encoding http://bugs.python.org/issue13619 closed by haypo #13878: test_sched failures on Windows buildbot http://bugs.python.org/issue13878 closed by neologix #13913: utf-8 or utf8 or utf-8 (codec display name inconsistency) http://bugs.python.org/issue13913 closed by haypo #13930: lib2to3 ability to output files into a different directory and http://bugs.python.org/issue13930 closed by gregory.p.smith #13948: rm needless use of set function http://bugs.python.org/issue13948 closed by eric.araujo #13949: rm needless use of pass statement http://bugs.python.org/issue13949 closed by eric.araujo #13950: rm commented-out code http://bugs.python.org/issue13950 closed by eric.araujo #13960: Handling of broken comments in HTMLParser http://bugs.python.org/issue13960 closed by ezio.melotti #13961: Have importlib use os.replace() http://bugs.python.org/issue13961 closed by brett.cannon #13972: set and frozenset constructors don't accept multiple iterables http://bugs.python.org/issue13972 closed by petri.lehtinen #13977: importlib simplification http://bugs.python.org/issue13977 closed by brett.cannon #13979: Automatic *libc.so loading behaviour http://bugs.python.org/issue13979 closed by meador.inge #13987: Handling of broken markup in HTMLParser on 2.7 http://bugs.python.org/issue13987 closed by ezio.melotti #13988: Expose the C implementation of ElementTree by default when imp http://bugs.python.org/issue13988 closed by flox #13991: namespace packages depending on order http://bugs.python.org/issue13991 closed by eric.araujo #13993: Handling of broken end tags in HTMLParser http://bugs.python.org/issue13993 closed by ezio.melotti #13994: incomplete revert in 2.7 Distutils left two copies of customiz http://bugs.python.org/issue13994 closed by ned.deily #13995: sqlite3 Cursor.rowcount documentation for old sqlite bug http://bugs.python.org/issue13995 closed by python-dev #13996: "What's New in Python" should have initial release date on hea http://bugs.python.org/issue13996 closed by rhettinger #14000: Subprocess stdin.flush does not flush http://bugs.python.org/issue14000 closed by rosslagerwall #14008: Python uses the new source when reporting an old exception http://bugs.python.org/issue14008 closed by flox #14016: Usage of socket.sendall() in multiple threads http://bugs.python.org/issue14016 closed by r.david.murray #14021: Write pkg_info with local encoding(GBK) will be a problem. http://bugs.python.org/issue14021 closed by eric.araujo #14022: bug in pkgutil.py with suggested fix http://bugs.python.org/issue14022 closed by ned.deily #14024: logging.Formatter Cache Prevents Exception Format Overriding http://bugs.python.org/issue14024 closed by vinay.sajip #14025: unittest.TestCase.assertEqual does not show diff when comparin http://bugs.python.org/issue14025 closed by michael.foord #14028: random.choice hits ValueError: cannot convert float NaN to int http://bugs.python.org/issue14028 closed by gregory.p.smith #14029: When using setattr identifiers can start with any character http://bugs.python.org/issue14029 closed by loewis #14031: logging module cannot format str.format log messages http://bugs.python.org/issue14031 closed by vinay.sajip #14033: distutils problem with setup.py build &setup.py install vs dir http://bugs.python.org/issue14033 closed by ??????.??? #14041: bsddb DB_RUNRECOVERY crash on write access http://bugs.python.org/issue14041 closed by jcea #1326113: Letting "build_ext --libraries" take more than one lib http://bugs.python.org/issue1326113 closed by eric.araujo #1051216: make distutils.core.run_setup re-entrant http://bugs.python.org/issue1051216 closed by eric.araujo From jimjjewett at gmail.com Fri Feb 17 18:42:40 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 17 Feb 2012 12:42:40 -0500 Subject: [Python-Dev] PEP for new dictionary implementation In-Reply-To: <4F3DF8A1.20708@v.loewis.de> References: <4f3d49d6.ec77ec0a.7988.ffffcebb@mx.google.com> <4F3D7648.6040600@v.loewis.de> <4F3DF8A1.20708@v.loewis.de> Message-ID: On Fri, Feb 17, 2012 at 1:50 AM, "Martin v. L?wis" wrote: >>> Good idea. However, how do you track per-dict how large the >>> table is? [Or, rather, what is the highest index needed to store any values that are actually set for this instance.] > To determine whether it needs to grow the array, it needs to find out > how large the array is, no? So: how do you do that? Ah, now I understand; you do need a single ssize_t either on the dict or at the head of the values array to indicate how many slots it has actually allocated. It *may* also be worthwhile to add a second ssize_t to indicate how many are currently in use, for faster results in case of len. But the dict is guaranteed to have at least one free slot, so that extra index will never make the allocation larger than the current code. -jJ From mark at hotpy.org Fri Feb 17 18:52:23 2012 From: mark at hotpy.org (Mark Shannon) Date: Fri, 17 Feb 2012 17:52:23 +0000 (GMT) Subject: [Python-Dev] PEP for new dictionary implementation In-Reply-To: References: <4f3d49d6.ec77ec0a.7988.ffffcebb@mx.google.com> <4F3D7648.6040600@v.loewis.de> <4F3DF8A1.20708@v.loewis.de> Message-ID: <1207577444.648837.1329501143303.JavaMail.open-xchange@email.1and1.co.uk> On 17 February 2012 at 17:42 Jim Jewett wrote: > On Fri, Feb 17, 2012 at 1:50 AM, "Martin v. L?wis" wrote: > >>> Good idea. However, how do you track per-dict how large the > >>> table is? > > [Or, rather, what is the highest index needed to store any values > that are actually set for this instance.] > > > To determine whether it needs to grow the array, it needs to find out > > how large the array is, no? So: how do you do that? > > Ah, now I understand; you do need a single ssize_t either on the dict > or at the head of the values array to indicate how many slots it has > actually allocated. It *may* also be worthwhile to add a second > ssize_t to indicate how many are currently in use, for faster results > in case of len. But the dict is guaranteed to have at least one free > slot, so that extra index will never make the allocation larger than > the current code. The dict already has a field indicating how many items are in use, the ma_used field. Cheers, Mark. From g.brandl at gmx.net Fri Feb 17 19:00:16 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 17 Feb 2012 19:00:16 +0100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: <4F3E1DBD.2080202@pearwood.info> References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C09A9.80009@v.loewis.de> <4F3CB87D.2060808@v.loewis.de> <4F3CD721.7080602@v.loewis.de> <4F3E1DBD.2080202@pearwood.info> Message-ID: Am 17.02.2012 10:28, schrieb Steven D'Aprano: > Georg Brandl wrote: >> Am 16.02.2012 11:14, schrieb "Martin v. L?wis": >>> Am 16.02.2012 10:51, schrieb Victor Stinner: >>>> 2012/2/16 "Martin v. L?wis" : >>>>>> Maybe an alternative PEP could be written that supports the filesystem >>>>>> copying use case only, using some specialized ns APIs? I really think >>>>>> that all you need is st_{a,c,m}time_ns fields and os.utime_ns(). >>>>> I'm -1 on that, because it will make people write complicated code. >>>> Python 3.3 *has already* APIs for nanosecond timestamps: >>>> os.utimensat(), os.futimens(), signal.sigtimedwait(), etc. These >>>> functions expect a (seconds: int, nanoseconds: int) tuple. >>> I'm -1 on adding these APIs, also. Since Python 3.3 is not released >>> yet, it's not too late to revert them. >> >> +1. > > Sorry, is that +1 on the revert, or +1 on the APIs? It's on what Martin said; you're right, it was a bit too ambiguous even for a RM :) Georg From ncoghlan at gmail.com Fri Feb 17 23:04:01 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 18 Feb 2012 08:04:01 +1000 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <4F3D4B0F.5040605@gmail.com> References: <4F3D3DA8.4010704@gmail.com> <20120216185554.6f890376@pitrou.net> <4F3D4B0F.5040605@gmail.com> Message-ID: On Fri, Feb 17, 2012 at 4:29 AM, Ezio Melotti wrote: > I'm assuming that eventually the module will be removed (maybe for Python > 4?), and I don't expect nor want to seen it removed in the near future. > If something gets removed it should be deprecated first, and it's usually > better to deprecate it sooner so that the developers have more time to > update their code. Not really - as soon as we programmatically deprecate something, it means anyone with a strict warnings policy (or with customers that have such a policy) has to update their code *now*. (Previously it was even worse than that, which is why deprecation warnings are no longer displayed by default). For things that we have no intention of deprecating in 3.x, but will likely ditch in a hypothetical future Python 4000, we'll almost certainly do exactly what we did with Pyk: later in the 3.x series, add a "-4" command line switch and a sys.py4kwarning flag to trigger conditional deprecation warnings. So, assuming things continue as they have for the first couple of decades of Python's existence, we can probably start worrying about it some time around 2020 :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ezio.melotti at gmail.com Fri Feb 17 23:56:17 2012 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Sat, 18 Feb 2012 00:56:17 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <4F3D3DA8.4010704@gmail.com> <20120216185554.6f890376@pitrou.net> <4F3D4B0F.5040605@gmail.com> Message-ID: <4F3EDB11.3040404@gmail.com> On 18/02/2012 0.04, Nick Coghlan wrote: > On Fri, Feb 17, 2012 at 4:29 AM, Ezio Melotti wrote: >> I'm assuming that eventually the module will be removed (maybe for Python >> 4?), and I don't expect nor want to seen it removed in the near future. >> If something gets removed it should be deprecated first, and it's usually >> better to deprecate it sooner so that the developers have more time to >> update their code. > Not really - as soon as we programmatically deprecate something, it > means anyone with a strict warnings policy (or with customers that > have such a policy) has to update their code *now*. (Previously it was > even worse than that, which is why deprecation warnings are no longer > displayed by default). The ones with a strict warning policy should be ready to deal with this situation. A possible solution (that I already proposed a while ago) would be to reuse the 2to3 framework to provide fixers that could be used for these "mechanical" updates between 3.x releases. For example I wrote a 2to3 fixer to replace all the deprecate unittest methods (fail*, some assert*) with the correct ones, but this can't be used to fix them while moving from 3.1 to 3.2. > For things that we have no intention of deprecating in 3.x, but will > likely ditch in a hypothetical future Python 4000, we'll almost > certainly do exactly what we did with Pyk: later in the 3.x series, > add a "-4" command line switch and a sys.py4kwarning flag to trigger > conditional deprecation warnings. I think Guido mentioned somewhere that this hypothetical Python 4000 will most likely be backward compatible, so we would still need a regular deprecation period. > So, assuming things continue as they have for the first couple of > decades of Python's existence, we can probably start worrying about it > some time around 2020 :) What bothers me most is that a valid mechanism to warn users who cares about things that will be removed is being hindered in several ways. DeprecationWarnings were first silenced (and this is fine as long as the developers are educated to enable warnings while testing), now discouraged (because people are still able to make them visible and also to turn them into errors), and on the tracker there's even a discussion about making the deprecation notes in the doc less visible (because the red boxes are too "scary"). See also http://mail.python.org/pipermail/python-dev/2011-October/114199.html Best Regards, Ezio Melotti > Cheers, > Nick. > From victor.stinner at gmail.com Sat Feb 18 04:10:40 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 18 Feb 2012 04:10:40 +0100 Subject: [Python-Dev] PEP 410, 3rd revision, Decimal timestamp Message-ID: PEP: 410 Title: Use decimal.Decimal type for timestamps Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 01-February-2012 Python-Version: 3.3 Abstract ======== Decimal becomes the official type for high-resolution timestamps to make Python support new functions using a nanosecond resolution without loss of precision. Motivation ========== Python 2.3 introduced float timestamps to support sub-second resolutions. os.stat() uses float timestamps by default since Python 2.5. Python 3.3 introduced functions supporting nanosecond resolutions: * os module: futimens(), utimensat() * time module: clock_gettime(), clock_getres(), monotonic(), wallclock() os.stat() reads nanosecond timestamps but returns timestamps as float. The Python float type uses binary64 format of the IEEE 754 standard. With a resolution of one nanosecond (10\ :sup:`-9`), float timestamps lose precision for values bigger than 2\ :sup:`24` seconds (194 days: 1970-07-14 for an Epoch timestamp). Nanosecond resolution is required to set the exact modification time on filesystems supporting nanosecond timestamps (e.g. ext4, btrfs, NTFS, ...). It helps also to compare the modification time to check if a file is newer than another file. Use cases: copy the modification time of a file using shutil.copystat(), create a TAR archive with the tarfile module, manage a mailbox with the mailbox module, etc. An arbitrary resolution is preferred over a fixed resolution (like nanosecond) to not have to change the API when a better resolution is required. For example, the NTP protocol uses fractions of 2\ :sup:`32` seconds (approximatively 2.3 ? 10\ :sup:`-10` second), whereas the NTP protocol version 4 uses fractions of 2\ :sup:`64` seconds (5.4 ? 10\ :sup:`-20` second). .. note:: With a resolution of 1 microsecond (10\ :sup:`-6`), float timestamps lose precision for values bigger than 2\ :sup:`33` seconds (272 years: 2242-03-16 for an Epoch timestamp). With a resolution of 100 nanoseconds (10\ :sup:`-7`, resolution used on Windows), float timestamps lose precision for values bigger than 2\ :sup:`29` seconds (17 years: 1987-01-05 for an Epoch timestamp). Specification ============= Add decimal.Decimal as a new type for timestamps. Decimal supports any timestamp resolution, support arithmetic operations and is comparable. It is possible to coerce a Decimal to float, even if the conversion may lose precision. The clock resolution can also be stored in a Decimal object. Add an optional *timestamp* argument to: * os module: fstat(), fstatat(), lstat(), stat() (st_atime, st_ctime and st_mtime fields of the stat structure), sched_rr_get_interval(), times(), wait3() and wait4() * resource module: ru_utime and ru_stime fields of getrusage() * signal module: getitimer(), setitimer() * time module: clock(), clock_gettime(), clock_getres(), monotonic(), time() and wallclock() The *timestamp* argument value can be float or Decimal, float is still the default for backward compatibility. The following functions support Decimal as input: * datetime module: date.fromtimestamp(), datetime.fromtimestamp() and datetime.utcfromtimestamp() * os module: futimes(), futimesat(), lutimes(), utime() * select module: epoll.poll(), kqueue.control(), select() * signal module: setitimer(), sigtimedwait() * time module: ctime(), gmtime(), localtime(), sleep() The os.stat_float_times() function is deprecated: use an explicit cast using int() instead. .. note:: The decimal module is implemented in Python and is slower than float, but there is a new C implementation which is almost ready for inclusion in CPython. Backwards Compatibility ======================= The default timestamp type is unchanged, so there is no impact on backward compatibility nor on performances. The new timestamp type, decimal.Decimal, is only returned when requested explicitly. Objection: clocks accuracy ========================== Computer clocks and operating systems are inaccurate and fail to provide nanosecond accuracy in practice. A nanosecond is what it takes to execute a couple of CPU instructions. Even on a real-time operating system, a nanosecond-precise measurement is already obsolete when it starts being processed by the higher-level application. A single cache miss in the CPU will make the precision worthless. .. note:: Linux *actually* is able to measure time in nanosecond precision, even though it is not able to keep its clock synchronized to UTC with a nanosecond accuracy. Alternatives: Timestamp types ============================= To support timestamps with an arbitrary or nanosecond resolution, the following types have been considered: * number of nanoseconds * 128-bits float * decimal.Decimal * datetime.datetime * datetime.timedelta * tuple of integers * timespec structure Criteria: * Doing arithmetic on timestamps must be possible * Timestamps must be comparable * An arbitrary resolution, or at least a resolution of one nanosecond without losing precision * It should be possible to coerce the new timestamp to float for backward compatibility A resolution of one nanosecond is enough to support all current C functions. The best resolution used by operating systems is one nanosecond. In practice, most clock accuracy is closer to microseconds than nanoseconds. So it sounds reasonable to use a fixed resolution of one nanosecond. Number of nanoseconds (int) --------------------------- A nanosecond resolution is enough for all current C functions and so a timestamp can simply be a number of nanoseconds, an integer, not a float. The number of nanoseconds format has been rejected because it would require to add new specialized functions for this format because it not possible to differentiate a number of nanoseconds and a number of seconds just by checking the object type. 128-bits float -------------- Add a new IEEE 754-2008 quad-precision binary float type. The IEEE 754-2008 quad precision float has 1 sign bit, 15 bits of exponent and 112 bits of mantissa. 128-bits float is supported by GCC (4.3), Clang and ICC compilers. Python must be portable and so cannot rely on a type only available on some platforms. For example, Visual C++ 2008 doesn't support 128-bits float, whereas it is used to build the official Windows executables. Another example: GCC 4.3 does not support __float128 in 32-bit mode on x86 (but GCC 4.4 does). There is also a license issue: GCC uses the MPFR library for 128-bits float, library distributed under the GNU LGPL license. This license is not compatible with the Python license. .. note:: The x87 floating point unit of Intel CPU supports 80-bit floats. This format is not supported by the SSE instruction set, which is now preferred over float, especially on x86_64. Other CPU vendors don't support 80-bit float. datetime.datetime ----------------- The datetime.datetime type is the natural choice for a timestamp because it is clear that this type contains a timestamp, whereas int, float and Decimal are raw numbers. It is an absolute timestamp and so is well defined. It gives direct access to the year, month, day, hours, minutes and seconds. It has methods related to time like methods to format the timestamp as string (e.g. datetime.datetime.strftime). The major issue is that except os.stat(), time.time() and time.clock_gettime(time.CLOCK_GETTIME), all time functions have an unspecified starting point and no timezone information, and so cannot be converted to datetime.datetime. datetime.datetime has also issues with timezone. For example, a datetime object without timezone (unaware) and a datetime with a timezone (aware) cannot be compared. There is also an ordering issues with daylight saving time (DST) in the duplicate hour of switching from DST to normal time. datetime.datetime has been rejected because it cannot be used for functions using an unspecified starting point like os.times() or time.clock(). For time.time() and time.clock_gettime(time.CLOCK_GETTIME): it is already possible to get the current time as a datetime.datetime object using:: datetime.datetime.now(datetime.timezone.utc) For os.stat(), it is simple to create a datetime.datetime object from a decimal.Decimal timestamp in the UTC timezone:: datetime.datetime.fromtimestamp(value, datetime.timezone.utc) .. note:: datetime.datetime only supports microsecond resolution, but can be enhanced to support nanosecond. datetime.timedelta ------------------ datetime.timedelta is the natural choice for a relative timestamp because it is clear that this type contains a timestamp, whereas int, float and Decimal are raw numbers. It can be used with datetime.datetime to get an absolute timestamp when the starting point is known. datetime.timedelta has been rejected because it cannot be coerced to float and has a fixed resolution. One new standard timestamp type is enough, Decimal is preferred over datetime.timedelta. Converting a datetime.timedelta to float requires an explicit call to the datetime.timedelta.total_seconds() method. .. note:: datetime.timedelta only supports microsecond resolution, but can be enhanced to support nanosecond. .. _tuple: Tuple of integers ----------------- To expose C functions in Python, a tuple of integers is the natural choice to store a timestamp because the C language uses structures with integers fields (e.g. timeval and timespec structures). Using only integers avoids the loss of precision (Python supports integers of arbitrary length). Creating and parsing a tuple of integers is simple and fast. Depending of the exact format of the tuple, the precision can be arbitrary or fixed. The precision can be choose as the loss of precision is smaller than an arbitrary limit like one nanosecond. Different formats has been proposed: * A: (numerator, denominator) * value = numerator / denominator * resolution = 1 / denominator * denominator > 0 * B: (seconds, numerator, denominator) * value = seconds + numerator / denominator * resolution = 1 / denominator * 0 <= numerator < denominator * denominator > 0 * C: (intpart, floatpart, base, exponent) * value = intpart + floatpart / base\ :sup:`exponent` * resolution = 1 / base \ :sup:`exponent` * 0 <= floatpart < base \ :sup:`exponent` * base > 0 * exponent >= 0 * D: (intpart, floatpart, exponent) * value = intpart + floatpart / 10\ :sup:`exponent` * resolution = 1 / 10 \ :sup:`exponent` * 0 <= floatpart < 10 \ :sup:`exponent` * exponent >= 0 * E: (sec, nsec) * value = sec + nsec ? 10\ :sup:`-9` * resolution = 10 \ :sup:`-9` (nanosecond) * 0 <= nsec < 10 \ :sup:`9` All formats support an arbitrary resolution, except of the format (E). The format (D) may not be able to store the exact value (may loss of precision) if the clock frequency is arbitrary and cannot be expressed as a power of 10. The format (C) has a similar issue, but in such case, it is possible to use base=frequency and exponent=1. The formats (C), (D) and (E) allow optimization for conversion to float if the base is 2 and to decimal.Decimal if the base is 10. The format (A) is a simple fraction. It supports arbitrary precision, is simple (only two fields), only requires a simple division to get the floating point value, and is already used by float.as_integer_ratio(). To simplify the implementation (especially the C implementation to avoid integer overflow), a numerator bigger than the denominator can be accepted. The tuple may be normalized later. Tuple of integers have been rejected because they don't support arithmetic operations. .. note:: On Windows, the ``QueryPerformanceCounter()`` clock uses the frequency of the processor which is an arbitrary number and so may not be a power or 2 or 10. The frequency can be read using ``QueryPerformanceFrequency()``. timespec structure ------------------ timespec is the C structure used to store timestamp with a nanosecond resolution. Python can use a type with the same structure: (seconds, nanoseconds). For convenience, arithmetic operations on timespec are supported. Example of an incomplete timespec type supporting addition, subtraction and coercion to float:: class timespec(tuple): def __new__(cls, sec, nsec): if not isinstance(sec, int): raise TypeError if not isinstance(nsec, int): raise TypeError asec, nsec = divmod(nsec, 10 ** 9) sec += asec obj = tuple.__new__(cls, (sec, nsec)) obj.sec = sec obj.nsec = nsec return obj def __float__(self): return self.sec + self.nsec * 1e-9 def total_nanoseconds(self): return self.sec * 10 ** 9 + self.nsec def __add__(self, other): if not isinstance(other, timespec): raise TypeError ns_sum = self.total_nanoseconds() + other.total_nanoseconds() return timespec(*divmod(ns_sum, 10 ** 9)) def __sub__(self, other): if not isinstance(other, timespec): raise TypeError ns_diff = self.total_nanoseconds() - other.total_nanoseconds() return timespec(*divmod(ns_diff, 10 ** 9)) def __str__(self): if self.sec < 0 and self.nsec: sec = abs(1 + self.sec) nsec = 10**9 - self.nsec return '-%i.%09u' % (sec, nsec) else: return '%i.%09u' % (self.sec, self.nsec) def __repr__(self): return '' % (self.sec, self.nsec) The timespec type is similar to the format (E) of tuples of integer, except that it supports arithmetic and coercion to float. The timespec type was rejected because it only supports nanosecond resolution and requires to implement each arithmetic operation, whereas the Decimal type is already implemented and well tested. Alternatives: API design ======================== Add a string argument to specify the return type ------------------------------------------------ Add an string argument to function returning timestamps, example: time.time(format="datetime"). A string is more extensible than a type: it is possible to request a format that has no type, like a tuple of integers. This API was rejected because it was necessary to import implicitly modules to instantiate objects (e.g. import datetime to create datetime.datetime). Importing a module may raise an exception and may be slow, such behaviour is unexpected and surprising. Add a global flag to change the timestamp type ---------------------------------------------- A global flag like os.stat_decimal_times(), similar to os.stat_float_times(), can be added to set globally the timestamp type. A global flag may cause issues with libraries and applications expecting float instead of Decimal. Decimal is not fully compatible with float. float+Decimal raises a TypeError for example. The os.stat_float_times() case is different because an int can be coerced to float and int+float gives float. Add a protocol to create a timestamp ------------------------------------ Instead of hard coding how timestamps are created, a new protocol can be added to create a timestamp from a fraction. For example, time.time(timestamp=type) would call the class method type.__fromfraction__(numerator, denominator) to create a timestamp object of the specified type. If the type doesn't support the protocol, a fallback is used: type(numerator) / type(denominator). A variant is to use a "converter" callback to create a timestamp. Example creating a float timestamp: def timestamp_to_float(numerator, denominator): return float(numerator) / float(denominator) Common converters can be provided by time, datetime and other modules, or maybe a specific "hires" module. Users can define their own converters. Such protocol has a limitation: the timestamp structure has to be decided once and cannot be changed later. For example, adding a timezone or the absolute start of the timestamp would break the API. The protocol proposition was as being excessive given the requirements, but that the specific syntax proposed (time.time(timestamp=type)) allows this to be introduced later if compelling use cases are discovered. .. note:: Other formats may be used instead of a fraction: see the tuple of integers section for example. Add new fields to os.stat ------------------------- To get the creation, modification and access time of a file with a nanosecond resolution, three fields can be added to os.stat() structure. The new fields can be timestamps with nanosecond resolution (e.g. Decimal) or the nanosecond part of each timestamp (int). If the new fields are timestamps with nanosecond resolution, populating the extra fields would be time consuming. Any call to os.stat() would be slower, even if os.stat() is only called to check if a file exists. A parameter can be added to os.stat() to make these fields optional, the structure would have a variable number of fields. If the new fields only contain the fractional part (nanoseconds), os.stat() would be efficient. These fields would always be present and so set to zero if the operating system does not support sub-second resolution. Splitting a timestamp in two parts, seconds and nanoseconds, is similar to the timespec type and tuple of integers, and so have the same drawbacks. Adding new fields to the os.stat() structure does not solve the nanosecond issue in other modules (e.g. the time module). Add a boolean argument ---------------------- Because we only need one new type (Decimal), a simple boolean flag can be added. Example: time.time(decimal=True) or time.time(hires=True). Such flag would require to do an hidden import which is considered as a bad practice. The boolean argument API was rejected because it is not "pythonic". Changing the return type with a parameter value is preferred over a boolean parameter (a flag). Add new functions ----------------- Add new functions for each type, examples: * time.clock_decimal() * time.time_decimal() * os.stat_decimal() * os.stat_timespec() * etc. Adding a new function for each function creating timestamps duplicate a lot of code and would be a pain to maintain. Add a new hires module ---------------------- Add a new module called "hires" with the same API than the time module, except that it would return timestamp with high resolution, e.g. decimal.Decimal. Adding a new module avoids to link low-level modules like time or os to the decimal module. This idea was rejected because it requires to duplicate most of the code of the time module, would be a pain to maintain, and timestamps are used modules other than the time module. Examples: signal.sigtimedwait(), select.select(), resource.getrusage(), os.stat(), etc. Duplicate the code of each module is not acceptable. Links ===== Python: * `Issue #7652: Merge C version of decimal into py3k `_ (cdecimal) * `Issue #11457: os.stat(): add new fields to get timestamps as Decimal objects with nanosecond resolution `_ * `Issue #13882: PEP 410: Use decimal.Decimal type for timestamps `_ * `[Python-Dev] Store timestamps as decimal.Decimal objects `_ Other languages: * Ruby (1.9.3), the `Time class `_ supports picosecond (10\ :sup:`-12`) * .NET framework, `DateTime type `_: number of 100-nanosecond intervals that have elapsed since 12:00:00 midnight, January 1, 0001. DateTime.Ticks uses a signed 64-bit integer. * Java (1.5), `System.nanoTime() `_: wallclock with an unspecified starting point as a number of nanoseconds, use a signed 64 bits integer (long). * Perl, `Time::Hiref module `_: use float so has the same loss of precision issue with nanosecond resolution than Python float timestamps Copyright ========= This document has been placed in the public domain. From victor.stinner at gmail.com Sat Feb 18 04:22:30 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 18 Feb 2012 04:22:30 +0100 Subject: [Python-Dev] PEP 410, 3rd revision, Decimal timestamp In-Reply-To: References: Message-ID: As asked by Martin, I tried to list *all* objections and alternatives. > ?* A: (numerator, denominator) > > ? * value = numerator / denominator > ? * resolution = 1 / denominator > ? * denominator > 0 > (...) > Tuple of integers have been rejected because they don't support > arithmetic operations. Oh, after writing the 3rd version of this PEP, I realized that fractions.Fraction is very close to this format except that it can be coerced to float and arithmetic on Fraction and float is allowed (return float). My implementation of the PEP implements something like Fraction in C, but something more specific to timestamps (e.g. without arithmetic). I don't know yet if Fraction is better or worse than Decimal. I see at least one drawback, str(Fraction): 5576475333606653/4194304 is less readable than 1329535325.43341. > ?* Ruby (1.9.3), the `Time class `_ > ? supports picosecond (10\ :sup:`-12`) We must do better than Ruby: support arbritrary precision! :-D Victor From skippy.hammond at gmail.com Sat Feb 18 05:51:31 2012 From: skippy.hammond at gmail.com (Mark Hammond) Date: Sat, 18 Feb 2012 15:51:31 +1100 Subject: [Python-Dev] dll name for embedding? In-Reply-To: <4F3E1359.7030703@googlemail.com> References: <4F3E1359.7030703@googlemail.com> Message-ID: <4F3F2E53.9050303@gmail.com> On 17/02/2012 7:44 PM, Egon Smiwa wrote: > Hi all, > I'm an app developer with a CPython dll in the folder of that app. > In general, are there strict requirements about the dll name > (a preference would be "python.dll" (easy to update (simple replace) ). > I successfully used "python.dll" and a few standard modules, > then I tried to use the sympy library and its import fails with an > AV exception, unless I rename the dll back to the original "python32.dll" > Is there an intrinsic filename requirement inside the CPython dll, modules, > or are name-restrictions to be presumed only in case of third-party libs? Note that this is off-topic for python-dev, which is for the development of Python - python-list would be a better choice. But the short story is that given Python extensions have a link-time dependency on the core Python DLL, it isn't possible to rename the DLL without breaking all extensions built against the original name - this is just how link-time dependencies work on Windows. You may also find http://www.python.org/dev/peps/pep-0384 of interest, but this still includes the major version in the DLL name and also depends on the authors of the extensions you want to use opting in. As mentioned above, please followup on python-list. Cheers, Mark. From skippy.hammond at gmail.com Sat Feb 18 06:24:15 2012 From: skippy.hammond at gmail.com (Mark Hammond) Date: Sat, 18 Feb 2012 16:24:15 +1100 Subject: [Python-Dev] Status of PEP 397 - Python launcher for Windows Message-ID: <4F3F35FF.1010506@gmail.com> I'm wondering what thoughts are on PEP 397, the Python launcher for Windows. I've been using the implementation for a number of months now and I find it incredibly useful. To my mind, the specific steps would be: * Have someone pronounce it as accepted (or suggest steps to be taken before such a pronouncement). I can't recall the current process - does Guido have to pronounce personally or formally delegate to a czar? * Move the source into the Python tree and update the build process. * Arrange for it to be installed with the next release of 3.2 and all future versions - I'm happy to try and help with that, but will probably need some help from Martin. * Write some user-oriented docs. Thoughts or comments? Mark From brian at python.org Sat Feb 18 06:37:01 2012 From: brian at python.org (Brian Curtin) Date: Fri, 17 Feb 2012 23:37:01 -0600 Subject: [Python-Dev] Status of PEP 397 - Python launcher for Windows In-Reply-To: <4F3F35FF.1010506@gmail.com> References: <4F3F35FF.1010506@gmail.com> Message-ID: On Fri, Feb 17, 2012 at 23:24, Mark Hammond wrote: > I'm wondering what thoughts are on PEP 397, the Python launcher for Windows. > ?I've been using the implementation for a number of months now and I find it > incredibly useful. > > To my mind, the specific steps would be: > > * Arrange for it to be installed with the next release of 3.2 and all future > versions - I'm happy to try and help with that, but will probably need some > help from Martin. I've been doing some installer work lately and would be willing to help out if I can. > Thoughts or comments? Will you be at PyCon, specifically at the language summit? I proposed a side-track to discuss this PEP, and I say side-track since a great majority of the group are not Windows users, so I don't think it's a topic to bring before the entire group. From mhammond at skippinet.com.au Sat Feb 18 06:45:37 2012 From: mhammond at skippinet.com.au (Mark Hammond) Date: Sat, 18 Feb 2012 16:45:37 +1100 Subject: [Python-Dev] Status of PEP 397 - Python launcher for Windows In-Reply-To: References: <4F3F35FF.1010506@gmail.com> Message-ID: <4F3F3B01.5030706@skippinet.com.au> On 18/02/2012 4:37 PM, Brian Curtin wrote: > On Fri, Feb 17, 2012 at 23:24, Mark Hammond wrote: >> I'm wondering what thoughts are on PEP 397, the Python launcher for Windows. >> I've been using the implementation for a number of months now and I find it >> incredibly useful. >> >> To my mind, the specific steps would be: >> >> * Arrange for it to be installed with the next release of 3.2 and all future >> versions - I'm happy to try and help with that, but will probably need some >> help from Martin. > > I've been doing some installer work lately and would be willing to > help out if I can. Great. >> Thoughts or comments? > > Will you be at PyCon, specifically at the language summit? I proposed > a side-track to discuss this PEP, and I say side-track since a great > majority of the group are not Windows users, so I don't think it's a > topic to bring before the entire group. Unfortunately not, but if you can get a few people together to discuss this, I'm happy to wait and see what consensus they arrive at. Cheers, Mark From martin at v.loewis.de Sat Feb 18 13:08:21 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Sat, 18 Feb 2012 13:08:21 +0100 Subject: [Python-Dev] Status of PEP 397 - Python launcher for Windows In-Reply-To: <4F3F35FF.1010506@gmail.com> References: <4F3F35FF.1010506@gmail.com> Message-ID: <20120218130821.Horde.udcfeFNNcXdPP5S1S4Syu-A@webmail.df.eu> Zitat von Mark Hammond : > I'm wondering what thoughts are on PEP 397, the Python launcher for > Windows. I've been using the implementation for a number of months > now and I find it incredibly useful. I wonder what the rationale for the PEP (as opposed to the rationale for the launcher) is - why do you need to have a PEP for it? As written, it specifies some "guidelines" that some software package of yours might adhere to. You don't need a PEP for that, just write the software and adhere to the guidelines, possibly putting them into the documentation. A PEP needs to have controversial issues, or else there would not have been a point in writing it in the first place. Also, it needs to concern CPython, or the Python language, else it does not need to be a *P*EP. To be a proper PEP, you need to include these things: - what is the action that you want to see taken? - what is the Python version (or versions) that you want to see the action taken for? - what alternative actions have been proposed, and what are (in your opinion, and the opinion of readers) pros and cons of each action? Assuming you are proposing some future action for CPython, I'm opposed to the notion that the implementation of the launcher is the specification. The specification needs to be in the PEP. It may leave room, in which case the remaining details need to be specified in the documentation. A critical question (IMO) is the question how the launcher gets onto systems. Will people have to download and install it themselves, or will it come as part of some Python distribution? If it comes with the Python distribution, how get multiple copies of the launcher coordinated? Also: what's the name of the launcher? How can I actually use it? Regards, Martin From vahid_male1384 at yahoo.com Sat Feb 18 18:15:53 2012 From: vahid_male1384 at yahoo.com (Vahid Ghaderi) Date: Sat, 18 Feb 2012 09:15:53 -0800 (PST) Subject: [Python-Dev] problem after installing python 3.2.2 Message-ID: <1329585353.30293.YahooMailClassic@web161302.mail.bf1.yahoo.com> hi i have downloaded and installed python 3.2.2 but still when i use python in terminal it show's: root at debian:~# python Python 2.6.6 (r266:84292, Dec 27 2010, 00:02:40) [GCC 4.4.5] on linux2 Type "help", "copyright", "credits" or "license" for more information. how can i change the default to python 3.2.2? tHanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sandro.tosi at gmail.com Sat Feb 18 18:21:06 2012 From: sandro.tosi at gmail.com (Sandro Tosi) Date: Sat, 18 Feb 2012 18:21:06 +0100 Subject: [Python-Dev] problem after installing python 3.2.2 In-Reply-To: <1329585353.30293.YahooMailClassic@web161302.mail.bf1.yahoo.com> References: <1329585353.30293.YahooMailClassic@web161302.mail.bf1.yahoo.com> Message-ID: Hello Vahid, i'm sorry but this mailing list is not the right place where to ask such question, I suggest get in touch with http://mail.python.org/mailman/listinfo/python-list for support. On Sat, Feb 18, 2012 at 18:15, Vahid Ghaderi wrote: > i have downloaded and installed python 3.2.2 but still when i use python > in terminal it show's: > root at debian:~# python you're using the system 'python' here, not the new installed, which probably has landed in /usr/local (or where you installed it). > Python 2.6.6 (r266:84292, Dec 27 2010, 00:02:40) > [GCC 4.4.5] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > how can i change the default to python 3.2.2? from a Debian Developer perspective, I'd suggest you not to switch the Debian default interpreter to python3.2 since it will make several system tools/debian packages to fail. If you need 3.2 explicitly, state it in the shebang or call the script with py3.2 explicitly. Regards, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi From greg.ewing at canterbury.ac.nz Sun Feb 19 01:09:28 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 19 Feb 2012 13:09:28 +1300 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review In-Reply-To: References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C77C1.7070706@hastings.org> <20120216135641.5ef37c64@pitrou.net> <1329398799.3407.5.camel@localhost.localdomain> Message-ID: <4F403DB8.1060101@canterbury.ac.nz> Guido van Rossum wrote: > if there is an *actual* > causal link between file A and B, the difference in timestamps should > always be much larger than 100 ns. And if there isn't a causal link, simultaneity is relative anyway. To Fred sitting at his computer, file A might have been created before file B, but to George running from the other end of the building in response to an urgent bug report, it could be the other way around. So to be *really* accurate, the API needs a way for the caller to indicate a frame of reference. -- Greg From ben+python at benfinney.id.au Sun Feb 19 01:40:51 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Sun, 19 Feb 2012 11:40:51 +1100 Subject: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review References: <4f3bc4c3.a54ab60a.65e2.2f66@mx.google.com> <4F3C77C1.7070706@hastings.org> <20120216135641.5ef37c64@pitrou.net> <1329398799.3407.5.camel@localhost.localdomain> <4F403DB8.1060101@canterbury.ac.nz> Message-ID: <87r4xrwt7w.fsf@benfinney.id.au> Greg Ewing writes: > Guido van Rossum wrote: > > if there is an *actual* causal link between file A and B, the > > difference in timestamps should always be much larger than 100 ns. > > And if there isn't a causal link, simultaneity is relative anyway. To > Fred sitting at his computer, file A might have been created before > file B, but to George running from the other end of the building in > response to an urgent bug report, it could be the other way around. Does that change if Fred and George are separated in the building by twenty floors? -- \ ?Kill myself? Killing myself is the last thing I'd ever do.? | `\ ?Homer, _The Simpsons_ | _o__) | Ben Finney From skippy.hammond at gmail.com Sun Feb 19 04:08:06 2012 From: skippy.hammond at gmail.com (Mark Hammond) Date: Sun, 19 Feb 2012 14:08:06 +1100 Subject: [Python-Dev] Status of PEP 397 - Python launcher for Windows In-Reply-To: <20120218130821.Horde.udcfeFNNcXdPP5S1S4Syu-A@webmail.df.eu> References: <4F3F35FF.1010506@gmail.com> <20120218130821.Horde.udcfeFNNcXdPP5S1S4Syu-A@webmail.df.eu> Message-ID: <4F406796.7050007@gmail.com> On 18/02/2012 11:08 PM, martin at v.loewis.de wrote: > > Zitat von Mark Hammond : > >> I'm wondering what thoughts are on PEP 397, the Python launcher for >> Windows. I've been using the implementation for a number of months now >> and I find it incredibly useful. > > I wonder what the rationale for the PEP (as opposed to the rationale > for the launcher) is - why do you need to have a PEP for it? As > written, it specifies some "guidelines" that some software package > of yours might adhere to. You don't need a PEP for that, just write > the software and adhere to the guidelines, possibly putting them into > the documentation. > > A PEP needs to have controversial issues, or else there would not > have been a point in writing it in the first place. Also, it needs > to concern CPython, or the Python language, else it does not need to > be a *P*EP. The launcher was slightly controversial when the pep was initially written 12 months ago. If you believe the creation of the PEP was procedurally incorrect I'm happy to withdraw it - obviously I just want the launcher, with or without a PEP. Alternatively, if you think the format of the PEP needs to change before it can be accepted, then I'm happy to do that too if you can be very specific about what you want changed. If you mean something else entirely then please be very specific - I admit I'm not clear on the point of your message at all. > > To be a proper PEP, you need to include these things: > - what is the action that you want to see taken? > - what is the Python version (or versions) that you > want to see the action taken for? > - what alternative actions have been proposed, and what > are (in your opinion, and the opinion of readers) pros > and cons of each action? > > Assuming you are proposing some future action for CPython, > I'm opposed to the notion that the implementation of the > launcher is the specification. The specification needs to be > in the PEP. It may leave room, in which case the remaining > details need to be specified in the documentation. I'm really not sure what you are trying to say here. That the PEP should remove all references to an implementation specification, or that the PEP simply should be withdrawn? As above, I don't care - I just want the launcher with the least amount of bureaucracy possible. > A critical question (IMO) is the question how the launcher > gets onto systems. Will people have to download and install > it themselves, or will it come as part of some Python > distribution? This is addressed in the PEP: "The launcher will be distributed with all future versions of Python ..." > If it comes with the Python distribution, > how get multiple copies of the launcher coordinated? This may not be specified as well as it could, but: "Future versions of the launcher should remain backwards compatible with older versions, so later versions of Python can install an updated version of the launcher without impacting how the previously installed version of the launcher is used." > Also: what's the name of the launcher? How can I actually use > it? This too is there: "The console launcher will be named 'py.exe' and the Windows one named 'pyw.exe'" and there is discussion of the command-line args. Mark From ncoghlan at gmail.com Sun Feb 19 04:18:09 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 19 Feb 2012 13:18:09 +1000 Subject: [Python-Dev] Status of PEP 397 - Python launcher for Windows In-Reply-To: <4F406796.7050007@gmail.com> References: <4F3F35FF.1010506@gmail.com> <20120218130821.Horde.udcfeFNNcXdPP5S1S4Syu-A@webmail.df.eu> <4F406796.7050007@gmail.com> Message-ID: On Sun, Feb 19, 2012 at 1:08 PM, Mark Hammond wrote: > The launcher was slightly controversial when the pep was initially written > 12 months ago. ?If you believe the creation of the PEP was procedurally > incorrect I'm happy to withdraw it - obviously I just want the launcher, > with or without a PEP. ?Alternatively, if you think the format of the PEP > needs to change before it can be accepted, then I'm happy to do that too if > you can be very specific about what you want changed. ?If you mean something > else entirely then please be very specific - I admit I'm not clear on the > point of your message at all. I think the PEP is appropriate, but some of the details that are currently embedded in the prose should be extracted out to a clear "specification" section: - two launcher binaries (one for .py files, one for .pyw) will be added to the system PATH - the launcher will be shipped as part of the default CPython windows installers (starting with Python 3.3) - the launcher will handle launching both Python 2 and Python 3 scripts - the launcher will be overwritten when upgrading CPython As a practical matter, it *may* be worth having the launcher available as an independent installer that just gets bundled with the CPython one, but that shouldn't be a requirement in the PEP. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From v+python at g.nevcal.com Sun Feb 19 04:35:35 2012 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sat, 18 Feb 2012 19:35:35 -0800 Subject: [Python-Dev] Status of PEP 397 - Python launcher for Windows In-Reply-To: <4F3F35FF.1010506@gmail.com> References: <4F3F35FF.1010506@gmail.com> Message-ID: <4F406E07.4010702@g.nevcal.com> On 2/17/2012 9:24 PM, Mark Hammond wrote: > I've been using the implementation for a number of months now and I > find it incredibly useful. +1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ejjyrex at gmail.com Sun Feb 19 05:54:28 2012 From: ejjyrex at gmail.com (Ejaj Hassan) Date: Sun, 19 Feb 2012 10:24:28 +0530 Subject: [Python-Dev] compiling cpython in visual studio 2010 Message-ID: Hello everyone, I am trying to work on Python bugs following the tutorial given in the python website. I have installed Tortoise svn and visual studio 2010, I cloned a copy of cpython as it is advised in the website using, however I am having some problem in compiling it using visual studio 2010. I request someone to kindly make me understand the full steps on solving the bugs. With regards, Ejaj Hassan From tjreedy at udel.edu Sun Feb 19 06:36:54 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 19 Feb 2012 00:36:54 -0500 Subject: [Python-Dev] compiling cpython in visual studio 2010 In-Reply-To: References: Message-ID: On 2/18/2012 11:54 PM, Ejaj Hassan wrote: > Hello everyone, > I am trying to work on Python bugs following the tutorial given in the > python website. > I have installed Tortoise svn and visual studio 2010, I cloned a copy > of cpython as it is advised in the website using, however I am having > some problem in compiling it using visual studio 2010. As the devguide says, you need vs2008 or the c++express edition. 3.3 may be released compiled with 2010 (that is being worked on) but I believe 2008 will still be needed for 2.7. -- Terry Jan Reedy From martin at v.loewis.de Sun Feb 19 09:41:32 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Sun, 19 Feb 2012 09:41:32 +0100 Subject: [Python-Dev] Status of PEP 397 - Python launcher for Windows In-Reply-To: <4F406796.7050007@gmail.com> References: <4F3F35FF.1010506@gmail.com> <20120218130821.Horde.udcfeFNNcXdPP5S1S4Syu-A@webmail.df.eu> <4F406796.7050007@gmail.com> Message-ID: <20120219094132.Horde.QohsX8L8999PQLW8Aj5ExSA@webmail.df.eu> > The launcher was slightly controversial when the pep was initially > written 12 months ago. So what were the objections? >> Assuming you are proposing some future action for CPython, >> I'm opposed to the notion that the implementation of the >> launcher is the specification. The specification needs to be >> in the PEP. It may leave room, in which case the remaining >> details need to be specified in the documentation. > > I'm really not sure what you are trying to say here. Let me try again: I dislike the phrase "written in C, which defines the detailed implementation". That means that in order to find out what the launcher does, you have to read its source code. I also dislike the phrase "but instead to offer guidelines the launcher should adhere to"; the PEP should not just be guidelines, but a clear, prescriptive specification. I admit that I had difficulties to find the places in the PEP where it specifies things, as opposed to explaining things. It seems that all of the sections - An overview of the launcher. - Guidelines for a Python launcher. - Shebang line parsing - Virtual commands in shebang lines: - Customized Commands: - Python Version Qualifiers - Command-line handling - Process Launching are specification, so it may help to group them as subsections of a top-level heading "Specification". OTOH, "Process Launching" has 4 paragraphs of discussion, then two sentences of specification, then 1,5 sentences of discussion. I wish it was easier to find out what the PEP actually says. > That the PEP should remove all references to an implementation > specification, or that the PEP simply should be withdrawn? Having references to the implementation is fine; saying that you have to read the code to understand what it does, and that the code takes precedence over the PEP is not. >> If it comes with the Python distribution, >> how get multiple copies of the launcher coordinated? > > This may not be specified as well as it could, but: "Future versions > of the launcher should remain backwards compatible with older > versions, so later versions of Python can install an updated version > of the launcher without impacting how the previously installed > version of the launcher is used." That's not really my concern. I didn't originally find the place where it said that the launcher goes into the Windows directory. Now that I see it: how do you prevent deinstallation of py.exe when some version of Python is uninstalled, but other versions remain? Regards, Martin From p.f.moore at gmail.com Sun Feb 19 10:03:31 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 19 Feb 2012 09:03:31 +0000 Subject: [Python-Dev] Status of PEP 397 - Python launcher for Windows In-Reply-To: <4F4067E4.3040903@skippinet.com.au> References: <4F3F35FF.1010506@gmail.com> <4F4067E4.3040903@skippinet.com.au> Message-ID: On 19 February 2012 03:09, Mark Hammond wrote: > Thanks for the note Paul, but did you also mean to CC python-dev? Yes, I did, sorry. > > On 18/02/2012 9:15 PM, Paul Moore wrote: >> >> On 18 February 2012 05:24, Mark Hammond ?wrote: >>> >>> I'm wondering what thoughts are on PEP 397, the Python launcher for >>> Windows. >>> ?I've been using the implementation for a number of months now and I find >>> it >>> incredibly useful. >> >> >> I use it all the time. It's extremely useful, and I wouldn't be without >> it. >> >> [OT: One interesting property I find useful - if you put an alias >> "vpython=python.exe" in the ini file (no path to Python) then >> #!vpython picks up whatever Python is on PATH - this can be very >> useful if you use virtualenvs a lot and want to run a script with the >> current virtualenv] >> >>> Thoughts or comments? >> >> >> IIRC, one question was how to manage multiple installs - if I install >> Python 3.3 and 3.4, both of which install py.exe, and then uninstall >> one, what happens to py.exe. What if I then uninstall the second? >> Reference count the launcher? >> >> If it were possible to package up the launcher installer with the >> Python installer, but then install it as a separate item, that might >> be best (it's what MS products seem to do a lot - install one, get ten >> extra - I don't like the way MS do it, but it seems appropriate here). >> This would also allow python.org to host a standalone version of the >> installer which could be installed seamlessly for older versions. >> >> Paul > > From nad at acm.org Sun Feb 19 10:29:50 2012 From: nad at acm.org (Ned Deily) Date: Sun, 19 Feb 2012 10:29:50 +0100 Subject: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems) References: <4F37FD96.2010603@v.loewis.de> <20120212203043.GA10257@cskk.homeip.net> <4F38244D.1000908@v.loewis.de> <20120213170845.3ee5d4b4@resist.wooz.org> <20120214094435.745d06e6@limelight.wooz.org> <4F3CD403.7070102@v.loewis.de> Message-ID: In article <4F3CD403.7070102 at v.loewis.de>, "Martin v. Lowis" wrote: > > There are two issues that I know of for OS X. One is just getting a > > python2 symlink into the bin directory of a framework build. That's > > easy. > > Where exactly in the Makefile is that reflected? ISTM that the current > patch already covers that, since the framwork* targets are not concerned > with the bin directory. When a framework build is enabled in configure, several additional targets from Mac/Makefile are called from the main Makefile. The creating of the links in the framework bin directory and in $prefix/bin (/usr/local/bin) are handled there. (See the checked in patch for 2.7 for gory details: http://hg.python.org/cpython/rev/499796937b7a) > > The other is managing symlinks (python, python2, and python3) > > across framework bin directories; currently there's no infrastructure > > for that. That part will probably have to wait until PyCon. > > What is the "framework bin directory"? The links are proposed for > /usr/local/bin resp. /usr/bin. The proposed patch already manages > these links across releases (the most recent install wins). The framework bin directory is a bin directory within a framework. The default location for 2.7 is: /Library/Frameworks/Python.framework/Versions/2.7/bin This is where the python executable, aux programs like idle, 2to3, pydoc, python-config, as well as all Distutils-installed scripts go. Mac/Makefile and the Mac installer each optionally create symlinks from /usr/local/bin (default) to the items in the framework bin directory at build or install time for the standard items but not for subsequent Distutils-installed scripts. Normally, the /usr/local/bin links are not needed with framework builds as the framework bin directory is added to the user's $PATH during installation. > If you are concerned about multiple feature releases: this is not an > issue, since the links are just proposed for Python 2.7 (distributions > may also add them for 2.6 and earlier, but we are not going to make > a release in that direction). It is more of an issue for multiple Python 3 versions. But the whole mechanism of managing multiple framework versions (2 and/or 3) is messy right now. But that's a separate topic that I plan to address later. As for now, I believe all that is needed for PEP 394 is now checked-in. -- Ned Deily, nad at acm.org From solipsis at pitrou.net Sun Feb 19 13:08:11 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 19 Feb 2012 13:08:11 +0100 Subject: [Python-Dev] compiling cpython in visual studio 2010 References: Message-ID: <20120219130811.24db9654@pitrou.net> Hello, > I am trying to work on Python bugs following the tutorial given in the > python website. > I have installed Tortoise svn and visual studio 2010, I cloned a copy > of cpython as it is advised in the website using, however I am having > some problem in compiling it using visual studio 2010. Which tutorial have you been reading? I'm not sure it's a typo in your e-mail, but you should be installing TortoiseHg, not TortoiseSVN. There's more information about that in the devguide: http://docs.python.org/devguide/index.html Regards Antoine. From solipsis at pitrou.net Sun Feb 19 13:11:22 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 19 Feb 2012 13:11:22 +0100 Subject: [Python-Dev] cpython: allow arbitrary attributes on classmethod and staticmethod (closes #14051) References: Message-ID: <20120219131122.3f4ced58@pitrou.net> Hi, > +static PyObject * > +cm_get___dict__(classmethod *cm, void *closure) > +{ > + Py_INCREF(cm->cm_dict); > + return cm->cm_dict; > +} >>> def f(): pass ... >>> cm = classmethod(f) >>> cm.__dict__ Erreur de segmentation Regards Antoine. From solipsis at pitrou.net Sun Feb 19 13:21:11 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 19 Feb 2012 13:21:11 +0100 Subject: [Python-Dev] cpython: allow arbitrary attributes on classmethod and staticmethod (closes #14051) References: <20120219131122.3f4ced58@pitrou.net> Message-ID: <20120219132111.311ea8d4@pitrou.net> That's answering to Benjamin's f46deae68e34, by the way. Regards Antoine. On Sun, 19 Feb 2012 13:11:22 +0100 Antoine Pitrou wrote: > > Hi, > > > +static PyObject * > > +cm_get___dict__(classmethod *cm, void *closure) > > +{ > > + Py_INCREF(cm->cm_dict); > > + return cm->cm_dict; > > +} > > >>> def f(): pass > ... > >>> cm = classmethod(f) > >>> cm.__dict__ > Erreur de segmentation > > > Regards > > Antoine. > > From stefan_ml at behnel.de Sun Feb 19 14:04:37 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 19 Feb 2012 14:04:37 +0100 Subject: [Python-Dev] C-API functions for reading/writing tstate->exc_* ? Message-ID: Hi, the Cython and PyPy projects are currently working on getting Cython implemented extensions to build and run in PyPy, interfacing at the C-API level for now. One problem we encountered was that there is currently no "abstract" way to access tstate->exc_type and friends, i.e. the last exception that was caught, aka. sys.exc_info(). Apparently, PyPy stores them at a frame level, whereas CPython makes them available in thread local storage as bare struct fields. Even if PyPy put them there on request, any changes to these fields would pass unnoticed. And Cython needs to set them in order to properly implement the semantics of a try-except statement (and in some other places where exception state is scoped). When compiling for PyPy, Cython therefore needs a way to tell PyPy about any changes. For the tstate->curexc_* fields, there are the two functions PyErr_Fetch() and PyErr_Restore(). Could we have two similar "official" functions for the exc_* fields? Maybe PyErr_FetchLast() and PyErr_RestoreLast()? Note that Cython would not have a reason to actually use them in CPython, and it should be uncommon for non-Cython extension modules to care about the exc_* fields at all. So these functions won't be of much use if actually implemented in CPython (although I wouldn't mind doing that). The question is just if we could have two officially named functions that PyPy (and maybe other Pythons) could implement in order to access the last raised exception in a way that does not depend on implementation details. Stefan From p.f.moore at gmail.com Sun Feb 19 15:18:34 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 19 Feb 2012 14:18:34 +0000 Subject: [Python-Dev] C-API functions for reading/writing tstate->exc_* ? In-Reply-To: References: Message-ID: On 19 February 2012 13:04, Stefan Behnel wrote: > When compiling for PyPy, Cython therefore needs a way to tell PyPy about > any changes. For the tstate->curexc_* fields, there are the two functions > PyErr_Fetch() and PyErr_Restore(). Could we have two similar "official" > functions for the exc_* fields? Maybe PyErr_FetchLast() and > PyErr_RestoreLast()? It sounds reasonable to simply write a patch implementing and documenting these functions, put it in the tracker, and ask for it to be applied - I can't see an obvious reason not to do so. (There may be reasons not to put them in the "stable API", but that's not so relevant here). > Note that Cython would not have a reason to actually use them in CPython, > and it should be uncommon for non-Cython extension modules to care about > the exc_* fields at all. So these functions won't be of much use if > actually implemented in CPython (although I wouldn't mind doing that). The > question is just if we could have two officially named functions that PyPy > (and maybe other Pythons) could implement in order to access the last > raised exception in a way that does not depend on implementation details. You're probably worrying too much here. Get them added to Python 3.3, and then you're fine (if PyPy need to implement them for earlier versions, that's no problem for CPython, presumably PyPy don't have quite so stringent backward compatibility requirements yet, and the fact that they exist in 3.3 gives you the standardisation you need). Certainly "to have a standard API for getting at this information, even though it's probably not necessary for CPython extensions" isn't the best justification, but it's not the worst I've seen either :-) Of course, you could always go through the Python API, getting the sys module, extracting the relevant functions and calling them using the abstract API. That's what I'd recommend if this were purely a CPython question. But I assume that (for some reason) that's not appropriate for PyPy. Of course, my opinion doesn't carry a lot of weight here, so don't read too much into this :-) Paul. From ncoghlan at gmail.com Sun Feb 19 15:31:04 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 20 Feb 2012 00:31:04 +1000 Subject: [Python-Dev] C-API functions for reading/writing tstate->exc_* ? In-Reply-To: References: Message-ID: On Mon, Feb 20, 2012 at 12:18 AM, Paul Moore wrote: > Of course, you could always go through the Python API, getting the sys > module, extracting the relevant functions and calling them using the > abstract API. That's what I'd recommend if this were purely a CPython > question. But I assume that (for some reason) that's not appropriate > for PyPy. I had the same thought, but it actually only works cleanly for the "fetch" half of the equation. You'd have to muck around with actually raising an exception to handle *setting* those fields, which is *really* relying on implementation details). That said, it may be worth further exploring the idea of invoking appropriate snippets of Python code to get the desired effect. My other question would be whether there's an existing *private* C API with the desired behaviour that could be made public, or if this would be a genuinely new addition to the API. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mail at kerrickstaley.com Sun Feb 19 17:16:44 2012 From: mail at kerrickstaley.com (Kerrick Staley) Date: Sun, 19 Feb 2012 10:16:44 -0600 Subject: [Python-Dev] PEP 394 accepted In-Reply-To: References: <4F3DFA6D.6090403@v.loewis.de> Message-ID: Thanks Nick, Ned, and everyone else who worked on implementing this! If any further work on the text of the PEP or on the Makefile patch is needed, please shoot me an email (I have GMail set to archive messages to python-dev unless they explicitly CC me). -Kerrick Staley On Fri, Feb 17, 2012 at 6:44 AM, Nick Coghlan wrote: > On Fri, Feb 17, 2012 at 10:27 PM, Nick Coghlan wrote: > > Unfortunately, dinsdale appears to have fallen over again, so I can't > > push the change right now :( > > It appears that was a temporary glitch - the 2.7 change is now in > Mercurial. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/mail%40kerrickstaley.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Sun Feb 19 22:10:50 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 19 Feb 2012 22:10:50 +0100 Subject: [Python-Dev] C-API functions for reading/writing tstate->exc_* ? In-Reply-To: References: Message-ID: Nick Coghlan, 19.02.2012 15:31: > On Mon, Feb 20, 2012 at 12:18 AM, Paul Moore wrote: >> Of course, you could always go through the Python API, getting the sys >> module, extracting the relevant functions and calling them using the >> abstract API. That's what I'd recommend if this were purely a CPython >> question. But I assume that (for some reason) that's not appropriate >> for PyPy. > > I had the same thought, but it actually only works cleanly for the > "fetch" half of the equation. You'd have to muck around with actually > raising an exception to handle *setting* those fields, which is > *really* relying on implementation details). That said, it may be > worth further exploring the idea of invoking appropriate snippets of > Python code to get the desired effect. Actually, we currently inline the straight C code for this in CPython for performance reasons, so, no, going through Python code isn't going to be a good idea. > My other question would be whether there's an existing *private* C API > with the desired behaviour that could be made public, or if this would > be a genuinely new addition to the API. I'm not aware of any. The code in CPython (especially in ceval.c) always uses direct field access. Stefan From martin at v.loewis.de Sun Feb 19 23:24:57 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 19 Feb 2012 23:24:57 +0100 Subject: [Python-Dev] C-API functions for reading/writing tstate->exc_* ? In-Reply-To: References: Message-ID: <4F4176B9.4080403@v.loewis.de> > When compiling for PyPy, Cython therefore needs a way to tell PyPy about > any changes. For the tstate->curexc_* fields, there are the two functions > PyErr_Fetch() and PyErr_Restore(). Could we have two similar "official" > functions for the exc_* fields? Maybe PyErr_FetchLast() and > PyErr_RestoreLast()? I wouldn't call the functions *Last, as this may cause confusion with sys.last_*. I'm also unsure why the current API uses this Fetch/Restore pair of functions where Fetch clears the variables. A Get/Set pair of functions would be more natural, IMO (where Get returns "new" references). This would give PyErr_GetExcInfo/PyErr_SetExcInfo. Regards, Martin From martin at v.loewis.de Sun Feb 19 23:47:29 2012 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 19 Feb 2012 23:47:29 +0100 Subject: [Python-Dev] PEP 410, 3rd revision, Decimal timestamp In-Reply-To: References: Message-ID: <4F417C01.4010200@v.loewis.de> >> * Ruby (1.9.3), the `Time class `_ >> supports picosecond (10\ :sup:`-12`) > > We must do better than Ruby: support arbritrary precision! :-D Seriously, I do consider that a necessary requirement for the PEP (which the Decimal type actually meets). I don't want to deal with this issue *again* in my lifetime (now being the second time), so it's either arbitrary-precision, or highest-possible precision. When I was in school, attoseconds (as) where the shortest named second fraction. Today, it seems we should go for yoctoseconds (ys). There is an absolute boundary, though: it seems there is no point in going shorter than the Planck time (5.4 * 10**-44). Regards, Martin From martin at v.loewis.de Mon Feb 20 00:12:33 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 20 Feb 2012 00:12:33 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <4F34E554.7090600@v.loewis.de> <4F3CC8C3.8070103@v.loewis.de> Message-ID: <4F4181E1.9040909@v.loewis.de> > The change of backing ElementTree by cElementTree has already been > implemented in the default branch (3.3) by Florent Xicluna with careful > review from me and others. etree has an extensive (albeit a bit clumsy) > set of tests which keep passing successfully after the change. I just noticed an incompatible change: xml.etree.ElementTree.Element used to be a type, but is now a function. > In the past couple of years Florent has been the de-facto maintainer of > etree in the standard library, although I don't think he ever > "committed" to keep maintaining it for years to come. Neither can I make > this commitment, however I do declare that I will do my best to keep the > library functional, and I also plan to work on improving its > documentation and cleaning up some of the accumulated cruft in its > implementation. I also have all the intentions to take the blame if > something breaks. Would you mind adding yourself to http://docs.python.org/devguide/experts.html Regards, Martin From martin at v.loewis.de Mon Feb 20 00:20:39 2012 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Mon, 20 Feb 2012 00:20:39 +0100 Subject: [Python-Dev] PEP for new dictionary implementation In-Reply-To: <1207577444.648837.1329501143303.JavaMail.open-xchange@email.1and1.co.uk> References: <4f3d49d6.ec77ec0a.7988.ffffcebb@mx.google.com> <4F3D7648.6040600@v.loewis.de> <4F3DF8A1.20708@v.loewis.de> <1207577444.648837.1329501143303.JavaMail.open-xchange@email.1and1.co.uk> Message-ID: <4F4183C7.8000602@v.loewis.de> >> Ah, now I understand; you do need a single ssize_t either on the dict >> or at the head of the values array to indicate how many slots it has >> actually allocated. It *may* also be worthwhile to add a second >> ssize_t to indicate how many are currently in use, for faster results >> in case of len. But the dict is guaranteed to have at least one free >> slot, so that extra index will never make the allocation larger than >> the current code. > > The dict already has a field indicating how many items are in use, > the ma_used field. So what do you think about Jim's proposal to make the values indexed not by hash value, but by an index that is stored in the shared keys? Since the load will be < 2/3, this should save 1/3 of the value storage (typically more than that, if you initialize the values array always to the current number of shared keys). Regards, Martin From michael at voidspace.org.uk Mon Feb 20 01:37:20 2012 From: michael at voidspace.org.uk (Michael Foord) Date: Mon, 20 Feb 2012 00:37:20 +0000 Subject: [Python-Dev] Links to last binary builds to download pages Message-ID: <2550EB08-9991-47B9-A923-BB3322DB5B8A@voidspace.org.uk> Hey folks, When we do security only releases of Python we regularly get emails to webmaster at python.org asking where to find binary builds. If you want to find the most recent binary builds of Python 2.5 & 2.6, it used to involve clicking through quite a few links. I've added links to the latest binary releases to the 2.6.7, 2.5.6 and 2.5.5 security release download pages. These are primarily for the benefit of Windows and Mac OS X users who wouldn't normally compile their own builds from source. It would be helpful if release managers for security, source only, releases could include similar links in future. All the best, Michael -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From ncoghlan at gmail.com Mon Feb 20 01:51:11 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 20 Feb 2012 10:51:11 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Fix a failing importlib test under Windows. In-Reply-To: References: Message-ID: On Mon, Feb 20, 2012 at 10:36 AM, brett.cannon wrote: > - ? ?if sys_module.platform in CASE_INSENSITIVE_PLATFORMS: > + ? ?if any(sys_module.platform.startswith(x) > + ? ? ? ? ? ?for x in CASE_INSENSITIVE_PLATFORMS): Since C_I_P is a tuple, that condition can be written as: "sys_module.platform.startswith(CASE_INSENSITIVE_PLATFORMS)" Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Feb 20 04:09:28 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 20 Feb 2012 13:09:28 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Issue #14043: Speed up importlib's _FileFinder by at least 8x, and add a new In-Reply-To: References: Message-ID: On Mon, Feb 20, 2012 at 10:55 AM, antoine.pitrou wrote: > +def _relax_case(): > + ? ?"""True if filenames must be checked case-insensitively.""" > + ? ?if any(map(sys.platform.startswith, CASE_INSENSITIVE_PLATFORMS)): > + ? ? ? ?def _relax_case(): > + ? ? ? ? ? ?return b'PYTHONCASEOK' in _os.environ > ? ? else: > - ? ? ? ?return True Wow, that's horrendously confusing. Please change the name of the factory function to "_make_relax_case" (or something, anything that isn't "_relax_case" would be an improvement). Also, the docstring should be on the created functions, not the factory function. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Feb 20 04:15:54 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 20 Feb 2012 13:15:54 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Issue #14043: Speed up importlib's _FileFinder by at least 8x, and add a new In-Reply-To: References: Message-ID: However, "very cool" on adding the caching in the default importers :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From g.brandl at gmx.net Mon Feb 20 08:45:14 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 20 Feb 2012 08:45:14 +0100 Subject: [Python-Dev] cpython: add generic implementation of a __dict__ descriptor for C types In-Reply-To: References: Message-ID: Am 20.02.2012 02:04, schrieb benjamin.peterson: > http://hg.python.org/cpython/rev/78f93eb7dd75 > changeset: 75050:78f93eb7dd75 > user: Benjamin Peterson > date: Sun Feb 19 19:59:10 2012 -0500 > summary: > add generic implementation of a __dict__ descriptor for C types > > files: > Doc/c-api/object.rst | 12 +++++++++ > Doc/c-api/type.rst | 1 - > Include/object.h | 2 + > Misc/NEWS | 4 +++ > Objects/object.c | 42 ++++++++++++++++++++++++++++++++ > Objects/typeobject.c | 22 +++------------- > 6 files changed, 64 insertions(+), 19 deletions(-) > > > diff --git a/Doc/c-api/object.rst b/Doc/c-api/object.rst > --- a/Doc/c-api/object.rst > +++ b/Doc/c-api/object.rst > @@ -101,6 +101,18 @@ > This is the equivalent of the Python statement ``del o.attr_name``. > > > +.. c:function:: PyObject* PyType_GenericGetDict(PyObject *o, void *context) > + > + A generic implementation for the getter of a ``__dict__`` descriptor. It > + creates the dictionary if necessary. > + > + > +.. c:function:: int PyType_GenericSetDict(PyObject *o, void *context) > + > + A generic implementation for the setter of a ``__dict__`` descriptor. This > + implementation does not allow the dictionary to be deleted. > + > + > .. c:function:: PyObject* PyObject_RichCompare(PyObject *o1, PyObject *o2, int opid) versionadded, please? Georg From steve at pearwood.info Mon Feb 20 11:59:51 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 20 Feb 2012 21:59:51 +1100 Subject: [Python-Dev] PEP 410, 3rd revision, Decimal timestamp In-Reply-To: <4F417C01.4010200@v.loewis.de> References: <4F417C01.4010200@v.loewis.de> Message-ID: <4F4227A7.2080408@pearwood.info> Martin v. L?wis wrote: >>> * Ruby (1.9.3), the `Time class `_ >>> supports picosecond (10\ :sup:`-12`) >> We must do better than Ruby: support arbritrary precision! :-D > > Seriously, I do consider that a necessary requirement for the PEP (which > the Decimal type actually meets). I don't want to deal with > this issue *again* in my lifetime (now being the second time), > so it's either arbitrary-precision, or highest-possible precision. > When I was in school, attoseconds (as) where the shortest named > second fraction. Today, it seems we should go for yoctoseconds (ys). > There is an absolute boundary, though: it seems there is no point in > going shorter than the Planck time (5.4 * 10**-44). That is an implementation detail. Python implementations in other universes may not have that limitation. Besides, if any of the speed of light c, gravitational constant G, Planck's constant h, or pi change, so will Planck time. Perhaps they should be environment variables? Not-quite-sure-how-seriously-you-intend-supporting-yoctoseconds-ly y'rs, -- Steven From eliben at gmail.com Mon Feb 20 12:36:25 2012 From: eliben at gmail.com (Eli Bendersky) Date: Mon, 20 Feb 2012 13:36:25 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <4F4181E1.9040909@v.loewis.de> References: <4F34E554.7090600@v.loewis.de> <4F3CC8C3.8070103@v.loewis.de> <4F4181E1.9040909@v.loewis.de> Message-ID: On Mon, Feb 20, 2012 at 01:12, "Martin v. L?wis" wrote: > > The change of backing ElementTree by cElementTree has already been > > implemented in the default branch (3.3) by Florent Xicluna with careful > > review from me and others. etree has an extensive (albeit a bit clumsy) > > set of tests which keep passing successfully after the change. > > I just noticed an incompatible change: xml.etree.ElementTree.Element > used to be a type, but is now a function. > Yes, this is a result of an incompatibility between the Python and C implementations of ElementTree. Since these have now been merged by default, one or the other had to be kept and the choice of cElementTree appeared to be sensible since this is what most people are expected to use in 2.7 and 3.2 anyway. I have an issue open for converting some function constructors into classes. Perhaps Element should also have this fate. > > > In the past couple of years Florent has been the de-facto maintainer of > > etree in the standard library, although I don't think he ever > > "committed" to keep maintaining it for years to come. Neither can I make > > this commitment, however I do declare that I will do my best to keep the > > library functional, and I also plan to work on improving its > > documentation and cleaning up some of the accumulated cruft in its > > implementation. I also have all the intentions to take the blame if > > something breaks. > > Would you mind adding yourself to > > http://docs.python.org/devguide/experts.html > Sure, I'll do that. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-dev at masklinn.net Mon Feb 20 12:43:50 2012 From: python-dev at masklinn.net (Xavier Morel) Date: Mon, 20 Feb 2012 12:43:50 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <4F34E554.7090600@v.loewis.de> <4F3CC8C3.8070103@v.loewis.de> <4F4181E1.9040909@v.loewis.de> Message-ID: On 2012-02-20, at 12:36 , Eli Bendersky wrote: > On Mon, Feb 20, 2012 at 01:12, "Martin v. L?wis" wrote: > >>> The change of backing ElementTree by cElementTree has already been >>> implemented in the default branch (3.3) by Florent Xicluna with careful >>> review from me and others. etree has an extensive (albeit a bit clumsy) >>> set of tests which keep passing successfully after the change. >> >> I just noticed an incompatible change: xml.etree.ElementTree.Element >> used to be a type, but is now a function. >> > > Yes, this is a result of an incompatibility between the Python and C > implementations of ElementTree. Since these have now been merged by > default, one or the other had to be kept and the choice of cElementTree > appeared to be sensible since this is what most people are expected to use > in 2.7 and 3.2 anyway. I have an issue open for converting some function > constructors into classes. Perhaps Element should also have this fate. I'm not sure that's much of an issue, Element (and most of the top-level utility "constructors") are documented as being frontend interfaces with no specific type of their own, and indeed they are simply functions in lxml, just as they are in cElementTree. Others will probably disagree, but as far as I am concerned these can stay functions, avoid issues when switching to lxml or between ElementTree and lxml (from one project to the next). From ncoghlan at gmail.com Mon Feb 20 13:50:43 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 20 Feb 2012 22:50:43 +1000 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <4F34E554.7090600@v.loewis.de> <4F3CC8C3.8070103@v.loewis.de> <4F4181E1.9040909@v.loewis.de> Message-ID: On Mon, Feb 20, 2012 at 9:43 PM, Xavier Morel wrote: > I'm not sure that's much of an issue, Element (and most of the top-level > utility "constructors") are documented as being frontend interfaces with > no specific type of their own, and indeed they are simply functions in > lxml, just as they are in cElementTree. > > Others will probably disagree, but as far as I am concerned these can stay > functions, avoid issues when switching to lxml or between ElementTree and > lxml (from one project to the next). For a similar situation (only the other way around), we're probably going to add a pure Python variant of functools.partial as a staticmethod, while the C version is a callable extension type. (see http://bugs.python.org/issue12428) Basically, if something is just documented as being callable without subclassing or instance checks being mentioned as supported in the docs, it can be implemented as either a type or an ordinary function, or pretty much any other kind of callable without being deemed an API change (obviously that's not a free pass to go make such implementation changes without a compelling reason, but C vs Python often *is* such a reason). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From techtonik at gmail.com Mon Feb 20 13:49:05 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Mon, 20 Feb 2012 15:49:05 +0300 Subject: [Python-Dev] Python in Native Client Message-ID: Hi, People on NaCl list are asking about Python support for development of native web applications in Python. Does anybody have experience compiling Python for NaCl? 1. https://groups.google.com/d/topic/native-client-discuss/ioY2jmw_OUQ/discussion -- anatoly t. -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Mon Feb 20 14:01:54 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 20 Feb 2012 14:01:54 +0100 Subject: [Python-Dev] PEP 410, 3rd revision, Decimal timestamp In-Reply-To: <4F4227A7.2080408@pearwood.info> References: <4F417C01.4010200@v.loewis.de> <4F4227A7.2080408@pearwood.info> Message-ID: >>> We must do better than Ruby: support arbritrary precision! :-D >> >> Seriously, I do consider that a necessary requirement for the PEP (which >> the Decimal type actually meets). (...) > > (...) > Not-quite-sure-how-seriously-you-intend-supporting-yoctoseconds-ly y'rs, The point is not supporting yoctosecond resolution, but not having to change the API each time that the resolution becomes better. The time resolution already changed 3 times in UNIX/Linux history: 1 second, 1 ms, 1 us, 1 ns. So the C library has to maintain API for all these resolution: time_t, timeb, timeval, timespec... ftime() and usleep() are deperecated by POSIX 2008 for example. http://news.bbc.co.uk/2/hi/technology/5099584.stm "The prototype operates at speeds up to 500 gigahertz (GHz), more than 100 times faster than desktop PC chips." "A decade ago we couldn't even envisage being able to run at these speeds." 500 Ghz means a theorical resolution of 2 picoseconds (10^-12). So nanosecond might not be enough for next 10 years. This is theorical. In practive, Linux does already use nanosecond timestamps and shutil.copstat() has an issue with such timestamp (is unable to copy such timestamp with no loss of precision). Victor From ncoghlan at gmail.com Mon Feb 20 14:23:13 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 20 Feb 2012 23:23:13 +1000 Subject: [Python-Dev] PEP czar for PEP 3144? Message-ID: Does anyone object to me naming myself PEP czar for PEP 3144? I've collated the objections to the original proposal on a few different occasions throughout the (long!) PEP review process, and as noted in the Background section, the latest version of the PEP [1] has addressed the key concerns that were raised: - the "strict" flag for Network objects is gone (instead, the validation differences between IP Network and IP Interface definitions are handled as different classes with otherwise similar interfaces) - the factory function naming scheme follows PEP 8 - some properties have been given new names that make it clearer what kind of object they produce - the module itself has been given a new name (ipaddress) to avoid clashing with the existing ipaddr module on PyPI There's also basic-but-usable module documentation available (http://code.google.com/p/ipaddr-py/wiki/Using3144). So, unless there are any new objections, I'd like to: - approve ipaddress for inclusion in Python 3.3 - grant Peter Moody push access as the module maintainer - create a tracker issue to cover incorporating the new module into the standard library, documentation and test suite (There are still a few places in both the PEP and the preliminary documentation that say "ipaddr" instead of "ipaddress", but those can be cleaned up as the module gets integrated). I don't personally think the module API needs the provisional disclaimer as the core functionality has been tested for years in ipaddr and the API changes in ipaddress are just cosmetic ones either for PEP 8 conformance, or to make the API map more cleanly to the underlying networking concepts. However, I'd be willing to include that proviso if anyone else has lingering concerns. Regards, Nick. [1] http://www.python.org/dev/peps/pep-3144/ -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Mon Feb 20 14:55:05 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 20 Feb 2012 14:55:05 +0100 Subject: [Python-Dev] PEP czar for PEP 3144? References: Message-ID: <20120220145505.238e6adb@pitrou.net> On Mon, 20 Feb 2012 23:23:13 +1000 Nick Coghlan wrote: > Does anyone object to me naming myself PEP czar for PEP 3144? ?Tsar is a title used to designate certain European Slavic monarchs or supreme rulers.? Is this our official word? > There's also basic-but-usable module documentation available > (http://code.google.com/p/ipaddr-py/wiki/Using3144). Mmmh, some comments: - a network can be "in" another network? Sounds strange. Compare with sets, which can be ordered, but not contained one within another. The idea of an address or network being "in" an interface sounds even stranger. - iterhosts()? Why not simply hosts()? - ?A TypeError exception is raised if you try to compare objects of different versions or different types.?: I hope equality still works? Regards Antoine. From techtonik at gmail.com Mon Feb 20 15:58:32 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Mon, 20 Feb 2012 17:58:32 +0300 Subject: [Python-Dev] PEP 394 Message-ID: On Mon, Feb 20, 2012 at 4:58 PM, Nick Coghlan wrote: > PEP 394 > was at the top of my list recently > I've tried to edit it to be a little bit shorter (perhaps cleaner) and commented (up to revision 2) up to Migration Notes. http://piratepad.net/pep-0394 The main points: 1. `python2.7` should be `python27` 2. until platform supports Python 2, `python` should link to python2 binary 3. python2 should always point to the latest version available on the system (I didn't write that in comments) -- anatoly t. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Feb 20 16:09:22 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 21 Feb 2012 01:09:22 +1000 Subject: [Python-Dev] PEP czar for PEP 3144? In-Reply-To: <20120220145505.238e6adb@pitrou.net> References: <20120220145505.238e6adb@pitrou.net> Message-ID: On Mon, Feb 20, 2012 at 11:55 PM, Antoine Pitrou wrote: > On Mon, 20 Feb 2012 23:23:13 +1000 > Nick Coghlan wrote: >> Does anyone object to me naming myself PEP czar for PEP 3144? > > ?Tsar is a title used to designate certain European Slavic monarchs or > supreme rulers.? > > Is this our official word? PEP czar/tsar and BDFOP (Benevolent Dictator for One PEP) are the two names I've seen for the role. I don't have a strong preference either way (just a mild preference for 'czar'). >> There's also basic-but-usable module documentation available >> (http://code.google.com/p/ipaddr-py/wiki/Using3144). > > Mmmh, some comments: > - a network can be "in" another network? Sounds strange. Compare with > ?sets, which can be ordered, but not contained one within another. > ?The idea of an address or network being "in" an interface sounds even > ?stranger. Ah, I'd missed that one. Yes, I think this a holdover from the main ipaddr module which plays fast and loose with type correctness by implicitly converting between networks and addresses in all sorts of places. It doesn't have Network and Interface as separate types (calling them both "Networks") and it appears the current incarnation of the Interface API still retains a few too many Network-specific behaviours. I agree the "container" behaviour should be reserved for the actual Network API, with Interface objects behaving more like Addresses in that respect. I also agree Network subset and superset checks should follow a set-style API rather than overloading the containment checks. There are actually a few other behaviours (like compare_networks() that should probably be moved to the Network objects, and accessed via the "network" property for Interface objects. > - iterhosts()? Why not simply hosts()? And I missed that one, too. Perhaps that provisional marker wouldn't be such a bad idea after all... One requirement for integration would be fleshing out the standard library version of the documentation to include a full public API reference for the module and public classes, which will also help highlight any lingering naming problems, as well as areas where APIs that currently return realised lists should probably be returning iterators instead (there's currently iter_subnets() and subnet(), which should just be a single subnets() iterator). > - ?A TypeError exception is raised if you try to compare objects of > ?different versions or different types.?: I hope equality still works? It looks like it's supposed to (and does for Address objects), but there's currently a bug in the _BaseInterface.__eq__ impl that makes it return None instead of False (the method impl *should* be returning NotImplemented, just as _BaseAddress does, with the interpreter than reporting False if both sides return NotImplemented). There's currently an implicit promotion of Address objects to Interface objects, such that "network_or_interface == address" is the same as "network_or_interface.ip == address". So yes, with the appropriate boundaries between the different types of objects still being a little blurred, I think a "provisional" marker is definitely warranted. Some of the APIs that are currently available directly on Interface objects should really be accessed via their .network property instead. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Feb 20 16:09:22 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 21 Feb 2012 01:09:22 +1000 Subject: [Python-Dev] PEP czar for PEP 3144? In-Reply-To: <20120220145505.238e6adb@pitrou.net> References: <20120220145505.238e6adb@pitrou.net> Message-ID: On Mon, Feb 20, 2012 at 11:55 PM, Antoine Pitrou wrote: > On Mon, 20 Feb 2012 23:23:13 +1000 > Nick Coghlan wrote: >> Does anyone object to me naming myself PEP czar for PEP 3144? > > ?Tsar is a title used to designate certain European Slavic monarchs or > supreme rulers.? > > Is this our official word? PEP czar/tsar and BDFOP (Benevolent Dictator for One PEP) are the two names I've seen for the role. I don't have a strong preference either way (just a mild preference for 'czar'). >> There's also basic-but-usable module documentation available >> (http://code.google.com/p/ipaddr-py/wiki/Using3144). > > Mmmh, some comments: > - a network can be "in" another network? Sounds strange. Compare with > ?sets, which can be ordered, but not contained one within another. > ?The idea of an address or network being "in" an interface sounds even > ?stranger. Ah, I'd missed that one. Yes, I think this a holdover from the main ipaddr module which plays fast and loose with type correctness by implicitly converting between networks and addresses in all sorts of places. It doesn't have Network and Interface as separate types (calling them both "Networks") and it appears the current incarnation of the Interface API still retains a few too many Network-specific behaviours. I agree the "container" behaviour should be reserved for the actual Network API, with Interface objects behaving more like Addresses in that respect. I also agree Network subset and superset checks should follow a set-style API rather than overloading the containment checks. There are actually a few other behaviours (like compare_networks() that should probably be moved to the Network objects, and accessed via the "network" property for Interface objects. > - iterhosts()? Why not simply hosts()? And I missed that one, too. Perhaps that provisional marker wouldn't be such a bad idea after all... One requirement for integration would be fleshing out the standard library version of the documentation to include a full public API reference for the module and public classes, which will also help highlight any lingering naming problems, as well as areas where APIs that currently return realised lists should probably be returning iterators instead (there's currently iter_subnets() and subnet(), which should just be a single subnets() iterator). > - ?A TypeError exception is raised if you try to compare objects of > ?different versions or different types.?: I hope equality still works? It looks like it's supposed to (and does for Address objects), but there's currently a bug in the _BaseInterface.__eq__ impl that makes it return None instead of False (the method impl *should* be returning NotImplemented, just as _BaseAddress does, with the interpreter than reporting False if both sides return NotImplemented). There's currently an implicit promotion of Address objects to Interface objects, such that "network_or_interface == address" is the same as "network_or_interface.ip == address". So yes, with the appropriate boundaries between the different types of objects still being a little blurred, I think a "provisional" marker is definitely warranted. Some of the APIs that are currently available directly on Interface objects should really be accessed via their .network property instead. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From dirkjan at ochtman.nl Mon Feb 20 16:20:15 2012 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Mon, 20 Feb 2012 16:20:15 +0100 Subject: [Python-Dev] PEP czar for PEP 3144? In-Reply-To: References: Message-ID: On Mon, Feb 20, 2012 at 14:23, Nick Coghlan wrote: > I don't personally think the module API needs the provisional > disclaimer as the core functionality has been tested for years in > ipaddr and the API changes in ipaddress are just cosmetic ones either > for PEP 8 conformance, or to make the API map more cleanly to the > underlying networking concepts. However, I'd be willing to include > that proviso if anyone else has lingering concerns. Should it be net.ipaddress instead of just ipaddress? Somewhat nested is better than fully flat. Cheers, Dirkjan From solipsis at pitrou.net Mon Feb 20 16:27:25 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 20 Feb 2012 16:27:25 +0100 Subject: [Python-Dev] PEP czar for PEP 3144? References: Message-ID: <20120220162725.29328d77@pitrou.net> On Mon, 20 Feb 2012 16:20:15 +0100 Dirkjan Ochtman wrote: > On Mon, Feb 20, 2012 at 14:23, Nick Coghlan wrote: > > I don't personally think the module API needs the provisional > > disclaimer as the core functionality has been tested for years in > > ipaddr and the API changes in ipaddress are just cosmetic ones either > > for PEP 8 conformance, or to make the API map more cleanly to the > > underlying networking concepts. However, I'd be willing to include > > that proviso if anyone else has lingering concerns. > > Should it be net.ipaddress instead of just ipaddress? > > Somewhat nested is better than fully flat. IMHO, nesting without a good, consistent, systematic categorization leads to very unpleasant results (e.g. "from urllib.request import urlopen"). Historically, our stdlib has been flat and I think it should stay so, short of redoing the whole hierarchy. (note this has nothing to do with the possible implementation of modules as packages, such as unittest or importlib) Regards Antoine. From breamoreboy at yahoo.co.uk Mon Feb 20 16:54:31 2012 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Mon, 20 Feb 2012 15:54:31 +0000 Subject: [Python-Dev] Status of PEP 397 - Python launcher for Windows In-Reply-To: <4F3F35FF.1010506@gmail.com> References: <4F3F35FF.1010506@gmail.com> Message-ID: On 18/02/2012 05:24, Mark Hammond wrote: > I'm wondering what thoughts are on PEP 397, the Python launcher for > Windows. I've been using the implementation for a number of months now > and I find it incredibly useful. > > To my mind, the specific steps would be: > > * Have someone pronounce it as accepted (or suggest steps to be taken > before such a pronouncement). I can't recall the current process - does > Guido have to pronounce personally or formally delegate to a czar? > > * Move the source into the Python tree and update the build process. > > * Arrange for it to be installed with the next release of 3.2 and all > future versions - I'm happy to try and help with that, but will probably > need some help from Martin. > > * Write some user-oriented docs. The section in the docs "Using Python on Windows" would need to be updated, but would this have to happen for every current version of Python? The docs here https://bitbucket.org/vinay.sajip/pylauncher/src/tip/Doc/launcher.rst are in my view possibly overkill, what do the rest of you think? The ouput from py --help seems fine but nothing happens when pyw --help is entered, is this by accident or design? > > Thoughts or comments? > > Mark A cracking bit of kit :) -- Cheers. Mark Lawrence. From martin at v.loewis.de Mon Feb 20 16:55:11 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Mon, 20 Feb 2012 16:55:11 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <4F34E554.7090600@v.loewis.de> <4F3CC8C3.8070103@v.loewis.de> <4F4181E1.9040909@v.loewis.de> Message-ID: <20120220165511.Horde.c-6iIaGZi1VPQmzfA-JBT4A@webmail.df.eu> > Basically, if something is just documented as being callable without > subclassing or instance checks being mentioned as supported in the > docs, it can be implemented as either a type or an ordinary function, > or pretty much any other kind of callable without being deemed an API > change So what would be your evaluation of http://docs.python.org/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element in that respect? Regards, Martin From martin at v.loewis.de Mon Feb 20 17:07:19 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Mon, 20 Feb 2012 17:07:19 +0100 Subject: [Python-Dev] PEP czar for PEP 3144? In-Reply-To: <20120220145505.238e6adb@pitrou.net> References: <20120220145505.238e6adb@pitrou.net> Message-ID: <20120220170719.Horde.wKTtdKGZi1VPQm_37xohbhA@webmail.df.eu> >> Does anyone object to me naming myself PEP czar for PEP 3144? > > ?Tsar is a title used to designate certain European Slavic monarchs or > supreme rulers.? > > Is this our official word? "supreme ruler" sounds good to me. I could go for "inquisitor" instead of "czar" as well... Regards, Martin From senthil at uthcode.com Mon Feb 20 17:28:41 2012 From: senthil at uthcode.com (Senthil Kumaran) Date: Tue, 21 Feb 2012 00:28:41 +0800 Subject: [Python-Dev] PEP czar for PEP 3144? In-Reply-To: <20120220170719.Horde.wKTtdKGZi1VPQm_37xohbhA@webmail.df.eu> References: <20120220145505.238e6adb@pitrou.net> <20120220170719.Horde.wKTtdKGZi1VPQm_37xohbhA@webmail.df.eu> Message-ID: On Tue, Feb 21, 2012 at 12:07 AM, wrote: > "supreme ruler" sounds good to me. I could go for "inquisitor" instead > of "czar" as well... But that would be bad for developers from Spain as nobody would expect a spanish inquisition. :-) -- Senthil From breamoreboy at yahoo.co.uk Mon Feb 20 17:50:19 2012 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Mon, 20 Feb 2012 16:50:19 +0000 Subject: [Python-Dev] PEP czar for PEP 3144? In-Reply-To: References: <20120220145505.238e6adb@pitrou.net> <20120220170719.Horde.wKTtdKGZi1VPQm_37xohbhA@webmail.df.eu> Message-ID: On 20/02/2012 16:28, Senthil Kumaran wrote: > On Tue, Feb 21, 2012 at 12:07 AM, wrote: >> "supreme ruler" sounds good to me. I could go for "inquisitor" instead >> of "czar" as well... > > But that would be bad for developers from Spain as nobody would expect > a spanish inquisition. > > :-) > How about Big Brother then? As anyone worked in room 101? -- Cheers. Mark Lawrence. From andrew.svetlov at gmail.com Mon Feb 20 17:52:57 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Mon, 20 Feb 2012 18:52:57 +0200 Subject: [Python-Dev] PEP 394 In-Reply-To: References: Message-ID: ArchLinux has used `python` as alias for `python3` while `python2` is still supported. On Mon, Feb 20, 2012 at 4:58 PM, anatoly techtonik wrote: > On Mon, Feb 20, 2012 at 4:58 PM, Nick Coghlan wrote: >> >> PEP 394 >> was at the top of my list recently > > > I've tried to edit it to be a little bit shorter (perhaps cleaner) and > commented (up to revision 2) up to Migration Notes. > http://piratepad.net/pep-0394 > > The main points: > 1. `python2.7` should be `python27` > 2. until platform supports Python 2, `python` should link to python2 binary > 3. python2 should always point to the latest version available on the system > (I didn't write that in comments) > -- > anatoly t. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com > From andrew.svetlov at gmail.com Mon Feb 20 17:53:58 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Mon, 20 Feb 2012 18:53:58 +0200 Subject: [Python-Dev] PEP czar for PEP 3144? In-Reply-To: References: <20120220145505.238e6adb@pitrou.net> <20120220170719.Horde.wKTtdKGZi1VPQm_37xohbhA@webmail.df.eu> Message-ID: I like 'PEP czar' On Mon, Feb 20, 2012 at 6:50 PM, Mark Lawrence wrote: > On 20/02/2012 16:28, Senthil Kumaran wrote: >> >> On Tue, Feb 21, 2012 at 12:07 AM, ?wrote: >>> >>> "supreme ruler" sounds good to me. I could go for "inquisitor" instead >>> of "czar" as well... >> >> >> But that would be bad for developers from Spain as nobody would expect >> a spanish inquisition. >> >> :-) >> > > How about Big Brother then? ?As anyone worked in room 101? > > -- > Cheers. > > Mark Lawrence. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com From dirkjan at ochtman.nl Mon Feb 20 19:06:23 2012 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Mon, 20 Feb 2012 19:06:23 +0100 Subject: [Python-Dev] PEP czar for PEP 3144? In-Reply-To: <20120220162725.29328d77@pitrou.net> References: <20120220162725.29328d77@pitrou.net> Message-ID: On Mon, Feb 20, 2012 at 16:27, Antoine Pitrou wrote: >> Should it be net.ipaddress instead of just ipaddress? >> >> Somewhat nested is better than fully flat. > > IMHO, nesting without a good, consistent, systematic categorization > leads to very unpleasant results (e.g. "from urllib.request import > urlopen"). > > Historically, our stdlib has been flat and I think it should stay so, > short of redoing the whole hierarchy. > > (note this has nothing to do with the possible implementation of > modules as packages, such as unittest or importlib) I thought Python 3 already came with a net package, but apparently that plan has long been discarded. So I retract my suggestion. Cheers, Dirkjan From brett at python.org Mon Feb 20 19:20:50 2012 From: brett at python.org (Brett Cannon) Date: Mon, 20 Feb 2012 13:20:50 -0500 Subject: [Python-Dev] [Python-checkins] cpython: Issue #14043: Speed up importlib's _FileFinder by at least 8x, and add a new In-Reply-To: References: Message-ID: On Sun, Feb 19, 2012 at 22:15, Nick Coghlan wrote: > However, "very cool" on adding the caching in the default importers :) Thanks to PJE for bringing the idea up again and Antoine discovering the approach *independently* from PJE and myself and actually writing the code. Now I *really* need to get that C hybrid for __import__() finished (in the middle of implementing _gcd_import() in C) to see where the performance ends up. -------------- next part -------------- An HTML attachment was scrubbed... URL: From anacrolix at gmail.com Mon Feb 20 19:39:01 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Tue, 21 Feb 2012 02:39:01 +0800 Subject: [Python-Dev] PEP czar for PEP 3144? In-Reply-To: <20120220162725.29328d77@pitrou.net> References: <20120220162725.29328d77@pitrou.net> Message-ID: On Mon, Feb 20, 2012 at 11:27 PM, Antoine Pitrou wrote: > IMHO, nesting without a good, consistent, systematic categorization > leads to very unpleasant results (e.g. "from urllib.request import > urlopen"). > > Historically, our stdlib has been flat and I think it should stay so, > short of redoing the whole hierarchy. I concur. Arbitrary nesting should be avoided. From tjreedy at udel.edu Mon Feb 20 20:13:01 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 20 Feb 2012 14:13:01 -0500 Subject: [Python-Dev] PEP czar for PEP 3144? In-Reply-To: References: Message-ID: On 2/20/2012 8:23 AM, Nick Coghlan wrote: > Does anyone object to me naming myself PEP czar for PEP 3144? I think it great that you volunteer to be the PEP czar and hope Guido appoints you -- especially after your response to Antoine. Since this is a Python 3 module, let us start off with a modern Python 3 interface. That includes returning iterators instead of lists unless there is a really good reason. I can see how an outside developer could have difficulty getting integrated into our collective PEP process ;-). -- Terry Jan Reedy From tjreedy at udel.edu Mon Feb 20 20:51:08 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 20 Feb 2012 14:51:08 -0500 Subject: [Python-Dev] Python in Native Client In-Reply-To: References: Message-ID: On 2/20/2012 7:49 AM, anatoly techtonik wrote: > People on NaCl list are asking about Python support for development > of native web applications in Python. Does anybody have experience > compiling Python for NaCl? > https://groups.google.com/d/topic/native-client-discuss/ioY2jmw_OUQ/discussion I suggest you ask this on python-list also. -- Terry Jan Reedy From guido at python.org Mon Feb 20 22:09:58 2012 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2012 13:09:58 -0800 Subject: [Python-Dev] PEP czar for PEP 3144? In-Reply-To: References: Message-ID: Approved. Nick is PEP czar for PEP 3144. Thanks Nick! On Mon, Feb 20, 2012 at 11:13 AM, Terry Reedy wrote: > On 2/20/2012 8:23 AM, Nick Coghlan wrote: >> >> Does anyone object to me naming myself PEP czar for PEP 3144? > > > I think it great that you volunteer to be the PEP czar and hope Guido > appoints you -- especially after your response to Antoine. Since this is a > Python 3 module, let us start off with a modern Python 3 interface. That > includes returning iterators instead of lists unless there is a really good > reason. > > I can see how an outside developer could have difficulty getting integrated > into our collective PEP process ;-). > > -- > Terry Jan Reedy > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Mon Feb 20 22:31:13 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 20 Feb 2012 16:31:13 -0500 Subject: [Python-Dev] [Python-checkins] cpython: Issue #13641: Decoding functions in the base64 module now accept ASCII-only In-Reply-To: References: Message-ID: <4F42BBA1.6040507@udel.edu> On 2/20/2012 1:33 PM, antoine.pitrou wrote: > http://hg.python.org/cpython/rev/c760bd844222 > changeset: 75058:c760bd844222 > user: Antoine Pitrou > date: Mon Feb 20 19:30:23 2012 +0100 > summary: > Issue #13641: Decoding functions in the base64 module now accept ASCII-only unicode strings. > Patch by Catalin Iacob. > + tests = {b"d3d3LnB5dGhvbi5vcmc=": b"www.python.org", > + b'AA==': b'\x00', > + b"YQ==": b"a", > + b"YWI=": b"ab", > + b"YWJj": b"abc", > + b"YWJjZGVmZ2hpamtsbW5vcHFyc3R1dnd4eXpBQkNE" > + b"RUZHSElKS0xNTk9QUVJTVFVWV1hZWjAxMjM0\nNT" > + b"Y3ODkhQCMwXiYqKCk7Ojw+LC4gW117fQ==": > + > + b"abcdefghijklmnopqrstuvwxyz" > + b"ABCDEFGHIJKLMNOPQRSTUVWXYZ" > + b"0123456789!@#0^&*();:<>,. []{}", > + b'': b'', > + } > + for data, res in tests.items(): I am a little puzzled why a constant sequence of pairs is being stored as a mapping instead of a tuple (or list) of 2-tuples (which is compiled more efficiently). As near as I can tell, 'tests' and similar constructs later in the file are never used as mappings. Am I missing something or is this just the way Catalin wrote it? -- Terry Jan Reedy From ncoghlan at gmail.com Mon Feb 20 23:51:34 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 21 Feb 2012 08:51:34 +1000 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <20120220165511.Horde.c-6iIaGZi1VPQmzfA-JBT4A@webmail.df.eu> References: <4F34E554.7090600@v.loewis.de> <4F3CC8C3.8070103@v.loewis.de> <4F4181E1.9040909@v.loewis.de> <20120220165511.Horde.c-6iIaGZi1VPQmzfA-JBT4A@webmail.df.eu> Message-ID: On Tue, Feb 21, 2012 at 1:55 AM, wrote: >> Basically, if something is just documented as being callable without >> subclassing or instance checks being mentioned as supported in the >> docs, it can be implemented as either a type or an ordinary function, >> or pretty much any other kind of callable without being deemed an API >> change > > > So what would be your evaluation of > > http://docs.python.org/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element > > in that respect? Completely different from the functools.partial case - with that, the docs are very careful to *never* call functools.partial a class (instead saying "returns a callable object"). The ElementTree docs unambiguously call Element a class (several times), so a conforming implementation must provide it as a class (i.e. supporting use in isinstance() checks. inheritance, etc) rather than as just a callable. A factory function is not a backwards compatible replacement (sorry Eli - given those docs, I'm definitely with Martin on this one). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Feb 20 23:56:48 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 21 Feb 2012 08:56:48 +1000 Subject: [Python-Dev] PEP 394 In-Reply-To: References: Message-ID: On Tue, Feb 21, 2012 at 12:58 AM, anatoly techtonik wrote: > On Mon, Feb 20, 2012 at 4:58 PM, Nick Coghlan wrote: >> >> PEP 394 >> was at the top of my list recently > > > I've tried to edit it to be a little bit shorter (perhaps cleaner) and > commented (up to revision 2) up to Migration Notes. > http://piratepad.net/pep-0394 > > The main points: > 1. `python2.7` should be `python27` No, it shouldn't. The default *nix links include the period (it's only the Windows binaries that leave it out) > 2. until platform supports Python 2, `python` should link to python2 binary That's a distro decision - if their Python 2 code is all updated to specifically use "python2", they can switch the default whenever they want. > 3. python2 should always point to the latest version available on the system No, it should point to the distro installed version or wherever the system admin decides to point it. So long as it points to *some* flavour of Python 2, it's in line with the recommendation. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From pje at telecommunity.com Mon Feb 20 23:29:48 2012 From: pje at telecommunity.com (PJ Eby) Date: Mon, 20 Feb 2012 17:29:48 -0500 Subject: [Python-Dev] [Python-checkins] cpython: Issue #14043: Speed up importlib's _FileFinder by at least 8x, and add a new In-Reply-To: References: Message-ID: On Mon, Feb 20, 2012 at 1:20 PM, Brett Cannon wrote: > On Sun, Feb 19, 2012 at 22:15, Nick Coghlan wrote: > >> However, "very cool" on adding the caching in the default importers :) > > > Thanks to PJE for bringing the idea up again and Antoine discovering the > approach *independently* from PJE and myself and actually writing the code. > Where is the code, btw? (I looked at your sandbox and didn't see it.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Feb 21 00:06:00 2012 From: brett at python.org (Brett Cannon) Date: Mon, 20 Feb 2012 18:06:00 -0500 Subject: [Python-Dev] [Python-checkins] cpython: Issue #14043: Speed up importlib's _FileFinder by at least 8x, and add a new In-Reply-To: References: Message-ID: On Mon, Feb 20, 2012 at 17:29, PJ Eby wrote: > On Mon, Feb 20, 2012 at 1:20 PM, Brett Cannon wrote: > >> On Sun, Feb 19, 2012 at 22:15, Nick Coghlan wrote: >> >>> However, "very cool" on adding the caching in the default importers :) >> >> >> Thanks to PJE for bringing the idea up again and Antoine discovering the >> approach *independently* from PJE and myself and actually writing the code. >> > > Where is the code, btw? (I looked at your sandbox and didn't see it.) > > It's not in the sandbox until I do a merge; Antoine committed the code to default so it's already in Python 3.3. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Feb 21 00:13:24 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 21 Feb 2012 00:13:24 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Issue #13641: Decoding functions in the base64 module now accept ASCII-only References: <4F42BBA1.6040507@udel.edu> Message-ID: <20120221001324.2935cab7@pitrou.net> On Mon, 20 Feb 2012 16:31:13 -0500 Terry Reedy wrote: > > I am a little puzzled why a constant sequence of pairs is being stored > as a mapping instead of a tuple (or list) of 2-tuples (which is compiled > more efficiently). As near as I can tell, 'tests' and similar constructs > later in the file are never used as mappings. Am I missing something or > is this just the way Catalin wrote it? This is just the way Catalin wrote it. If the style bothers you, you can always change it. I don't think it makes much of a difference either way. (but if you really care about compilation efficiency in tests you are probably the victim of premature optimization) Regards Antoine. From skippy.hammond at gmail.com Tue Feb 21 00:48:01 2012 From: skippy.hammond at gmail.com (Mark Hammond) Date: Tue, 21 Feb 2012 10:48:01 +1100 Subject: [Python-Dev] Status of PEP 397 - Python launcher for Windows In-Reply-To: References: <4F3F35FF.1010506@gmail.com> Message-ID: <4F42DBB1.6030309@gmail.com> On 21/02/2012 2:54 AM, Mark Lawrence wrote: > On 18/02/2012 05:24, Mark Hammond wrote: ... >> * Write some user-oriented docs. > > The section in the docs "Using Python on Windows" would need to be > updated, but would this have to happen for every current version of Python? I'm not sure what docs you are referring to here? > The docs here > https://bitbucket.org/vinay.sajip/pylauncher/src/tip/Doc/launcher.rst > are in my view possibly overkill, what do the rest of you think? Even though I had no input into those docs, I actually think they are fairly good and can't see what should be dropped. It may make sense to split the docs so there is a separate "advanced" doc page. Further, I think there is something that could be added to those docs - the use of PATHEXT and the fact that once the shebang line is in place, a command-prompt could do just "hello.py" rather than needing "py hello.py". > The ouput from py --help seems fine but nothing happens when pyw --help > is entered, is this by accident or design? I guess "accident" - or more accurately, the lack of doing anything special. It could be useful to have that display a message box with the usage - while that would break "pyw --help > out.txt", I doubt that really is useful for anyone. Alternatively, instead of trying to display all the usage in "pyw --help", it could display a short message indicating what the program is for and refer to "py.exe --help" for more information. Possibly a plain "pyw" (with no args) could do the same thing - nothing useful happens in that case either. >> Thoughts or comments? >> >> Mark > > A cracking bit of kit :) Thanks! Vinay's implementation is great, I agree. Thanks, Mark From breamoreboy at yahoo.co.uk Tue Feb 21 01:01:56 2012 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Tue, 21 Feb 2012 00:01:56 +0000 Subject: [Python-Dev] Status of PEP 397 - Python launcher for Windows In-Reply-To: <4F42DBB1.6030309@gmail.com> References: <4F3F35FF.1010506@gmail.com> <4F42DBB1.6030309@gmail.com> Message-ID: On 20/02/2012 23:48, Mark Hammond wrote: > On 21/02/2012 2:54 AM, Mark Lawrence wrote: >> The section in the docs "Using Python on Windows" would need to be >> updated, but would this have to happen for every current version of >> Python? > > I'm not sure what docs you are referring to here? > See http://docs.python.org/using/windows.html > Mark > -- Cheers. Mark Lawrence. From steve at pearwood.info Tue Feb 21 01:26:04 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 21 Feb 2012 11:26:04 +1100 Subject: [Python-Dev] PEP czar for PEP 3144? In-Reply-To: References: <20120220145505.238e6adb@pitrou.net> Message-ID: <4F42E49C.7080108@pearwood.info> Nick Coghlan wrote: > On Mon, Feb 20, 2012 at 11:55 PM, Antoine Pitrou wrote: >> On Mon, 20 Feb 2012 23:23:13 +1000 >> Nick Coghlan wrote: >>> Does anyone object to me naming myself PEP czar for PEP 3144? >> ?Tsar is a title used to designate certain European Slavic monarchs or >> supreme rulers.? >> >> Is this our official word? > > PEP czar/tsar and BDFOP (Benevolent Dictator for One PEP) are the two > names I've seen for the role. I don't have a strong preference either > way (just a mild preference for 'czar'). Also, "Czar" is commonly used in US politics as an informal term for the top official responsible for an area. "Drug Czar" is only the most familiar: http://en.wikipedia.org/wiki/List_of_U.S._executive_branch_%27czars%27 -- Steven From stephen at xemacs.org Tue Feb 21 01:53:47 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 21 Feb 2012 09:53:47 +0900 Subject: [Python-Dev] PEP czar for PEP 3144? In-Reply-To: <4F42E49C.7080108@pearwood.info> References: <20120220145505.238e6adb@pitrou.net> <4F42E49C.7080108@pearwood.info> Message-ID: <87sji5m2g4.fsf@uwakimon.sk.tsukuba.ac.jp> Steven D'Aprano writes: > Also, "Czar" is commonly used in US politics as an informal term for the top > official responsible for an area. I think here the most important connotation is that in US parlance a "czar" does not report to a committee, and with the exception of a case where Sybil is appointed czar, cannot bikeshed. Decisions get made (what a concept!) From rdmurray at bitdance.com Tue Feb 21 02:24:16 2012 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 20 Feb 2012 20:24:16 -0500 Subject: [Python-Dev] accept string in a2b and base64? Message-ID: <20120221012417.202D02500E7@webabinitio.net> Two patches have been committed to 3.3 that I am very uncomfortable with. See issue 13637 and issue 13641, respectively. It seems to me that part of the point of the byte/string split (and the lack of automatic coercion) is to make the programmer be explicit about converting between unicode and bytes. Having these functions, which convert between binary formats (ASCII-only representations of binary data and back) accept unicode strings is reintroducing automatic coercions, and I think it will lead to the same kind of bugs that automatic string coercions yielded in Python2: a program works fine until the input turns out to have non-ASCII data in it, and then it blows up with an unexpected UnicodeError. You can see Antoine's counter arguments in the issue, and I'm sure he'll chime in here. If most people agree with Antoine I won't fight it, but it seems to me that accepting unicode in the binascii and base64 APIs is a bad idea. I'm on vacation this week so I may not be very responsive on this thread, but unless other people agree with me (and will therefore advance the relevant arguments) the thread can die and the patches can stay in. --David From eliben at gmail.com Tue Feb 21 02:39:16 2012 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 21 Feb 2012 03:39:16 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <4F34E554.7090600@v.loewis.de> <4F3CC8C3.8070103@v.loewis.de> <4F4181E1.9040909@v.loewis.de> <20120220165511.Horde.c-6iIaGZi1VPQmzfA-JBT4A@webmail.df.eu> Message-ID: On Tue, Feb 21, 2012 at 00:51, Nick Coghlan wrote: > On Tue, Feb 21, 2012 at 1:55 AM, wrote: > >> Basically, if something is just documented as being callable without > >> subclassing or instance checks being mentioned as supported in the > >> docs, it can be implemented as either a type or an ordinary function, > >> or pretty much any other kind of callable without being deemed an API > >> change > > > > > > So what would be your evaluation of > > > > > http://docs.python.org/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element > > > > in that respect? > > Completely different from the functools.partial case - with that, the > docs are very careful to *never* call functools.partial a class > (instead saying "returns a callable object"). > > The ElementTree docs unambiguously call Element a class (several > times), so a conforming implementation must provide it as a class > (i.e. supporting use in isinstance() checks. inheritance, etc) rather > than as just a callable. A factory function is not a backwards > compatible replacement (sorry Eli - given those docs, I'm definitely > with Martin on this one). > No need to be sorry :-) I don't think my view differs from Martin's here, by the way. My point is just that this isn't a regression, since "use cElementTree" is ubiquitous advice, and the C implementation has Element as a factory function and not a class, so the documentation wasn't correct to begin with. So the documentation isn't correct for previous versions any way you look at it. There's a conflict in that it says Element is a class and also that cElementTree implements the same API. So the two choices here are either change the documentation or the C implementation to actually make Element a class. The first is of course simpler. However, someone somewhere may have written code that knowingly forces the Python implementation to be used and subclasses Element. Such code will break in 3.3, so it probably makes sense to invest in making Element a class in the C implementation as well. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Feb 21 02:48:00 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 21 Feb 2012 11:48:00 +1000 Subject: [Python-Dev] PEP czar for PEP 3144? In-Reply-To: References: Message-ID: On Tue, Feb 21, 2012 at 7:09 AM, Guido van Rossum wrote: > Approved. Nick is PEP czar for PEP 3144. Thanks Nick! In that case the addition of the "ipaddress" module is approved for 3.3, with a provisional caveat on the API details. I'm doing it that way because I think those remaining details can be better flushed out by the integration process (in particular, populating full module API reference documentation) than they could by another round of updates on the PEP and the ipaddr 3144 branch. At the very least: - the IP Interface API needs to move to a point where it more clearly *is* an IP Address and *has* an associated IP Network (rather than being the other way around) - IP Network needs to behave more like an ordered set of sequential IP Addresses (without sometimes behaving like an Address in its own right) - iterable APIs should consistently produce iterators (leaving users free to wrap list() around the calls if they want the concrete realisation) Initial maintainers will be me (for the semantically cleaner incarnation of the module API) and Peter (for the IPv4 and IPv6 correctness heavy lifting and ensuring any API updates only change the spelling of particular operations, such as adding a ".network." to some current operations on Interface objects, rather than reducing overall module functionality). This approach means we will still gain the key benefits of using the PyPI-tested ipaddr as a base (i.e. correct IP address parsing and generation, full coverage of the same set of supported operations) while exposing a simpler semantic model for new users that first encounter these concepts through the standard library module documentation: - IP Address as the core abstraction - IP Network as a container for IP Addresses - IP Interface as an IP Address with an associated IP Network Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From barry at python.org Tue Feb 21 02:49:20 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 20 Feb 2012 20:49:20 -0500 Subject: [Python-Dev] hash randomization in the 2.6 branch Message-ID: <20120220204920.62ce1501@resist.wooz.org> I've just committed a back port of issue 13703, the hash randomization patch, to the Python 2.6 branch. I have left the forward porting of this to Python 2.7 to Benjamin. test_json will fail with randomization enabled since there is a sort order dependency in the __init__.py doctest. I'm not going to fix this (it can't be fixed in the same way 2.7 can), but I'd gladly accept a patch if it's not too nasty. If not, then it doesn't bother me because we previously agreed that it is not a showstopper for the tests to pass in Python 2.6 with randomization enabled. Please however, do test the Python 2.6 branch thoroughly, both with and without randomization. We'll be coordinating release candidates between all affected branches fairly soon. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From solipsis at pitrou.net Tue Feb 21 02:47:22 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 21 Feb 2012 02:47:22 +0100 Subject: [Python-Dev] accept string in a2b and base64? References: <20120221012417.202D02500E7@webabinitio.net> Message-ID: <20120221024722.4be6bcfa@pitrou.net> On Mon, 20 Feb 2012 20:24:16 -0500 "R. David Murray" wrote: > > It seems to me that part of the point of the byte/string split (and the > lack of automatic coercion) is to make the programmer be explicit about > converting between unicode and bytes. Having these functions, which > convert between binary formats (ASCII-only representations of binary data > and back) accept unicode strings is reintroducing automatic coercions, Whether a baseXX representation is binary or text can probably be argued endlessly. As a data point, hex() returns str, not bytes, so at least base16 can be considered (potentially) text. And the point of baseXX representations is generally to embed binary data safely into text, which explains why you may commonly need to baseXX-decode some chunk of text. This occurred to me when porting Twisted to py3k; I'm sure other networking code would also benefit. Really, I think there's no problem with coercions when they are unambiguous and safe (which they are, in the committed patches). They make writing and porting code easier. For example, we already have: >>> int("10") 10 >>> int(b"10") 10 Regards Antoine. From solipsis at pitrou.net Tue Feb 21 02:53:17 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 21 Feb 2012 02:53:17 +0100 Subject: [Python-Dev] cpython (2.6): - Issue #13703: oCERT-2011-003: add -R command-line option and PYTHONHASHSEED References: Message-ID: <20120221025317.11054eed@pitrou.net> On Tue, 21 Feb 2012 02:44:32 +0100 barry.warsaw wrote: > + This is intended to provide protection against a denial-of-service caused by > + carefully-chosen inputs that exploit the worst case performance of a dict > + insertion, O(n^2) complexity. See > + http://www.ocert.org/advisories/ocert-2011-003.html for details. The worst case performance of a dict insertion is O(n) (not counting potential resizes, whose cost is amortized by the overallocation heuristic). It's dict construction that has O(n**2) worst case complexity. > @@ -1232,9 +1233,9 @@ > flags__doc__, /* doc */ > flags_fields, /* fields */ > #ifdef RISCOS > + 17 > +#else > 16 > -#else > - 15 > #endif Changing the sequence size of sys.flags can break existing code (e.g. tuple-unpacking). Regards Antoine. From ncoghlan at gmail.com Tue Feb 21 02:59:54 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 21 Feb 2012 11:59:54 +1000 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <4F34E554.7090600@v.loewis.de> <4F3CC8C3.8070103@v.loewis.de> <4F4181E1.9040909@v.loewis.de> <20120220165511.Horde.c-6iIaGZi1VPQmzfA-JBT4A@webmail.df.eu> Message-ID: On Tue, Feb 21, 2012 at 11:39 AM, Eli Bendersky wrote: > So the two choices here are either change the documentation or the C > implementation to actually make Element a class. The first is of course > simpler. However, someone somewhere may have written code that knowingly > forces the Python implementation to be used and subclasses Element. Such > code will break in 3.3, so it probably makes sense to invest in making > Element a class in the C implementation as well. Yeah, that's my take as well (especially since, in 3.2 and earlier, "forcing" use of the pure Python version was just a matter of importing ElementTree instead of cElementTree). While Xavier's point about lxml following cElementTree's lead and using a factory function is an interesting one, I think in this case the documented behaviour + pure Python implementation win out over the C accelerator behaviour. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From benjamin at python.org Tue Feb 21 03:04:51 2012 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 20 Feb 2012 21:04:51 -0500 Subject: [Python-Dev] cpython (2.6): - Issue #13703: oCERT-2011-003: add -R command-line option and PYTHONHASHSEED In-Reply-To: <20120221025317.11054eed@pitrou.net> References: <20120221025317.11054eed@pitrou.net> Message-ID: 2012/2/20 Antoine Pitrou : > On Tue, 21 Feb 2012 02:44:32 +0100 > barry.warsaw wrote: >> + ? This is intended to provide protection against a denial-of-service caused by >> + ? carefully-chosen inputs that exploit the worst case performance of a dict >> + ? insertion, O(n^2) complexity. ?See >> + ? http://www.ocert.org/advisories/ocert-2011-003.html for details. > > The worst case performance of a dict insertion is O(n) (not counting > potential resizes, whose cost is amortized by the overallocation > heuristic). It's dict construction that has O(n**2) worst case > complexity. > >> @@ -1232,9 +1233,9 @@ >> ? ? ?flags__doc__, ? ? ? /* doc */ >> ? ? ?flags_fields, ? ? ? /* fields */ >> ?#ifdef RISCOS >> + ? ?17 >> +#else >> ? ? ?16 >> -#else >> - ? ?15 >> ?#endif > > Changing the sequence size of sys.flags can break existing code (e.g. > tuple-unpacking). I told George I didn't think it was a major problem. How much code have you seen trying to upack sys.flags? (Moreover, such code would have been broken by previous minor releases.) -- Regards, Benjamin From benjamin at python.org Tue Feb 21 03:05:13 2012 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 20 Feb 2012 21:05:13 -0500 Subject: [Python-Dev] cpython (2.6): - Issue #13703: oCERT-2011-003: add -R command-line option and PYTHONHASHSEED In-Reply-To: References: <20120221025317.11054eed@pitrou.net> Message-ID: 2012/2/20 Benjamin Peterson : > I told George Sorry, Georg! -- Regards, Benjamin From ncoghlan at gmail.com Tue Feb 21 03:51:08 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 21 Feb 2012 12:51:08 +1000 Subject: [Python-Dev] accept string in a2b and base64? In-Reply-To: <20120221012417.202D02500E7@webabinitio.net> References: <20120221012417.202D02500E7@webabinitio.net> Message-ID: On Tue, Feb 21, 2012 at 11:24 AM, R. David Murray wrote: > If most people agree with Antoine I won't fight it, but it seems to me > that accepting unicode in the binascii and base64 APIs is a bad idea. I see it as essentially the same as the changes I made in urllib.urlparse to support pure ASCII bytes->bytes in many of the APIs (which work by doing an implicit ascii+strict decode at the beginning of the function, and then reversing that at the end). For those, if your byte sequence has non-ASCII data in it, they'll throw a UnicodeDecodeError and it's up to you to figure out where those non-ASCII bytes are coming from. Similarly, if one of these updated APIs throws ValueError, then you'll have to figure out where the non-ASCII code points are coming from. Yes, it's a niggling irritation from a purist point of view, but it's also an acknowledgement of the fact that whether a pure ASCII sequence should be treated as a sequence of bytes or a sequence of code points is going to be application and context depended. Sometimes it will make more sense to treat it as binary data, other times as text. The key point is that any multimode support that depends on implicit type conversion from bytes->str (or vice-versa) really needs to be limited to *strict* ASCII only (if no other information on the encoding is available). If something is 7-bit ASCII pure, then odds are very good that it really *is* ASCII text. As soon as that high-order bit gets set though, all bets are off and we have to push the text encoding problem back on the API caller to figure out. The reason Python 2's implicit str<->unicode conversions are so problematic isn't just because they're implicit: it's because they effectively assume *latin-1* as the encoding on the 8-bit str side. That means reliance on implicit decoding can silently corrupt non-ASCII data instead of triggering exceptions at the point of implicit conversion. If you're lucky, some *other* part of the application will detect the corruption and you'll have at least a vague hope of tracking it down. Otherwise, the corrupted data may escape the application and you'll have an even *thornier* debugging problem on your hands. My one concern with the base64 patch is that it doesn't test that mixing types triggers TypeError. While this shouldn't require any extra code (the error should arise naturally from the method implementation), it should still be tested explicitly to ensure type mismatches fail as expected. Checking explicitly for mismatches in the code would then just be a matter of wanting to emit nice error messages explaining the problem rather than being needed for correctness reasons (e.g. urlparse uses pre-checks in order to emit a clear error message for type mismatches, but it has significantly longer function signatures to deal with). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Tue Feb 21 05:52:20 2012 From: guido at python.org (Guido van Rossum) Date: Mon, 20 Feb 2012 20:52:20 -0800 Subject: [Python-Dev] PEP czar for PEP 3144? In-Reply-To: <87sji5m2g4.fsf@uwakimon.sk.tsukuba.ac.jp> References: <20120220145505.238e6adb@pitrou.net> <4F42E49C.7080108@pearwood.info> <87sji5m2g4.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Mon, Feb 20, 2012 at 4:53 PM, Stephen J. Turnbull wrote: > Steven D'Aprano writes: > > ?> Also, "Czar" is commonly used in US politics as an informal term for the top > ?> official responsible for an area. > > I think here the most important connotation is that in US parlance a > "czar" does not report to a committee, and with the exception of a > case where Sybil is appointed czar, cannot bikeshed. ?Decisions get > made (what a concept!) I'm curious how old that usage is. I first encountered it around '88 when I interned for a summer at DEC SRC (long since subsumed into HP Labs); the person in charge of deciding a particular aspect of their software or organization was called a czar, e.g. the documentation czar. -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Tue Feb 21 06:07:07 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 21 Feb 2012 00:07:07 -0500 Subject: [Python-Dev] PEP czar for PEP 3144? In-Reply-To: References: <20120220145505.238e6adb@pitrou.net> <4F42E49C.7080108@pearwood.info> <87sji5m2g4.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 2/20/2012 11:52 PM, Guido van Rossum wrote: > On Mon, Feb 20, 2012 at 4:53 PM, Stephen J. Turnbull wrote: >> Steven D'Aprano writes: >> >> > Also, "Czar" is commonly used in US politics as an informal term for the top >> > official responsible for an area. >> >> I think here the most important connotation is that in US parlance a >> "czar" does not report to a committee, and with the exception of a >> case where Sybil is appointed czar, cannot bikeshed. Decisions get >> made (what a concept!) > > I'm curious how old that usage is. I first encountered it around '88 > when I interned for a summer at DEC SRC (long since subsumed into HP > Labs); the person in charge of deciding a particular aspect of their > software or organization was called a czar, e.g. the documentation > czar. In US politics, the first I remember was the Drug Czar about that time. It really came into currently during Clinton's admin. -- Terry Jan Reedy From greg at krypto.org Tue Feb 21 07:30:10 2012 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 20 Feb 2012 22:30:10 -0800 Subject: [Python-Dev] [Python-checkins] cpython (merge 2.6 -> 2.7): merge 2.6 with hash randomization fix In-Reply-To: References: Message-ID: On Mon, Feb 20, 2012 at 10:19 PM, Gregory P. Smith wrote: > Look at PCbuild/pythoncore.vcproj within this commit, it looks like > you committed (or merged) a merge conflict marker in the file. > > -gps > > On Mon, Feb 20, 2012 at 6:49 PM, benjamin.peterson > wrote: >> http://hg.python.org/cpython/rev/a0f43f4481e0 >> changeset: ? 75102:a0f43f4481e0 >> branch: ? ? ?2.7 nevermind I see the follow up fixing commits now. :) From g.brandl at gmx.net Tue Feb 21 08:06:29 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 21 Feb 2012 08:06:29 +0100 Subject: [Python-Dev] cpython (2.7): use set In-Reply-To: References: Message-ID: Am 21.02.2012 05:13, schrieb benjamin.peterson: > http://hg.python.org/cpython/rev/98732d20b6d1 > changeset: 75112:98732d20b6d1 > branch: 2.7 > user: Benjamin Peterson > date: Mon Feb 20 23:11:19 2012 -0500 > summary: > use set > > files: > Lib/re.py | 5 +---- > 1 files changed, 1 insertions(+), 4 deletions(-) > > > diff --git a/Lib/re.py b/Lib/re.py > --- a/Lib/re.py > +++ b/Lib/re.py > @@ -198,10 +198,7 @@ > "Compile a template pattern, returning a pattern object" > return _compile(pattern, flags|T) > > -_alphanum = {} > -for c in 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890': > - _alphanum[c] = 1 > -del c > +_alphanum = set('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890') Seems the "0" is twice in that set. ;-) Georg From merwok at netwok.org Tue Feb 21 08:39:53 2012 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Tue, 21 Feb 2012 08:39:53 +0100 Subject: [Python-Dev] cpython (2.6): - Issue #13703: oCERT-2011-003: add -R command-line option and PYTHONHASHSEED In-Reply-To: References: <20120221025317.11054eed@pitrou.net> Message-ID: <4F434A49.2040409@netwok.org> Le 21/02/2012 03:04, Benjamin Peterson a ?crit : > 2012/2/20 Antoine Pitrou : >> Changing the sequence size of sys.flags can break existing code (e.g. >> tuple-unpacking). > > I told George I didn't think it was a major problem. How much code > have you seen trying to upack sys.flags? (Moreover, such code would > have been broken by previous minor releases.) If by ?minor? you mean the Y in Python X.Y.Z, then I think the precedent does not apply here: people expect to have to check their code when going from X.Y to X.Y+1, but not when they update X.Y.Z to X.Y.Z+1. But I agree this is rather theoretical, as I don?t see why anyone would iterate over sys.flags. The important point IMO is having clear policies for us and our users and sticking with them; here the decision was that adding a new flag in a bugfix release was needed, so it?s fine. Regards From fperez.net at gmail.com Tue Feb 21 08:44:41 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 21 Feb 2012 07:44:41 +0000 (UTC) Subject: [Python-Dev] A panel with Guido/python-dev on scientific uses and Python 3 at Google HQ, March 2nd Message-ID: Hi all, I wanted to point out to you folks, and invite any of you who could make it in person, to a panel discussion we'll be having on Friday March 2nd, at 3pm, during the PyData workshop that will take place at Google's headquarters in Mountain View: http://pydataworkshop.eventbrite.com The PyData workshop is organized by several developers coming from the numerical/scientific side of the Python world, and we thought this would be a good opportunity, both timing- and logistics-wise, for a discussion with as many Python developers as possible. The upcoming Python 3.3 release, the lifting of the language moratorium, the gradual (but slow) uptake of Python 3 in science, the continued and increasing growth of Python as a tool in scientific research and education, etc, are all good reasons for thinking this could be a productive discussion. This is the thread on the Numpy mailing list where we've had some back-and- forth about ideas: http://mail.scipy.org/pipermail/numpy-discussion/2012-February/060437.html Guido has already agreed to participate, and a number of developers for 'core' scientific Python projects will be present at the panel, including: - Travis Oliphant, Peter Wang, Mark Wiebe, Stefan van der Walt (Numpy, Scipy) - John Hunter (Matplotlib) - Fernando Perez, Brian Granger, Min Ragan-Kelley (IPython) - Dag Sverre Seljebotn (Numpy, Cython) It would be great if as many core Python developers for whom a Bay Area Friday afternoon drive to Mountain View is feasible could attend. Those of you already at Google will hopefully all make it, of course :) We hope this discussion will be a good start for interesting developments that require dialog between the 'science crowd' and python-dev. Several of us will also be available at PyCon 2012, so if there's interest we can organize an informal follow-up/BoF on this topic the next week at PyCon. Please forward this information to anyone you think might be interested (I'll be posting in a second to the Bay Piggies list). If you are not a Googler nor already registered for PyData, but would like to attend, please let me know by emailing me at: fernando.perez at berkeley.edu We have room for a few extra people (in addition to PyData attendees) for this particular meeting, and we'll do our best to accomodate you. Please let me know if you're a core python committer in your message. I'd like to thank Google for their hospitality in hosting us for PyData, and Guido for his willingness to take part in this discussion. I hope it will be a productive one for all involved. Best, f From mal at egenix.com Tue Feb 21 10:36:28 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 21 Feb 2012 10:36:28 +0100 Subject: [Python-Dev] accept string in a2b and base64? In-Reply-To: References: <20120221012417.202D02500E7@webabinitio.net> Message-ID: <4F43659C.3010301@egenix.com> Nick Coghlan wrote: > The reason Python 2's implicit str<->unicode conversions are so > problematic isn't just because they're implicit: it's because they > effectively assume *latin-1* as the encoding on the 8-bit str side. The implicit conversion in Python2 only works with ASCII content, pretty much like what you describe here. Note that e.g. UTF-16 is not an ASCII super set, but the ASCII assumption still works: >>> u'abc'.encode('utf-16-le').decode('ascii') u'a\x00b\x00c\x00' Apart from that nit (which can be resolved in most cases by disallowing 0 bytes), I still believe that the Python2 implicit conversion between Unicode and 8-bit strings is a very useful feature in practice. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 21 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-02-13: Released eGenix pyOpenSSL 0.13 http://egenix.com/go26 2012-02-09: Released mxODBC.Zope.DA 2.0.2 http://egenix.com/go25 2012-02-06: Released eGenix mx Base 3.2.3 http://egenix.com/go24 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From robert.kern at gmail.com Tue Feb 21 11:38:53 2012 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 21 Feb 2012 10:38:53 +0000 Subject: [Python-Dev] PEP czar for PEP 3144? In-Reply-To: References: <20120220145505.238e6adb@pitrou.net> <4F42E49C.7080108@pearwood.info> <87sji5m2g4.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 2/21/12 4:52 AM, Guido van Rossum wrote: > On Mon, Feb 20, 2012 at 4:53 PM, Stephen J. Turnbull wrote: >> Steven D'Aprano writes: >> >> > Also, "Czar" is commonly used in US politics as an informal term for the top >> > official responsible for an area. >> >> I think here the most important connotation is that in US parlance a >> "czar" does not report to a committee, and with the exception of a >> case where Sybil is appointed czar, cannot bikeshed. Decisions get >> made (what a concept!) > > I'm curious how old that usage is. I first encountered it around '88 > when I interned for a summer at DEC SRC (long since subsumed into HP > Labs); the person in charge of deciding a particular aspect of their > software or organization was called a czar, e.g. the documentation > czar. From the Wikipedia article Steven cited: """ The earliest known use of the term for a U.S. government official was in the administration of Franklin Roosevelt (1933?1945), during which eleven unique positions (or twelve if one were to count "Economic Czar" and "Economic Czar of World War II" as separate) were so described. The term was revived, mostly by the press, to describe officials in the Nixon and Ford administrations and continues today. """ http://en.wikipedia.org/wiki/List_of_U.S._executive_branch_%27czars%27 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From eliben at gmail.com Tue Feb 21 11:41:17 2012 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 21 Feb 2012 12:41:17 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <4F34E554.7090600@v.loewis.de> <4F3CC8C3.8070103@v.loewis.de> <4F4181E1.9040909@v.loewis.de> <20120220165511.Horde.c-6iIaGZi1VPQmzfA-JBT4A@webmail.df.eu> Message-ID: On Tue, Feb 21, 2012 at 03:59, Nick Coghlan wrote: > On Tue, Feb 21, 2012 at 11:39 AM, Eli Bendersky wrote: > > So the two choices here are either change the documentation or the C > > implementation to actually make Element a class. The first is of course > > simpler. However, someone somewhere may have written code that knowingly > > forces the Python implementation to be used and subclasses Element. Such > > code will break in 3.3, so it probably makes sense to invest in making > > Element a class in the C implementation as well. > > Yeah, that's my take as well (especially since, in 3.2 and earlier, > "forcing" use of the pure Python version was just a matter of > importing ElementTree instead of cElementTree). > > I can't fathom why someone would do it though, since bar tiny differences (like this one) cET is just a faster ET and it's available practically everywhere with CPython. I mean, is it really important to be able to subclass ET.Element? What goal does it serve? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Feb 21 13:28:07 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 21 Feb 2012 13:28:07 +0100 Subject: [Python-Dev] accept string in a2b and base64? References: <20120221012417.202D02500E7@webabinitio.net> Message-ID: <20120221132807.7b40efc5@pitrou.net> On Tue, 21 Feb 2012 12:51:08 +1000 Nick Coghlan wrote: > > My one concern with the base64 patch is that it doesn't test that > mixing types triggers TypeError. While this shouldn't require any > extra code (the error should arise naturally from the method > implementation), it should still be tested explicitly to ensure type > mismatches fail as expected. I don't think mixing types is a concern. The extra parameters to the base64 functions aren't mixed into the original string, they are used to modify the decoding algorithm. So it's like typing `open(b"LICENSE", "r")`: the fast that `b"LICENSE"` is bytes while `"r"` is str isn't really a problem. Regards Antoine. From ncoghlan at gmail.com Tue Feb 21 13:49:09 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 21 Feb 2012 22:49:09 +1000 Subject: [Python-Dev] accept string in a2b and base64? In-Reply-To: <20120221132807.7b40efc5@pitrou.net> References: <20120221012417.202D02500E7@webabinitio.net> <20120221132807.7b40efc5@pitrou.net> Message-ID: On Tue, Feb 21, 2012 at 10:28 PM, Antoine Pitrou wrote: > So it's like typing `open(b"LICENSE", "r")`: the fast that `b"LICENSE"` > is bytes while `"r"` is str isn't really a problem. Ah, right - I misunderstood how the different arguments were being used. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From vinay_sajip at yahoo.co.uk Tue Feb 21 16:50:47 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 21 Feb 2012 15:50:47 +0000 (UTC) Subject: [Python-Dev] Status of PEP 397 - Python launcher for Windows References: <4F3F35FF.1010506@gmail.com> <4F42DBB1.6030309@gmail.com> Message-ID: Mark Hammond gmail.com> writes: > think there is something that could be added to those docs - the use of > PATHEXT and the fact that once the shebang line is in place, a > command-prompt could do just "hello.py" rather than needing "py hello.py". Or even just "hello" should work. Regards, Vinay Sajip From solipsis at pitrou.net Tue Feb 21 19:19:28 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 21 Feb 2012 19:19:28 +0100 Subject: [Python-Dev] hash randomization in 3.3 Message-ID: <20120221191928.406b8dcc@pitrou.net> Hello, Shouldn't it be enabled by default in 3.3? It's currently disabled. $ ./python -c "print(hash('aa'))" 12416074593111936 [44297 refs] $ ./python -c "print(hash('aa'))" 12416074593111936 [44297 refs] Thanks Antoine. From guido at python.org Tue Feb 21 19:35:42 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 21 Feb 2012 10:35:42 -0800 Subject: [Python-Dev] What do you want me to discuss at PyCon's keynote? Message-ID: I'm starting to think about my annual PyCon keynote. I don't want it to be just a feel-good motivational speech (I'm no good at those), nor a dry "state of the Python union" talk (I'm bored with those), but I'd like to hear what Python users care about. I've created a Google+ post for feedback: https://plus.google.com/u/0/115212051037621986145/posts/P8XZ5Zxvpxk Post your questions as comments there, or vote on others' questions using your +1 button. If you don't have a G+ account, you can mail questions to guido at python.org-- I will then copy them here anonymously for voting, unless you ask me not to, or ask me to add your name. -- --Guido van Rossum (python.org/~guido) From fperez.net at gmail.com Tue Feb 21 19:36:11 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 21 Feb 2012 18:36:11 +0000 (UTC) Subject: [Python-Dev] A panel with Guido/python-dev on scientific uses and Python 3 at Google HQ, March 2nd References: Message-ID: On Tue, 21 Feb 2012 07:44:41 +0000, Fernando Perez wrote: > I wanted to point out to you folks, and invite any of you who could make > it in person, to a panel discussion we'll be having on Friday March 2nd, > at 3pm, during the PyData workshop that will take place at Google's > headquarters in Mountain View: > > http://pydataworkshop.eventbrite.com as luck would have it, it seems that *today* eventbrite revamped their url handling and the url I gave yesterday no longer works; it's now: http://pydataworkshop-esearch.eventbrite.com/?srnk=1 Sorry for the hassle, folks. Ah, Murphy's law, web edition... Cheers, f From benjamin at python.org Tue Feb 21 20:58:41 2012 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 21 Feb 2012 14:58:41 -0500 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: <20120221191928.406b8dcc@pitrou.net> References: <20120221191928.406b8dcc@pitrou.net> Message-ID: 2012/2/21 Antoine Pitrou : > > Hello, > > Shouldn't it be enabled by default in 3.3? Should you be able to disable it? -- Regards, Benjamin From solipsis at pitrou.net Tue Feb 21 20:59:52 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 21 Feb 2012 20:59:52 +0100 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: References: <20120221191928.406b8dcc@pitrou.net> Message-ID: <20120221205952.3694f2a2@pitrou.net> On Tue, 21 Feb 2012 14:58:41 -0500 Benjamin Peterson wrote: > 2012/2/21 Antoine Pitrou : > > > > Hello, > > > > Shouldn't it be enabled by default in 3.3? > > Should you be able to disable it? PYTHONHASHSEED=0 should disable it. Do we also need a command-line option? Regards Antoine. From benjamin at python.org Tue Feb 21 21:04:43 2012 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 21 Feb 2012 15:04:43 -0500 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: <20120221205952.3694f2a2@pitrou.net> References: <20120221191928.406b8dcc@pitrou.net> <20120221205952.3694f2a2@pitrou.net> Message-ID: 2012/2/21 Antoine Pitrou : > On Tue, 21 Feb 2012 14:58:41 -0500 > Benjamin Peterson wrote: >> 2012/2/21 Antoine Pitrou : >> > >> > Hello, >> > >> > Shouldn't it be enabled by default in 3.3? >> >> Should you be able to disable it? > > PYTHONHASHSEED=0 should disable it. ?Do we also need a command-line > option? I don't think so. I was just wondering if we should force people to use it. -- Regards, Benjamin From barry at python.org Tue Feb 21 21:05:33 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 21 Feb 2012 15:05:33 -0500 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: References: <20120221191928.406b8dcc@pitrou.net> Message-ID: <20120221150533.4d968c83@resist.wooz.org> On Feb 21, 2012, at 02:58 PM, Benjamin Peterson wrote: >2012/2/21 Antoine Pitrou : >> >> Hello, >> >> Shouldn't it be enabled by default in 3.3? Yes. >Should you be able to disable it? No, but you should be able to provide a seed. -Barry From v+python at g.nevcal.com Tue Feb 21 21:08:26 2012 From: v+python at g.nevcal.com (Glenn Linderman) Date: Tue, 21 Feb 2012 12:08:26 -0800 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: References: <20120221191928.406b8dcc@pitrou.net> Message-ID: <4F43F9BA.2080500@g.nevcal.com> On 2/21/2012 11:58 AM, Benjamin Peterson wrote: > 2012/2/21 Antoine Pitrou: >> Hello, >> >> Shouldn't it be enabled by default in 3.3? > Should you be able to disable it? Yes, absolutely. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Feb 21 21:24:59 2012 From: brett at python.org (Brett Cannon) Date: Tue, 21 Feb 2012 15:24:59 -0500 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: <20120221150533.4d968c83@resist.wooz.org> References: <20120221191928.406b8dcc@pitrou.net> <20120221150533.4d968c83@resist.wooz.org> Message-ID: On Tue, Feb 21, 2012 at 15:05, Barry Warsaw wrote: > On Feb 21, 2012, at 02:58 PM, Benjamin Peterson wrote: > > >2012/2/21 Antoine Pitrou : > >> > >> Hello, > >> > >> Shouldn't it be enabled by default in 3.3? > > Yes. > > >Should you be able to disable it? > > No, but you should be able to provide a seed. I think that's inviting trouble if you can provide the seed. It leads to a false sense of security in that providing some seed secures them instead of just making it a tad harder for the attack. And it won't help with keeping compatibility with Python 2.7 installations that don't have randomization turned on by default. If we are going to allow people to turn this off then it should be basically the inverse of the default under Python 2.7 and no more. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sumerc at gmail.com Tue Feb 21 22:00:56 2012 From: sumerc at gmail.com (=?ISO-8859-1?Q?S=FCmer_Cip?=) Date: Tue, 21 Feb 2012 23:00:56 +0200 Subject: [Python-Dev] CPU vs Wall time Profiling Message-ID: Hi all, Is there a reason behind the fact that the Python profilers work with Wall time by default? There are OS-dependent ways to get the CPU time of a thread, and giving that choice to the user _somehow_ ( to use wall vs cpu time) might be a good feature? -- Sumer Cip -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Tue Feb 21 22:08:32 2012 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 21 Feb 2012 16:08:32 -0500 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: <20120221191928.406b8dcc@pitrou.net> References: <20120221191928.406b8dcc@pitrou.net> Message-ID: 2012/2/21 Antoine Pitrou : > > Hello, > > Shouldn't it be enabled by default in 3.3? I've now enabled it by default in 3.3. -- Regards, Benjamin From python-dev at masklinn.net Tue Feb 21 21:58:18 2012 From: python-dev at masklinn.net (Xavier Morel) Date: Tue, 21 Feb 2012 21:58:18 +0100 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: References: <20120221191928.406b8dcc@pitrou.net> <20120221150533.4d968c83@resist.wooz.org> Message-ID: <542685C8-C032-4F6E-B011-ABAD0CAC0D2E@masklinn.net> On 2012-02-21, at 21:24 , Brett Cannon wrote: > On Tue, Feb 21, 2012 at 15:05, Barry Warsaw wrote: > >> On Feb 21, 2012, at 02:58 PM, Benjamin Peterson wrote: >> >>> 2012/2/21 Antoine Pitrou : >>>> >>>> Hello, >>>> >>>> Shouldn't it be enabled by default in 3.3? >> >> Yes. >> >>> Should you be able to disable it? >> >> No, but you should be able to provide a seed. > > I think that's inviting trouble if you can provide the seed. It leads to a > false sense of security in that providing some seed secures them instead of > just making it a tad harder for the attack. I might have misunderstood something, but wouldn't providing a seed always make it *easier* for the attacker, compared to a randomized hash? From fijall at gmail.com Tue Feb 21 22:21:43 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Tue, 21 Feb 2012 14:21:43 -0700 Subject: [Python-Dev] CPU vs Wall time Profiling In-Reply-To: References: Message-ID: On Tue, Feb 21, 2012 at 2:00 PM, S?mer Cip wrote: > Hi all, > > Is there a reason behind the fact that the Python profilers work with Wall > time by default? There are OS-dependent ways to get the CPU time of a > thread, and giving that choice to the user _somehow_ ( to use wall vs cpu > time) might be a good feature? What would you use for linux for example? Cheers, fijal From brett at python.org Tue Feb 21 22:31:17 2012 From: brett at python.org (Brett Cannon) Date: Tue, 21 Feb 2012 16:31:17 -0500 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: <542685C8-C032-4F6E-B011-ABAD0CAC0D2E@masklinn.net> References: <20120221191928.406b8dcc@pitrou.net> <20120221150533.4d968c83@resist.wooz.org> <542685C8-C032-4F6E-B011-ABAD0CAC0D2E@masklinn.net> Message-ID: On Tue, Feb 21, 2012 at 15:58, Xavier Morel wrote: > On 2012-02-21, at 21:24 , Brett Cannon wrote: > > On Tue, Feb 21, 2012 at 15:05, Barry Warsaw wrote: > > > >> On Feb 21, 2012, at 02:58 PM, Benjamin Peterson wrote: > >> > >>> 2012/2/21 Antoine Pitrou : > >>>> > >>>> Hello, > >>>> > >>>> Shouldn't it be enabled by default in 3.3? > >> > >> Yes. > >> > >>> Should you be able to disable it? > >> > >> No, but you should be able to provide a seed. > > > > I think that's inviting trouble if you can provide the seed. It leads to > a > > false sense of security in that providing some seed secures them instead > of > > just making it a tad harder for the attack. > > I might have misunderstood something, but wouldn't providing a seed always > make it *easier* for the attacker, compared to a randomized hash? > Yes, that was what I was trying to convey. -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Tue Feb 21 22:33:10 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 21 Feb 2012 16:33:10 -0500 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: <542685C8-C032-4F6E-B011-ABAD0CAC0D2E@masklinn.net> References: <20120221191928.406b8dcc@pitrou.net> <20120221150533.4d968c83@resist.wooz.org> <542685C8-C032-4F6E-B011-ABAD0CAC0D2E@masklinn.net> Message-ID: <20120221163310.00b06bd1@resist.wooz.org> On Feb 21, 2012, at 09:58 PM, Xavier Morel wrote: >On 2012-02-21, at 21:24 , Brett Cannon wrote: >> On Tue, Feb 21, 2012 at 15:05, Barry Warsaw wrote: >> >>> On Feb 21, 2012, at 02:58 PM, Benjamin Peterson wrote: >>> >>>> 2012/2/21 Antoine Pitrou : >>>>> >>>>> Hello, >>>>> >>>>> Shouldn't it be enabled by default in 3.3? >>> >>> Yes. >>> >>>> Should you be able to disable it? >>> >>> No, but you should be able to provide a seed. >> >> I think that's inviting trouble if you can provide the seed. It leads to a >> false sense of security in that providing some seed secures them instead of >> just making it a tad harder for the attack. > >I might have misunderstood something, but wouldn't providing a seed always >make it *easier* for the attacker, compared to a randomized hash? I don't think so. You'd have to somehow coerce the sys.hash_seed out of the process. Not impossible perhaps, but unlikely unless the application isn't written well and leaks that information (which is not Python's fault). Plus, with randomization enabled, that won't help you much past the current invocation of Python. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From martin at v.loewis.de Tue Feb 21 22:51:48 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Feb 2012 22:51:48 +0100 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: <20120221205952.3694f2a2@pitrou.net> References: <20120221191928.406b8dcc@pitrou.net> <20120221205952.3694f2a2@pitrou.net> Message-ID: <4F4411F4.6020208@v.loewis.de> Am 21.02.2012 20:59, schrieb Antoine Pitrou: > On Tue, 21 Feb 2012 14:58:41 -0500 > Benjamin Peterson wrote: >> 2012/2/21 Antoine Pitrou : >>> >>> Hello, >>> >>> Shouldn't it be enabled by default in 3.3? >> >> Should you be able to disable it? > > PYTHONHASHSEED=0 should disable it. Do we also need a command-line > option? On the contrary. PYTHONHASHSEED should go in 3.3, as should any facility to disable or otherwise fix the seed. Regards, martin From martin at v.loewis.de Tue Feb 21 22:55:02 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Feb 2012 22:55:02 +0100 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: <20120221150533.4d968c83@resist.wooz.org> References: <20120221191928.406b8dcc@pitrou.net> <20120221150533.4d968c83@resist.wooz.org> Message-ID: <4F4412B6.70800@v.loewis.de> >> Should you be able to disable it? > > No, but you should be able to provide a seed. Why exactly is that? We should take an attitude that Python hash values are completely arbitrary and can change at any point without notice. The only strict requirement should be that hashing must be consistent with equality; everything else should be an implementation detail. With that attitude, supporting explicit seeds is counter-productive. Regards, Martin From martin at v.loewis.de Tue Feb 21 22:47:43 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 21 Feb 2012 22:47:43 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <4F34E554.7090600@v.loewis.de> <4F3CC8C3.8070103@v.loewis.de> <4F4181E1.9040909@v.loewis.de> <20120220165511.Horde.c-6iIaGZi1VPQmzfA-JBT4A@webmail.df.eu> Message-ID: <4F4410FF.8010208@v.loewis.de> Am 21.02.2012 11:41, schrieb Eli Bendersky: > > > On Tue, Feb 21, 2012 at 03:59, Nick Coghlan > wrote: > > On Tue, Feb 21, 2012 at 11:39 AM, Eli Bendersky > wrote: > > So the two choices here are either change the documentation or the C > > implementation to actually make Element a class. The first is of > course > > simpler. However, someone somewhere may have written code that > knowingly > > forces the Python implementation to be used and subclasses > Element. Such > > code will break in 3.3, so it probably makes sense to invest in making > > Element a class in the C implementation as well. > > Yeah, that's my take as well (especially since, in 3.2 and earlier, > "forcing" use of the pure Python version was just a matter of > importing ElementTree instead of cElementTree). > > > I can't fathom why someone would do it though, since bar tiny > differences (like this one) cET is just a faster ET and it's available > practically everywhere with CPython. I mean, is it really important to > be able to subclass ET.Element? What goal does it serve? Statements like this make me *extremely* worried. Please try to adopt a position of much higher caution, accepting that a change is "incompatible" if there is a remote possibility that someone might actually rely on the original behavior. Otherwise, I predict that you will get flooded with complaints that you broke ET for no good reason. In the specific case, I tried to write a script that determines the memory usage of ET. As Element is lacking gc.get_referents support, I tried isinstance(o, Element), which failed badly. Feel free to dismiss this application as irrelevant, but I do wish that somebody was in charge of ET who was taking backwards compatibility as serious as Fredrik Lundh. Regards, Martin From victor.stinner at haypocalc.com Tue Feb 21 23:00:07 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 21 Feb 2012 23:00:07 +0100 Subject: [Python-Dev] CPU vs Wall time Profiling In-Reply-To: References: Message-ID: Python 3.3 has two new functions in the time module: monotonic() and wallclock(). Victor From solipsis at pitrou.net Tue Feb 21 23:06:22 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 21 Feb 2012 23:06:22 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 References: <4F34E554.7090600@v.loewis.de> <4F3CC8C3.8070103@v.loewis.de> <4F4181E1.9040909@v.loewis.de> <20120220165511.Horde.c-6iIaGZi1VPQmzfA-JBT4A@webmail.df.eu> Message-ID: <20120221230622.2a696bc4@pitrou.net> On Tue, 21 Feb 2012 12:41:17 +0200 Eli Bendersky wrote: > On Tue, Feb 21, 2012 at 03:59, Nick Coghlan wrote: > > > On Tue, Feb 21, 2012 at 11:39 AM, Eli Bendersky wrote: > > > So the two choices here are either change the documentation or the C > > > implementation to actually make Element a class. The first is of course > > > simpler. However, someone somewhere may have written code that knowingly > > > forces the Python implementation to be used and subclasses Element. Such > > > code will break in 3.3, so it probably makes sense to invest in making > > > Element a class in the C implementation as well. > > > > Yeah, that's my take as well (especially since, in 3.2 and earlier, > > "forcing" use of the pure Python version was just a matter of > > importing ElementTree instead of cElementTree). > > > > > I can't fathom why someone would do it though, since bar tiny differences > (like this one) cET is just a faster ET and it's available practically > everywhere with CPython. I mean, is it really important to be able to > subclass ET.Element? What goal does it serve? It probably wouldn't be very difficult to make element_new() the tp_new of Element_Type, and expose that type as "Element". That would settle the issue nicely and avoid compatibility concerns :) Regards Antoine. From solipsis at pitrou.net Tue Feb 21 23:07:10 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 21 Feb 2012 23:07:10 +0100 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: <4F4411F4.6020208@v.loewis.de> References: <20120221191928.406b8dcc@pitrou.net> <20120221205952.3694f2a2@pitrou.net> <4F4411F4.6020208@v.loewis.de> Message-ID: <20120221230710.4a8c625d@pitrou.net> On Tue, 21 Feb 2012 22:51:48 +0100 "Martin v. L?wis" wrote: > Am 21.02.2012 20:59, schrieb Antoine Pitrou: > > On Tue, 21 Feb 2012 14:58:41 -0500 > > Benjamin Peterson wrote: > >> 2012/2/21 Antoine Pitrou : > >>> > >>> Hello, > >>> > >>> Shouldn't it be enabled by default in 3.3? > >> > >> Should you be able to disable it? > > > > PYTHONHASHSEED=0 should disable it. Do we also need a command-line > > option? > > On the contrary. PYTHONHASHSEED should go in 3.3, as should any > facility to disable or otherwise fix the seed. Being able to reproduce exact output is useful to chase sporadic test failures (as with the --randseed option to regrtest). Regards Antoine. From ncoghlan at gmail.com Tue Feb 21 23:21:21 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 22 Feb 2012 08:21:21 +1000 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <4F4410FF.8010208@v.loewis.de> References: <4F34E554.7090600@v.loewis.de> <4F3CC8C3.8070103@v.loewis.de> <4F4181E1.9040909@v.loewis.de> <20120220165511.Horde.c-6iIaGZi1VPQmzfA-JBT4A@webmail.df.eu> <4F4410FF.8010208@v.loewis.de> Message-ID: On Wed, Feb 22, 2012 at 7:47 AM, "Martin v. L?wis" wrote: > Am 21.02.2012 11:41, schrieb Eli Bendersky: >> I can't fathom why someone would do it though, since bar tiny >> differences (like this one) cET is just a faster ET and it's available >> practically everywhere with CPython. I mean, is it really important to >> be able to subclass ET.Element? What goal does it serve? > > Statements like this make me *extremely* worried. Please try to adopt > a position of much higher caution, accepting that a change is > "incompatible" if there is a remote possibility that someone might > actually rely on the original behavior. Otherwise, I predict that you > will get flooded with complaints that you broke ET for no good reason. Indeed. It's a *major* PITA at times (and has definitely led to some ugly workarounds), but we have to take documented API compatibility very seriously. We're often even reluctant to change long-standing *de facto* behaviour, let alone things that are written up in the docs as being explicitly supported. In Python 3, merely saying "this class" or "this type" is as good as saying "this instance of the type metaclass" as far as API guarantees go. That's the reason for the awkward phrasing in the functools docs - specifically to *avoid* saying that functools.partial is a class, as we want to allow closure-based implementations as well. The key thing to remember is that the web-style "eh, just change it, people can fix their code to cope" mentality is a tiny *minority* in the global software development community. There's a huge amount of Python code out there, and a lot of it is hidden behind corporate firewalls. Our attention to backward compatibility concerns is one of the reasons why Python's reach extends into so many different areas. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From amcnabb at mcnabbs.org Tue Feb 21 23:16:10 2012 From: amcnabb at mcnabbs.org (Andrew McNabb) Date: Tue, 21 Feb 2012 15:16:10 -0700 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <4F4410FF.8010208@v.loewis.de> References: <4F4181E1.9040909@v.loewis.de> <20120220165511.Horde.c-6iIaGZi1VPQmzfA-JBT4A@webmail.df.eu> <4F4410FF.8010208@v.loewis.de> Message-ID: <20120221221610.GB3000@mcnabbs.org> On Tue, Feb 21, 2012 at 10:47:43PM +0100, "Martin v. L?wis" wrote: > > I can't fathom why someone would do it though, since bar tiny > > differences (like this one) cET is just a faster ET and it's available > > practically everywhere with CPython. I mean, is it really important to > > be able to subclass ET.Element? What goal does it serve? > > Statements like this make me *extremely* worried. Please try to adopt > a position of much higher caution, accepting that a change is > "incompatible" if there is a remote possibility that someone might > actually rely on the original behavior. Otherwise, I predict that you > will get flooded with complaints that you broke ET for no good reason. I'm happy to stand up as an example of someone who uses a custom Element class. My specific use case is loading the project Gutenberg database, which is a 210MB XML file. I created a custom Element class which I use for the top-level element (a custom element_factory passed to TreeBuilder distinguishes between the top-level element and all others). The custom Element class doesn't add children, so it keeps ElementTree from storing all of the elements its seen so far. On a system with 1 GB of RAM, there was no other way to get the file to load. So, I would be one of those people who would flood in the complaints. :) -- Andrew McNabb http://www.mcnabbs.org/andrew/ PGP Fingerprint: 8A17 B57C 6879 1863 DE55 8012 AB4D 6098 8826 6868 From ncoghlan at gmail.com Tue Feb 21 23:25:30 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 22 Feb 2012 08:25:30 +1000 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: <20120221230710.4a8c625d@pitrou.net> References: <20120221191928.406b8dcc@pitrou.net> <20120221205952.3694f2a2@pitrou.net> <4F4411F4.6020208@v.loewis.de> <20120221230710.4a8c625d@pitrou.net> Message-ID: On Wed, Feb 22, 2012 at 8:07 AM, Antoine Pitrou wrote: > On Tue, 21 Feb 2012 22:51:48 +0100 > "Martin v. L?wis" wrote: >> On the contrary. PYTHONHASHSEED should go in 3.3, as should any >> facility to disable or otherwise fix the seed. > > Being able to reproduce exact output is useful to chase sporadic test > failures (as with the --randseed option to regrtest). I'm with Antoine here - being able to force a particular seed still matters for testing purposes. However, the documentation of the option may need to be updated for 3.3 to emphasise that it should only be used to reproduce sporadic failures. Using it to work around applications that can't cope with randomised hashes would be rather ill-advised. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue Feb 21 23:45:54 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 22 Feb 2012 08:45:54 +1000 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <20120221221610.GB3000@mcnabbs.org> References: <4F4181E1.9040909@v.loewis.de> <20120220165511.Horde.c-6iIaGZi1VPQmzfA-JBT4A@webmail.df.eu> <4F4410FF.8010208@v.loewis.de> <20120221221610.GB3000@mcnabbs.org> Message-ID: On Wed, Feb 22, 2012 at 8:16 AM, Andrew McNabb wrote: > So, I would be one of those people who would flood in the complaints. :) As another "you don't know what you're going to break" war story: In Python 2.5, using "-m" with a package appeared to work, but actually slightly corrupted the import state (mostly in a benign way, but if it ever bit you it would lead to some very confusing behaviour). Since I'd never intended to allow that to happen (as I knew about the state corruption problem), for 2.6 I added back the "this doesn't work properly" guard that had been present in the earlier versions of 2.5, but had been lost when some duplicate code in pkgutil and runpy was merged into a single version. Doing that broke things: http://bugs.python.org/issue4195 The basic rule is, if it's documented to work a certain way and the current implementation does work that way, then, someone, somewhere is relying on it working as documented. If it *doesn't* actually work that way (or the behaviour isn't explicitly documented at all), then we have some leeway to decide whether to bring the docs in line with the actual behaviour or vice-versa. For the Element case though, there's no such discrepancy - the docs and implementation have been consistent for years, so we need to maintain the current behaviour if the C acceleration is going to be used implicitly. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue Feb 21 23:49:02 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 22 Feb 2012 08:49:02 +1000 Subject: [Python-Dev] [Python-checkins] cpython (2.6): ensure no one tries to hash things before the random seed is found In-Reply-To: References: Message-ID: On Wed, Feb 22, 2012 at 2:24 AM, benjamin.peterson wrote: > http://hg.python.org/cpython/rev/357e268e7c5f > changeset: ? 75133:357e268e7c5f > branch: ? ? ?2.6 > parent: ? ? ?75124:04738f35e0ec > user: ? ? ? ?Benjamin Peterson > date: ? ? ? ?Tue Feb 21 11:08:50 2012 -0500 > summary: > ?ensure no one tries to hash things before the random seed is found Won't this trigger in the -Wd case that led to the PyStr_Fini workaround? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue Feb 21 23:51:46 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 22 Feb 2012 08:51:46 +1000 Subject: [Python-Dev] [Python-checkins] cpython (2.6): ensure no one tries to hash things before the random seed is found In-Reply-To: References: Message-ID: On Wed, Feb 22, 2012 at 8:49 AM, Nick Coghlan wrote: > Won't this trigger in the -Wd case that led to the PyStr_Fini workaround? Never mind, just saw the later series of checkins that fixed it. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue Feb 21 23:53:39 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 22 Feb 2012 08:53:39 +1000 Subject: [Python-Dev] [Python-checkins] cpython: enable hash randomization by default In-Reply-To: References: Message-ID: On Wed, Feb 22, 2012 at 7:08 AM, benjamin.peterson wrote: > + ? ? ?Changing hash values affects the order in which keys are retrieved from a > + ? ? ?dict. ?Although Python has never made guarantees about this ordering (and > + ? ? ?it typically varies between 32-bit and 64-bit builds), enough real-world > + ? ? ?code implicitly relies on this non-guaranteed behavior that the > + ? ? ?randomization is disabled by default. That last sentence needs to change for 3.3 Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From skippy.hammond at gmail.com Tue Feb 21 23:58:14 2012 From: skippy.hammond at gmail.com (Mark Hammond) Date: Wed, 22 Feb 2012 09:58:14 +1100 Subject: [Python-Dev] Status of PEP 397 - Python launcher for Windows In-Reply-To: References: <4F3F35FF.1010506@gmail.com> <4F42DBB1.6030309@gmail.com> Message-ID: <4F442186.3050709@gmail.com> On 22/02/2012 2:50 AM, Vinay Sajip wrote: > Mark Hammond gmail.com> writes: > >> think there is something that could be added to those docs - the use of >> PATHEXT and the fact that once the shebang line is in place, a >> command-prompt could do just "hello.py" rather than needing "py hello.py". > > Or even just "hello" should work. Ooops - right. IIRC, "hello.py" will work without anything special in PATHEXT and just "hello" would work with a modified PATHEXT. Mark From guido at python.org Wed Feb 22 00:30:28 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 21 Feb 2012 15:30:28 -0800 Subject: [Python-Dev] CPU vs Wall time Profiling In-Reply-To: References: Message-ID: On Tue, Feb 21, 2012 at 1:00 PM, S?mer Cip wrote: > Is there a reason behind the fact that the Python profilers work with Wall > time by default? There are OS-dependent ways to get the CPU time of a > thread, and giving that choice to the user _somehow_ ( to use wall vs cpu > time) might be a good feature? The original reason was that the Unix wall clock was more accurate than its CPU clock. If that's changed we should probably (perhaps in a platform-dependent way) change the default to the most accurate clock available. -- --Guido van Rossum (python.org/~guido) From eliben at gmail.com Wed Feb 22 03:24:38 2012 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 22 Feb 2012 04:24:38 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <20120221221610.GB3000@mcnabbs.org> References: <4F4181E1.9040909@v.loewis.de> <20120220165511.Horde.c-6iIaGZi1VPQmzfA-JBT4A@webmail.df.eu> <4F4410FF.8010208@v.loewis.de> <20120221221610.GB3000@mcnabbs.org> Message-ID: > I'm happy to stand up as an example of someone who uses a custom Element > class. My specific use case is loading the project Gutenberg database, > which is a 210MB XML file. I created a custom Element class which I use > for the top-level element (a custom element_factory passed to > TreeBuilder distinguishes between the top-level element and all others). > The custom Element class doesn't add children, so it keeps ElementTree > from storing all of the elements its seen so far. On a system with 1 GB > of RAM, there was no other way to get the file to load. > > So, I would be one of those people who would flood in the complaints. :) > Andrew, could you elaborate on your use case? Are you using cElementTree to do the parsing, or ElementTree (the Python implementation). Can you show a short code sample? Thanks in advance, Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Wed Feb 22 03:30:17 2012 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 22 Feb 2012 04:30:17 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <4F4410FF.8010208@v.loewis.de> References: <4F34E554.7090600@v.loewis.de> <4F3CC8C3.8070103@v.loewis.de> <4F4181E1.9040909@v.loewis.de> <20120220165511.Horde.c-6iIaGZi1VPQmzfA-JBT4A@webmail.df.eu> <4F4410FF.8010208@v.loewis.de> Message-ID: > > I can't fathom why someone would do it though, since bar tiny > > differences (like this one) cET is just a faster ET and it's available > > practically everywhere with CPython. I mean, is it really important to > > be able to subclass ET.Element? What goal does it serve? > > Statements like this make me *extremely* worried. Please try to adopt > a position of much higher caution, accepting that a change is > "incompatible" if there is a remote possibility that someone might > actually rely on the original behavior. Otherwise, I predict that you > will get flooded with complaints that you broke ET for no good reason. > No need to be worried, Martin. If you read back in this thread you'll see that I agree that backwards compatibility should be preserved, by making the Element exposed from _elementtree also a type. I was simply trying to have a discussion to better understand the use cases and implications. I hope that's alright. > > In the specific case, I tried to write a script that determines the > memory usage of ET. As Element is lacking gc.get_referents support, > I tried isinstance(o, Element), which failed badly. > Thanks for describing the use case. By the way, when working with ET I also wanted to track the memory usage of the package a couple of times, which made me lament that there's no useful memory profiler in the stdlib. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian at python.org Wed Feb 22 04:32:23 2012 From: brian at python.org (Brian Curtin) Date: Tue, 21 Feb 2012 21:32:23 -0600 Subject: [Python-Dev] Windows build - fixing compile warnings before VS2010 Message-ID: While some effort has gone on to get the 32-bit build to compile without warnings (thanks for that!), 64-bit still has numerous warnings. Before I push forward on more of the VS2010 port, I'd like to have a clean 2008 build all around so we can more easily track what may have changed. In completing that effort, I'm using a guideline Martin set out in #9566 [0], and please let me know if there are any others to follow. I kind of doubt anyone is against this, but if you are, please speak up before I start pushing changes. While I have your attention, I'd like to throw two other things out there to follow up the above effort: 1. Is anyone opposed to moving up to Level 4 warnings? ...take a deep breath... 2. Is anyone opposed to enabling warnings as errors? [0] http://bugs.python.org/issue9566#msg113574 From martin at v.loewis.de Wed Feb 22 06:20:21 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Wed, 22 Feb 2012 06:20:21 +0100 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: References: <20120221191928.406b8dcc@pitrou.net> <20120221205952.3694f2a2@pitrou.net> <4F4411F4.6020208@v.loewis.de> <20120221230710.4a8c625d@pitrou.net> Message-ID: <20120222062021.Horde.InHoBElCcOxPRHsVzdc1BnA@webmail.df.eu> > I'm with Antoine here - being able to force a particular seed still > matters for testing purposes. However, the documentation of the option > may need to be updated for 3.3 to emphasise that it should only be > used to reproduce sporadic failures. Using it to work around > applications that can't cope with randomised hashes would be rather > ill-advised. In the tracker, someone proposed that the option is necessary to synchronize the seed across processes in a cluster. I'm sure people will use it for that if they can. Regards, Martin From martin at v.loewis.de Wed Feb 22 06:36:41 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Wed, 22 Feb 2012 06:36:41 +0100 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <4F34E554.7090600@v.loewis.de> <4F3CC8C3.8070103@v.loewis.de> <4F4181E1.9040909@v.loewis.de> <20120220165511.Horde.c-6iIaGZi1VPQmzfA-JBT4A@webmail.df.eu> <4F4410FF.8010208@v.loewis.de> Message-ID: <20120222063641.Horde.I4glR0lCcOxPRH7pLoL1HLA@webmail.df.eu> > Thanks for describing the use case. By the way, when working with ET I also > wanted to track the memory usage of the package a couple of times, which > made me lament that there's no useful memory profiler in the stdlib. A memory profiler can be a ten-line Python function which, however, does need to be tuned to the application. So I'm not sure it can be provided by the stdlib in a reasonable fashion beyond what's already there, but it may not be necessary to have it in the stdlib, either. Regards, Martin From martin at v.loewis.de Wed Feb 22 06:45:48 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Wed, 22 Feb 2012 06:45:48 +0100 Subject: [Python-Dev] Windows build - fixing compile warnings before VS2010 In-Reply-To: References: Message-ID: <20120222064548.Horde.ZXw4cklCcOxPRIEMZQ7FIuA@webmail.df.eu> Zitat von Brian Curtin : > While some effort has gone on to get the 32-bit build to compile > without warnings (thanks for that!), 64-bit still has numerous > warnings. Before I push forward on more of the VS2010 port, I'd like > to have a clean 2008 build all around so we can more easily track what > may have changed. Does that *really* have to be a prerequisite for porting to VS 2010? If yes, then my hopes that we can move to VS 2010 before 3.3 are falling... > While I have your attention, I'd like to throw two other things out > there to follow up the above effort: > 1. Is anyone opposed to moving up to Level 4 warnings? Not sure what this means. What kind of warnings would this get us? MS says "This option should be used only to provide "lint" level warnings and is not recommended as your usual warning level setting." Usually, following MS recommendations is a good thing to do on Windows. But then, the documentation goes on saying "For a new project, it may be best to use /W4 in all compilations. This will ensure the fewest possible hard-to-find code defects." > ...take a deep breath... > 2. Is anyone opposed to enabling warnings as errors? The immediate consequence would be that the Windows buildbots break when somebody makes a checkin on Unix, and they cannot easily figure out how to rewrite the code to make the compiler happy. So I guess I'm -1. Regards, Martin From sumerc at gmail.com Wed Feb 22 07:49:07 2012 From: sumerc at gmail.com (=?ISO-8859-1?Q?S=FCmer_Cip?=) Date: Wed, 22 Feb 2012 08:49:07 +0200 Subject: [Python-Dev] CPU vs Wall time Profiling In-Reply-To: References: Message-ID: > > The original reason was that the Unix wall clock was more accurate > than its CPU clock. If that's changed we should probably (perhaps in a > platform-dependent way) change the default to the most accurate clock > available. > > Currently it seems clock_gettime() APIs have nanosecond resolution and OTOH gettimeofday() have microsecond. Other than that, clock_gettime() has a significant advantage: it has per-process timer available which will increase the accuracy of the timing information of the profiled application. -- Sumer Cip -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Feb 22 07:57:54 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 22 Feb 2012 16:57:54 +1000 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: <20120222062021.Horde.InHoBElCcOxPRHsVzdc1BnA@webmail.df.eu> References: <20120221191928.406b8dcc@pitrou.net> <20120221205952.3694f2a2@pitrou.net> <4F4411F4.6020208@v.loewis.de> <20120221230710.4a8c625d@pitrou.net> <20120222062021.Horde.InHoBElCcOxPRHsVzdc1BnA@webmail.df.eu> Message-ID: On Wed, Feb 22, 2012 at 3:20 PM, wrote: >> I'm with Antoine here - being able to force a particular seed still >> matters for testing purposes. However, the documentation of the option >> may need to be updated for 3.3 to emphasise that it should only be >> used to reproduce sporadic failures. Using it to work around >> applications that can't cope with randomised hashes would be rather >> ill-advised. > > > In the tracker, someone proposed that the option is necessary to synchronize > the seed across processes in a cluster. I'm sure people will use it for that > if they can. Yeah, that use case sounds reasonable, too. Another example is that, even within a machine, if two processes are using shared memory rather than serialised IPC, synchronising the hashes may be necessary. The key point is that there *are* valid use cases for forcing a particular seed, so we shouldn't take that ability away. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From stephen at xemacs.org Wed Feb 22 08:37:55 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 22 Feb 2012 16:37:55 +0900 Subject: [Python-Dev] accept string in a2b and base64? In-Reply-To: <20120221012417.202D02500E7@webabinitio.net> References: <20120221012417.202D02500E7@webabinitio.net> Message-ID: <87mx8bmi7g.fsf@uwakimon.sk.tsukuba.ac.jp> R. David Murray writes: > If most people agree with Antoine I won't fight it, but it seems to me > that accepting unicode in the binascii and base64 APIs is a bad > idea. First, I agree with David that this change should have been brought up on python-dev before committing it. The distinctions Python 3 has made between APIs for bytes and those for str are both obviously controversial and genuinely delicate. Second, if Unicode is to be accepted in these APIs, there is a doc issue (which I haven't checked). It must be made clear that the "printable ASCII" is question is the set represented by the *integers* 33 to 126, *not* the ASCII characters ! to ~. Those characters are present in the Unicode repertoire in many other places (specifically the "full-width ASCII" compatibility character set around U+FF20, but also several Greek and Cyrillic characters, and possibly others.) I'm going to side with Antoine and Nick on these particular changes because in practice (except maybe in the email module :-( ) the BASE-encoded "text" to be decoded is going to be consistently defined by the client as either str or bytes, but not both. The fact that the repr of the encoded text is identical (except for the presence or absence of a leading "b") is very suggestive here. I do harbor a slight niggle that I think there is more room for confusion here than in Nick's urllib work. However, once we clarify that confusion in *our* minds, I don't think there's much potential for dangerous confusion for API clients. (I agree with Antoine on that point.) The BASE## decoding APIs in abstract are "text" to bytes. Pedantically in Python that suggests a str -> bytes signature, but RFC 4648 doesn't anywhere require a 1-byte representation of ASCII, only that the representation be interpreted as integers in the ASCII coding. However, an RFC-4648-conforming implementation MUST reject any string containing characters not allowed in the representation, so it's actually stricter than requiring ASCII. I see no problem with allowing str-or-bytes -> bytes polymorphism here. The remaining issue to my mind is we'd also like bytes -> str-or-bytes polymorphism for symmetry, but this is not Haskell, we can't have it. The same is true for binascii, I suppose -- assuming that the module is specified (as the name suggests) to produce and consume only ASCII text as a representation of bytes. From martin at v.loewis.de Wed Feb 22 10:35:01 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 22 Feb 2012 10:35:01 +0100 Subject: [Python-Dev] accept string in a2b and base64? In-Reply-To: <20120221012417.202D02500E7@webabinitio.net> References: <20120221012417.202D02500E7@webabinitio.net> Message-ID: <4F44B6C5.4010800@v.loewis.de> > It seems to me that part of the point of the byte/string split (and the > lack of automatic coercion) is to make the programmer be explicit about > converting between unicode and bytes. Having these functions, which > convert between binary formats (ASCII-only representations of binary data > and back) accept unicode strings is reintroducing automatic coercions, > and I think it will lead to the same kind of bugs that automatic string > coercions yielded in Python2: a program works fine until the input > turns out to have non-ASCII data in it, and then it blows up with an > unexpected UnicodeError. I agree with the change in principle, but I also agree in the choice of error with you: py> binascii.a2b_hex("MURRAY") Traceback (most recent call last): File "", line 1, in binascii.Error: Non-hexadecimal digit found py> binascii.a2b_hex("VL?WIS") Traceback (most recent call last): File "", line 1, in ValueError: string argument should contain only ASCII characters I think it should give binascii.Error in both cases: ? is as much a non-hexadecimal digit as M. With that changed, I'd have no issues with the patch: these functions are already fairly strict in their input, whether it's bytes or Unicode. So the chances that non-ASCII characters get it to fall over in a way that never causes problems in pure-ASCII communities are very low. > If most people agree with Antoine I won't fight it, but it seems to me > that accepting unicode in the binascii and base64 APIs is a bad idea. No - it's only the choice of error that is a bad idea. Regards, Martin From stephen at xemacs.org Wed Feb 22 13:04:38 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 22 Feb 2012 21:04:38 +0900 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: References: <20120221191928.406b8dcc@pitrou.net> <20120221150533.4d968c83@resist.wooz.org> Message-ID: <87ipizm5ux.fsf@uwakimon.sk.tsukuba.ac.jp> Brett Cannon writes: > I think that's inviting trouble if you can provide the seed. It leads to a > false sense of security I thought the point of providing the seed was for reproducability of tests and the like? As for "false sense", can't we document this and chalk up hubristic behavior to "consenting adults"? From solipsis at pitrou.net Wed Feb 22 15:43:23 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 22 Feb 2012 15:43:23 +0100 Subject: [Python-Dev] Windows build - fixing compile warnings before VS2010 References: Message-ID: <20120222154323.617e1c22@pitrou.net> On Tue, 21 Feb 2012 21:32:23 -0600 Brian Curtin wrote: > While some effort has gone on to get the 32-bit build to compile > without warnings (thanks for that!), 64-bit still has numerous > warnings. Before I push forward on more of the VS2010 port, I'd like > to have a clean 2008 build all around so we can more easily track what > may have changed. +1. Of course, it doesn't help that Microsoft implements POSIX APIs incorrectly (for example, Microsoft's read() take the length as an int, not a size_t). Regards Antoine. From solipsis at pitrou.net Wed Feb 22 15:47:05 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 22 Feb 2012 15:47:05 +0100 Subject: [Python-Dev] cpython: Optimize str%arg for number formats: %i, %d, %u, %x, %p References: Message-ID: <20120222154705.0eac15b1@pitrou.net> On Wed, 22 Feb 2012 13:58:45 +0100 victor.stinner wrote: > > +/* Copy a ASCII or latin1 char* string into a Python Unicode string. > + Return the length of the input string. > + > + WARNING: Don't copy the terminating null character and don't check the > + maximum character (may write a latin1 character in an ASCII string). */ If this is a description of what the function does, it should say "doesn't", not "don't". Right now this comment is ambiguous (is it an order given to the reader?). Regards Antoine. From brian at python.org Wed Feb 22 15:58:06 2012 From: brian at python.org (Brian Curtin) Date: Wed, 22 Feb 2012 08:58:06 -0600 Subject: [Python-Dev] Windows build - fixing compile warnings before VS2010 In-Reply-To: <20120222064548.Horde.ZXw4cklCcOxPRIEMZQ7FIuA@webmail.df.eu> References: <20120222064548.Horde.ZXw4cklCcOxPRIEMZQ7FIuA@webmail.df.eu> Message-ID: On Tue, Feb 21, 2012 at 23:45, wrote: > > Zitat von Brian Curtin : > > >> While some effort has gone on to get the 32-bit build to compile >> without warnings (thanks for that!), 64-bit still has numerous >> warnings. Before I push forward on more of the VS2010 port, I'd like >> to have a clean 2008 build all around so we can more easily track what >> may have changed. > > > Does that *really* have to be a prerequisite for porting to VS 2010? > If yes, then my hopes that we can move to VS 2010 before 3.3 are > falling... Is it a prerequisite? No. I guess with this question all I'm asking is "Can I fix a lot of these warnings without someone wanting to undo them for the sake of cleaner merges or neat hg history?" I'd prefer not to take 315 warnings into a compiler change, come out with 550, and not know what potentially went wrong. In a previous company, we changed from 2008 to 2010 by upping the warning level, fixing all warnings, then enabling warnings-as-errors (I'll address this later) - the port to 2010 went nicely and we experienced a very smooth transition. Much more smoothly than 2005 to 2008. I just cut out around 100 warnings last night in 45 minutes, so I don't plan on having this take several months or anything. If I get stuck, I'll just give it up. >> While I have your attention, I'd like to throw two other things out >> there to follow up the above effort: >> 1. Is anyone opposed to moving up to Level 4 warnings? > > > Not sure what this means. What kind of warnings would this get us? > > MS says "This option should be used only to provide "lint" level > warnings and is not recommended as your usual warning level setting." > > Usually, following MS recommendations is a good thing to do on Windows. > But then, the documentation goes on saying > > "For a new project, it may be best to use /W4 in all compilations. > This will ensure the fewest possible hard-to-find code defects." The last sentence (but applied to old projects) says it all. Like I mentioned above, my last company jacked everything up to the highest levels and stuck with it, and I think we wrote nicer code. That's really all I can say. No metrics, no strong support, no debate. You could just say "no" and I'll probably accept it. >> ...take a deep breath... >> 2. Is anyone opposed to enabling warnings as errors? > > > The immediate consequence would be that the Windows buildbots > break when somebody makes a checkin on Unix, and they cannot > easily figure out how to rewrite the code to make the compiler > happy. So I guess I'm -1. I didn't think about that, so yeah, I'm probably -1 here as well. From shibturn at gmail.com Wed Feb 22 17:04:34 2012 From: shibturn at gmail.com (shibturn) Date: Wed, 22 Feb 2012 16:04:34 +0000 Subject: [Python-Dev] Windows build - fixing compile warnings before VS2010 In-Reply-To: References: Message-ID: On 22/02/2012 3:32am, Brian Curtin wrote: > 1. Is anyone opposed to moving up to Level 4 warnings? At that level I think it complains about common things like the "do {...} while (0)" idiom, and the unreferenced self parameter of builtin functions. Presumably you would have to disable those specific warnings and any other overly annoying ones? sbt From brian at python.org Wed Feb 22 17:12:50 2012 From: brian at python.org (Brian Curtin) Date: Wed, 22 Feb 2012 10:12:50 -0600 Subject: [Python-Dev] Windows build - fixing compile warnings before VS2010 In-Reply-To: References: Message-ID: On Wed, Feb 22, 2012 at 10:04, shibturn wrote: > On 22/02/2012 3:32am, Brian Curtin wrote: >> >> 1. Is anyone opposed to moving up to Level 4 warnings? > > > At that level I think it complains about common things like the "do {...} > while (0)" idiom, and the unreferenced self parameter of builtin functions. > > Presumably you would have to disable those specific warnings and any other > overly annoying ones? What we did was fix what was reasonable, then disable warnings which were unreasonable. If that's reasonable, that's how I would do it. (just to say it one more time: reasonable) From amcnabb at mcnabbs.org Wed Feb 22 17:46:10 2012 From: amcnabb at mcnabbs.org (Andrew McNabb) Date: Wed, 22 Feb 2012 09:46:10 -0700 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: References: <20120220165511.Horde.c-6iIaGZi1VPQmzfA-JBT4A@webmail.df.eu> <4F4410FF.8010208@v.loewis.de> <20120221221610.GB3000@mcnabbs.org> Message-ID: <20120222164610.GA8547@mcnabbs.org> On Wed, Feb 22, 2012 at 04:24:38AM +0200, Eli Bendersky wrote: > > Andrew, could you elaborate on your use case? Are you using cElementTree to > do the parsing, or ElementTree (the Python implementation). Can you show a > short code sample? I'm mostly using ElementTree because several classes/functions that I need are not in cElementTree or behave differently. Specifically, the program loads TreeBuilder, XMLParser, iterparse from ElementTree; the only class from cElementTree that works is Element. A shortened version of the program is available here: http://aml.cs.byu.edu/~amcnabb/gutenberg-short.py By the way, this code is just one example of how one might rely on the documented extensibility of ElementTree. There are probably many other examples out there that look nothing like mine. -- Andrew McNabb http://www.mcnabbs.org/andrew/ PGP Fingerprint: 8A17 B57C 6879 1863 DE55 8012 AB4D 6098 8826 6868 From martin at v.loewis.de Wed Feb 22 17:59:52 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Wed, 22 Feb 2012 17:59:52 +0100 Subject: [Python-Dev] The ultimate question of life, the universe, and everything Message-ID: <20120222175952.Horde.23E4TKGZi1VPRR8IeRuzQ8A@webmail.df.eu> What is the hash of "ePjNTUhitHkL"? Regards, Martin P.S. It took me roughly 86h to compute 150 strings colliding for the 64-bit hash function. From barry at python.org Wed Feb 22 18:59:33 2012 From: barry at python.org (Barry Warsaw) Date: Wed, 22 Feb 2012 12:59:33 -0500 Subject: [Python-Dev] hash randomization in 3.3 References: <20120221191928.406b8dcc@pitrou.net> <20120221150533.4d968c83@resist.wooz.org> <87ipizm5ux.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20120222125933.62a848cd@resist.wooz.org> On Feb 22, 2012, at 09:04 PM, Stephen J. Turnbull wrote: >Brett Cannon writes: > > > I think that's inviting trouble if you can provide the seed. It leads to a > > false sense of security > >I thought the point of providing the seed was for reproducability of >tests and the like? > >As for "false sense", can't we document this and chalk up hubristic >behavior to "consenting adults"? +1 -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Wed Feb 22 19:14:49 2012 From: barry at python.org (Barry Warsaw) Date: Wed, 22 Feb 2012 13:14:49 -0500 Subject: [Python-Dev] Issue 13703 is closed for the Python 2.6 branch In-Reply-To: <20120221025317.11054eed@pitrou.net> References: <20120221025317.11054eed@pitrou.net> Message-ID: <20120222131449.3a05de23@resist.wooz.org> Two more small details to address, and then I think we're ready to start creating release candidates. - sys.flags.hash_randomization In the tracker issue, I had previously stated a preference that this flag only reflect the state of the -R command line option, not the $PYTHONHASHSEED environment variable. Well, that's not the way other options/envars such as -O/$PYTHONOPTIMIZE work. sys.flags.optimize gets set if either of those two things set it, so sys.flags.hash_randomization needs to follow that convention. Thus no change is necessary here. - sys.hash_seed In the same tracker issue, I expressed my opinion that the hash seed should be exposed in sys.hash_seed for reproducibility. There's a complication that Victor first mentioned in IRC, but I didn't quite understand the implications of at first. When PYTHONHASHSEED=random is set, there *is no* hash seed. We pull random data straight out of urandom and use that directly as the secret, so there's nothing to expose in sys.hash_seed. In that case, sys.hash_seed is pretty much redundant, since Python code could just check getenv('PYTHONHASHSEED') and be done with it. I don't think there's anything useful to expose to Python or communicated between Python executables when truly random hash data is used. Thus, unless there are objections, I consider the current state of the Python 2.6 branch to be finished wrt issue 13703. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From solipsis at pitrou.net Wed Feb 22 19:26:11 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 22 Feb 2012 19:26:11 +0100 Subject: [Python-Dev] hash randomization in 3.3 References: <20120221191928.406b8dcc@pitrou.net> <20120221150533.4d968c83@resist.wooz.org> <87ipizm5ux.fsf@uwakimon.sk.tsukuba.ac.jp> <20120222125933.62a848cd@resist.wooz.org> Message-ID: <20120222192611.1a9ba4c6@pitrou.net> On Wed, 22 Feb 2012 12:59:33 -0500 Barry Warsaw wrote: > On Feb 22, 2012, at 09:04 PM, Stephen J. Turnbull wrote: > > >Brett Cannon writes: > > > > > I think that's inviting trouble if you can provide the seed. It leads to a > > > false sense of security > > > >I thought the point of providing the seed was for reproducability of > >tests and the like? > > > >As for "false sense", can't we document this and chalk up hubristic > >behavior to "consenting adults"? > > +1 How is it a "false sense of security" at all? It's the same as setting a private secret for e.g. session cookies in Web applications. As long as you don't leak the seed, it's (should be) secure. (the only hypothetical issue being with Victor's choice of an LCG pseudo-random generator to generate the secret from the seed) Regards Antoine. From fijall at gmail.com Wed Feb 22 19:46:02 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 22 Feb 2012 11:46:02 -0700 Subject: [Python-Dev] The ultimate question of life, the universe, and everything In-Reply-To: <20120222175952.Horde.23E4TKGZi1VPRR8IeRuzQ8A@webmail.df.eu> References: <20120222175952.Horde.23E4TKGZi1VPRR8IeRuzQ8A@webmail.df.eu> Message-ID: On Wed, Feb 22, 2012 at 9:59 AM, wrote: > What is the hash of "ePjNTUhitHkL"? > > Regards, > Martin > > P.S. It took me roughly 86h to compute 150 strings colliding for the 64-bit > hash function. You should have used pypy, should have been faster. From martin at v.loewis.de Wed Feb 22 19:55:33 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 22 Feb 2012 19:55:33 +0100 Subject: [Python-Dev] The ultimate question of life, the universe, and everything In-Reply-To: References: <20120222175952.Horde.23E4TKGZi1VPRR8IeRuzQ8A@webmail.df.eu> Message-ID: <4F453A25.1020302@v.loewis.de> Am 22.02.2012 19:46, schrieb Maciej Fijalkowski: > On Wed, Feb 22, 2012 at 9:59 AM, wrote: >> What is the hash of "ePjNTUhitHkL"? >> >> Regards, >> Martin >> >> P.S. It took me roughly 86h to compute 150 strings colliding for the 64-bit >> hash function. > > You should have used pypy, should have been faster. It was actually a C program - I doubt PyPy would have been faster. Perhaps my algorithm wasn't good enough if you think this is slow. Regards, Martin From fijall at gmail.com Wed Feb 22 20:01:00 2012 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 22 Feb 2012 12:01:00 -0700 Subject: [Python-Dev] The ultimate question of life, the universe, and everything In-Reply-To: <4F453A25.1020302@v.loewis.de> References: <20120222175952.Horde.23E4TKGZi1VPRR8IeRuzQ8A@webmail.df.eu> <4F453A25.1020302@v.loewis.de> Message-ID: On Wed, Feb 22, 2012 at 11:55 AM, "Martin v. L?wis" wrote: > Am 22.02.2012 19:46, schrieb Maciej Fijalkowski: >> On Wed, Feb 22, 2012 at 9:59 AM, ? wrote: >>> What is the hash of "ePjNTUhitHkL"? >>> >>> Regards, >>> Martin >>> >>> P.S. It took me roughly 86h to compute 150 strings colliding for the 64-bit >>> hash function. >> >> You should have used pypy, should have been faster. > > It was actually a C program - I doubt PyPy would have been faster. > Perhaps my algorithm wasn't good enough if you think this is slow. > > Regards, > Martin That was entirely a joke, sorry if a bad one :) I seriously doubt it'll be faster. From martin at v.loewis.de Wed Feb 22 20:23:03 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 22 Feb 2012 20:23:03 +0100 Subject: [Python-Dev] Windows build - fixing compile warnings before VS2010 In-Reply-To: References: <20120222064548.Horde.ZXw4cklCcOxPRIEMZQ7FIuA@webmail.df.eu> Message-ID: <4F454097.7010807@v.loewis.de> > I just cut out around 100 warnings last night in 45 minutes, so I > don't plan on having this take several months or anything. If I get > stuck, I'll just give it up. Would you mind posting a batch of these to the tracker? I'd like to review them, just to be sure we have the same understanding. Regards, Martin From tjreedy at udel.edu Wed Feb 22 22:05:21 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 22 Feb 2012 16:05:21 -0500 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: References: <20120221191928.406b8dcc@pitrou.net> <20120221205952.3694f2a2@pitrou.net> <4F4411F4.6020208@v.loewis.de> <20120221230710.4a8c625d@pitrou.net> <20120222062021.Horde.InHoBElCcOxPRHsVzdc1BnA@webmail.df.eu> Message-ID: On 2/22/2012 1:57 AM, Nick Coghlan wrote: >> In the tracker, someone proposed that the option is necessary to synchronize >> the seed across processes in a cluster. I'm sure people will use it for that >> if they can. > > Yeah, that use case sounds reasonable, too. Another example is that, > even within a machine, if two processes are using shared memory rather > than serialised IPC, synchronising the hashes may be necessary. The > key point is that there *are* valid use cases for forcing a particular > seed, so we shouldn't take that ability away. When we document the option to set the seed, we could mention that synchronization of processes that share data is the main intended use. -- Terry Jan Reedy From greg at krypto.org Wed Feb 22 23:05:07 2012 From: greg at krypto.org (Gregory P. Smith) Date: Wed, 22 Feb 2012 14:05:07 -0800 Subject: [Python-Dev] Issue 13703 is closed for the Python 2.6 branch In-Reply-To: <20120222131449.3a05de23@resist.wooz.org> References: <20120221025317.11054eed@pitrou.net> <20120222131449.3a05de23@resist.wooz.org> Message-ID: On Wed, Feb 22, 2012 at 10:14 AM, Barry Warsaw wrote: > Two more small details to address, and then I think we're ready to start > creating release candidates. > > ?- sys.flags.hash_randomization > > ? In the tracker issue, I had previously stated a preference that this flag > ? only reflect the state of the -R command line option, not the > ? $PYTHONHASHSEED environment variable. ?Well, that's not the way other > ? options/envars such as -O/$PYTHONOPTIMIZE work. ?sys.flags.optimize gets > ? set if either of those two things set it, so sys.flags.hash_randomization > ? needs to follow that convention. ?Thus no change is necessary here. > > ?- sys.hash_seed > > ? In the same tracker issue, I expressed my opinion that the hash seed should > ? be exposed in sys.hash_seed for reproducibility. ?There's a complication > ? that Victor first mentioned in IRC, but I didn't quite understand the > ? implications of at first. ?When PYTHONHASHSEED=random is set, there *is no* > ? hash seed. ?We pull random data straight out of urandom and use that > ? directly as the secret, so there's nothing to expose in sys.hash_seed. > > In that case, sys.hash_seed is pretty much redundant, since Python code could > just check getenv('PYTHONHASHSEED') and be done with it. ?I don't think > there's anything useful to expose to Python or communicated between Python > executables when truly random hash data is used. > > Thus, unless there are objections, I consider the current state of the Python > 2.6 branch to be finished wrt issue 13703. +10 From merwok at netwok.org Thu Feb 23 01:21:48 2012 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 23 Feb 2012 01:21:48 +0100 Subject: [Python-Dev] http://pythonmentors.com/ In-Reply-To: References: <3DC2DAB56C4C469380525FFB08E3A2D9@gmail.com> Message-ID: <4F45869C.4060107@netwok.org> Le 11/02/2012 12:00, Eli Bendersky a ?crit : > Well, I think the situation is pretty good now. If one goes to > python.org and is interested in contributing, clicking on the "Core > Development" link is a sensible step, right? Maybe, depending on your knowledge of jargon. How about rewording that link to ?Contributing?? Regards From merwok at netwok.org Thu Feb 23 01:26:33 2012 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 23 Feb 2012 01:26:33 +0100 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: References: <20120207152435.379ac6f4@resist.wooz.org> Message-ID: <4F4587B9.4080502@netwok.org> Hi Brett, I think this message went unanswered, so here?s a late reply: Le 07/02/2012 23:21, Brett Cannon a ?crit : > On Tue, Feb 7, 2012 at 15:28, Dirkjan Ochtman wrote: >> [...] >> Anyway, I think there was enough of a python3 port for Mercurial (from >> various GSoC students) that you can probably run some of the very >> simple commands (like hg parents or hg id), which should be enough for >> your purposes, right? > > Possibly. Where is the code? # get Mercurial from a repo or tarball hg clone http://selenic.com/repo/hg/ cd hg # convert files in place (don?t commit after this :) python3.2 contrib/setup3k.py # the makefile is not py3k-aware, need to run manually # the current stable head fails with a TypeError for me PYTHONPATH=. python3.2 build/scripts-3.2 Cheers From brian at python.org Thu Feb 23 01:31:29 2012 From: brian at python.org (Brian Curtin) Date: Wed, 22 Feb 2012 18:31:29 -0600 Subject: [Python-Dev] http://pythonmentors.com/ In-Reply-To: <4F45869C.4060107@netwok.org> References: <3DC2DAB56C4C469380525FFB08E3A2D9@gmail.com> <4F45869C.4060107@netwok.org> Message-ID: On Wed, Feb 22, 2012 at 18:21, ?ric Araujo wrote: > Le 11/02/2012 12:00, Eli Bendersky a ?crit : >> Well, I think the situation is pretty good now. If one goes to >> python.org and is interested in contributing, clicking on the "Core >> Development" link is a sensible step, right? > > Maybe, depending on your knowledge of jargon. ?How about rewording that > link to ?Contributing?? If you want to contribute to development, I think you'll know that a link about development is relevant. If you want to contribute money, a contribute link about development means you have to try again to give away your money. From stephen at xemacs.org Thu Feb 23 08:12:39 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 23 Feb 2012 16:12:39 +0900 Subject: [Python-Dev] hash randomization in 3.3 In-Reply-To: <20120222192611.1a9ba4c6@pitrou.net> References: <20120221191928.406b8dcc@pitrou.net> <20120221150533.4d968c83@resist.wooz.org> <87ipizm5ux.fsf@uwakimon.sk.tsukuba.ac.jp> <20120222125933.62a848cd@resist.wooz.org> <20120222192611.1a9ba4c6@pitrou.net> Message-ID: <87booqm3a0.fsf@uwakimon.sk.tsukuba.ac.jp> Antoine Pitrou writes: > How is it a "false sense of security" at all? It's the same as > setting a private secret for e.g. session cookies in Web applications. > As long as you don't leak the seed, it's (should be) secure. That's true. The problem is, the precondition that you won't leak the seed is all too often false. If a user takes advantage of the ability to set the seed, she can leak it, or a coworker (or a virus) can steal it from her source or keystroke logging, etc. And it's not the same, at least not for a highly secure application. In high-quality security, session keys are generated for each session (and changed frequently); the user doesn't know them (of course, he can always find out if he really wants to know, and sometimes that's necessary -- Hello, Debian OpenSSH maintainer!), and so can't leak them. From stephen at xemacs.org Thu Feb 23 08:15:48 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 23 Feb 2012 16:15:48 +0900 Subject: [Python-Dev] http://pythonmentors.com/ In-Reply-To: References: <3DC2DAB56C4C469380525FFB08E3A2D9@gmail.com> <4F45869C.4060107@netwok.org> Message-ID: <87aa4am34r.fsf@uwakimon.sk.tsukuba.ac.jp> Brian Curtin writes: > If you want to contribute to development, I think you'll know that a > link about development is relevant. For values of "you" in "experienced programmers", yes. But translators and tech writers don't consider what they do to be "development." From brian at python.org Thu Feb 23 08:24:23 2012 From: brian at python.org (Brian Curtin) Date: Thu, 23 Feb 2012 01:24:23 -0600 Subject: [Python-Dev] http://pythonmentors.com/ In-Reply-To: <87aa4am34r.fsf@uwakimon.sk.tsukuba.ac.jp> References: <3DC2DAB56C4C469380525FFB08E3A2D9@gmail.com> <4F45869C.4060107@netwok.org> <87aa4am34r.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Thu, Feb 23, 2012 at 01:15, Stephen J. Turnbull wrote: > Brian Curtin writes: > > ?> If you want to contribute to development, I think you'll know that a > ?> link about development is relevant. > > For values of "you" in "experienced programmers", yes. ?But > translators and tech writers don't consider what they do to be > "development." I don't know what this is saying, but I'll guess it's some suggestion that we should still name the link "Contributing". Keep in mind that the current "Core Development" link on the front page goes directly to http://docs.python.org/devguide/ -- getting this page in people's hands earlier is a Good Thing. However, this is not a correct link from something named "Contributing". It would have to say "Contributing Code", but then it leaves out docs and translations and our resident spelling bee contestants. Paint the bike shed any way you want except the plain "Contributing" color, please. From stefan_ml at behnel.de Thu Feb 23 09:01:44 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 23 Feb 2012 09:01:44 +0100 Subject: [Python-Dev] C-API functions for reading/writing tstate->exc_* ? In-Reply-To: <4F4176B9.4080403@v.loewis.de> References: <4F4176B9.4080403@v.loewis.de> Message-ID: "Martin v. L?wis", 19.02.2012 23:24: >> When compiling for PyPy, Cython therefore needs a way to tell PyPy about >> any changes. For the tstate->curexc_* fields, there are the two functions >> PyErr_Fetch() and PyErr_Restore(). Could we have two similar "official" >> functions for the exc_* fields? Maybe PyErr_FetchLast() and >> PyErr_RestoreLast()? > > I wouldn't call the functions *Last, as this may cause confusion with > sys.last_*. I'm also unsure why the current API uses this Fetch/Restore > pair of functions where Fetch clears the variables. A Get/Set pair of > functions would be more natural, IMO (where Get returns "new" > references). This would give PyErr_GetExcInfo/PyErr_SetExcInfo. Ok, I added a tracker ticket and I'm working on a patch. http://bugs.python.org/issue14098 Stefan From stephen at xemacs.org Thu Feb 23 09:44:31 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 23 Feb 2012 17:44:31 +0900 Subject: [Python-Dev] http://pythonmentors.com/ In-Reply-To: References: <3DC2DAB56C4C469380525FFB08E3A2D9@gmail.com> <4F45869C.4060107@netwok.org> <87aa4am34r.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <877gzelz0w.fsf@uwakimon.sk.tsukuba.ac.jp> Brian Curtin writes: > On Thu, Feb 23, 2012 at 01:15, Stephen J. Turnbull wrote: > > Brian Curtin writes: > > > > ?> If you want to contribute to development, I think you'll know that a > > ?> link about development is relevant. > > > > For values of "you" in "experienced programmers", yes. ?But > > translators and tech writers don't consider what they do to be > > "development." > > I don't know what this is saying, but I'll guess it's some suggestion > that we should still name the link "Contributing". No, it's saying that there are a lot of potential contributors for whom "Core Development" is pretty obviously not where they want to go. There should be a link that is the obvious place for them to go, and there currently isn't one. I don't have a problem with the presence of a "core development" link that goes to the devguide. I do have a problem with failing to invite people who are not at present interested in contributing code or money to contribute what they have. The next question is how many links do we want in that sidebar; I think there may already be too many. But I'm not a web designer to have a strong opinion on that. From mark at hotpy.org Thu Feb 23 11:12:13 2012 From: mark at hotpy.org (Mark Shannon) Date: Thu, 23 Feb 2012 10:12:13 +0000 Subject: [Python-Dev] Exceptions in LOAD_GLOBAL and LOAD_NAME In-Reply-To: <4F454097.7010807@v.loewis.de> References: <20120222064548.Horde.ZXw4cklCcOxPRIEMZQ7FIuA@webmail.df.eu> <4F454097.7010807@v.loewis.de> Message-ID: <4F4610FD.2060501@hotpy.org> The code below causes different behaviour for LOAD_GLOBAL and LOAD_NAME. Which is correct? Should exceptions raised in the equality test be converted to a NameError or just propogated? Cheers, Mark. ------------------------------------- import sys class S(str): pass def eq_except(self, other): if isinstance(other, str): raise TypeError("Cannot compare S and str") globals()[S("a")] = 0 S.__eq__ = eq_except def f(): print(a) try: f() except: print(sys.exc_info()[1]) try: print(a) except: print(sys.exc_info()[1]) ---------------------------------- Output: TypeError('Cannot compare S and str',) NameError("name 'a' is not defined",) From ncoghlan at gmail.com Thu Feb 23 13:28:42 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 23 Feb 2012 22:28:42 +1000 Subject: [Python-Dev] Exceptions in LOAD_GLOBAL and LOAD_NAME In-Reply-To: <4F4610FD.2060501@hotpy.org> References: <20120222064548.Horde.ZXw4cklCcOxPRIEMZQ7FIuA@webmail.df.eu> <4F454097.7010807@v.loewis.de> <4F4610FD.2060501@hotpy.org> Message-ID: On Thu, Feb 23, 2012 at 8:12 PM, Mark Shannon wrote: > Should exceptions raised in the equality test be converted to a NameError or > just propogated? Our general trend has been towards letting such exceptions escape the operation that caused them rather than implicitly suppressing them. In this case, the NameError message that results is also misleading (since "print(globals().keys())" will definitely show an 'a' entry). Given the effort you have to go to to trigger it, I'd consider fixing this low priority, but I agree that the conversion of the TypeError to NameError is a bug (likely resolved by adding a KeyError exception type check in the appropriate location). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mark at hotpy.org Thu Feb 23 15:09:38 2012 From: mark at hotpy.org (Mark Shannon) Date: Thu, 23 Feb 2012 14:09:38 +0000 Subject: [Python-Dev] Exceptions in LOAD_GLOBAL and LOAD_NAME In-Reply-To: References: <20120222064548.Horde.ZXw4cklCcOxPRIEMZQ7FIuA@webmail.df.eu> <4F454097.7010807@v.loewis.de> <4F4610FD.2060501@hotpy.org> Message-ID: <4F4648A2.9010506@hotpy.org> Nick Coghlan wrote: > On Thu, Feb 23, 2012 at 8:12 PM, Mark Shannon wrote: >> Should exceptions raised in the equality test be converted to a NameError or >> just propogated? > > Our general trend has been towards letting such exceptions escape the > operation that caused them rather than implicitly suppressing them. In > this case, the NameError message that results is also misleading > (since "print(globals().keys())" will definitely show an 'a' entry). > > Given the effort you have to go to to trigger it, I'd consider fixing > this low priority, but I agree that the conversion of the TypeError to > NameError is a bug (likely resolved by adding a KeyError exception > type check in the appropriate location). It is not a difficult fix. Just replacing calls to PyDict_GetItem with PyDict_GetItemWithError and raising NameError only if no Exception has occurred. Cheers, Mark From solipsis at pitrou.net Thu Feb 23 16:43:59 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 23 Feb 2012 16:43:59 +0100 Subject: [Python-Dev] cpython: Refactor importlib to make it easier to re-implement in C. References: Message-ID: <20120223164359.3266bd92@pitrou.net> On Thu, 23 Feb 2012 16:18:19 +0100 brett.cannon wrote: > def _sanity_check(name, package, level): > """Verify arguments are "sane".""" > + if not hasattr(name, 'rpartition'): > + raise TypeError("module name must be str, not {}".format(type(name))) Why don't you simply use isinstance()? (bytes objects also have rpartition()) Regards Antoine. From jimjjewett at gmail.com Thu Feb 23 17:37:34 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 23 Feb 2012 11:37:34 -0500 Subject: [Python-Dev] [Python-checkins] peps: Switch back to named functions, since the Ellipsis version degenerated badly In-Reply-To: References: Message-ID: On Wed, Feb 22, 2012 at 10:22 AM, nick.coghlan wrote: > + ? ?in x = weakref.ref(target, report_destruction) > + ? ?def report_destruction(obj): > ? ? ? ? print("{} is being destroyed".format(obj)) > +If the repetition of the name seems especially annoying, then a throwaway > +name like ``f`` can be used instead:: > + ? ?in x = weakref.ref(target, f) > + ? ?def f(obj): > + ? ? ? ?print("{} is being destroyed".format(obj)) I still feel that the helper function (or class) is subordinate, and should be indented. Thinking of "in ..." as a decorator helps, but makes it seem that the helper function is the important part (which it sometimes is...) I understand that adding a colon and indent has its own problems, but ... I'm not certain this is better, and I am certain that the desire for indentation is strong enough to at least justify discussion in the PEP. -jJ From brett at python.org Thu Feb 23 17:36:06 2012 From: brett at python.org (Brett Cannon) Date: Thu, 23 Feb 2012 11:36:06 -0500 Subject: [Python-Dev] cpython: Refactor importlib to make it easier to re-implement in C. In-Reply-To: <20120223164359.3266bd92@pitrou.net> References: <20120223164359.3266bd92@pitrou.net> Message-ID: On Thu, Feb 23, 2012 at 10:43, Antoine Pitrou wrote: > On Thu, 23 Feb 2012 16:18:19 +0100 > brett.cannon wrote: > > def _sanity_check(name, package, level): > > """Verify arguments are "sane".""" > > + if not hasattr(name, 'rpartition'): > > + raise TypeError("module name must be str, not > {}".format(type(name))) > > Why don't you simply use isinstance()? > (bytes objects also have rpartition()) > I think I was on a interface-conformance kick at the time and didn't want to restrict to a specific type over a specific interface. But since subclasses is not exactly complicated I can change this (which will also match potential C code more with a PyUnicode_Check()). -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Thu Feb 23 22:28:14 2012 From: larry at hastings.org (Larry Hastings) Date: Thu, 23 Feb 2012 13:28:14 -0800 Subject: [Python-Dev] Proposing an alternative to PEP 410 Message-ID: <4F46AF6E.2030300@hastings.org> I've been meditating on the whole os.stat mtime representation thing. Here's a possible alternative approach. * Improve datetime.datetime objects so they support nanosecond resolution, in such a way that it's 100% painless to make them even more precise in the future. * Add support to datetime objects that allows adding and subtracting ints and floats as seconds. This behavior is controllable with a flag on the object--by default this behavior is off. * Support accepting naive datetime.datetime objects in all functions that accept a timestamp in os (utime etc). * Change the result of os.stat to be a custom class rather than a PyStructSequence. Support the sequence protocol on the custom class but mark it PendingDeprecation, to be removed completely in 3.5. (I can't take credit for this idea; MvL suggested it to me once while we were talking about this issue. Now that the os.stat object has named fields, who uses the struct unpacking anymore?) * Add support for setting "stat_float_times=2" (or perhaps "stat_float_times=datetime.datetime" ?) to enable returning st_[acm]time as naive datetime.datetime objects--specifically, ones that allow addition and subtraction of ints and floats. The value would be similar to calling datetime.datetime.fromdatetime() on the current float timestamp, but would preserve all available precision. * Add a new parameter to functions that produce stat-like timestamps to explicitly specify the type of the timestamps (float or datetime), as proposed in PEP 410. I realize datetime objects aren't a drop-in replacement for floats (or ints). In particular their str/repr representations are much more ornate. So I'd expect some breakage. Personally I think the adding/subtracting ints change is a tiny bit smelly--but this is a practicality beating purity thing. I propose making it non-default behavior just to minimize the effects of the change. Similarly, I realize os.stat_float_times was always a bit of a hack, what with it being global state and all. However the approach has the virtue of having worked in the past. I disagree with PEP 410's conclusions about the suitability of datetime as a timestamp object. I think "naive" datetime objects are a perfect fit. Specficially addressing PEP 410's concerns: * I don't propose doing anything about the other functions that have no explicit start time; I'm only proposing changing the functions that deal with timestamps. (Perhaps the right thing for epoch-less times like time.clock would be timedelta? But I think we can table this discussion for now.) * "You can't compare naive and non-naive datetimes." So what? The existing timestamp from os.stat is a float, and you can't compare floats and non-naive datetimes. How is this an issue? Perhaps someone else can propose something even better, //arry/ From victor.stinner at gmail.com Thu Feb 23 23:35:24 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 23 Feb 2012 23:35:24 +0100 Subject: [Python-Dev] Proposing an alternative to PEP 410 In-Reply-To: <4F46AF6E.2030300@hastings.org> References: <4F46AF6E.2030300@hastings.org> Message-ID: I rejected datetime.datetime because I want to get nanosecond resolution for time and os modules, not only for the os module. If we choose to only patch the os module (*stat() and *utime*() functions), datetime.datetime would be meaningful (e.g. it's easier to format datetime for an human, than a Epoch timestamp). I don't think that it's a real issue that datetime is not fully compatible with float. If os.stat() continues to return float by default, programs asking explicitly for datetime would be prepared to handle this type. I have the same rationale with Decimal :-) I don't think that there is a need to support datetime+int or datetime-float, there is already the timedelta type which is well defined. For os.stat(), you should use the UTC timezone, not a naive datetime. > * Add a new parameter to functions that produce stat-like timestamps to > ?explicitly specify the type of the timestamps (float or datetime), > ?as proposed in PEP 410. What is a stat-like timestamp? Which functions are concerned? > Similarly, I realize os.stat_float_times was always a bit of a hack, what > with it being global state and all. ?However the approach has the virtue of > having worked in the past. A global switch to get timestamps as datetime or Decimal would break libraries and programs unable to handle these types. I prefer adding an argument to os.*stat() functions to avoid border effects. Read also: http://www.python.org/dev/peps/pep-0410/#add-a-global-flag-to-change-the-timestamp-type > Specficially addressing PEP 410's concerns: > > ?* I don't propose doing anything about the other functions that have no > ? ?explicit start time; I'm only proposing changing the functions that deal > ? ?with timestamps. ?(Perhaps the right thing for epoch-less times like > ? ?time.clock would be timedelta? ?But I think we can table this discussion > ? ?for now.) We may choose a different solution for the os.stat()/os.utime() and for the others functions (see the PEP 410 for the full list). But I would prefer an unified solution to provide nanosecond resolution in all modules. It would avoid to have to support two new types for example. Victor From larry at hastings.org Fri Feb 24 00:47:05 2012 From: larry at hastings.org (Larry Hastings) Date: Thu, 23 Feb 2012 15:47:05 -0800 Subject: [Python-Dev] Proposing an alternative to PEP 410 In-Reply-To: References: <4F46AF6E.2030300@hastings.org> Message-ID: <4F46CFF9.3070503@hastings.org> On 02/23/2012 02:35 PM, Victor Stinner wrote: > I rejected datetime.datetime because I want to get nanosecond > resolution for time and os modules, not only for the os module. If we > choose to only patch the os module (*stat() and *utime*() functions), > datetime.datetime would be meaningful (e.g. it's easier to format > datetime for an human, than a Epoch timestamp). I think a piecemeal approach would be better. I'm aware of a specific problem with os.stat / os.utime--the loss of precision problem that's already been endlessly discussed. Is there a similar problem with these other functions? > I don't > think that there is a need to support datetime+int or datetime-float, > there is already the timedelta type which is well defined. I suggest this because I myself have written (admittedly sloppy) code that assumed it could perform simple addition with st_mtime. Instead of finding out the current timestamp and writing that out properly, I occasionally read in the file's mtime, add a small integer (or even smaller float), and write it back out. > For os.stat(), you should use the UTC timezone, not a naive datetime. Why is that more appropriate? IIUC, timestamps ignore leap seconds and strictly represent "seconds since the epoch". In order to correctly return a time in the UTC time zone we'd have to adjust for leap seconds. Naive datetimes bask in their happy ignorance of such complexities. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Feb 24 01:43:49 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 23 Feb 2012 16:43:49 -0800 Subject: [Python-Dev] Proposing an alternative to PEP 410 In-Reply-To: <4F46CFF9.3070503@hastings.org> References: <4F46AF6E.2030300@hastings.org> <4F46CFF9.3070503@hastings.org> Message-ID: On Thu, Feb 23, 2012 at 3:47 PM, Larry Hastings wrote: > On 02/23/2012 02:35 PM, Victor Stinner wrote: > > For os.stat(), you should use the UTC timezone, not a naive datetime. > > Why is that more appropriate?? IIUC, timestamps ignore leap seconds and > strictly represent "seconds since the epoch".? In order to correctly return > a time in the UTC time zone we'd have to adjust for leap seconds.? Naive > datetimes bask in their happy ignorance of such complexities. You seem to have the meaning of "ignore leap seconds" backwards. POSIX timestamps are not *literally* seconds since the epoch. They are *non-leap* seconds since the epoch, which is just what you want. IOW the simple calculation ignoring leap seconds (found e.g. in calendar.py) will always produce the right value. -- --Guido van Rossum (python.org/~guido) From brett at python.org Fri Feb 24 02:15:57 2012 From: brett at python.org (Brett Cannon) Date: Thu, 23 Feb 2012 20:15:57 -0500 Subject: [Python-Dev] requirements for moving __import__ over to importlib? In-Reply-To: <4F4587B9.4080502@netwok.org> References: <20120207152435.379ac6f4@resist.wooz.org> <4F4587B9.4080502@netwok.org> Message-ID: I just tried this and I get a str/bytes issue. I also think your setup3k.py command is missing ``build`` and your build/scripts-3.2 is missing ``/hg``. On Wed, Feb 22, 2012 at 19:26, ?ric Araujo wrote: > Hi Brett, > > I think this message went unanswered, so here?s a late reply: > > Le 07/02/2012 23:21, Brett Cannon a ?crit : > > On Tue, Feb 7, 2012 at 15:28, Dirkjan Ochtman > wrote: > >> [...] > >> Anyway, I think there was enough of a python3 port for Mercurial (from > >> various GSoC students) that you can probably run some of the very > >> simple commands (like hg parents or hg id), which should be enough for > >> your purposes, right? > > > > Possibly. Where is the code? > > # get Mercurial from a repo or tarball > hg clone http://selenic.com/repo/hg/ > cd hg > > # convert files in place (don?t commit after this :) > python3.2 contrib/setup3k.py > > # the makefile is not py3k-aware, need to run manually > # the current stable head fails with a TypeError for me > PYTHONPATH=. python3.2 build/scripts-3.2 > > Cheers > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Fri Feb 24 02:51:07 2012 From: greg at krypto.org (Gregory P. Smith) Date: Thu, 23 Feb 2012 17:51:07 -0800 Subject: [Python-Dev] [Python-checkins] cpython (3.2): logging: Added locking in flush() and close() handler methods. Thanks to Fayaz In-Reply-To: References: Message-ID: On Thu, Feb 23, 2012 at 12:04 PM, vinay.sajip wrote: > http://hg.python.org/cpython/rev/b2adcd90e656 > changeset: ? 75211:b2adcd90e656 > branch: ? ? ?3.2 > parent: ? ? ?75200:85d08a1ba74e > user: ? ? ? ?Vinay Sajip > date: ? ? ? ?Thu Feb 23 19:45:52 2012 +0000 > summary: > ?logging: Added locking in flush() and close() handler methods. Thanks to Fayaz Yusuf Khan for the suggestion. > > files: > ?Lib/logging/__init__.py | ?24 +++++++++------- > ?Lib/logging/handlers.py | ?40 +++++++++++++++------------- > ?2 files changed, 35 insertions(+), 29 deletions(-) > > > diff --git a/Lib/logging/__init__.py b/Lib/logging/__init__.py > --- a/Lib/logging/__init__.py > +++ b/Lib/logging/__init__.py > @@ -1,4 +1,4 @@ > -# Copyright 2001-2010 by Vinay Sajip. All Rights Reserved. > +# Copyright 2001-2012 by Vinay Sajip. All Rights Reserved. > ?# > ?# Permission to use, copy, modify, and distribute this software and its > ?# documentation for any purpose and without fee is hereby granted, > @@ -16,9 +16,9 @@ > > ?""" > ?Logging package for Python. Based on PEP 282 and comments thereto in > -comp.lang.python, and influenced by Apache's log4j system. > +comp.lang.python. > > -Copyright (C) 2001-2011 Vinay Sajip. All Rights Reserved. > +Copyright (C) 2001-2012 Vinay Sajip. All Rights Reserved. > > ?To use, simply 'import logging' and log away! > ?""" > @@ -917,8 +917,9 @@ > ? ? ? ? """ > ? ? ? ? Flushes the stream. > ? ? ? ? """ > - ? ? ? ?if self.stream and hasattr(self.stream, "flush"): > - ? ? ? ? ? ?self.stream.flush() > + ? ? ? ?with self.lock: > + ? ? ? ? ? ?if self.stream and hasattr(self.stream, "flush"): > + ? ? ? ? ? ? ? ?self.stream.flush() I don't know if anyone actually builds Python without thread support anymore, but if so, self.lock will be set to None and these "with self.lock"s will fail. Perhaps change lock = None to self.lock = some dummy duck typed lock that supports use as a context manager and acquire/release calls? > > ? ? def emit(self, record): > ? ? ? ? """ > @@ -969,12 +970,13 @@ > ? ? ? ? """ > ? ? ? ? Closes the stream. > ? ? ? ? """ > - ? ? ? ?if self.stream: > - ? ? ? ? ? ?self.flush() > - ? ? ? ? ? ?if hasattr(self.stream, "close"): > - ? ? ? ? ? ? ? ?self.stream.close() > - ? ? ? ? ? ?StreamHandler.close(self) > - ? ? ? ? ? ?self.stream = None > + ? ? ? ?with self.lock: > + ? ? ? ? ? ?if self.stream: > + ? ? ? ? ? ? ? ?self.flush() > + ? ? ? ? ? ? ? ?if hasattr(self.stream, "close"): > + ? ? ? ? ? ? ? ? ? ?self.stream.close() > + ? ? ? ? ? ? ? ?StreamHandler.close(self) > + ? ? ? ? ? ? ? ?self.stream = None > > ? ? def _open(self): > ? ? ? ? """ > diff --git a/Lib/logging/handlers.py b/Lib/logging/handlers.py > --- a/Lib/logging/handlers.py > +++ b/Lib/logging/handlers.py > @@ -1,4 +1,4 @@ > -# Copyright 2001-2010 by Vinay Sajip. All Rights Reserved. > +# Copyright 2001-2012 by Vinay Sajip. All Rights Reserved. > ?# > ?# Permission to use, copy, modify, and distribute this software and its > ?# documentation for any purpose and without fee is hereby granted, > @@ -16,10 +16,9 @@ > > ?""" > ?Additional handlers for the logging package for Python. The core package is > -based on PEP 282 and comments thereto in comp.lang.python, and influenced by > -Apache's log4j system. > +based on PEP 282 and comments thereto in comp.lang.python. > > -Copyright (C) 2001-2010 Vinay Sajip. All Rights Reserved. > +Copyright (C) 2001-2012 Vinay Sajip. All Rights Reserved. > > ?To use, simply 'import logging.handlers' and log away! > ?""" > @@ -554,10 +553,11 @@ > ? ? ? ? """ > ? ? ? ? Closes the socket. > ? ? ? ? """ > - ? ? ? ?if self.sock: > - ? ? ? ? ? ?self.sock.close() > - ? ? ? ? ? ?self.sock = None > - ? ? ? ?logging.Handler.close(self) > + ? ? ? ?with self.lock: > + ? ? ? ? ? ?if self.sock: > + ? ? ? ? ? ? ? ?self.sock.close() > + ? ? ? ? ? ? ? ?self.sock = None > + ? ? ? ? ? ?logging.Handler.close(self) > > ?class DatagramHandler(SocketHandler): > ? ? """ > @@ -752,9 +752,10 @@ > ? ? ? ? """ > ? ? ? ? Closes the socket. > ? ? ? ? """ > - ? ? ? ?if self.unixsocket: > - ? ? ? ? ? ?self.socket.close() > - ? ? ? ?logging.Handler.close(self) > + ? ? ? ?with self.lock: > + ? ? ? ? ? ?if self.unixsocket: > + ? ? ? ? ? ? ? ?self.socket.close() > + ? ? ? ? ? ?logging.Handler.close(self) > > ? ? def mapPriority(self, levelName): > ? ? ? ? """ > @@ -1095,7 +1096,8 @@ > > ? ? ? ? This version just zaps the buffer to empty. > ? ? ? ? """ > - ? ? ? ?self.buffer = [] > + ? ? ? ?with self.lock: > + ? ? ? ? ? ?self.buffer = [] > > ? ? def close(self): > ? ? ? ? """ > @@ -1145,18 +1147,20 @@ > > ? ? ? ? The record buffer is also cleared by this operation. > ? ? ? ? """ > - ? ? ? ?if self.target: > - ? ? ? ? ? ?for record in self.buffer: > - ? ? ? ? ? ? ? ?self.target.handle(record) > - ? ? ? ? ? ?self.buffer = [] > + ? ? ? ?with self.lock: > + ? ? ? ? ? ?if self.target: > + ? ? ? ? ? ? ? ?for record in self.buffer: > + ? ? ? ? ? ? ? ? ? ?self.target.handle(record) > + ? ? ? ? ? ? ? ?self.buffer = [] > > ? ? def close(self): > ? ? ? ? """ > ? ? ? ? Flush, set the target to None and lose the buffer. > ? ? ? ? """ > ? ? ? ? self.flush() > - ? ? ? ?self.target = None > - ? ? ? ?BufferingHandler.close(self) > + ? ? ? ?with self.lock: > + ? ? ? ? ? ?self.target = None > + ? ? ? ? ? ?BufferingHandler.close(self) > > > ?class QueueHandler(logging.Handler): > > -- > Repository URL: http://hg.python.org/cpython > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > From greg at krypto.org Fri Feb 24 02:52:46 2012 From: greg at krypto.org (Gregory P. Smith) Date: Thu, 23 Feb 2012 17:52:46 -0800 Subject: [Python-Dev] [Python-checkins] cpython (3.2): logging: Added locking in flush() and close() handler methods. Thanks to Fayaz In-Reply-To: References: Message-ID: On Thu, Feb 23, 2012 at 5:51 PM, Gregory P. Smith wrote: > On Thu, Feb 23, 2012 at 12:04 PM, vinay.sajip > wrote: >> http://hg.python.org/cpython/rev/b2adcd90e656 >> changeset: ? 75211:b2adcd90e656 >> branch: ? ? ?3.2 >> parent: ? ? ?75200:85d08a1ba74e >> user: ? ? ? ?Vinay Sajip >> date: ? ? ? ?Thu Feb 23 19:45:52 2012 +0000 >> summary: >> ?logging: Added locking in flush() and close() handler methods. Thanks to Fayaz Yusuf Khan for the suggestion. >> >> files: >> ?Lib/logging/__init__.py | ?24 +++++++++------- >> ?Lib/logging/handlers.py | ?40 +++++++++++++++------------- >> ?2 files changed, 35 insertions(+), 29 deletions(-) >> >> >> diff --git a/Lib/logging/__init__.py b/Lib/logging/__init__.py >> --- a/Lib/logging/__init__.py >> +++ b/Lib/logging/__init__.py >> @@ -1,4 +1,4 @@ >> -# Copyright 2001-2010 by Vinay Sajip. All Rights Reserved. >> +# Copyright 2001-2012 by Vinay Sajip. All Rights Reserved. >> ?# >> ?# Permission to use, copy, modify, and distribute this software and its >> ?# documentation for any purpose and without fee is hereby granted, >> @@ -16,9 +16,9 @@ >> >> ?""" >> ?Logging package for Python. Based on PEP 282 and comments thereto in >> -comp.lang.python, and influenced by Apache's log4j system. >> +comp.lang.python. >> >> -Copyright (C) 2001-2011 Vinay Sajip. All Rights Reserved. >> +Copyright (C) 2001-2012 Vinay Sajip. All Rights Reserved. >> >> ?To use, simply 'import logging' and log away! >> ?""" >> @@ -917,8 +917,9 @@ >> ? ? ? ? """ >> ? ? ? ? Flushes the stream. >> ? ? ? ? """ >> - ? ? ? ?if self.stream and hasattr(self.stream, "flush"): >> - ? ? ? ? ? ?self.stream.flush() >> + ? ? ? ?with self.lock: >> + ? ? ? ? ? ?if self.stream and hasattr(self.stream, "flush"): >> + ? ? ? ? ? ? ? ?self.stream.flush() > > I don't know if anyone actually builds Python without thread support > anymore, but if so, self.lock will be set to None and these "with > self.lock"s will fail. > > Perhaps change lock = None to self.lock = some dummy duck typed lock > that supports use as a context manager and acquire/release calls? whoops. once again there I go reading the commit log in order. Looks like you already fixed this in http://hg.python.org/cpython/rev/2ab3a97d544c. thanks! :) > >> >> ? ? def emit(self, record): >> ? ? ? ? """ >> @@ -969,12 +970,13 @@ >> ? ? ? ? """ >> ? ? ? ? Closes the stream. >> ? ? ? ? """ >> - ? ? ? ?if self.stream: >> - ? ? ? ? ? ?self.flush() >> - ? ? ? ? ? ?if hasattr(self.stream, "close"): >> - ? ? ? ? ? ? ? ?self.stream.close() >> - ? ? ? ? ? ?StreamHandler.close(self) >> - ? ? ? ? ? ?self.stream = None >> + ? ? ? ?with self.lock: >> + ? ? ? ? ? ?if self.stream: >> + ? ? ? ? ? ? ? ?self.flush() >> + ? ? ? ? ? ? ? ?if hasattr(self.stream, "close"): >> + ? ? ? ? ? ? ? ? ? ?self.stream.close() >> + ? ? ? ? ? ? ? ?StreamHandler.close(self) >> + ? ? ? ? ? ? ? ?self.stream = None >> >> ? ? def _open(self): >> ? ? ? ? """ >> diff --git a/Lib/logging/handlers.py b/Lib/logging/handlers.py >> --- a/Lib/logging/handlers.py >> +++ b/Lib/logging/handlers.py >> @@ -1,4 +1,4 @@ >> -# Copyright 2001-2010 by Vinay Sajip. All Rights Reserved. >> +# Copyright 2001-2012 by Vinay Sajip. All Rights Reserved. >> ?# >> ?# Permission to use, copy, modify, and distribute this software and its >> ?# documentation for any purpose and without fee is hereby granted, >> @@ -16,10 +16,9 @@ >> >> ?""" >> ?Additional handlers for the logging package for Python. The core package is >> -based on PEP 282 and comments thereto in comp.lang.python, and influenced by >> -Apache's log4j system. >> +based on PEP 282 and comments thereto in comp.lang.python. >> >> -Copyright (C) 2001-2010 Vinay Sajip. All Rights Reserved. >> +Copyright (C) 2001-2012 Vinay Sajip. All Rights Reserved. >> >> ?To use, simply 'import logging.handlers' and log away! >> ?""" >> @@ -554,10 +553,11 @@ >> ? ? ? ? """ >> ? ? ? ? Closes the socket. >> ? ? ? ? """ >> - ? ? ? ?if self.sock: >> - ? ? ? ? ? ?self.sock.close() >> - ? ? ? ? ? ?self.sock = None >> - ? ? ? ?logging.Handler.close(self) >> + ? ? ? ?with self.lock: >> + ? ? ? ? ? ?if self.sock: >> + ? ? ? ? ? ? ? ?self.sock.close() >> + ? ? ? ? ? ? ? ?self.sock = None >> + ? ? ? ? ? ?logging.Handler.close(self) >> >> ?class DatagramHandler(SocketHandler): >> ? ? """ >> @@ -752,9 +752,10 @@ >> ? ? ? ? """ >> ? ? ? ? Closes the socket. >> ? ? ? ? """ >> - ? ? ? ?if self.unixsocket: >> - ? ? ? ? ? ?self.socket.close() >> - ? ? ? ?logging.Handler.close(self) >> + ? ? ? ?with self.lock: >> + ? ? ? ? ? ?if self.unixsocket: >> + ? ? ? ? ? ? ? ?self.socket.close() >> + ? ? ? ? ? ?logging.Handler.close(self) >> >> ? ? def mapPriority(self, levelName): >> ? ? ? ? """ >> @@ -1095,7 +1096,8 @@ >> >> ? ? ? ? This version just zaps the buffer to empty. >> ? ? ? ? """ >> - ? ? ? ?self.buffer = [] >> + ? ? ? ?with self.lock: >> + ? ? ? ? ? ?self.buffer = [] >> >> ? ? def close(self): >> ? ? ? ? """ >> @@ -1145,18 +1147,20 @@ >> >> ? ? ? ? The record buffer is also cleared by this operation. >> ? ? ? ? """ >> - ? ? ? ?if self.target: >> - ? ? ? ? ? ?for record in self.buffer: >> - ? ? ? ? ? ? ? ?self.target.handle(record) >> - ? ? ? ? ? ?self.buffer = [] >> + ? ? ? ?with self.lock: >> + ? ? ? ? ? ?if self.target: >> + ? ? ? ? ? ? ? ?for record in self.buffer: >> + ? ? ? ? ? ? ? ? ? ?self.target.handle(record) >> + ? ? ? ? ? ? ? ?self.buffer = [] >> >> ? ? def close(self): >> ? ? ? ? """ >> ? ? ? ? Flush, set the target to None and lose the buffer. >> ? ? ? ? """ >> ? ? ? ? self.flush() >> - ? ? ? ?self.target = None >> - ? ? ? ?BufferingHandler.close(self) >> + ? ? ? ?with self.lock: >> + ? ? ? ? ? ?self.target = None >> + ? ? ? ? ? ?BufferingHandler.close(self) >> >> >> ?class QueueHandler(logging.Handler): >> >> -- >> Repository URL: http://hg.python.org/cpython >> >> _______________________________________________ >> Python-checkins mailing list >> Python-checkins at python.org >> http://mail.python.org/mailman/listinfo/python-checkins >> From martin at v.loewis.de Fri Feb 24 11:01:27 2012 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Fri, 24 Feb 2012 11:01:27 +0100 Subject: [Python-Dev] New shared-keys dictionary implementation (issue13903) In-Reply-To: <4F475880.6050301@hotpy.org> References: <20120220161438.9745.3132@psf.upfronthosting.co.za> <4F475880.6050301@hotpy.org> Message-ID: <4F475FF7.2020306@v.loewis.de> > Unfortunately it seems to be the norm in CPython to publish almost > everything in header files that get included in "Python.h". In many cases, this is purely for historic reasons. In many additional cases, it's to support fast access macros, at least in the interpreter itself, but then also in extension modules. I agree that moving the structures into the implementation is fine, as long as there are sufficient access functions (for dictionaries, there are plenty, of course). Regards, Martin From solipsis at pitrou.net Fri Feb 24 13:19:07 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 24 Feb 2012 13:19:07 +0100 Subject: [Python-Dev] cpython: Issue #13706: Fix format(int, "n") for locale with non-ASCII thousands separator References: Message-ID: <20120224131907.155820cc@pitrou.net> On Fri, 24 Feb 2012 00:49:31 +0100 victor.stinner wrote: > http://hg.python.org/cpython/rev/f89e2f4cda88 > changeset: 75231:f89e2f4cda88 > user: Victor Stinner > date: Fri Feb 24 00:37:51 2012 +0100 > summary: > Issue #13706: Fix format(int, "n") for locale with non-ASCII thousands separator Can you please check the buildbots after you commit something large? This commit broke compilation under Windows. cheers Antoine. From victor.stinner at gmail.com Fri Feb 24 13:54:18 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 24 Feb 2012 13:54:18 +0100 Subject: [Python-Dev] cpython: Issue #13706: Fix format(int, "n") for locale with non-ASCII thousands separator In-Reply-To: <20120224131907.155820cc@pitrou.net> References: <20120224131907.155820cc@pitrou.net> Message-ID: Oh sorry and thanks for the fix. Victor 2012/2/24 Antoine Pitrou : > On Fri, 24 Feb 2012 00:49:31 +0100 > victor.stinner wrote: >> http://hg.python.org/cpython/rev/f89e2f4cda88 >> changeset: ? 75231:f89e2f4cda88 >> user: ? ? ? ?Victor Stinner >> date: ? ? ? ?Fri Feb 24 00:37:51 2012 +0100 >> summary: >> ? Issue #13706: Fix format(int, "n") for locale with non-ASCII thousands separator > > Can you please check the buildbots after you commit something large? > This commit broke compilation under Windows. From tshepang at gmail.com Fri Feb 24 17:24:50 2012 From: tshepang at gmail.com (Tshepang Lekhonkhobe) Date: Fri, 24 Feb 2012 18:24:50 +0200 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating Message-ID: Hi, I was of the thought that Old String Formatting |"%s" % foo| was to be phased out by Advanced String Formatting |"{}.format(foo)|. I however keep seeing new code committed into the main VCS using the old style. Is this okay? Is there a policy? I ask also because I expect CPython to lead by example. On another note, will the old format ever be deprecated? Is there a date? From benjamin at python.org Fri Feb 24 17:41:31 2012 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 24 Feb 2012 11:41:31 -0500 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: Message-ID: 2012/2/24 Tshepang Lekhonkhobe : > Hi, > > I was of the thought that Old String Formatting |"%s" % foo| was to be > phased out by Advanced String Formatting |"{}.format(foo)|. I however > keep seeing new code committed into the main VCS using the old style. > Is this okay? Is there a policy? I ask also because I expect CPython > to lead by example. Using either is fine I think. It doesn't hurt anyone to have old string formatting around for a long time. In general, +0 for using new string formatting. > > On another note, will the old format ever be deprecated? Is there a date? I doubt it. -- Regards, Benjamin From status at bugs.python.org Fri Feb 24 18:07:36 2012 From: status at bugs.python.org (Python tracker) Date: Fri, 24 Feb 2012 18:07:36 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20120224170736.8EF241DEB5@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2012-02-17 - 2012-02-24) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3277 (+20) closed 22611 (+44) total 25888 (+64) Open issues with patches: 1406 Issues opened (46) ================== #13637: binascii.a2b_* functions could accept unicode strings http://bugs.python.org/issue13637 reopened by r.david.murray #13641: decoding functions in the base64 module could accept unicode s http://bugs.python.org/issue13641 reopened by r.david.murray #14044: IncompleteRead error with urllib2 or urllib.request -- fine wi http://bugs.python.org/issue14044 opened by Alex Quinn #14046: argparse: assertion failure if optional argument has square/ro http://bugs.python.org/issue14046 opened by oxplot #14049: execfile() fails on files that use global variables inside fun http://bugs.python.org/issue14049 opened by techtonik #14050: Tutorial, list.sort() and items comparability http://bugs.python.org/issue14050 opened by sandro.tosi #14055: Implement __sizeof__ for etree Element http://bugs.python.org/issue14055 opened by loewis #14056: Misc doc changes for tarfile http://bugs.python.org/issue14056 opened by eric.araujo #14057: Speedup sysconfig startup http://bugs.python.org/issue14057 opened by haypo #14059: Implement multiprocessing.Barrier http://bugs.python.org/issue14059 opened by anacrolix #14060: Implement a CSP-style channel http://bugs.python.org/issue14060 opened by anacrolix #14061: Clean up archiving code in shutil http://bugs.python.org/issue14061 opened by eric.araujo #14062: UTF-8 Email Subject problem http://bugs.python.org/issue14062 opened by msladek #14065: Element should support cyclic GC http://bugs.python.org/issue14065 opened by loewis #14067: Avoid more stat() calls in importlib http://bugs.python.org/issue14067 opened by pitrou #14069: In extensions (?...) the lookbehind assertion cannot choose be http://bugs.python.org/issue14069 opened by py.user #14070: Idea: Add a flag to reload from source, e.g. reload(module, ig http://bugs.python.org/issue14070 opened by timClicks #14071: allow more than one hash seed per process (move _Py_HashSecret http://bugs.python.org/issue14071 opened by gregory.p.smith #14072: urlparse on tel: URI-s misses the scheme in some cases http://bugs.python.org/issue14072 opened by ivan_herman #14074: argparse does not allow nargs>1 for positional arguments but d http://bugs.python.org/issue14074 opened by tshepang #14075: argparse: unused method? http://bugs.python.org/issue14075 opened by tshepang #14076: sqlite3 module ignores placeholers in CREATE TRIGGER code http://bugs.python.org/issue14076 opened by GuGu #14078: Add 'sourceline' property to xml.etree Elements http://bugs.python.org/issue14078 opened by leonov #14080: Sporadic test_imp failure http://bugs.python.org/issue14080 opened by pitrou #14081: Allow "maxsplit" argument to str.split() to be passed as a key http://bugs.python.org/issue14081 opened by ncoghlan #14082: shutil doesn't support extended attributes http://bugs.python.org/issue14082 opened by pitrou #14085: PyUnicode_WRITE: "comparison is always true" warnings http://bugs.python.org/issue14085 opened by skrah #14087: multiprocessing.Condition.wait_for missing http://bugs.python.org/issue14087 opened by sbt #14088: sys.executable generating canonical path http://bugs.python.org/issue14088 opened by alvesjnr #14089: Patch to increase fractions lib test coverage http://bugs.python.org/issue14089 opened by Oleg.Plakhotnyuk #14092: __name__ inconsistently applied in class definition http://bugs.python.org/issue14092 opened by eukreign #14093: Mercurial version information not appearing in Windows builds http://bugs.python.org/issue14093 opened by vinay.sajip #14094: nt.realpath() should use GetFinalPathNameByHandle() when avail http://bugs.python.org/issue14094 opened by haypo #14095: type_new() removes __qualname__ from the input dictionary http://bugs.python.org/issue14095 opened by haypo #14097: Improve the "introduction" page of the tutorial http://bugs.python.org/issue14097 opened by ezio.melotti #14098: provide public C-API for reading/setting sys.exc_info() http://bugs.python.org/issue14098 opened by scoder #14099: ZipFile.open() should not reopen the underlying file http://bugs.python.org/issue14099 opened by kasal #14100: Add a missing info to PEP 393 + link from whatsnew 3.3 http://bugs.python.org/issue14100 opened by tshepang #14101: example function in tertools.count docstring is misindented http://bugs.python.org/issue14101 opened by zbysz #14102: argparse: add ability to create a man page http://bugs.python.org/issue14102 opened by Daniel.Walsh #14103: argparse: add ability to create a bash_completion script http://bugs.python.org/issue14103 opened by Daniel.Walsh #14104: Implement time.monotonic() on Mac OS X http://bugs.python.org/issue14104 opened by haypo #14105: Breakpoints in debug lost if line is inserted; IDLE http://bugs.python.org/issue14105 opened by ltaylor934 #14106: Distutils manifest: recursive-(include|exclude) matches suffix http://bugs.python.org/issue14106 opened by nadeem.vawda #14107: Debian bigmem buildbot hanging in test_bigmem http://bugs.python.org/issue14107 opened by nadeem.vawda #14073: allow per-thread atexit() http://bugs.python.org/issue14073 opened by tarek Most recent 15 issues with no replies (15) ========================================== #14107: Debian bigmem buildbot hanging in test_bigmem http://bugs.python.org/issue14107 #14106: Distutils manifest: recursive-(include|exclude) matches suffix http://bugs.python.org/issue14106 #14105: Breakpoints in debug lost if line is inserted; IDLE http://bugs.python.org/issue14105 #14104: Implement time.monotonic() on Mac OS X http://bugs.python.org/issue14104 #14094: nt.realpath() should use GetFinalPathNameByHandle() when avail http://bugs.python.org/issue14094 #14089: Patch to increase fractions lib test coverage http://bugs.python.org/issue14089 #14082: shutil doesn't support extended attributes http://bugs.python.org/issue14082 #14078: Add 'sourceline' property to xml.etree Elements http://bugs.python.org/issue14078 #14076: sqlite3 module ignores placeholers in CREATE TRIGGER code http://bugs.python.org/issue14076 #14074: argparse does not allow nargs>1 for positional arguments but d http://bugs.python.org/issue14074 #14072: urlparse on tel: URI-s misses the scheme in some cases http://bugs.python.org/issue14072 #14069: In extensions (?...) the lookbehind assertion cannot choose be http://bugs.python.org/issue14069 #14065: Element should support cyclic GC http://bugs.python.org/issue14065 #14062: UTF-8 Email Subject problem http://bugs.python.org/issue14062 #14055: Implement __sizeof__ for etree Element http://bugs.python.org/issue14055 Most recent 15 issues waiting for review (15) ============================================= #14106: Distutils manifest: recursive-(include|exclude) matches suffix http://bugs.python.org/issue14106 #14101: example function in tertools.count docstring is misindented http://bugs.python.org/issue14101 #14100: Add a missing info to PEP 393 + link from whatsnew 3.3 http://bugs.python.org/issue14100 #14099: ZipFile.open() should not reopen the underlying file http://bugs.python.org/issue14099 #14098: provide public C-API for reading/setting sys.exc_info() http://bugs.python.org/issue14098 #14097: Improve the "introduction" page of the tutorial http://bugs.python.org/issue14097 #14095: type_new() removes __qualname__ from the input dictionary http://bugs.python.org/issue14095 #14093: Mercurial version information not appearing in Windows builds http://bugs.python.org/issue14093 #14089: Patch to increase fractions lib test coverage http://bugs.python.org/issue14089 #14088: sys.executable generating canonical path http://bugs.python.org/issue14088 #14087: multiprocessing.Condition.wait_for missing http://bugs.python.org/issue14087 #14085: PyUnicode_WRITE: "comparison is always true" warnings http://bugs.python.org/issue14085 #14081: Allow "maxsplit" argument to str.split() to be passed as a key http://bugs.python.org/issue14081 #14078: Add 'sourceline' property to xml.etree Elements http://bugs.python.org/issue14078 #14075: argparse: unused method? http://bugs.python.org/issue14075 Top 10 most discussed issues (10) ================================= #13703: Hash collision security issue http://bugs.python.org/issue13703 25 msgs #14080: Sporadic test_imp failure http://bugs.python.org/issue14080 13 msgs #6884: Impossible to include file in sdist that starts with 'build' o http://bugs.python.org/issue6884 11 msgs #13641: decoding functions in the base64 module could accept unicode s http://bugs.python.org/issue13641 11 msgs #13405: Add DTrace probes http://bugs.python.org/issue13405 9 msgs #14073: allow per-thread atexit() http://bugs.python.org/issue14073 9 msgs #13447: Add tests for some scripts in Tools/scripts http://bugs.python.org/issue13447 8 msgs #13873: SIGBUS in test_big_buffer() of test_zlib on Debian bigmem buil http://bugs.python.org/issue13873 8 msgs #14088: sys.executable generating canonical path http://bugs.python.org/issue14088 8 msgs #2377: Replace __import__ w/ importlib.__import__ http://bugs.python.org/issue2377 7 msgs Issues closed (45) ================== #1659: Tests needing network flag? http://bugs.python.org/issue1659 closed by eric.araujo #6039: cygwin compilers should not check compiler versions http://bugs.python.org/issue6039 closed by eric.araujo #6807: No such file or directory: 'msisupport.dll' in msi.py http://bugs.python.org/issue6807 closed by loewis #7813: Bug in command-line module launcher http://bugs.python.org/issue7813 closed by eric.araujo #7966: mhlib does not emit deprecation warning http://bugs.python.org/issue7966 closed by eric.araujo #8033: sqlite: broken long integer handling for arguments to user-def http://bugs.python.org/issue8033 closed by python-dev #9691: sdist includes files that are not in MANIFEST.in http://bugs.python.org/issue9691 closed by eric.araujo #10580: Minor grammar change in Python???s MSI installer http://bugs.python.org/issue10580 closed by loewis #11689: sqlite: Incorrect unit test fails to detect failure http://bugs.python.org/issue11689 closed by python-dev #12406: msi.py needs updating for Python 3.3 http://bugs.python.org/issue12406 closed by loewis #12627: Implement PEP 394: The "python" Command on Unix-Like Systems http://bugs.python.org/issue12627 closed by ned.deily #12702: shutil.copytree() should use os.lutimes() to copy the metadata http://bugs.python.org/issue12702 closed by petri.lehtinen #12817: test_multiprocessing: io.BytesIO() requires bytearray buffers http://bugs.python.org/issue12817 closed by skrah #13909: Ordering of free variables in dis is dependent on dict orderin http://bugs.python.org/issue13909 closed by Mark.Shannon #13974: packaging: test for set_platform() http://bugs.python.org/issue13974 closed by eric.araujo #13978: OSError exception in multiprocessing module when using os.remo http://bugs.python.org/issue13978 closed by neologix #14001: CVE-2012-0845 Python v2.7.2 / v3.2.2 (SimpleXMLRPCServer): DoS http://bugs.python.org/issue14001 closed by neologix #14004: Distutils filelist selects too many files on Windows http://bugs.python.org/issue14004 closed by eric.araujo #14005: IDLE Crash when running/saving a file http://bugs.python.org/issue14005 closed by ned.deily #14020: Improve HTMLParser doc http://bugs.python.org/issue14020 closed by ezio.melotti #14023: bytes implied to be mutable http://bugs.python.org/issue14023 closed by terry.reedy #14038: Packaging test support code raises exception http://bugs.python.org/issue14038 closed by eric.araujo #14040: Deprecate some of the module file formats http://bugs.python.org/issue14040 closed by pitrou #14043: Speed-up importlib's _FileFinder http://bugs.python.org/issue14043 closed by pitrou #14045: In regex pattern long unicode character isn't recognized by re http://bugs.python.org/issue14045 closed by loewis #14047: UTF-8 Email Header http://bugs.python.org/issue14047 closed by eric.araujo #14048: calendar bug related to September 2-14, 1752 http://bugs.python.org/issue14048 closed by mark.dickinson #14051: Cannot set attributes on staticmethod http://bugs.python.org/issue14051 closed by python-dev #14052: importlib mixes up '.' and os.getcwd() http://bugs.python.org/issue14052 closed by brett.cannon #14053: Make patchcheck work with MQ http://bugs.python.org/issue14053 closed by nadeem.vawda #14054: test_importlib failures under Windows http://bugs.python.org/issue14054 closed by brett.cannon #14058: test_sys has started failing http://bugs.python.org/issue14058 closed by vinay.sajip #14063: test_importlib failure on Mac OS X http://bugs.python.org/issue14063 closed by pitrou #14064: collections module imported twice in urllib/parse.py http://bugs.python.org/issue14064 closed by benjamin.peterson #14066: Crash in imputil.imp.find_module when a *directory* exists wit http://bugs.python.org/issue14066 closed by dmalcolm #14068: problem with re split http://bugs.python.org/issue14068 closed by ezio.melotti #14077: sporadic test_multiprocessing failure http://bugs.python.org/issue14077 closed by neologix #14079: Problems with recent test_subprocess changes http://bugs.python.org/issue14079 closed by vinay.sajip #14083: Use local timezone offset by default in datetime.timezone http://bugs.python.org/issue14083 closed by ncoghlan #14084: test_imp resource leak http://bugs.python.org/issue14084 closed by pitrou #14086: str(KeyError("Foo")) Unexpected Result http://bugs.python.org/issue14086 closed by vencabot_teppoo #14090: Bus error on test_big_buffer() of test_zlib, buildbot AMD64 de http://bugs.python.org/issue14090 closed by nadeem.vawda #14091: python subprocess hangs if script called from another director http://bugs.python.org/issue14091 closed by Massimo.Paladin #14096: IDLE quits unexpectedly when some keys are pressed http://bugs.python.org/issue14096 closed by guxianminer #1173134: improvement of the script adaptation for the win32 platform http://bugs.python.org/issue1173134 closed by eric.araujo From ncoghlan at gmail.com Fri Feb 24 18:24:27 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 25 Feb 2012 03:24:27 +1000 Subject: [Python-Dev] PEP 413: Faster evolution of the Python Standard Library Message-ID: To allow the PEP 407 authors to focus on making the case for doing full CPython releases every 6 months (including language spec updates), I've created PEP 413 as a competing proposal. It leaves the current language versioning (and rate of change) alone, while adding an additional date based standard library version number and adopting a new development workflow that will allow "standard library" releases to be made alongside each new maintenance release. PEP text is below, or you can read it online: http://www.python.org/dev/peps/pep-0413/ Cheers, Nick. PEP: 413 Title: Faster evolution of the Python Standard Library Version: $Revision$ Last-Modified: $Date$ Author: Nick Coghlan Status: Draft Type: Process Content-Type: text/x-rst Created: 2012-02-24 Post-History: 2012-02-24 Resolution: TBD Abstract ======== This PEP proposes the adoption of a new date-based versioning scheme for the standard library (distinct from, but coupled to, the existing language versioning scheme) that allows accelerated releases of the Python standard library, while maintaining (or even slowing down) the current rate of change in the core language definition. Like PEP 407, it aims to adjust the current balance between measured change that allows the broader community time to adapt and being able to keep pace with external influences that evolve more rapidly than the current release cycle can handle (this problem is particularly notable for standard library elements that relate to web technologies). However, it's more conservative in its aims than PEP 407, seeking to restrict the increased pace of development to builtin and standard library interfaces, without affecting the rate of change for other elements such as the language syntax and version numbering as well as the CPython binary API and bytecode format. Rationale ========= To quote the PEP 407 abstract: Finding a release cycle for an open-source project is a delicate exercise in managing mutually contradicting constraints: developer manpower, availability of release management volunteers, ease of maintenance for users and third-party packagers, quick availability of new features (and behavioural changes), availability of bug fixes without pulling in new features or behavioural changes. The current release cycle errs on the conservative side. It is adequate for people who value stability over reactivity. This PEP is an attempt to keep the stability that has become a Python trademark, while offering a more fluid release of features, by introducing the notion of long-term support versions. I agree with the PEP 407 authors that the current release cycle of the *standard library* is too slow to effectively cope with the pace of change in some key programming areas (specifically, web protocols and related technologies, including databases, templating and serialisation formats). However, I have written this competing PEP because I believe that the approach proposed in PEP 407 of offering full, potentially binary incompatible releases of CPython every 6 months places too great a burden on the wider Python ecosystem. Under the current CPython release cycle, distributors of key binary extensions will often support Python releases even after the CPython branches enter "security fix only" mode (for example, Twisted currently ships binaries for 2.5, 2.6 and 2.7, NumPy and SciPy suport those 3 along with 3.1 and 3.2, PyGame adds a 2.4 binary release, wxPython provides both 32-bit and 64-bit binaries for 2.6 and 2.7, etc). If CPython were to triple (or more) its rate of releases, the developers of those libraries (many of which are even more resource starved than CPython) would face an unpalatable choice: either adopt the faster release cycle themselves (up to 18 simultaneous binary releases for PyGame!), drop older Python versions more quickly, or else tell their users to stick to the CPython LTS releases (thus defeating the entire point of speeding up the CPython release cycle in the first place). Similarly, many support tools for Python (e.g. syntax highlighters) can take quite some time to catch up with language level changes. At a cultural level, the Python community is also accustomed to a certain meaning for Python version numbers - they're linked to deprecation periods, support periods, all sorts of things. PEP 407 proposes that collective knowledge all be swept aside, without offering a compelling rationale for why such a course of action is actually *necessary* (aside from, perhaps, making the lives of the CPython core developers a little easier at the expense of everyone else). But, if we go back to the primary rationale for increasing the pace of change (i.e. more timely support for web protocols and related technologies), we can note that those only require *standard library* changes. That means many (perhaps even most) of the negative effects on the wider community can be avoided by explicitly limiting which parts of CPython are affected by the new release cycle, and allowing other parts to evolve at their current, more sedate, pace. Proposal ======== This PEP proposes the addition of a new ``sys.stdlib_info`` attribute that records a date based standard library version above and beyond the underlying interpreter version:: sys.stdlib_info(year=2012, month=8, micro=0, releaselevel='final', serial=0) This information would also be included in the ``sys.version`` string:: Python 3.3.0 (12.08.0, default:c1a07c8092f7+, Feb 17 2012, 23:03:41) [GCC 4.6.1] When maintenance releases are created, *two* new versions of Python would actually be published on python.org (using the first 3.3 maintenance release, planned for February 2013 as an example):: 3.3.1 + 12.08.1 # Maintenance release 3.3.1 + 13.02.0 # Standard library release A standard library release would just be the corresponding maintenance release, with the following additional, backwards compatible changes: * new features in pure Python modules * new features in C extension modules (subject to PEP 399 compatibility requirements) * new features in language builtins (provided the C ABI remains unaffected) A further 6 months later, the next 3.3 maintenance release would again be accompanied by a new standard library release:: 3.3.2 + 12.08.2 # Maintenance release 3.3.2 + 13.08.1 # Standard library release Again, the standard library release would be binary compatible with the previous language release, merely offering additional features at the Python level. Finally, 18 months after the release of 3.3, a new language release would be made around the same time as the final 3.3 maintenance release: 3.3.3 + 12.08.3 # Maintenance release 3.4.0 + 14.02.0 # Language release Language releases would then contain all the features that are not permitted in standard library releases: * new language syntax * new deprecation warnings * removal of previously deprecated features * changes to the emitted bytecode * changes to the AST * any other significant changes to the compilation toolchain * changes to the C ABI The 3.4 release cycle would then follow a similar pattern to that for 3.3:: 3.4.1 + 14.02.1 # Maintenance release 3.4.1 + 14.08.0 # Standard library release 3.4.2 + 14.02.2 # Maintenance release 3.4.2 + 15.02.0 # Standard library release 3.4.3 + 14.02.3 # Maintenance release 3.5.0 + 15.08.0 # Language release Effects ======= Effect on development cycle --------------------------- Similar to PEP 407, this PEP will break up the delivery of new features into more discrete chunks. Instead of whole raft of changes landing all at once in a language release, each language release will be limited to 6 months worth of standard library changes, as well as any changes associated with new syntax. Effect on workflow ------------------ This PEP proposes the creation of a single additional branch for use in the normal workflow. After the release of 3.3, the following branches would be in use:: 2.7 # Maintenance branch, no change 3.3 # Maintenance branch, as for 3.2 3.3-compat # New branch, backwards compatible changes default # Language changes, standard library updates that depend on them When working on a new feature, developers will need to decide whether or not it is an acceptable change for a standard library release. If so, then it should be checked in on ``3.3-compat`` and then merged to ``default``. Otherwise it should be checked in directly to ``default``. Effect on bugfix cycle ---------------------- The effect on the bug fix cycle is essentially the same as that on the workflow for new features - there is one additional branch to pass through before the change reaches default branch. Effect on the community ----------------------- PEP 407 has this to say about the effects on the community: People who value stability can just synchronize on the LTS releases which, with the proposed figures, would give a similar support cycle (both in duration and in stability). I believe this statement is just plain wrong. Life isn't that simple. Instead, developers of third party modules and frameworks will come under pressure to support the full pace of the new release cycle with binary updates, teachers and book authors will receive complaints that they're only covering an "old" version of Python ("You're only using 3.3, the latest is 3.5!"), etc. As the minor version number starts climbing 3 times faster than it has in the past, I believe perceptions of language stability would also fall (whether such opinions were justified or not). I believe isolating the increased pace of change to the standard library, and clearly delineating it with a separate date-based version number will greatly reassure the rest of the community that no, we're not suddenly asking them to triple their own rate of development. Instead, we're merely going to ship standard library updates for the next language release in three 6-monthly installments rather than delaying them all, even those that are backwards compatible with the previously released version of Python. The community benefits list in PEP 407 are equally applicable to this PEP, at least as far as the standard library is concerned: People who value reactivity and access to new features (without taking the risk to install alpha versions or Mercurial snapshots) would get much more value from the new release cycle than currently. People who want to contribute new features or improvements would be more motivated to do so, knowing that their contributions will be more quickly available to normal users. If the faster release cycle encourages more people to focus on contributing to the standard library rather than proposing changes to the language definition, I don't see that as a bad thing. Handling News Updates ===================== What's New? ----------- The "What's New" documents would be split out into separate documents for standard library releases and language releases. If the major version number only continues to increase once every decade or so, resolving the eventual numbering conflict can be safely deemed somebody elses problem :) NEWS ---- Merge conflicts on the NEWS file is already a hassle. Since this PEP proposes introduction of an additional branch into the normal workflow, resolving this becomes even more critical. While Mercurial phases will help to some degree, it would be good to eliminate the problem entirely. One suggestion from Barry Warsaw is to adopt a non-conflicting separate-files-per-change approach, similar to that used by Twisted [2_]. For this PEP, one possible layout for such an approach (adopted following the release of 3.3.0+12.8.0 using the existing NEWS process) might look like:: Misc/ lang_news/ 3.3.1/ 3.4.0/ stdlib_news/ 12.08.1/ builtins/ extensions/ library/ documentation/ tests/ 13.02.0/ builtins/ extensions/ library/ documentation/ tests/ NEWS # Now autogenerated from lang_news and stdlib_news Putting the version information in the directory heirarchy isn't strictly necessary (since the NEWS file generator could figure out from the version history), but does make it easy for *humans* to keep the different versions in order. Why isn't PEP 384 enough? ========================= PEP 384 introduced the notion of a "Stable ABI" for CPython, a limited subset of the full C ABI that is guaranteed to remain stable. Extensions built against the stable ABI should be able to support all subsequent Python versions with the same binary. This will help new projects to avoid coupling their C extension modules too closely to a specific version of CPython. For existing modules, however, migrating to the stable ABI can involve quite a lot of work (especially for extension modules that define a lot of classes). With limited development resources available, any time spent on such a change is time that could otherwise have been spent working on features that are offer more direct benefits to end users. Why not separate out the standard library entirely? =================================================== Because it's a lot of work for next to no pay-off. CPython without the standard library is useless (the build chain won't even finish). You can't create a standalone pure Python standard library, because too many "modules" are actually tightly linked in to the internal details of the respective interpreters (e.g. ``weakref``, ``gc``, ``sys``). Creating a separate development branch that is kept compatible with the previous feature release should provide most of the benefits of a separate standard library repository with only a fraction of the pain. Acknowledgements ================ Thanks go to the PEP 407 authors for starting this discussion, as well as to those authors and Larry Hastings for initial discussions of the proposal made in this PEP. References ========== .. [1] PEP 407: New release cycle and introducing long-term support versions http://www.python.org/dev/peps/pep-0407/ .. [2] Twisted's "topfiles" approach to NEWS generation http://twistedmatrix.com/trac/wiki/ReviewProcess#Newsfiles Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Fri Feb 24 18:09:48 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 25 Feb 2012 03:09:48 +1000 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: Message-ID: On Sat, Feb 25, 2012 at 2:41 AM, Benjamin Peterson wrote: > 2012/2/24 Tshepang Lekhonkhobe : >> Hi, >> >> I was of the thought that Old String Formatting |"%s" % foo| was to be >> phased out by Advanced String Formatting |"{}.format(foo)|. I however >> keep seeing new code committed into the main VCS using the old style. >> Is this okay? Is there a policy? I ask also because I expect CPython >> to lead by example. > > Using either is fine I think. It doesn't hurt anyone to have old > string formatting around for a long time. In general, +0 for using new > string formatting. Yep. Also, the two can work nicely in tandem as templates for each other that don't need tons of escaping. >> >> On another note, will the old format ever be deprecated? Is there a date? > > I doubt it. *If* the old format were ever to be deprecated (and that's a very big if), Python 4k would be the earliest it could happen. More likely the old format will just hang around indefinitely, though. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Fri Feb 24 18:46:22 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 24 Feb 2012 18:46:22 +0100 Subject: [Python-Dev] PEP 413: Faster evolution of the Python Standard Library References: Message-ID: <20120224184622.5df22ade@pitrou.net> Hello, On Sat, 25 Feb 2012 03:24:27 +1000 Nick Coghlan wrote: > To allow the PEP 407 authors to focus on making the case for doing > full CPython releases every 6 months (including language spec > updates), I've created PEP 413 as a competing proposal. > > It leaves the current language versioning (and rate of change) alone, > while adding an additional date based standard library version number > and adopting a new development workflow that will allow "standard > library" releases to be made alongside each new maintenance release. Overall, I like the principle of this PEP, but I really dislike the dual version numbering it introduces. Such a numbering scheme will be cryptic and awkward for anyone but Python specialists. I also think the branches and releases management should be even simpler: - 2.7: as today - 3.3: bug fixes + stdlib enhancements - default: language enhancements / ABI-breaking changes Every 6 months, a new stdlib + bugfix release would be cut (3.3.1, 3.3.2, etc.), while language enhancement releases (3.4, 3.5...) would still happen every 18 months. If people really want some bugfix releases without any stdlib enhancements, we have two solutions: - let them handle patch maintenance themselves - have a community-maintained bugfix branch or repo somewhere, where interested contributors can backport selected bugfixes Regards Antoine. From ethan at stoneleaf.us Fri Feb 24 18:00:12 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 24 Feb 2012 09:00:12 -0800 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: Message-ID: <4F47C21C.50805@stoneleaf.us> Tshepang Lekhonkhobe wrote: > Hi, > > I was of the thought that Old String Formatting |"%s" % foo| was to be > phased out by Advanced String Formatting |"{}.format(foo)|. I however > keep seeing new code committed into the main VCS using the old style. > Is this okay? Is there a policy? I ask also because I expect CPython > to lead by example. > > On another note, will the old format ever be deprecated? Is there a date? Brett Cannon wrote: > On Tue, Feb 22, 2011 at 10:43, Ethan Furman wrote: > >> Greetings! >> >> According to these release notes in Python 3.0, %-formatting will be >> going away. >> >> http://docs.python.org/release/3.0.1/whatsnew/3.0.html#pep-3101-a-new-approach-to-string-formatting >> >> However, I was unable to find any further evidence of actual >> deprecation in 3.1 or 3.2... does anybody know the status of this? >> > > There isn't one. =) > > The very long term view is for %-formatting to go away, but that's as > far as the thinking has gone. There are currently no plans to > introduce any deprecation warning, and I highly doubt we will even > remove the feature in Python 3, giving you probably at least another > decade of use at our current major version release schedule. =) From g.brandl at gmx.net Fri Feb 24 19:23:36 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 24 Feb 2012 19:23:36 +0100 Subject: [Python-Dev] PEP 413: Faster evolution of the Python Standard Library In-Reply-To: <20120224184622.5df22ade@pitrou.net> References: <20120224184622.5df22ade@pitrou.net> Message-ID: Am 24.02.2012 18:46, schrieb Antoine Pitrou: > > Hello, > > On Sat, 25 Feb 2012 03:24:27 +1000 > Nick Coghlan wrote: >> To allow the PEP 407 authors to focus on making the case for doing >> full CPython releases every 6 months (including language spec >> updates), I've created PEP 413 as a competing proposal. >> >> It leaves the current language versioning (and rate of change) alone, >> while adding an additional date based standard library version number >> and adopting a new development workflow that will allow "standard >> library" releases to be made alongside each new maintenance release. > > Overall, I like the principle of this PEP, but I really dislike the > dual version numbering it introduces. Such a numbering scheme will be > cryptic and awkward for anyone but Python specialists. I agree. > I also think the branches and releases management should be even > simpler: > > - 2.7: as today > - 3.3: bug fixes + stdlib enhancements > - default: language enhancements / ABI-breaking changes > > Every 6 months, a new stdlib + bugfix release would be cut (3.3.1, > 3.3.2, etc.), while language enhancement releases (3.4, 3.5...) would > still happen every 18 months. Sorry, I don't think that's feasible at all. For one, it removes the possibility to target a stable set of features for a longer time. In short, the only usable solution I see is PEP 407-style versioning with language changes only in LTS releases. Georg From brett at python.org Fri Feb 24 19:59:55 2012 From: brett at python.org (Brett Cannon) Date: Fri, 24 Feb 2012 13:59:55 -0500 Subject: [Python-Dev] PEP 413: Faster evolution of the Python Standard Library In-Reply-To: References: <20120224184622.5df22ade@pitrou.net> Message-ID: On Fri, Feb 24, 2012 at 13:23, Georg Brandl wrote: > Am 24.02.2012 18:46, schrieb Antoine Pitrou: > > > > Hello, > > > > On Sat, 25 Feb 2012 03:24:27 +1000 > > Nick Coghlan wrote: > >> To allow the PEP 407 authors to focus on making the case for doing > >> full CPython releases every 6 months (including language spec > >> updates), I've created PEP 413 as a competing proposal. > >> > >> It leaves the current language versioning (and rate of change) alone, > >> while adding an additional date based standard library version number > >> and adopting a new development workflow that will allow "standard > >> library" releases to be made alongside each new maintenance release. > > > > Overall, I like the principle of this PEP, but I really dislike the > > dual version numbering it introduces. Such a numbering scheme will be > > cryptic and awkward for anyone but Python specialists. > > I agree. > Ditto. You could also mention that this could help other VMs by getting compatibility fixes into the stdlib faster than they do currently, letting other VMs hit minor release compat faster. > > > I also think the branches and releases management should be even > > simpler: > > > > - 2.7: as today > > - 3.3: bug fixes + stdlib enhancements > > - default: language enhancements / ABI-breaking changes > > > > Every 6 months, a new stdlib + bugfix release would be cut (3.3.1, > > 3.3.2, etc.), while language enhancement releases (3.4, 3.5...) would > > still happen every 18 months. > > Sorry, I don't think that's feasible at all. For one, it removes the > possibility to target a stable set of features for a longer time. > > In short, the only usable solution I see is PEP 407-style versioning > with language changes only in LTS releases. While I personally would rather switch to making the major version mean a language change has occurred instead of reserving it for completely backwards-incompatible language changes, I know history is going to lead people killing that idea, so I agree with Georg on using an LTS delineation for language-changing releases and keeping them patched up until the next LTS is better. IOW we cut 3.3.0 as an LTS and then have 3.4 and 3.5 only contain stdlib changes while also releasing 3.3.1 and 3.3.2 for bugfixes on 3.3. There would be no micro releases for 3.4 and 3.5 (sans an emergency brown bag release) since the next release is coming in 6 months anyway with any previous bugfixes + changes that might break compatibility with 3.3 in the stdlib. My worry with Antoine's approach is that it will cause pain for people by us mucking around with stdlib stuff that will break compatibility somehow, preventing some people from upgrading immediately, but out in the cold when it comes to bugfixes since they can't grab the next release yet. The only way for Antoine's approach to work is if we made some release count guarantee (like 3 releases) for anything that might break someone, making stdlib-only releases more accumulative then iterative. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Fri Feb 24 20:52:09 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 24 Feb 2012 14:52:09 -0500 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: Message-ID: On 2/24/2012 11:41 AM, Benjamin Peterson wrote: > 2012/2/24 Tshepang Lekhonkhobe: >> Hi, >> >> I was of the thought that Old String Formatting |"%s" % foo| was to be >> phased out by Advanced String Formatting |"{}.format(foo)|. I however >> keep seeing new code committed into the main VCS using the old style. >> Is this okay? Is there a policy? I ask also because I expect CPython >> to lead by example. > > Using either is fine I think. It doesn't hurt anyone to have old > string formatting around for a long time. It is a burden for some people to learn and remember the exact details of both systems and exactly how they differ. Having both in the stdlib hurts readability for such people. I would prefer that the stdlib only used {} formatting or if not that, that it only used the simple, hard-to-forget forms of % formatting (%s, %d, %f, without elaboration). -- Terry Jan Reedy From martin at v.loewis.de Fri Feb 24 22:39:47 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 24 Feb 2012 22:39:47 +0100 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: Message-ID: <4F4803A3.7040803@v.loewis.de> > It is a burden for some people to learn and remember the exact details > of both systems and exactly how they differ. Having both in the stdlib > hurts readability for such people. I would prefer that the stdlib only > used {} formatting or if not that, that it only used the simple, > hard-to-forget forms of % formatting (%s, %d, %f, without elaboration). If that issue was getting serious, I would prefer if the .format method was deprecated, and only % formatting was kept. I doubt this is as much of an issue as you think, though. Regards, Martin From martin at v.loewis.de Fri Feb 24 22:37:40 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 24 Feb 2012 22:37:40 +0100 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: Message-ID: <4F480324.7000308@v.loewis.de> > I was of the thought that Old String Formatting |"%s" % foo| was to be > phased out by Advanced String Formatting |"{}.format(foo)|. This is actually not the case, and never was. Some people would indeed like to see that happen, and others are strongly opposed. As a consequence, both APIs for formatting will coexist for a long time to come (ten years at least); no deprecation is planned. Regards, Martin From tshepang at gmail.com Fri Feb 24 22:52:01 2012 From: tshepang at gmail.com (Tshepang Lekhonkhobe) Date: Fri, 24 Feb 2012 23:52:01 +0200 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: <4F4803A3.7040803@v.loewis.de> References: <4F4803A3.7040803@v.loewis.de> Message-ID: On Fri, Feb 24, 2012 at 23:39, "Martin v. L?wis" wrote: >> It is a burden for some people to learn and remember the exact details >> of both systems and exactly how they differ. Having both in the stdlib >> hurts readability for such people. I would prefer that the stdlib only >> used {} formatting or if not that, that it only used the simple, >> hard-to-forget forms of % formatting (%s, %d, %f, without elaboration). > > If that issue was getting serious, I would prefer if the .format method > was deprecated, and only % formatting was kept. Why is that? Isn't .format regarded superior? Or is this just a matter of taste? From greg at krypto.org Fri Feb 24 22:56:44 2012 From: greg at krypto.org (Gregory P. Smith) Date: Fri, 24 Feb 2012 13:56:44 -0800 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: <4F4803A3.7040803@v.loewis.de> Message-ID: On Fri, Feb 24, 2012 at 1:52 PM, Tshepang Lekhonkhobe wrote: > On Fri, Feb 24, 2012 at 23:39, "Martin v. L?wis" wrote: >>> It is a burden for some people to learn and remember the exact details >>> of both systems and exactly how they differ. Having both in the stdlib >>> hurts readability for such people. I would prefer that the stdlib only >>> used {} formatting or if not that, that it only used the simple, >>> hard-to-forget forms of % formatting (%s, %d, %f, without elaboration). >> >> If that issue was getting serious, I would prefer if the .format method >> was deprecated, and only % formatting was kept. > > Why is that? Isn't .format regarded superior? Or is this just a matter of taste? It has superior features, but its current implementation is much slower and there is a HUGE body of existing code that would need conversion (a lot of that is automatable), including most uses of the logging module. % formatting is also familiar to anyone who uses C and C++ on a regular basis. as martin said, both will exist for a long time. -gps From barry at python.org Fri Feb 24 23:02:55 2012 From: barry at python.org (Barry Warsaw) Date: Fri, 24 Feb 2012 17:02:55 -0500 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: <4F4803A3.7040803@v.loewis.de> References: <4F4803A3.7040803@v.loewis.de> Message-ID: <20120224170255.5df74a81@resist.wooz.org> On Feb 24, 2012, at 10:39 PM, Martin v. L?wis wrote: >> It is a burden for some people to learn and remember the exact details >> of both systems and exactly how they differ. Having both in the stdlib >> hurts readability for such people. I would prefer that the stdlib only >> used {} formatting or if not that, that it only used the simple, >> hard-to-forget forms of % formatting (%s, %d, %f, without elaboration). > >If that issue was getting serious, I would prefer if the .format method >was deprecated, and only % formatting was kept. > >I doubt this is as much of an issue as you think, though. I personally prefer .format() these days, but I agree that it will be a long time, if ever, we see one of the styles getting dropped. I don't have much of a problem with the stdlib containing both styles as appropriate and preferred by the maintainer, as long as the module is consistent. Also, modules which provide a format-ish API should accept both styles, such as what logging currently does. (It's nice that logging also accepts PEP 292 $-strings too, don't forget about those. :) -Barry From martin at v.loewis.de Sat Feb 25 01:20:39 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Sat, 25 Feb 2012 01:20:39 +0100 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: <4F4803A3.7040803@v.loewis.de> Message-ID: <20120225012039.Horde.G9vmccL8999PSClXHxABAoA@webmail.df.eu> Zitat von Tshepang Lekhonkhobe : > On Fri, Feb 24, 2012 at 23:39, "Martin v. L?wis" wrote: >>> It is a burden for some people to learn and remember the exact details >>> of both systems and exactly how they differ. Having both in the stdlib >>> hurts readability for such people. I would prefer that the stdlib only >>> used {} formatting or if not that, that it only used the simple, >>> hard-to-forget forms of % formatting (%s, %d, %f, without elaboration). >> >> If that issue was getting serious, I would prefer if the .format method >> was deprecated, and only % formatting was kept. > > Why is that? Isn't .format regarded superior? I find the .format syntax too complicated and difficult to learn. It has so many bells and whistles, making it more than just a *mini* language. So for my own code, I always prefer % formatting for simplicity. Regards, Martin From breamoreboy at yahoo.co.uk Sat Feb 25 01:23:41 2012 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Sat, 25 Feb 2012 00:23:41 +0000 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: <4F480324.7000308@v.loewis.de> References: <4F480324.7000308@v.loewis.de> Message-ID: On 24/02/2012 21:37, "Martin v. L?wis" wrote: >> I was of the thought that Old String Formatting |"%s" % foo| was to be >> phased out by Advanced String Formatting |"{}.format(foo)|. > > This is actually not the case, and never was. Some people would indeed > like to see that happen, and others are strongly opposed. > > As a consequence, both APIs for formatting will coexist for a long time > to come (ten years at least); no deprecation is planned. > > Regards, > Martin Quoting the docs http://docs.python.org/py3k/library/stdtypes.html 4.6.2. Old String Formatting Operations Note The formatting operations described here are obsolete and may go away in future versions of Python. Use the new String Formatting in new code. I think this is daft because all of the code has to be supported for the ten years that MVL has suggested. I suggest a plan that says something like:- Until Python 3.5 both methods of string formatting will be supported. In Python 3.6 the the old formating style will be deprecated. In Python 3.7 the old style is dead. I'm fully aware that it isn't likely to be that easy, I'm simply trying to spark ideas from the core developers and users who are in a far better situation to judge this situation than I am. -- Cheers. Mark Lawrence. From brian at python.org Sat Feb 25 03:05:33 2012 From: brian at python.org (Brian Curtin) Date: Fri, 24 Feb 2012 20:05:33 -0600 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: <4F480324.7000308@v.loewis.de> Message-ID: On Feb 24, 2012 6:26 PM, "Mark Lawrence" wrote: > > On 24/02/2012 21:37, "Martin v. L?wis" wrote: >>> >>> I was of the thought that Old String Formatting |"%s" % foo| was to be >>> phased out by Advanced String Formatting |"{}.format(foo)|. >> >> >> This is actually not the case, and never was. Some people would indeed >> like to see that happen, and others are strongly opposed. >> >> As a consequence, both APIs for formatting will coexist for a long time >> to come (ten years at least); no deprecation is planned. >> >> Regards, >> Martin > > > > Quoting the docs http://docs.python.org/py3k/library/stdtypes.html > > 4.6.2. Old String Formatting Operations > > Note > > The formatting operations described here are obsolete and may go away in future versions of Python. Use the new String Formatting in new code. > > > > I think this is daft because all of the code has to be supported for the ten years that MVL has suggested. I suggest a plan that says something like:- > > Until Python 3.5 both methods of string formatting will be supported. > In Python 3.6 the the old formating style will be deprecated. > In Python 3.7 the old style is dead. > > I'm fully aware that it isn't likely to be that easy, I'm simply trying to spark ideas from the core developers and users who are in a far better situation to judge this situation than I am. -infinity. We can't do that as outlined earlier in the thread. -------------- next part -------------- An HTML attachment was scrubbed... URL: From anacrolix at gmail.com Sat Feb 25 03:08:05 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Sat, 25 Feb 2012 10:08:05 +0800 Subject: [Python-Dev] PEP 413: Faster evolution of the Python Standard Library In-Reply-To: References: <20120224184622.5df22ade@pitrou.net> Message-ID: I think every minor release should be fully supported. The current rate of change is very high and there's a huge burden on implementers and production users to keep up, so much so that upgrading is undesirable except for serious enthusiasts. Include just the basics and CPython specific modules in the core release and version the stdlib separately. The stdlib should be supported such that it can be installed to an arbitrary version of Python. Better yet I'd like to see the stdlib become a list of vetted external libraries that meet some requirements on usefulness, stability and compatibility (PEP), that get cut at regular intervals. This takes the burden away from core, improves innovation, allows for different implementations, and ensures that the Python package management system is actually useful. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sat Feb 25 03:53:26 2012 From: brett at python.org (Brett Cannon) Date: Fri, 24 Feb 2012 21:53:26 -0500 Subject: [Python-Dev] PEP 413: Faster evolution of the Python Standard Library In-Reply-To: References: <20120224184622.5df22ade@pitrou.net> Message-ID: On Fri, Feb 24, 2012 at 21:08, Matt Joiner wrote: > I think every minor release should be fully supported. The current rate of > change is very high and there's a huge burden on implementers and > production users to keep up, so much so that upgrading is undesirable > except for serious enthusiasts. > > Include just the basics and CPython specific modules in the core release > and version the stdlib separately. The stdlib should be supported such that > it can be installed to an arbitrary version of Python. > That idea has been put forth and shot down. The stdlib has to be tied to at least some version of Python just like any other project. Plus the stdlib is where we try out new language features to make sure they make sense. Making it a separate project is not that feasible. > Better yet I'd like to see the stdlib become a list of vetted external > libraries that meet some requirements on usefulness, stability and > compatibility (PEP), that get cut at regular intervals. This takes the > burden away from core, improves innovation, allows for different > implementations, and ensures that the Python package management system is > actually useful. > That's been called a sumo release and proposed before, but no one has taken the time to do it (although the 3rd-party releases of Python somewhat take this view). Thinning out the stdlib in favour of the community providing solutions is another can of worms which does not directly impact the discussion of how to handle stdlib releases unless you are pushing to simply drop the stdlib which is not possible as Python itself depends on it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ned at nedbatchelder.com Sat Feb 25 04:10:28 2012 From: ned at nedbatchelder.com (Ned Batchelder) Date: Fri, 24 Feb 2012 22:10:28 -0500 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: <4F480324.7000308@v.loewis.de> Message-ID: <4F485124.1020800@nedbatchelder.com> On 2/24/2012 7:23 PM, Mark Lawrence wrote: > I think this is daft because all of the code has to be supported for > the ten years that MVL has suggested. I suggest a plan that says > something like:- > > Until Python 3.5 both methods of string formatting will be supported. > In Python 3.6 the the old formating style will be deprecated. > In Python 3.7 the old style is dead. > > I'm fully aware that it isn't likely to be that easy, I'm simply > trying to spark ideas from the core developers and users who are in a > far better situation to judge this situation than I am. I don't understand why we'd even consider getting rid of old-style formatting. Python does a great job keeping things working into the future, and there are so many features in the language and library that are there to keep old code working in spite of newer ways to accomplish the same task. Has Python *ever* removed a feature except in X.0 releases? Why are we even discussing this? Two ways to format strings is no big deal, especially considering how heavily used these tools are. And btw, if you include the almost-never-mentioned string.Template, there are at least three ways to do it, maybe more. --Ned. From anacrolix at gmail.com Sat Feb 25 04:32:36 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Sat, 25 Feb 2012 11:32:36 +0800 Subject: [Python-Dev] PEP 413: Faster evolution of the Python Standard Library In-Reply-To: References: <20120224184622.5df22ade@pitrou.net> Message-ID: Why not cut the (external) stdlib before each minor release? Testing new language features is not the role of a public release, this is no reason to require ownership on a module. Evidently some modules have to ship with core if they are required (sys),or expose internals (os, gc). Others are clearly extras, (async{ore,hat}, subprocess, unittest, select). There are so many third party modules languishing because inferior forms exist in the stdlib, and no centralized method for their recommendation and discovery. Breaking out optional parts of the stdlib is an enabling step towards addressing this. I would suggest Haskell, node.js and golang as examples of how stdlibs are minimal enough to define basic idiomatic interfaces but allow and encourage extension. On Feb 25, 2012 10:53 AM, "Brett Cannon" wrote: > > > On Fri, Feb 24, 2012 at 21:08, Matt Joiner wrote: > >> I think every minor release should be fully supported. The current rate >> of change is very high and there's a huge burden on implementers and >> production users to keep up, so much so that upgrading is undesirable >> except for serious enthusiasts. >> >> Include just the basics and CPython specific modules in the core release >> and version the stdlib separately. The stdlib should be supported such that >> it can be installed to an arbitrary version of Python. >> > > That idea has been put forth and shot down. The stdlib has to be tied to > at least some version of Python just like any other project. Plus the > stdlib is where we try out new language features to make sure they make > sense. Making it a separate project is not that feasible. > > >> Better yet I'd like to see the stdlib become a list of vetted external >> libraries that meet some requirements on usefulness, stability and >> compatibility (PEP), that get cut at regular intervals. This takes the >> burden away from core, improves innovation, allows for different >> implementations, and ensures that the Python package management system is >> actually useful. >> > > That's been called a sumo release and proposed before, but no one has > taken the time to do it (although the 3rd-party releases of Python somewhat > take this view). Thinning out the stdlib in favour of the community > providing solutions is another can of worms which does not directly impact > the discussion of how to handle stdlib releases unless you are pushing to > simply drop the stdlib which is not possible as Python itself depends on it. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Feb 25 06:55:38 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 25 Feb 2012 15:55:38 +1000 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: <4F480324.7000308@v.loewis.de> Message-ID: On Sat, Feb 25, 2012 at 10:23 AM, Mark Lawrence wrote: > > Quoting the docs http://docs.python.org/py3k/library/stdtypes.html > > 4.6.2. Old String Formatting Operations > > Note > > The formatting operations described here are obsolete and may go away in > future versions of Python. Use the new String Formatting in new code. > > > > I think this is daft because all of the code has to be supported for the ten > years that MVL has suggested. Indeed, that note was written before we decided that getting rid of "%" formatting altogether would be a bad idea. It would be better to update it to say something like: "The formatting operations described here are modelled on C's printf() syntax. They only support formatting of certain builtin types, and the use of a binary operator means that care may be needed in order to format tuples and dictionaries correctly. As the new string formatting syntax is more powerful, flexible, extensible and handles tuples and dictionaries naturally, it is recommended for new code. However, there are no current plans to deprecate printf-style formatting." Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Feb 25 07:21:56 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 25 Feb 2012 16:21:56 +1000 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: <20120225012039.Horde.G9vmccL8999PSClXHxABAoA@webmail.df.eu> References: <4F4803A3.7040803@v.loewis.de> <20120225012039.Horde.G9vmccL8999PSClXHxABAoA@webmail.df.eu> Message-ID: On Sat, Feb 25, 2012 at 10:20 AM, wrote: > I find the .format syntax too complicated and difficult to learn. It has > so many bells and whistles, making it more than just a *mini* language. > So for my own code, I always prefer % formatting for simplicity. Heh, I've switched almost entirely to .format() just so I never have to worry if: fmt % arg should actually be written as: fmt % (arg,) With fmt.format(arg), the question just never comes up. Since 90%+ of the replacement field specifiers I use are just "{}" or "{!r}" (the format-style equivalents to "%s" and "%r" in printf-style formatting), and most of the rest are either using field references or just {:number_spec} (where the number formatting is the same as that in printf-style) the complexity of the full mini-language doesn't really come into play (although I did find use recently for several of the features in order to emit nicely formatted data from a command line utility). Another *very* handy trick is "{0.attr} {0.attr2}" for concise attribute formatting. Really, what the new-style formatting is most lacking is a tutorial or cookbook style reference to teach people things like: - here's how to do basic string interpolation ("{}") - here's how to use repr() or ascii() instead of str() ("{!r}", "{!a}") - here's how to explicit number your fields so you can refer to the same argument more than once ("{0}") - here's how to name your fields so you can use keyword arguments or format_map() - here's how to access attributes of an object being formatted - here's how to access individual items in a container being formatted - here's how to do implicit string interpolation with format_map() on locals() or vars(obj) - here's how to align (and/or truncate) text within a field - here's how to format numbers - here's how to implicitly invoke strftime And in a more advanced section: - here's how to use __format__() to customise the formatting options for your own class - here's how to use string.StringFormatter to create a custom formatting variant (e.g. one that invokes shlex.quote() on interpolated variables by default) Currently we point them directly at the relevant section of the *language spec*, which is written for people trying to create compliant formatting implementations, not anyone that just wants to *use* the thing. However, while I'm personally a fan of the new format() methods and greatly prefer them to the old way of doing things, I agree it would be a bad idea to try to *force* people to switch if they're used to printf-style formatting and don't have any issues with it (particularly given the current expert-friendly state of the documentation). It's not like printf-style formatting is a security risk or poses some kind of huge maintenance burden. I see it as similar to the getopt/optparse/argparse situation. getopt() has value for consistency with C's getopt() API, argparse is the current recommended best practice, while optparse is kept around because it is so prevalent that it isn't worth trying to remove it. Similarly, printf-style formatting has value for its consistency with C and deserves to be kept around due to both that and its current prevalence. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Feb 25 07:31:56 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 25 Feb 2012 16:31:56 +1000 Subject: [Python-Dev] PEP 413: Faster evolution of the Python Standard Library In-Reply-To: References: <20120224184622.5df22ade@pitrou.net> Message-ID: On Sat, Feb 25, 2012 at 1:32 PM, Matt Joiner wrote: > Evidently some modules have to ship with core if they are required (sys),or > expose internals (os, gc). Others are clearly extras, (async{ore,hat}, > subprocess, unittest, select). There's a whole raft of additional dependencies introduced by the build process and the regression test suite. The fact that your "others are clearly extras" was only right for 2 out of 5 examples doesn't fill me with confidence that you have any idea of the scale of what you're suggesting. But the real reason this isn't going to happen? It's an absolute ton of work that *won't make Python better at the end*. Anyone that feels otherwise is free to fork CPython and try it for themselves, but I predict they will be thoroughly disappointed with the outcome. (We moved to a DVCS in large part to make it easier for other people to experiment with things - they don't have to ask python-dev's permission, they can just fork, try it for themselves, and see if it works out. If they instead want to *pay* somebody to try it on their behalf, then they can just offer some contract work and see if they get any takers). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From tshepang at gmail.com Sat Feb 25 09:02:04 2012 From: tshepang at gmail.com (Tshepang Lekhonkhobe) Date: Sat, 25 Feb 2012 10:02:04 +0200 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: <4F485124.1020800@nedbatchelder.com> References: <4F480324.7000308@v.loewis.de> <4F485124.1020800@nedbatchelder.com> Message-ID: On Sat, Feb 25, 2012 at 05:10, Ned Batchelder wrote: > ?Has Python *ever* removed a feature except in X.0 releases? I thought this happens all the time, but with deprecations first. Is that not the case? From tshepang at gmail.com Sat Feb 25 09:06:46 2012 From: tshepang at gmail.com (Tshepang Lekhonkhobe) Date: Sat, 25 Feb 2012 10:06:46 +0200 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: <20120225012039.Horde.G9vmccL8999PSClXHxABAoA@webmail.df.eu> References: <4F4803A3.7040803@v.loewis.de> <20120225012039.Horde.G9vmccL8999PSClXHxABAoA@webmail.df.eu> Message-ID: On Sat, Feb 25, 2012 at 02:20, wrote: > Zitat von Tshepang Lekhonkhobe : >> On Fri, Feb 24, 2012 at 23:39, "Martin v. L?wis" >>> If that issue was getting serious, I would prefer if the .format method >>> was deprecated, and only % formatting was kept. >> >> Why is that? Isn't .format regarded superior? > > I find the .format syntax too complicated and difficult to learn. It has > so many bells and whistles, making it more than just a *mini* language. > So for my own code, I always prefer % formatting for simplicity. I find that strange, especially for an expert Python dev. I, a newbie, find it far friendlier (and easier for a new programmer to grasp). Maybe it's because I use it all the time, and you don't? From tshepang at gmail.com Sat Feb 25 09:14:26 2012 From: tshepang at gmail.com (Tshepang Lekhonkhobe) Date: Sat, 25 Feb 2012 10:14:26 +0200 Subject: [Python-Dev] PEP 413: Faster evolution of the Python Standard Library In-Reply-To: References: <20120224184622.5df22ade@pitrou.net> Message-ID: On Sat, Feb 25, 2012 at 05:32, Matt Joiner wrote: > There are so many third party modules languishing because inferior forms > exist in the stdlib, and no centralized method for their recommendation and > discovery. That's interesting. Do you have a list of these? Maybe a blog post somewhere? From ncoghlan at gmail.com Sat Feb 25 09:49:05 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 25 Feb 2012 18:49:05 +1000 Subject: [Python-Dev] PEP 413: Faster evolution of the Python Standard Library In-Reply-To: References: <20120224184622.5df22ade@pitrou.net> Message-ID: On Sat, Feb 25, 2012 at 4:59 AM, Brett Cannon wrote: > On Fri, Feb 24, 2012 at 13:23, Georg Brandl wrote: >> Am 24.02.2012 18:46, schrieb Antoine Pitrou: >> > Overall, I like the principle of this PEP, but I really dislike the >> > dual version numbering it introduces. Such a numbering scheme will be >> > cryptic and awkward for anyone but Python specialists. >> >> I agree. > > Ditto. And, in contrast, I believe that the free-wheeling minor version number proposed in PEP 407 is a train wreck and PR disaster waiting to happen. I find it interesting that we can so readily agree that using the major version number in any way is impossible due to the ongoing Python 2 -> 3 transition, yet I get so much pushback on the idea that messing with the implications of changing the *minor* version number will unnecessarily confuse or upset users. I spent quite a bit of time thinking about the ways people *use* the CPython version number, and it switched me from mildly preferring a separate version number for the standard library to being a strong *opponent* of increasing the rate of change for the minor version number. Anyway, the PEP now describes the user scenarios that convinced me that a separate version number for the standard library was the right way to go: http://www.python.org/dev/peps/pep-0413/#user-scenarios Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From steve at pearwood.info Sat Feb 25 10:41:51 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 25 Feb 2012 20:41:51 +1100 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: <4F480324.7000308@v.loewis.de> <4F485124.1020800@nedbatchelder.com> Message-ID: <4F48ACDF.1060608@pearwood.info> Tshepang Lekhonkhobe wrote: > On Sat, Feb 25, 2012 at 05:10, Ned Batchelder wrote: >> Has Python *ever* removed a feature except in X.0 releases? > > I thought this happens all the time, but with deprecations first. Is > that not the case? Hardly "all the time". Only when absolutely necessary, the exception being the 2.x -> 3.x transition which was designed to break backwards compatibility for the sake of "cleaning up" the language. And even there, the changes were very conservative. If there is every going to be a similar 3.x -> 4.x transition, and there may not be, it will probably be 10 years away. Python is a lot more mature now, and consequently the costs of breaking backwards compatibility is much greater, particularly when it comes to language features like % rather than modules. After all, it is easy for Python users to take a copy of a depreciated module and keep using it, but it's very difficult for them to fork Python if a language feature is removed. -- Steven From martin at v.loewis.de Sat Feb 25 11:20:51 2012 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sat, 25 Feb 2012 11:20:51 +0100 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: <4F4803A3.7040803@v.loewis.de> <20120225012039.Horde.G9vmccL8999PSClXHxABAoA@webmail.df.eu> Message-ID: <4F48B603.2000705@v.loewis.de> > I find that strange, especially for an expert Python dev. I, a newbie, > find it far friendlier (and easier for a new programmer to grasp). > Maybe it's because I use it all the time, and you don't? That is most likely the case. You learn by practice. For that very reason, the claim "and easier for a new programmer to grasp" is difficult to prove. It was easier for *you*, since you started using it, and then kept using it. I don't recall any particular obstacles learning % formatting (even though I did for C, not for C++). Generalizing that it is *easier* is invalid: you just didn't try learning that instead first, and now you can't go back in a state where either are new to you. C++ is very similar here: they also introduced a new way of output (iostreams, and << overloading). I used that for a couple of years, primarily because people said that printf is "bad" and "not object- oriented". I then recognized that there is nothing wrong with printf per so, and would avoid std::cout in C++ these days, in favor of std::printf (yes, I know that it does have an issue with type safety). So I think we really should fight the impression that % formatting in Python is "bad", "deprecated", or "old-style". Having both may be considered a slight violation of the Zen, however, I would claim that neither formatting API are that obvious - AFAIR, the biggest hurdle in learning printf was to understand the notion of "placeholder", which I think is the reason why people are coming up with so many templating systems (templating isn't "obvious"). Regards, Martin From songofacandy at gmail.com Sat Feb 25 11:44:35 2012 From: songofacandy at gmail.com (INADA Naoki) Date: Sat, 25 Feb 2012 19:44:35 +0900 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: <4F48B603.2000705@v.loewis.de> References: <4F4803A3.7040803@v.loewis.de> <20120225012039.Horde.G9vmccL8999PSClXHxABAoA@webmail.df.eu> <4F48B603.2000705@v.loewis.de> Message-ID: I don't feel "similar to other language" is not enough reason for builtins violates the Zen. Violating the Zen by standard library like `getopt` for compatibility to other language or API is good. So, I prefer moving %-style format from builtin str to function in string module in Python 4. On Sat, Feb 25, 2012 at 7:20 PM, "Martin v. L?wis" wrote: >> I find that strange, especially for an expert Python dev. I, a newbie, >> find it far friendlier (and easier for a new programmer to grasp). >> Maybe it's because I use it all the time, and you don't? > > That is most likely the case. You learn by practice. For that very > reason, the claim "and easier for a new programmer to grasp" is > difficult to prove. It was easier for *you*, since you started using > it, and then kept using it. I don't recall any particular obstacles > learning % formatting (even though I did for C, not for C++). > Generalizing that it is *easier* is invalid: you just didn't try > learning that instead first, and now you can't go back in a state > where either are new to you. > > C++ is very similar here: they also introduced a new way of output > (iostreams, and << overloading). I used that for a couple of years, > primarily because people said that printf is "bad" and "not object- > oriented". I then recognized that there is nothing wrong with printf > per so, and would avoid std::cout in C++ these days, in favor of > std::printf (yes, I know that it does have an issue with type safety). > > So I think we really should fight the impression that % formatting > in Python is "bad", "deprecated", or "old-style". Having both > may be considered a slight violation of the Zen, however, I would > claim that neither formatting API are that obvious - AFAIR, the > biggest hurdle in learning printf was to understand the notion > of "placeholder", which I think is the reason why people are coming > up with so many templating systems (templating isn't "obvious"). > > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com -- INADA Naoki? From tshepang at gmail.com Sat Feb 25 11:46:17 2012 From: tshepang at gmail.com (Tshepang Lekhonkhobe) Date: Sat, 25 Feb 2012 12:46:17 +0200 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: <4F48B603.2000705@v.loewis.de> References: <4F4803A3.7040803@v.loewis.de> <20120225012039.Horde.G9vmccL8999PSClXHxABAoA@webmail.df.eu> <4F48B603.2000705@v.loewis.de> Message-ID: On Sat, Feb 25, 2012 at 12:20, "Martin v. L?wis" wrote: >> I find that strange, especially for an expert Python dev. I, a newbie, >> find it far friendlier (and easier for a new programmer to grasp). >> Maybe it's because I use it all the time, and you don't? > > That is most likely the case. You learn by practice. For that very > reason, the claim "and easier for a new programmer to grasp" is > difficult to prove. It was easier for *you*, since you started using > it, and then kept using it. I don't recall any particular obstacles > learning % formatting (even though I did for C, not for C++). > Generalizing that it is *easier* is invalid: you just didn't try > learning that instead first, and now you can't go back in a state > where either are new to you. When I started using Python, Advanced format wasn't yet available, so I was forced to use Old style format. It's not a big issue, especially since I had also used C before then. It's just that when Advanced format was introduced, I fell for it, mainly because I found it more readable (also see the sort of power Nick displayed earlier in this thread) not to mention elegant. For that reason, I would recommend any new Python programmer to ignore Old style altogether. That includes those with C background. PS: By newbie, I meant that I'm at the low ranks (at least as compared to you), not that I only started using Python last year. Sorry for the noise. From breamoreboy at yahoo.co.uk Sat Feb 25 14:13:44 2012 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Sat, 25 Feb 2012 13:13:44 +0000 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: <4F480324.7000308@v.loewis.de> Message-ID: On 25/02/2012 05:55, Nick Coghlan wrote: > On Sat, Feb 25, 2012 at 10:23 AM, Mark Lawrence wrote: > >> >> Quoting the docs http://docs.python.org/py3k/library/stdtypes.html >> >> 4.6.2. Old String Formatting Operations >> >> Note >> >> The formatting operations described here are obsolete and may go away in >> future versions of Python. Use the new String Formatting in new code. >> >> >> >> I think this is daft because all of the code has to be supported for the ten >> years that MVL has suggested. > > Indeed, that note was written before we decided that getting rid of > "%" formatting altogether would be a bad idea. > > It would be better to update it to say something like: > > "The formatting operations described here are modelled on C's printf() > syntax. They only support formatting of certain builtin types, and the > use of a binary operator means that care may be needed in order to > format tuples and dictionaries correctly. As the new string formatting > syntax is more powerful, flexible, extensible and handles tuples and > dictionaries naturally, it is recommended for new code. However, there > are no current plans to deprecate printf-style formatting." > > Cheers, > Nick. > That's fine by me, it'll save me changing my own code. I'll put this on the issue tracker if you want, but after the pressing needs of the bar and 6 Nations rugby :) -- Cheers. Mark Lawrence. From zachary.ware at gmail.com Sat Feb 25 17:18:29 2012 From: zachary.ware at gmail.com (Zachary Ware) Date: Sat, 25 Feb 2012 10:18:29 -0600 Subject: [Python-Dev] PEP 413: Faster evolution of the Python Standard Library In-Reply-To: References: <20120224184622.5df22ade@pitrou.net> Message-ID: Quick disclaimer: this is the first time I've replied on any Python list, and thus am not entirely sure what I'm doing. Hopefully this message goes as expected :) Anyhow; I have to say I like Nick's idea put forth in PEP 413, but I agree that the extra versioning info could get pretty awkward. Therefore, why not just make stdlib upgrades part of the regular maintenance releases? As long as there is absolutely no change in usage from (for example) 3.3.0 to 3.3.1, what's wrong with adding new (stdlib) features in 3.3.1? Alternately, if we wanted to preserve Nick's thoughts on separate versions with and without the stdlib upgrades, why not just make them 3.3.1 and 3.3.2, bringing us up to around 3.3.6 at 3.4.0 release? We could take a(n old) page from Linux development's book and make the oddness/evenness of the third number meaningful; say odds have stdlib upgrades, evens are strictly maintenance (or vice versa). My apologies for the noise if these ideas have been shot down elsewhere already, but I hadn't seen it so I thought I'd stick my head in for a bit :) Regards, Zach Ware On Feb 25, 2012 2:50 AM, "Nick Coghlan" wrote: > > On Sat, Feb 25, 2012 at 4:59 AM, Brett Cannon wrote: > > On Fri, Feb 24, 2012 at 13:23, Georg Brandl wrote: > >> Am 24.02.2012 18:46, schrieb Antoine Pitrou: > >> > Overall, I like the principle of this PEP, but I really dislike the > >> > dual version numbering it introduces. Such a numbering scheme will be > >> > cryptic and awkward for anyone but Python specialists. > >> > >> I agree. > > > > Ditto. > > And, in contrast, I believe that the free-wheeling minor version > number proposed in PEP 407 is a train wreck and PR disaster waiting to > happen. I find it interesting that we can so readily agree that using > the major version number in any way is impossible due to the ongoing > Python 2 -> 3 transition, yet I get so much pushback on the idea that > messing with the implications of changing the *minor* version number > will unnecessarily confuse or upset users. > > I spent quite a bit of time thinking about the ways people *use* the > CPython version number, and it switched me from mildly preferring a > separate version number for the standard library to being a strong > *opponent* of increasing the rate of change for the minor version > number. Anyway, the PEP now describes the user scenarios that > convinced me that a separate version number for the standard library > was the right way to go: > http://www.python.org/dev/peps/pep-0413/#user-scenarios > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Sat Feb 25 17:24:47 2012 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 25 Feb 2012 11:24:47 -0500 Subject: [Python-Dev] PEP 413: Faster evolution of the Python Standard Library In-Reply-To: References: <20120224184622.5df22ade@pitrou.net> Message-ID: <4F490B4F.8030201@trueblade.com> On 2/25/2012 11:18 AM, Zachary Ware wrote: > Anyhow; I have to say I like Nick's idea put forth in PEP 413, but I > agree that the extra versioning info could get pretty awkward. > Therefore, why not just make stdlib upgrades part of the regular > maintenance releases? As long as there is absolutely no change in usage > from (for example) 3.3.0 to 3.3.1, what's wrong with adding new (stdlib) > features in 3.3.1? The problem is that you can't say "my code works on Python 3.3". You now have to specify the micro version number as well: "my code works on Python 3.3.1+". We've made this mistake before; I can't see it happening again. Eric. From solipsis at pitrou.net Sat Feb 25 17:50:07 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 25 Feb 2012 17:50:07 +0100 Subject: [Python-Dev] PEP 413: Faster evolution of the Python Standard Library References: <20120224184622.5df22ade@pitrou.net> <4F490B4F.8030201@trueblade.com> Message-ID: <20120225175007.52e87e09@pitrou.net> On Sat, 25 Feb 2012 11:24:47 -0500 "Eric V. Smith" wrote: > On 2/25/2012 11:18 AM, Zachary Ware wrote: > > Anyhow; I have to say I like Nick's idea put forth in PEP 413, but I > > agree that the extra versioning info could get pretty awkward. > > Therefore, why not just make stdlib upgrades part of the regular > > maintenance releases? As long as there is absolutely no change in usage > > from (for example) 3.3.0 to 3.3.1, what's wrong with adding new (stdlib) > > features in 3.3.1? > > The problem is that you can't say "my code works on Python 3.3". You now > have to specify the micro version number as well: "my code works on > Python 3.3.1+". We've made this mistake before; I can't see it happening > again. I don't see how it's a mistake. It's only a mistake if it breaks the convention on version numbers, which is precisely what we are discussing to change. Regards Antoine. From solipsis at pitrou.net Sat Feb 25 17:55:32 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 25 Feb 2012 17:55:32 +0100 Subject: [Python-Dev] PEP 413: Faster evolution of the Python Standard Library References: <20120224184622.5df22ade@pitrou.net> Message-ID: <20120225175532.41c8b392@pitrou.net> On Fri, 24 Feb 2012 19:23:36 +0100 Georg Brandl wrote: > > > I also think the branches and releases management should be even > > simpler: > > > > - 2.7: as today > > - 3.3: bug fixes + stdlib enhancements > > - default: language enhancements / ABI-breaking changes > > > > Every 6 months, a new stdlib + bugfix release would be cut (3.3.1, > > 3.3.2, etc.), while language enhancement releases (3.4, 3.5...) would > > still happen every 18 months. > > Sorry, I don't think that's feasible at all. For one, it removes the > possibility to target a stable set of features for a longer time. Why does it? You can target the 3.3.0 set of features and be (reasonably) confident that your code will still work with 3.3.1. Regards Antoine. From g.brandl at gmx.net Sat Feb 25 18:21:40 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 25 Feb 2012 18:21:40 +0100 Subject: [Python-Dev] PEP 413: Faster evolution of the Python Standard Library In-Reply-To: <20120225175532.41c8b392@pitrou.net> References: <20120224184622.5df22ade@pitrou.net> <20120225175532.41c8b392@pitrou.net> Message-ID: On 02/25/2012 05:55 PM, Antoine Pitrou wrote: > On Fri, 24 Feb 2012 19:23:36 +0100 > Georg Brandl wrote: >> >> > I also think the branches and releases management should be even >> > simpler: >> > >> > - 2.7: as today >> > - 3.3: bug fixes + stdlib enhancements >> > - default: language enhancements / ABI-breaking changes >> > >> > Every 6 months, a new stdlib + bugfix release would be cut (3.3.1, >> > 3.3.2, etc.), while language enhancement releases (3.4, 3.5...) would >> > still happen every 18 months. >> >> Sorry, I don't think that's feasible at all. For one, it removes the >> possibility to target a stable set of features for a longer time. > > Why does it? You can target the 3.3.0 set of features and be > (reasonably) confident that your code will still work with 3.3.1. Yes, but anybody developing for 3.3.1 will have to specify "3.3.1+". Which is kind of defeating the point of giving them micro versions at all. Frankly, the longer we are discussing about this, the more I get the impression that all of the different proposed changes will result in grievous mental confusion to the Great British Public^W^W^W Python community. Georg From eric at trueblade.com Sat Feb 25 18:24:12 2012 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 25 Feb 2012 12:24:12 -0500 Subject: [Python-Dev] PEP 413: Faster evolution of the Python Standard Library In-Reply-To: <20120225175007.52e87e09@pitrou.net> References: <20120224184622.5df22ade@pitrou.net> <4F490B4F.8030201@trueblade.com> <20120225175007.52e87e09@pitrou.net> Message-ID: <4F49193C.2070103@trueblade.com> On 2/25/2012 11:50 AM, Antoine Pitrou wrote: >> The problem is that you can't say "my code works on Python 3.3". You now >> have to specify the micro version number as well: "my code works on >> Python 3.3.1+". We've made this mistake before; I can't see it happening >> again. > > I don't see how it's a mistake. It's only a mistake if it breaks the > convention on version numbers, which is precisely what we are > discussing to change. I was thinking of language changes, so maybe it doesn't apply to this discussion. I thought there was some non-trivial addition to the language in a micro release, but searching for it I don't come up with anything. Maybe it was adding True and False in 2.2.1. From solipsis at pitrou.net Sat Feb 25 18:39:58 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 25 Feb 2012 18:39:58 +0100 Subject: [Python-Dev] PEP 413: Faster evolution of the Python Standard Library References: <20120224184622.5df22ade@pitrou.net> <20120225175532.41c8b392@pitrou.net> Message-ID: <20120225183958.07f03f78@pitrou.net> On Sat, 25 Feb 2012 18:21:40 +0100 Georg Brandl wrote: > > Yes, but anybody developing for 3.3.1 will have to specify "3.3.1+". > Which is kind of defeating the point of giving them micro versions > at all. > > Frankly, the longer we are discussing about this, the more I get the > impression that all of the different proposed changes will result in > grievous mental confusion to the Great British Public^W^W^W Python > community. Well, the main reason we are discussing about this is that there is some opposition to making the release schedule faster, which is the simple and obvious solution. Regards Antoine. From benjamin at python.org Sat Feb 25 18:56:15 2012 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 25 Feb 2012 12:56:15 -0500 Subject: [Python-Dev] [RELEASED] Release candidates for Python 2.6.8, 2.7.3, 3.1.5, and 3.2.3 Message-ID: We're pleased to announce the immediate availability of release candidates for Python 2.6.8, 2.7.3, 3.1.5, and 3.2.3 . The main impetus for these releases is fixing a security issue in Python's hash based types, dict and set, as described below. Python 2.7.3 and 3.2.3 include the security patch and the normal set of bug fixes. Since Python 2.6 and 3.1 are maintained only for security issues, 2.6.8 and 3.1.5 contain only various security patches. The security issue exploits Python's dict and set implementations. Carefully crafted input can lead to extremely long computation times and denials of service. [1] Python dict and set types use hash tables to provide amortized constant time operations. Hash tables require a well-distributed hash function to spread data evenly across the hash table. The security issue is that an attacker could compute thousands of keys with colliding hashes; this causes quadratic algorithmic complexity when the hash table is constructed. To alleviate the problem, the new releases add randomization to the hashing of Python's string types (bytes/str in Python 3 and str/unicode in Python 2), datetime.date, and datetime.datetime. This prevents an attacker from computing colliding keys of these types without access to the Python process. Hash randomization causes the iteration order of dicts and sets to be unpredictable and differ across Python runs. Python has never guaranteed iteration order of keys in a dict or set, and applications are advised to never rely on it. Historically, dict iteration order has not changed very often across releases and has always remained consistent between successive executions of Python. Thus, some existing applications may be relying on dict or set ordering. Because of this and the fact that many Python applications which don't accept untrusted input are not vulnerable to this attack, in all stable Python releases mentioned here, HASH RANDOMIZATION IS DISABLED BY DEFAULT. There are two ways to enable it. The -R commandline option can be passed to the python executable. It can also be enabled by setting an environmental variable PYTHONHASHSEED to "random". (Other values are accepted, too; pass -h to python for complete description.) More details about the issue and the patch can be found in the oCERT advisory [1] and the Python bug tracker [2]. These releases are releases candidates and thus not recommended for production use. Please test your applications and libraries with them, and report any bugs you encounter. We are especially interested in any buggy behavior observed using hash randomization. Excepting major calamity, final versions should appear after several weeks. Downloads are at http://python.org/download/releases/2.6.8/ http://python.org/download/releases/2.7.3/ http://python.org/download/releases/3.1.5/ http://python.org/download/releases/3.2.3/ Please test these candidates and report bugs to http://bugs.python.org/ With regards, The Python release team Barry Warsaw (2.6), Georg Brandl (3.2), Benjamin Peterson (2.7 and 3.1) [1] http://www.ocert.org/advisories/ocert-2011-003.html [2] http://bugs.python.org/issue13703 From breamoreboy at yahoo.co.uk Sat Feb 25 21:16:13 2012 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Sat, 25 Feb 2012 20:16:13 +0000 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: <4F480324.7000308@v.loewis.de> Message-ID: On 25/02/2012 13:13, Mark Lawrence wrote: > On 25/02/2012 05:55, Nick Coghlan wrote: >> On Sat, Feb 25, 2012 at 10:23 AM, Mark >> Lawrence wrote: >> >>> >>> Quoting the docs http://docs.python.org/py3k/library/stdtypes.html >>> >>> 4.6.2. Old String Formatting Operations >>> >>> Note >>> >>> The formatting operations described here are obsolete and may go away in >>> future versions of Python. Use the new String Formatting in new code. >>> >>> >>> >>> I think this is daft because all of the code has to be supported for >>> the ten >>> years that MVL has suggested. >> >> Indeed, that note was written before we decided that getting rid of >> "%" formatting altogether would be a bad idea. >> >> It would be better to update it to say something like: >> >> "The formatting operations described here are modelled on C's printf() >> syntax. They only support formatting of certain builtin types, and the >> use of a binary operator means that care may be needed in order to >> format tuples and dictionaries correctly. As the new string formatting >> syntax is more powerful, flexible, extensible and handles tuples and >> dictionaries naturally, it is recommended for new code. However, there >> are no current plans to deprecate printf-style formatting." >> >> Cheers, >> Nick. >> > > That's fine by me, it'll save me changing my own code. I'll put this on > the issue tracker if you want, but after the pressing needs of the bar > and 6 Nations rugby :) > I would raise this on the issue tracker but it won't let me login. Guess I'm not wanted. :( -- Cheers. Mark Lawrence. From armin.ronacher at active-4.com Sat Feb 25 21:23:39 2012 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Sat, 25 Feb 2012 20:23:39 +0000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 Message-ID: <4F49434B.6050604@active-4.com> Hi, I just uploaded PEP 414 which proposes am optional 'u' prefix for string literals for Python 3. You can read the PEP online: http://www.python.org/dev/peps/pep-0414/ This is a followup to the discussion about this topic here on the mailinglist and on twitter/IRC over the last few weeks. Regards, Armin From breamoreboy at yahoo.co.uk Sat Feb 25 21:49:59 2012 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Sat, 25 Feb 2012 20:49:59 +0000 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: <4F480324.7000308@v.loewis.de> Message-ID: On 25/02/2012 20:16, Mark Lawrence wrote: > On 25/02/2012 13:13, Mark Lawrence wrote: >> On 25/02/2012 05:55, Nick Coghlan wrote: >>> On Sat, Feb 25, 2012 at 10:23 AM, Mark >>> Lawrence wrote: >>> >>>> >>>> Quoting the docs http://docs.python.org/py3k/library/stdtypes.html >>>> >>>> 4.6.2. Old String Formatting Operations >>>> >>>> Note >>>> >>>> The formatting operations described here are obsolete and may go >>>> away in >>>> future versions of Python. Use the new String Formatting in new code. >>>> >>>> >>>> >>>> I think this is daft because all of the code has to be supported for >>>> the ten >>>> years that MVL has suggested. >>> >>> Indeed, that note was written before we decided that getting rid of >>> "%" formatting altogether would be a bad idea. >>> >>> It would be better to update it to say something like: >>> >>> "The formatting operations described here are modelled on C's printf() >>> syntax. They only support formatting of certain builtin types, and the >>> use of a binary operator means that care may be needed in order to >>> format tuples and dictionaries correctly. As the new string formatting >>> syntax is more powerful, flexible, extensible and handles tuples and >>> dictionaries naturally, it is recommended for new code. However, there >>> are no current plans to deprecate printf-style formatting." >>> >>> Cheers, >>> Nick. >>> >> >> That's fine by me, it'll save me changing my own code. I'll put this on >> the issue tracker if you want, but after the pressing needs of the bar >> and 6 Nations rugby :) >> > > I would raise this on the issue tracker but it won't let me login. Guess > I'm not wanted. :( > But there's more than one way of skinning a cat. http://bugs.python.org/issue14123 -- Cheers. Mark Lawrence. From barry at python.org Sat Feb 25 22:31:29 2012 From: barry at python.org (Barry Warsaw) Date: Sat, 25 Feb 2012 16:31:29 -0500 Subject: [Python-Dev] Proposing an alternative to PEP 410 In-Reply-To: <4F46AF6E.2030300@hastings.org> References: <4F46AF6E.2030300@hastings.org> Message-ID: <20120225163129.3a104cdd@resist.wooz.org> On Feb 23, 2012, at 01:28 PM, Larry Hastings wrote: >* Improve datetime.datetime objects so they support nanosecond resolution, > in such a way that it's 100% painless to make them even more precise in > the future. +1 >* Add support to datetime objects that allows adding and subtracting ints > and floats as seconds. This behavior is controllable with a flag on the > object--by default this behavior is off. Why conditionalize this behavior? It should either be enabled or not, but making it switchable on a per-object basis seems like asking for trouble. >* Support accepting naive datetime.datetime objects in all functions that > accept a timestamp in os (utime etc). +1 >* Change the result of os.stat to be a custom class rather than a > PyStructSequence. Support the sequence protocol on the custom class but > mark it PendingDeprecation, to be removed completely in 3.5. (I can't > take credit for this idea; MvL suggested it to me once while we were > talking about this issue. Now that the os.stat object has named fields, > who uses the struct unpacking anymore?) +1 >* Add support for setting "stat_float_times=2" (or perhaps > "stat_float_times=datetime.datetime" ?) to enable returning st_[acm]time as > naive datetime.datetime objects--specifically, ones that allow addition and > subtraction of ints and floats. The value would be similar to calling > datetime.datetime.fromdatetime() on the current float timestamp, but > would preserve all available precision. I personally don't much like the global state represented by os.stat_float_times() in the first place. Even though it also could be considered somewhat un-Pythonthic, I think it probably would have been better to accept an optional argument in os.stat() to determine the return value. Or maybe it would have been more acceptable to have os.stat(), os.stat_float(), and os.stat_datetime() methods. >* Add a new parameter to functions that produce stat-like timestamps to > explicitly specify the type of the timestamps (float or datetime), > as proposed in PEP 410. +1 >I disagree with PEP 410's conclusions about the suitability of datetime as >a timestamp object. I think "naive" datetime objects are a perfect fit. >Specficially addressing PEP 410's concerns: > > * I don't propose doing anything about the other functions that have no > explicit start time; I'm only proposing changing the functions that deal > with timestamps. (Perhaps the right thing for epoch-less times like > time.clock would be timedelta? But I think we can table this discussion > for now.) +1, and yeah, I think we've had general agreement about using timedeltas for epoch-less times. > * "You can't compare naive and non-naive datetimes." So what? The > existing timestamp from os.stat is a float, and you can't compare floats > and non-naive datetimes. How is this an issue? Exactly. >Perhaps someone else can propose something even better, If we really feel like we need to make a change to support higher resolution timestamps, this comes pretty darn close to what I'd like to see. -Barry From guido at python.org Sun Feb 26 00:31:56 2012 From: guido at python.org (Guido van Rossum) Date: Sat, 25 Feb 2012 15:31:56 -0800 Subject: [Python-Dev] Proposing an alternative to PEP 410 In-Reply-To: <20120225163129.3a104cdd@resist.wooz.org> References: <4F46AF6E.2030300@hastings.org> <20120225163129.3a104cdd@resist.wooz.org> Message-ID: On Sat, Feb 25, 2012 at 1:31 PM, Barry Warsaw wrote: > On Feb 23, 2012, at 01:28 PM, Larry Hastings wrote: > >>* Improve datetime.datetime objects so they support nanosecond resolution, >> ? in such a way that it's 100% painless to make them even more precise in >> ? the future. > > +1 And how would you do that? Given the way the API currently works you pretty much have to add a separate field 'nanosecond' with a range of 0-999, leaving the microseconds field the same. (There are no redundant fields.) This is possible but makes it quite awkward by the time we've added picosecond and femtosecond. >>* Add support to datetime objects that allows adding and subtracting ints >> ? and floats as seconds. ?This behavior is controllable with a flag on the >> ? object--by default this behavior is off. > > Why conditionalize this behavior? ?It should either be enabled or not, but > making it switchable on a per-object basis seems like asking for trouble. I am guessing that Larry isn't convinced that this is always a good idea, but I agree with Barry that making it conditional is just too complex. >>* Support accepting naive datetime.datetime objects in all functions that >> ? accept a timestamp in os (utime etc). > > +1 What timezone would it assume? Timestamps are traditionally linked to UTC -- but naive timestamps are most frequently used for local time. Local time is awkward due to the ambiguities around DST transitions. I do think we should support APIs for going back and forth between timezone-aware datetime and timestamps. >>* Change the result of os.stat to be a custom class rather than a >> ? PyStructSequence. ?Support the sequence protocol on the custom class but >> ? mark it PendingDeprecation, to be removed completely in 3.5. ?(I can't >> ? take credit for this idea; MvL suggested it to me once while we were >> ? talking about this issue. ?Now that the os.stat object has named fields, >> ? who uses the struct unpacking anymore?) > > +1 Yeah, the sequence protocol is outdated here. Would this be a mutable or an immutable object? >>* Add support for setting "stat_float_times=2" (or perhaps >> ? "stat_float_times=datetime.datetime" ?) to enable returning st_[acm]time as >> ? naive datetime.datetime objects--specifically, ones that allow addition and >> ? subtraction of ints and floats. ?The value would be similar to calling >> ? datetime.datetime.fromdatetime() on the current float timestamp, but >> ? would preserve all available precision. > > I personally don't much like the global state represented by > os.stat_float_times() in the first place. Agreed. We should just deprecate stat_float_times(). > Even though it also could be > considered somewhat un-Pythonthic, I think it probably would have been better > to accept an optional argument in os.stat() to determine the return value. I still really don't like this. >?Or maybe it would have been more acceptable to have os.stat(), os.stat_float(), > and os.stat_datetime() methods. But I also don't like a proliferation of functions, especially since there are already so many stat() functions: stat(), fstat(), fstatat(). My proposal: add extra fields that represent the time in different types. E.g. st_atime_nsec could be an integer expressing the entire timestamp in nanoseconds; st_atime_decimal could give as much precision as happens to be available as a Decimal; st_atime_datetime could be a UTC-based datetime; and in the future we could have other forms. Plain st_atime would be a float. (It can change if and when the default floating point type changes.) We could make these fields lazily computed so that if you never touch st_atime_decimal, the decimal module doesn't get loaded. (It would be awkward if "import os" would imply "import decimal", since the latter is huge.) >>* Add a new parameter to functions that produce stat-like timestamps to >> ? explicitly specify the type of the timestamps (float or datetime), >> ? as proposed in PEP 410. > > +1 No. >>I disagree with PEP 410's conclusions about the suitability of datetime as >>a timestamp object. ?I think "naive" datetime objects are a perfect fit. >>Specficially addressing PEP 410's concerns: >> >> ? * I don't propose doing anything about the other functions that have no >> ? ? explicit start time; I'm only proposing changing the functions that deal >> ? ? with timestamps. ?(Perhaps the right thing for epoch-less times like >> ? ? time.clock would be timedelta? ?But I think we can table this discussion >> ? ? for now.) > > +1, and yeah, I think we've had general agreement about using timedeltas for > epoch-less times. Scratch that, *I* don't agree. timedelta is a pretty clumsy type to use. Have you ever tried to compute the number of seconds between two datetimes? You can't just use the .seconds field, you have to combine the .days and .seconds fields. And negative timedeltas are even harder due to the requirement that seconds and microseconds are never negative; e.g -1 second is represented as -1 days plus 86399 seconds. For fixed-epoch timestamps, *maybe* UTC datetime makes some sense. (We did add the UTC timezone to the stdlib right?) But still I think the flexibility of floating point wins, and there are no worries about ambiguities. >> ? * "You can't compare naive and non-naive datetimes." ?So what? ?The >> ? ? existing timestamp from os.stat is a float, and you can't compare floats >> ? ? and non-naive datetimes. ?How is this an issue? > > Exactly. The problem is with the ambiguity of naive datetimes. >>Perhaps someone else can propose something even better, > > If we really feel like we need to make a change to support higher resolution > timestamps, this comes pretty darn close to what I'd like to see. I'm currently also engaged in an off-list discussion with Victor. I still think that when you are actually interested in *using* times, the current float format is absolutely fine. Anybody who thinks they need to accurately know the absolute time that something happened with nanosecond accuracy is out of their mind; given relativity such times have an incredibly local significance anyway. So I don't worry about not being able to represent a timestamp with nsec precision. For *relative* times, nanoseconds may be useful, and a float has no trouble representing them. (A float can represent time intervals of many millions of seconds with nanosecond precision. There are probably only a few clocks in the world whose drift is less than a nanosecond over such a timespan.) The one exception here is making accurate copies of filesystem metadata. This can be dealt with by making certain changes to os.stat() and os.utime(). For os.stat(), adding extra fields like I suggested above should work. For os.utime(), we could use keyword arguments, or some other API hack. -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Sun Feb 26 03:05:40 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 25 Feb 2012 21:05:40 -0500 Subject: [Python-Dev] Versioning proposal: syntax.stdlib.bugfix Message-ID: We have two similar proposals, PEPs 407 and 413, to speed up the release of at least library changes. To me, both have major problems with version numbering. I think the underlying problem is starting with a long-term fixed leading '3', which conveys no information about current and future changes (at least for another decade). So I propose for consideration that we use the first digit to indicate a version of core python with fixed grammar/syntax and corresponding semantics. I would have this be stable for at least two years. It seems that most current syntax proposals amount to duplication of current function to suite someone's or some people's stylistic preference. My current view is that current syntax in mostly good enough, the implementation thereof is close to bug-free, and we should think carefully about changes. We could then use the second digit to indicate library version. The .0 library version would be for a long-term support version. The library version could change every six months, but I would not necessarily fix it at any particular interval. If we have some important addition or upgrade at four months, release it. If we need another month to include an important change, perhaps wait. The third digit would be for initial (.0) and bugfix releases, as at present. Non .0 bugfix releases would mostly be for x.0 long-term syntax+library versions. x.(y!=0).0 library-change-only releases would only get x.(y!=0).1 bugfix releases on an 'emergency' basis. How this would work: Instead of 3.3.0, release 4.0.0. That would be followed by 4.0.1, 4.0.2, etc, bugfixes, however often we feel like it, until 5.0.0 is released. 4.0.0 would also be followed by 4.1.0 with updated stdlib in about 6 months, then barring mistakes, 4.2.0, etc, again until 5.0.0. A variation of this proposal would be to prefix '3.' to core.lib.fix. I disfavor that for 3 reasons. 1. It is not needed to indicate 'not Python 2' as *any* leading digit greater than 2 says the same. 2. It makes for a more awkward 4 level number. 3. It presupposes a 3 to 4 transition something like the 2 to 3 transition. However, I am dubious about for more than one reason (another topic for another post). -- Terry Jan Reedy From guido at python.org Sun Feb 26 04:04:41 2012 From: guido at python.org (Guido van Rossum) Date: Sat, 25 Feb 2012 19:04:41 -0800 Subject: [Python-Dev] Rejecting PEP 410 (Use decimal.Decimal type for timestamps) Message-ID: After an off-list discussion with Victor I have decided to reject PEP 410. Here are my reasons for rejecting the PEP. (Someone please copy this to the PEP or reference this message in the archives on mail.python.org.) 1. I have a general dislike of APIs that take a flag parameter which modifies the return type. But I also don't like having to add variants that return Decimal for over a dozen API functions (stat(), fstat(), etc.). I really think that this PEP would add a lot of complexity that we don't need. 2. The Decimal type is somewhat awkward to use; it doesn't mix with floats, there's a context that sets things like precision and rounding, it's still a floating point type that may lose precision (something which many people don't get when they first see it). 3. There are *very* few clocks in existance (if any) that actually measure time with more than 56 bits of accuracy. Sure, for short time periods we can measure nanoseconds. But a Python (64-bit) float can represent quite a large number of nanoseconds exactly (many millions of seconds), so if you're using some kind of real-time timer that reset e.g. at the start of the current process you should be fine using a float to represent the time with great precision and accuracy. On the other hand, if you're measuring the time of day expressed in seconds (and fractions) since 1/1/1970, you should consider yourself lucky if your clock is accurate within 1 second. (Especially since POSIX systems aren't allowed to admit the existence of leap seconds in their timestamps -- around a leap second you must adjust your clock, either gradually or abruptly.) I'll give you that some people might have clocks accurate to a microsecond. Such timestamps can be represented exactly as floats (at least until some point in the very distant future, when hopefully we'll have 128-bit floats). 4. I don't expect that timestamps with even greater precision than nanoseconds will ever become fashionable. Given that light travels about 30 cm in a nanosecond, there's not much use for more accurate time measurements in daily life. Given relativity theory, at such a timescale simultaneity of events is iffy at best. 5. I see only one real use case for nanosecond precision: faithful copying of the mtime/atime recorded by filesystems, in cases where the filesystem (like e.g. ext4) records these times with nanosecond precision. Even if such timestamps can't be trusted to be accurate, converting them to floats and back loses precision, and verification using tools not written in Python will flag the difference. But for this specific use case a much simpler set of API changes will suffice; only os.stat() and os.utime() need to change slightly (and variants of os.stat() like os.fstat()). 6. If you worry about systems where float has fewer bits: I don't think there are any relevant systems that have both a smaller float type and nanosecond clocks. So far the rejection note. As to the changes alluded to in #5: Let os.stat() and friends return extra fields st_atime_ns (etc.) that give the timestamps in nanoseconds as a Python (long) integer, such that e.g. (in close approximation) st_atime == st_atime_ns * 1e-9. Let os.utime() take an optional keyword argument ns=(atime_ns, mtime_ns). Details of the actual design of this API, such as the actual field and parameter names, may change; this is just a suggestion. We don't need a PEP for this proposal; we can just open a tracker issue and hash out the details during the code review. I'm also in favor of getting rid of os.stat_float_times(). As proposed in another thread, we may deprecated it in 3.3 and delete it in 3.5. I'm not excited about adding more precision to datetime and timedelta. -- --Guido van Rossum (python.org/~guido) From guido at python.org Sun Feb 26 04:13:26 2012 From: guido at python.org (Guido van Rossum) Date: Sat, 25 Feb 2012 19:13:26 -0800 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F49434B.6050604@active-4.com> References: <4F49434B.6050604@active-4.com> Message-ID: If this can encourage more projects to support Python 3 (even if it's only 3.3 and later) and hence improve adoption of Python 3, I'm all for it. A small quibble: I'd like to see a benchmark of a 'u' function implemented in C. --Guido On Sat, Feb 25, 2012 at 12:23 PM, Armin Ronacher wrote: > Hi, > > I just uploaded PEP 414 which proposes am optional 'u' prefix for string > literals for Python 3. > > You can read the PEP online: http://www.python.org/dev/peps/pep-0414/ > > This is a followup to the discussion about this topic here on the > mailinglist and on twitter/IRC over the last few weeks. > > > Regards, > Armin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From anacrolix at gmail.com Sun Feb 26 04:22:54 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Sun, 26 Feb 2012 11:22:54 +0800 Subject: [Python-Dev] Versioning proposal: syntax.stdlib.bugfix In-Reply-To: References: Message-ID: Chrome does something similar. All digits keep rising in that scheme. However in your examples you can't identify whether bug fixes are to stdlib or interpreter? On Feb 26, 2012 10:07 AM, "Terry Reedy" wrote: > We have two similar proposals, PEPs 407 and 413, to speed up the release > of at least library changes. To me, both have major problems with version > numbering. > > I think the underlying problem is starting with a long-term fixed leading > '3', which conveys no information about current and future changes (at > least for another decade). > > So I propose for consideration that we use the first digit to indicate a > version of core python with fixed grammar/syntax and corresponding > semantics. I would have this be stable for at least two years. It seems > that most current syntax proposals amount to duplication of current > function to suite someone's or some people's stylistic preference. My > current view is that current syntax in mostly good enough, the > implementation thereof is close to bug-free, and we should think carefully > about changes. > > We could then use the second digit to indicate library version. The .0 > library version would be for a long-term support version. The library > version could change every six months, but I would not necessarily fix it > at any particular interval. If we have some important addition or upgrade > at four months, release it. If we need another month to include an > important change, perhaps wait. > > The third digit would be for initial (.0) and bugfix releases, as at > present. Non .0 bugfix releases would mostly be for x.0 long-term > syntax+library versions. x.(y!=0).0 library-change-only releases would only > get x.(y!=0).1 bugfix releases on an 'emergency' basis. > > How this would work: > > Instead of 3.3.0, release 4.0.0. That would be followed by 4.0.1, 4.0.2, > etc, bugfixes, however often we feel like it, until 5.0.0 is released. > > 4.0.0 would also be followed by 4.1.0 with updated stdlib in about 6 > months, then barring mistakes, 4.2.0, etc, again until 5.0.0. > > A variation of this proposal would be to prefix '3.' to core.lib.fix. I > disfavor that for 3 reasons. > 1. It is not needed to indicate 'not Python 2' as *any* leading digit > greater than 2 says the same. > 2. It makes for a more awkward 4 level number. > 3. It presupposes a 3 to 4 transition something like the 2 to 3 > transition. However, I am dubious about for more than one reason (another > topic for another post). > > -- > Terry Jan Reedy > > ______________________________**_________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/**mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** > anacrolix%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Feb 26 06:16:14 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 26 Feb 2012 15:16:14 +1000 Subject: [Python-Dev] PEP 413 updated (with improved rationale and simpler stdlib versioning scheme) Message-ID: After working through some additional scenarios (primarily the question of handling security fixes), I have simplified the proposed versioning scheme in PEP 413. New version included below, or you can read the nicely formatted version: http://www.python.org/dev/peps/pep-0413/ ========================== PEP: 413 Title: Faster evolution of the Python Standard Library Version: $Revision$ Last-Modified: $Date$ Author: Nick Coghlan Status: Draft Type: Process Content-Type: text/x-rst Created: 2012-02-24 Post-History: 2012-02-24, 2012-02-25 Resolution: TBD Abstract ======== This PEP proposes the adoption of a separate versioning scheme for the standard library (distinct from, but coupled to, the existing language versioning scheme) that allows accelerated releases of the Python standard library, while maintaining (or even slowing down) the current rate of change in the core language definition. Like PEP 407, it aims to adjust the current balance between measured change that allows the broader community time to adapt and being able to keep pace with external influences that evolve more rapidly than the current release cycle can handle (this problem is particularly notable for standard library elements that relate to web technologies). However, it's more conservative in its aims than PEP 407, seeking to restrict the increased pace of development to builtin and standard library interfaces, without affecting the rate of change for other elements such as the language syntax and version numbering as well as the CPython binary API and bytecode format. Rationale ========= To quote the PEP 407 abstract: Finding a release cycle for an open-source project is a delicate exercise in managing mutually contradicting constraints: developer manpower, availability of release management volunteers, ease of maintenance for users and third-party packagers, quick availability of new features (and behavioural changes), availability of bug fixes without pulling in new features or behavioural changes. The current release cycle errs on the conservative side. It is adequate for people who value stability over reactivity. This PEP is an attempt to keep the stability that has become a Python trademark, while offering a more fluid release of features, by introducing the notion of long-term support versions. I agree with the PEP 407 authors that the current release cycle of the *standard library* is too slow to effectively cope with the pace of change in some key programming areas (specifically, web protocols and related technologies, including databases, templating and serialisation formats). However, I have written this competing PEP because I believe that the approach proposed in PEP 407 of offering full, potentially binary incompatible releases of CPython every 6 months places too great a burden on the wider Python ecosystem. Under the current CPython release cycle, distributors of key binary extensions will often support Python releases even after the CPython branches enter "security fix only" mode (for example, Twisted currently ships binaries for 2.5, 2.6 and 2.7, NumPy and SciPy suport those 3 along with 3.1 and 3.2, PyGame adds a 2.4 binary release, wxPython provides both 32-bit and 64-bit binaries for 2.6 and 2.7, etc). If CPython were to triple (or more) its rate of releases, the developers of those libraries (many of which are even more resource starved than CPython) would face an unpalatable choice: either adopt the faster release cycle themselves (up to 18 simultaneous binary releases for PyGame!), drop older Python versions more quickly, or else tell their users to stick to the CPython LTS releases (thus defeating the entire point of speeding up the CPython release cycle in the first place). Similarly, many support tools for Python (e.g. syntax highlighters) can take quite some time to catch up with language level changes. At a cultural level, the Python community is also accustomed to a certain meaning for Python version numbers - they're linked to deprecation periods, support periods, all sorts of things. PEP 407 proposes that collective knowledge all be swept aside, without offering a compelling rationale for why such a course of action is actually *necessary* (aside from, perhaps, making the lives of the CPython core developers a little easier at the expense of everyone else). However, if we go back to the primary rationale for increasing the pace of change (i.e. more timely support for web protocols and related technologies), we can note that those only require *standard library* changes. That means many (perhaps even most) of the negative effects on the wider community can be avoided by explicitly limiting which parts of CPython are affected by the new release cycle, and allowing other parts to evolve at their current, more sedate, pace. Proposal ======== This PEP proposes the introduction of a new kind of CPython release: "standard library releases". As with PEP 407, this will give CPython 3 kinds of release: * Language release: "x.y.0" * Maintenance release: "x.y.z" (where z > 0) * Standard library release: "x.y (xy.z)" (where z > 0) Under this scheme, an unqualified version reference (such as "3.3") would always refer to the most recent corresponding language or maintenance release. It will never be used without qualification to refer to a standard library release (at least, not by python-dev - obviously, we can only set an example, not force the rest of the Python ecosystem to go along with it). Language releases will continue as they are now, as new versions of the Python language definition, along with a new version of the CPython interpreter and the Python standard library. Accordingly, a language release may contain any and all of the following changes: * new language syntax * new standard library changes (see below) * new deprecation warnings * removal of previously deprecated features * changes to the emitted bytecode * changes to the AST * any other significant changes to the compilation toolchain * changes to the core interpreter eval loop * binary incompatible changes to the C ABI (although the PEP 384 stable ABI must still be preserved) * bug fixes Maintenance releases will also continue as they do today, being strictly limited to bug fixes for the corresponding language release. No new features or radical internal changes are permitted. The new standard library releases will occur in parallel with each maintenance release and will be qualified with a new version identifier documenting the standard library version. Standard library releases may include the following changes: * new features in pure Python modules * new features in C extension modules (subject to PEP 399 compatibility requirements) * new features in language builtins (provided the C ABI remains unaffected) * bug fixes from the corresponding maintenance release Standard library version identifiers are constructed by combining the major and minor version numbers for the Python language release into a single two digit number and then appending a sequential standard library version identifier. Release Cycle ------------- When maintenance releases are created, *two* new versions of Python would actually be published on python.org (using the first 3.3 maintenance release, planned for February 2013 as an example):: 3.3.1 # Maintenance release 3.3 (33.1) # Standard library release A further 6 months later, the next 3.3 maintenance release would again be accompanied by a new standard library release:: 3.3.2 # Maintenance release 3.3 (33.2) # Standard library release Again, the standard library release would be binary compatible with the previous language release, merely offering additional features at the Python level. Finally, 18 months after the release of 3.3, a new language release would be made around the same time as the final 3.3 maintenance and standard library releases:: 3.3.3 # Maintenance release 3.3 (33.3) # Standard library release 3.4.0 # Language release The 3.4 release cycle would then follow a similar pattern to that for 3.3:: 3.4.1 # Maintenance release 3.4 (34.1) # Standard library release 3.4.2 # Maintenance release 3.4 (34.2) # Standard library release 3.4.3 # Maintenance release 3.4 (34.3) # Standard library release 3.5.0 # Language release Programmatic Version Identification ----------------------------------- To expose the new version details programmatically, this PEP proposes the addition of a new ``sys.stdlib_info`` attribute that records the new standard library version above and beyond the underlying interpreter version. Using the initial Python 3.3 release as an example:: sys.stdlib_info(python=33, version=0, releaselevel='final', serial=0) This information would also be included in the ``sys.version`` string:: Python 3.3.0 (33.0, default, Feb 17 2012, 23:03:41) [GCC 4.6.1] Security Fixes and Other "Out of Cycle" Releases ------------------------------------------------ For maintenance releases the process of handling out-of-cycle releases (for example, to fix a security issue or resolve a critical bug in a new release), remains the same as it is now: the minor version number is incremented and a new release is made incorporating the required bug fixes, as well as any other bug fixes that have been committed since the previous release. For standard library releases, the process is essentially the same, but the corresponding "What's New?" document may require some tidying up for the release (as the standard library release may incorporate new features, not just bug fixes). User Scenarios ============== The versioning scheme proposed above is based on a number of user scenarios that are likely to be encountered if this scheme is adopted. In each case, the scenario is described for both the status quo (i.e. slow release cycle) the versioning scheme in this PEP and the free wheeling minor version number scheme proposed in PEP 407. To give away the ending, the point of using a separate version number is that for almost all scenarios, the important number is the *language* version, not the standard library version. Most users won't even need to care that the standard library version number exists. In the two identified cases where it matters, providing it as a separate number is actually clearer and more explicit than embedding the two different kinds of number into a single sequence and then tagging some of the numbers in the unified sequence as special. Novice user, downloading Python from python.org in March 2013 ------------------------------------------------------------- **Status quo:** must choose between 3.3 and 2.7 **This PEP:** must choose between 3.3 (33.1), 3.3 and 2.7. **PEP 407:** must choose between 3.4, 3.3 (LTS) and 2.7. **Verdict:** explaining the meaning of a Long Term Support release is about as complicated as explaining the meaning of the proposed standard library release version numbers. I call this a tie. Novice user, attempting to judge currency of third party documentation ---------------------------------------------------------------------- **Status quo:** minor version differences indicate 18-24 months of language evolution **This PEP:** same as status quo for language core, standard library version numbers indicate 6 months of standard library evolution. **PEP 407:** minor version differences indicate 18-24 months of language evolution up to 3.3, then 6 months of language evolution thereafter. **Verdict:** Since language changes and deprecations can have a much bigger effect on the accuracy of third party documentation than the addition of new features to the standard library, I'm calling this a win for the scheme in this PEP. Novice user, looking for an extension module binary release ----------------------------------------------------------- **Status quo:** look for the binary corresponding to the Python version you are running. **This PEP:** same as status quo. **PEP 407 (full releases):** same as status quo, but corresponding binary version is more likely to be missing (or, if it does exist, has to be found amongst a much larger list of alternatives). **PEP 407 (ABI updates limited to LTS releases):** all binary release pages will need to tell users that Python 3.3, 3.4 and 3.5 all need the 3.3 binary. **Verdict:** I call this a clear win for the scheme in this PEP. Absolutely nothing changes from the current situation, since the standard library version is actually irrelevant in this case (only binary extension compatibility is important). Extension module author, deciding whether or not to make a binary release ------------------------------------------------------------------------- **Status quo:** unless using the PEP 384 stable ABI, a new binary release is needed every time the minor version number changes. **This PEP:** same as status quo. **PEP 407 (full releases):** same as status quo, but becomes a far more frequent occurrence. **PEP 407 (ABI updates limited to LTS releases):** before deciding, must first look up whether the new release is an LTS release or an interim release. If it is an LTS release, then a new build is necessary. **Verdict:** I call this another clear win for the scheme in this PEP. As with the end user facing side of this problem, the standard library version is actually irrelevant in this case. Moving that information out to a separate number avoids creating unnecessary confusion. Python developer, deciding priority of eliminating a Deprecation Warning ------------------------------------------------------------------------ **Status quo:** code that triggers deprecation warnings is not guaranteed to run on a version of Python with a higher minor version number. **This PEP:** same as status quo **PEP 407:** unclear, as the PEP doesn't currently spell this out. Assuming the deprecation cycle is linked to LTS releases, then upgrading to a non-LTS release is safe but upgrading to the next LTS release may require avoiding the deprecated construct. **Verdict:** another clear win for the scheme in this PEP since, once again, the standard library version is irrelevant in this scenario. Alternative interpreter implementor, updating with new features --------------------------------------------------------------- **Status quo:** new Python versions arrive infrequently, but are a mish-mash of standard library updates and core language definition and interpreter changes. **This PEP:** standard library updates, which are easier to integrate, are made available more frequently in a form that is clearly and explicitly compatible with the previous version of the language definition. This means that, once an alternative implementation catches up to Python 3.3, they should have a much easier time incorporating standard library features as they happen (especially pure Python changes), leaving minor version number updates as the only task that requires updates to their core compilation and execution components. **PEP 407 (full releases):** same as status quo, but becomes a far more frequent occurrence. **PEP 407 (language updates limited to LTS releases):** unclear, as the PEP doesn't currently spell out a specific development strategy. Assuming a 3.3 compatibility branch is adopted (as proposed in this PEP), then the outcome would be much the same, but the version number signalling would be slightly less clear (since you would have to check to see if a particular release was an LTS release or not). **Verdict:** while not as clear cut as some previous scenarios, I'm still calling this one in favour of the scheme in this PEP. Explicit is better than implicit, and the scheme in this PEP makes a clear split between the two different kinds of update rather than adding a separate "LTS" tag to an otherwise ordinary release number. Tagging a particular version as being special is great for communicating with version control systems and associated automated tools, but it's a lousy way to communicate information to other humans. Python developer, deciding their minimum version dependency ----------------------------------------------------------- **Status quo:** look for "version added" or "version changed" markers in the documentation, check against ``sys.version_info`` **This PEP:** look for "version added" or "version changed" markers in the documentation. If written as a bare Python version, such as "3.3", check against ``sys.version_info``. If qualified with a standard library version, such as "3.3 (33.1)", check against ``sys.stdlib_info``. **PEP 407:** same as status quo **Verdict:** the scheme in this PEP actually allows third party libraries to be more explicit about their rate of adoption of standard library features. More conservative projects will likely pin their dependency to the language version and avoid features added in the standard library releases. Faster moving projects could instead declare their dependency on a particular standard library version. However, since PEP 407 does have the advantage of preserving the status quo, I'm calling this one for PEP 407 (albeit with a slim margin). Python developers, attempting to reproduce a tracker issue ---------------------------------------------------------- **Status quo:** if not already provided, ask the reporter which version of Python they're using. This is often done by asking for the first two lines displayed by the interactive prompt or the value of ``sys.version``. **This PEP:** same as the status quo (as ``sys.version`` will be updated to also include the standard library version), but may be needed on additional occasions (where the user knew enough to state their Python version, but that proved to be insufficient to reproduce the fault). **PEP 407:** same as the status quo **Verdict:** another marginal win for PEP 407. The new standard library version *is* an extra piece of information that users may need to pass back to developers when reporting issues with Python libraries (or Python itself, on our own tracker). However, by including it in ``sys.version``, many fault reports will already include it, and it is easy to request if needed. CPython release managers, handling a security fix ------------------------------------------------- **Status quo:** create a new maintenance release incorporating the security fix and any other bug fixes under source control. Also create source releases for any branches open solely for security fixes. **This PEP:** same as the status quo for maintenance branches. Also create a new standard library release (potentially incorporating new features along with the security fix). For security branches, create source releases for both the former maintenance branch and the standard library update branch. **PEP 407:** same as the status quo for maintenance and security branches, but handling security fixes for non-LTS releases is currently an open question. **Verdict:** until PEP 407 is updated to actually address this scenario, a clear win for this PEP. Effects ======= Effect on development cycle --------------------------- Similar to PEP 407, this PEP will break up the delivery of new features into more discrete chunks. Instead of a whole raft of changes landing all at once in a language release, each language release will be limited to 6 months worth of standard library changes, as well as any changes associated with new syntax. Effect on workflow ------------------ This PEP proposes the creation of a single additional branch for use in the normal workflow. After the release of 3.3, the following branches would be in use:: 2.7 # Maintenance branch, no change 3.3 # Maintenance branch, as for 3.2 3.3-compat # New branch, backwards compatible changes default # Language changes, standard library updates that depend on them When working on a new feature, developers will need to decide whether or not it is an acceptable change for a standard library release. If so, then it should be checked in on ``3.3-compat`` and then merged to ``default``. Otherwise it should be checked in directly to ``default``. The "version added" and "version changed" markers for any changes made on the ``3.3-compat`` branch would need to be flagged with both the language version and the standard library version. For example: "3.3 (33.1)". Any changes made directly on the ``default`` branch would just be flagged with "3.4" as usual. The ``3.3-compat`` branch would be closed to normal development at the same time as the ``3.3`` maintenance branch. The ``3.3-compat`` branch would remain open for security fixes for the same period of time as the ``3.3`` maintenance branch. Effect on bugfix cycle ---------------------- The effect on the bug fix workflow is essentially the same as that on the workflow for new features - there is one additional branch to pass through before the change reaches the ``default`` branch. If critical bugs are found in a maintenance release, then new maintenance and standard library releases will be created to resolve the problem. The final part of the version number will be incremented for both the language version and the standard library version. If critical bugs are found in a standard library release that do not affect the associated maintenance release, then only a new standard library release will be created and only the standard library's version number will be incremented. Note that in these circumstances, the standard library release *may* include additional features, rather than just containing the bug fix. It is assumed that anyone that cares about receiving *only* bug fixes without any new features mixed in will already be relying strictly on the maintenance releases rather than using the new standard library releases. Effect on the community ----------------------- PEP 407 has this to say about the effects on the community: People who value stability can just synchronize on the LTS releases which, with the proposed figures, would give a similar support cycle (both in duration and in stability). I believe this statement is just plain wrong. Life isn't that simple. Instead, developers of third party modules and frameworks will come under pressure to support the full pace of the new release cycle with binary updates, teachers and book authors will receive complaints that they're only covering an "old" version of Python ("You're only using 3.3, the latest is 3.5!"), etc. As the minor version number starts climbing 3 times faster than it has in the past, I believe perceptions of language stability would also fall (whether such opinions were justified or not). I believe isolating the increased pace of change to the standard library, and clearly delineating it with a separate version number will greatly reassure the rest of the community that no, we're not suddenly asking them to triple their own rate of development. Instead, we're merely going to ship standard library updates for the next language release in 6-monthly installments rather than delaying them all until the next language definition update, even those changes that are backwards compatible with the previously released version of Python. The community benefits listed in PEP 407 are equally applicable to this PEP, at least as far as the standard library is concerned: People who value reactivity and access to new features (without taking the risk to install alpha versions or Mercurial snapshots) would get much more value from the new release cycle than currently. People who want to contribute new features or improvements would be more motivated to do so, knowing that their contributions will be more quickly available to normal users. If the faster release cycle encourages more people to focus on contributing to the standard library rather than proposing changes to the language definition, I don't see that as a bad thing. Handling News Updates ===================== What's New? ----------- The "What's New" documents would be split out into separate documents for standard library releases and language releases. So, during the 3.3 release cycle, we would see: * What's New in Python 3.3? * What's New in the Python Standard Library 33.1? * What's New in the Python Standard Library 33.2? * What's New in the Python Standard Library 33.3? And then finally, we would see the next language release: * What's New in Python 3.4? For the benefit of users that ignore standard library releases, the 3.4 What's New would link back to the What's New documents for each of the standard library releases in the 3.3 series. NEWS ---- Merge conflicts on the NEWS file are already a hassle. Since this PEP proposes introduction of an additional branch into the normal workflow, resolving this becomes even more critical. While Mercurial phases may help to some degree, it would be good to eliminate the problem entirely. One suggestion from Barry Warsaw is to adopt a non-conflicting separate-files-per-change approach, similar to that used by Twisted [2_]. Given that the current manually updated NEWS file will be used for the 3.3.0 release, one possible layout for such an approach might look like:: Misc/ NEWS # Now autogenerated from news_entries news_entries/ 3.3/ NEWS # Original 3.3 NEWS file maint.1/ # Maintenance branch changes core/ builtins/ extensions/ library/ documentation/ tests/ compat.1/ # Compatibility branch changes builtins/ extensions/ library/ documentation/ tests/ # Add maint.2, compat.2 etc as releases are made 3.4/ core/ builtins/ extensions/ library/ documentation/ tests/ # Add maint.1, compat.1 etc as releases are made Putting the version information in the directory heirarchy isn't strictly necessary (since the NEWS file generator could figure out from the version history), but does make it easier for *humans* to keep the different versions in order. Other benefits of reduced version coupling ========================================== Slowing down the language release cycle --------------------------------------- The current release cycle is a compromise between the desire for stability in the core language definition and C extension ABI, and the desire to get new features (most notably standard library updates) into user's hands more quickly. With the standard library release cycle decoupled (to some degree) from that of the core language definition, it provides an opportunity to actually *slow down* the rate of change in the language definition. The language moratorium for Python 3.2 effectively slowed that cycle down to *more than 3 years* (3.1: June 2009, 3.3: August 2012) without causing any major problems or complaints. The NEWS file management scheme described above is actually designed to allow us the flexibility to slow down language releases at the same time as standard library releases become more frequent. As a simple example, if a full two years was allowed between 3.3 and 3.4, the 3.3 release cycle would end up looking like:: 3.2.4 # Maintenance release 3.3.0 # Language release 3.3.1 # Maintenance release 3.3 (33.1) # Standard library release 3.3.2 # Maintenance release 3.3 (33.2) # Standard library release 3.3.3 # Maintenance release 3.3 (33.3) # Standard library release 3.3.4 # Maintenance release 3.3 (33.4) # Standard library release 3.4.0 # Language release The elegance of the proposed branch structure and NEWS entry layout is that this decision wouldn't really need to be made until shortly before the planned 3.4 release date. At that point, the decision could be made to postpone the 3.4 release and keep the ``3.3`` and ``3.3-compat`` branches open after the 3.3.3 maintenance release and the 3.3 (33.3) standard library release, thus adding another standard library release to the cycle. The choice between another standard library release or a full language release would then be available every 6 months after that. Further increasing the pace of standard library development ----------------------------------------------------------- As noted in the previous section, one benefit of the scheme proposed in this PEP is that it largely decouples the language release cycle from the standard library release cycle. The standard library could be updated every 3 months, or even once a month, without having any flow on effects on the language version numbering or the perceived stability of the core language. While that pace of development isn't practical as long as the binary installer creation for Windows and Mac OS X involves several manual steps (including manual testing) and for as long as we don't have separate "-release" trees that only receive versions that have been marked as good by the stable buildbots, it's still a useful criterion to keep in mind when considering proposed new versioning schemes: what if we eventually want to make standard library releases even *faster* than every 6 months? If the practical issues were ever resolved, then the separate standard library versioning scheme in this PEP could handle it. The tagged version number approach proposed in PEP 407 could not (at least, not without a lot of user confusion and uncertainty). Other Questions =============== Why not a date-based versioning scheme? --------------------------------------- Earlier versions of this PEP proposed a date-based versioning scheme for the standard library. However, such a scheme made it very difficult to handle out-of-cycle releases to fix security issues and other critical bugs in standard library releases, as it required the following steps: 1. Change the release version number to the date of the current month. 2. Update the What's New, NEWS and documentation to refer to the new release number. 3. Make the new release. With the sequential scheme now proposed, such releases should at most require a little tidying up of the What's New document before making the release. Why isn't PEP 384 enough? ------------------------- PEP 384 introduced the notion of a "Stable ABI" for CPython, a limited subset of the full C ABI that is guaranteed to remain stable. Extensions built against the stable ABI should be able to support all subsequent Python versions with the same binary. This will help new projects to avoid coupling their C extension modules too closely to a specific version of CPython. For existing modules, however, migrating to the stable ABI can involve quite a lot of work (especially for extension modules that define a lot of classes). With limited development resources available, any time spent on such a change is time that could otherwise have been spent working on features that offer more direct benefits to end users. There are also other benefits to separate versioning (as described above) that are not directly related to the question of binary compatibility with third party C extensions. Why no binary compatible additions to the C ABI in standard library releases? ----------------------------------------------------------------------------- There's a case to be made that *additions* to the CPython C ABI could reasonably be permitted in standard library releases. This would give C extension authors the same freedom as any other package or module author to depend either on a particular language version or on a standard library version. The PEP currently associates the interpreter version with the language version, and therefore limits major interpreter changes (including C ABI additions) to the language releases. An alternative, internally consistent, approach would be to link the interpreter version with the standard library version, with only changes that may affect backwards compatibility limited to language releases. Under such a scheme, the following changes would be acceptable in standard library releases: * Standard library updates * new features in pure Python modules * new features in C extension modules (subject to PEP 399 compatibility requirements) * new features in language builtins * Interpreter implementation updates * binary compatible additions to the C ABI * changes to the compilation toolchain that do not affect the AST or alter the bytecode magic number * changes to the core interpreter eval loop * bug fixes from the corresponding maintenance release And the following changes would be acceptable in language releases: * new language syntax * any updates acceptable in a standard library release * new deprecation warnings * removal of previously deprecated features * changes to the AST * changes to the emitted bytecode that require altering the magic number * binary incompatible changes to the C ABI (although the PEP 384 stable ABI must still be preserved) While such an approach could probably be made to work, there does not appear to be a compelling justification for it, and the approach currently described in the PEP is simpler and easier to explain. Why not separate out the standard library entirely? --------------------------------------------------- A concept that is occasionally discussed is the idea of making the standard library truly independent from the CPython reference implementation. My personal opinion is that actually making such a change would involve a lot of work for next to no pay-off. CPython without the standard library is useless (the build chain won't even run, let alone the test suite). You also can't create a standalone pure Python standard library either, because too many "standard library modules" are actually tightly linked in to the internal details of their respective interpreters (for example, the builtins, ``weakref``, ``gc``, ``sys``, ``inspect``, ``ast``). Creating a separate CPython development branch that is kept compatible with the previous language release, and making releases from that branch that are identified with a separate standard library version number should provide most of the benefits of a separate standard library repository with only a fraction of the pain. Acknowledgements ================ Thanks go to the PEP 407 authors for starting this discussion, as well as to those authors and Larry Hastings for initial discussions of the proposal made in this PEP. References ========== .. [1] PEP 407: New release cycle and introducing long-term support versions http://www.python.org/dev/peps/pep-0407/ .. [2] Twisted's "topfiles" approach to NEWS generation http://twistedmatrix.com/trac/wiki/ReviewProcess#Newsfiles Copyright ========= This document has been placed in the public domain. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Feb 26 06:20:06 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 26 Feb 2012 15:20:06 +1000 Subject: [Python-Dev] Versioning proposal: syntax.stdlib.bugfix In-Reply-To: References: Message-ID: On Sun, Feb 26, 2012 at 12:05 PM, Terry Reedy wrote: > We have two similar proposals, PEPs 407 and 413, to speed up the release of > at least library changes. To me, both have major problems with version > numbering. > > I think the underlying problem is starting with a long-term fixed leading > '3', which conveys no information about current and future changes (at least > for another decade). Correct, but the still ongoing challenges of the 2 -> 3 transition make that approach, as logical as it may be, entirely unworkable from a PR point of view. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Feb 26 07:06:13 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 26 Feb 2012 16:06:13 +1000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> Message-ID: On Sun, Feb 26, 2012 at 1:13 PM, Guido van Rossum wrote: > A small quibble: I'd like to see a benchmark of a 'u' function implemented in C. Even if it was quite fast, I don't think such a function would bring the same benefits as restoring support for u'' literals. Using myself as an example, my work projects (such as PulpDist [1]) are currently written to target Python 2.6, since that's the system Python on RHEL 6. As a web application, PulpDist has unicode literals *everywhere*, but (as Armin pointed out to me), turning on "from __future__ import unicode_literals" in every file would be incorrect, since many of them also include native strings (mostly related to attribute names and subprocess invocation, but probably a few WSGI related ones as well). The action-at-a-distance of that future import can also make the code hard to read and review (in particular, a diff doesn't tell you whether or not the future import is present in the original file). It's going to be quite some time before I look at porting that code to Python 3, but, given the style of forward compatible code that I write (e.g. "print (X)", never "print X" or " print (X, Y)"; "except A as B:", never "except A, B:"), the lack of unicode literals in 3.x is the only significant sticking point I expect to encounter. If 3.3+ has Unicode literals, I expect that PulpDist *right now* would be awfully close to being source compatible (and any other discrepancies would just be simple fixes like adding conditional imports from new locations). IIRC, I've previously opposed the restoration of unicode literals as a retrograde step. Looking at the implications for the future migration of PulpDist has changed my mind. Regards, Nick. [1] https://fedorahosted.org/pulpdist/ -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Feb 26 07:14:51 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 26 Feb 2012 16:14:51 +1000 Subject: [Python-Dev] Versioning proposal: syntax.stdlib.bugfix In-Reply-To: References: Message-ID: On Sun, Feb 26, 2012 at 12:05 PM, Terry Reedy wrote: > I think the underlying problem is starting with a long-term fixed leading > '3', which conveys no information about current and future changes (at least > for another decade). In updating PEP 413 to include an explanation for why the simple major.minor.micro = language.stdlib.maintenance approach doesn't work due to the ongoing 2->3 transition [1], I realised that there *is* a way to make it work: Instead of making 3.3 version 4.0, we make it version 33.0 That's essentially what PEP 413 currently proposes for the standard library anyway, but it would actually work just as well for the existing sys.version_info structure. (it would break "sys.version_info.major == 3" checks, but such checks are badly written) [1] http://www.python.org/dev/peps/pep-0413/#why-not-use-the-major-version-number Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From stefan_ml at behnel.de Sun Feb 26 08:47:36 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 26 Feb 2012 08:47:36 +0100 Subject: [Python-Dev] C-API functions for reading/writing tstate->exc_* ? In-Reply-To: References: <4F4176B9.4080403@v.loewis.de> Message-ID: Stefan Behnel, 23.02.2012 09:01: > "Martin v. L?wis", 19.02.2012 23:24: >>> When compiling for PyPy, Cython therefore needs a way to tell PyPy about >>> any changes. For the tstate->curexc_* fields, there are the two functions >>> PyErr_Fetch() and PyErr_Restore(). Could we have two similar "official" >>> functions for the exc_* fields? Maybe PyErr_FetchLast() and >>> PyErr_RestoreLast()? >> >> I wouldn't call the functions *Last, as this may cause confusion with >> sys.last_*. I'm also unsure why the current API uses this Fetch/Restore >> pair of functions where Fetch clears the variables. A Get/Set pair of >> functions would be more natural, IMO (where Get returns "new" >> references). This would give PyErr_GetExcInfo/PyErr_SetExcInfo. > > Ok, I added a tracker ticket and I'm working on a patch. > > http://bugs.python.org/issue14098 The patch is attached to the ticket, including documentation and test. I'd be happy if someone could review it and apply it. Thanks! Stefan From eliben at gmail.com Sun Feb 26 09:40:03 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 26 Feb 2012 10:40:03 +0200 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: <4F48B603.2000705@v.loewis.de> References: <4F4803A3.7040803@v.loewis.de> <20120225012039.Horde.G9vmccL8999PSClXHxABAoA@webmail.df.eu> <4F48B603.2000705@v.loewis.de> Message-ID: On Sat, Feb 25, 2012 at 12:20, "Martin v. L?wis" wrote: > > I find that strange, especially for an expert Python dev. I, a newbie, > > find it far friendlier (and easier for a new programmer to grasp). > > Maybe it's because I use it all the time, and you don't? > > That is most likely the case. You learn by practice. For that very > reason, the claim "and easier for a new programmer to grasp" is > difficult to prove. It was easier for *you*, since you started using > it, and then kept using it. I don't recall any particular obstacles > learning % formatting (even though I did for C, not for C++). > Generalizing that it is *easier* is invalid: you just didn't try > learning that instead first, and now you can't go back in a state > where either are new to you. > > C++ is very similar here: they also introduced a new way of output > (iostreams, and << overloading). I used that for a couple of years, > primarily because people said that printf is "bad" and "not object- > oriented". I then recognized that there is nothing wrong with printf > per so, and would avoid std::cout in C++ these days, in favor of > std::printf (yes, I know that it does have an issue with type safety). > Not to mention that the performance of iostreams is pretty bad, to the extent that some projects actively discourage using them in favor of either C-style IO (fgets, printf, etc.) or custom IO implementations. This is marginally off-topic, although it does show that an initial thought of deprecating an existing functionality for new one doesn't always work out in the long run, even for super-popular languages like C++. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Sun Feb 26 09:42:12 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 26 Feb 2012 10:42:12 +0200 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: <4F480324.7000308@v.loewis.de> Message-ID: > Indeed, that note was written before we decided that getting rid of > "%" formatting altogether would be a bad idea. > > It would be better to update it to say something like: > > "The formatting operations described here are modelled on C's printf() > syntax. They only support formatting of certain builtin types, and the > use of a binary operator means that care may be needed in order to > format tuples and dictionaries correctly. As the new string formatting > syntax is more powerful, flexible, extensible and handles tuples and > dictionaries naturally, it is recommended for new code. However, there > are no current plans to deprecate printf-style formatting." > +1 on rephrasing that doc section, but I wouldn't mention deprecation at all. It's alright to keep calling % formatting "old style" and encouraging .format instead, of course. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From vinay_sajip at yahoo.co.uk Sun Feb 26 10:05:14 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 26 Feb 2012 09:05:14 +0000 (UTC) Subject: [Python-Dev] PEP 414 References: <4F49434B.6050604@active-4.com> Message-ID: The PEP does not consider an alternative idea such as using "from __future__ import unicode_literals" in code which needs to run on 2.x, together with e.g. a callable n('xxx') which can be used where native strings are needed. This avoids the need to reintroduce the u'xxx' literal syntax, makes it explicit where native strings are needed, is less obtrusive that u('xxx') or u'xxx' because typically there will be vastly fewer places where you need native strings, and is unlikely to impose a major runtime penalty when compared with u('xxx') (again, because of the lower frequency of occurrence). Even if you have arguments against this idea, I think it's at least worth mentioning in the PEP with any counter-arguments you have. Regards, Vinay Sajip From anacrolix at gmail.com Sun Feb 26 10:16:44 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Sun, 26 Feb 2012 17:16:44 +0800 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: References: <4F4803A3.7040803@v.loewis.de> <20120225012039.Horde.G9vmccL8999PSClXHxABAoA@webmail.df.eu> <4F48B603.2000705@v.loewis.de> Message-ID: Big +1 On Feb 26, 2012 4:41 PM, "Eli Bendersky" wrote: > > On Sat, Feb 25, 2012 at 12:20, "Martin v. L?wis" wrote: > >> > I find that strange, especially for an expert Python dev. I, a newbie, >> > find it far friendlier (and easier for a new programmer to grasp). >> > Maybe it's because I use it all the time, and you don't? >> >> That is most likely the case. You learn by practice. For that very >> reason, the claim "and easier for a new programmer to grasp" is >> difficult to prove. It was easier for *you*, since you started using >> it, and then kept using it. I don't recall any particular obstacles >> learning % formatting (even though I did for C, not for C++). >> Generalizing that it is *easier* is invalid: you just didn't try >> learning that instead first, and now you can't go back in a state >> where either are new to you. >> >> C++ is very similar here: they also introduced a new way of output >> (iostreams, and << overloading). I used that for a couple of years, >> primarily because people said that printf is "bad" and "not object- >> oriented". I then recognized that there is nothing wrong with printf >> per so, and would avoid std::cout in C++ these days, in favor of >> std::printf (yes, I know that it does have an issue with type safety). >> > > Not to mention that the performance of iostreams is pretty bad, to the > extent that some projects actively discourage using them in favor of either > C-style IO (fgets, printf, etc.) or custom IO implementations. This is > marginally off-topic, although it does show that an initial thought of > deprecating an existing functionality for new one doesn't always work out > in the long run, even for super-popular languages like C++. > > Eli > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/anacrolix%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Sun Feb 26 10:50:11 2012 From: larry at hastings.org (Larry Hastings) Date: Sun, 26 Feb 2012 01:50:11 -0800 Subject: [Python-Dev] Proposing an alternative to PEP 410 In-Reply-To: References: <4F46AF6E.2030300@hastings.org> <20120225163129.3a104cdd@resist.wooz.org> Message-ID: <4F4A0053.5030508@hastings.org> On 02/25/2012 03:31 PM, Guido van Rossum wrote: > On Sat, Feb 25, 2012 at 1:31 PM, Barry Warsaw wrote: >> On Feb 23, 2012, at 01:28 PM, Larry Hastings wrote: >>> * Change the result of os.stat to be a custom class rather than a >>> PyStructSequence. Support the sequence protocol on the custom class but >>> mark it PendingDeprecation [...] >> +1 > Yeah, the sequence protocol is outdated here. > > Would this be a mutable or an immutable object? Immutable, just like the current PyStructSequence object. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Sun Feb 26 10:50:32 2012 From: larry at hastings.org (Larry Hastings) Date: Sun, 26 Feb 2012 01:50:32 -0800 Subject: [Python-Dev] Rejecting PEP 410 (Use decimal.Decimal type for timestamps) In-Reply-To: References: Message-ID: <4F4A0068.9030909@hastings.org> On 02/25/2012 07:04 PM, Guido van Rossum wrote: > As to the changes alluded to in #5: Let os.stat() and friends return > extra fields st_atime_ns [...] We don't need a PEP for > this proposal; we can just open a tracker issue and hash out the > details during the code review. http://bugs.python.org/issue14127 //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sun Feb 26 11:18:21 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 26 Feb 2012 12:18:21 +0200 Subject: [Python-Dev] PEP 414 In-Reply-To: References: <4F49434B.6050604@active-4.com> Message-ID: 26.02.12 11:05, Vinay Sajip ???????(??): > The PEP does not consider an alternative idea such as using "from __future__ > import unicode_literals" in code which needs to run on 2.x, together with e.g. a > callable n('xxx') which can be used where native strings are needed. This avoids > the need to reintroduce the u'xxx' literal syntax, makes it explicit where > native strings are needed, is less obtrusive that u('xxx') or u'xxx' because > typically there will be vastly fewer places where you need native strings, and > is unlikely to impose a major runtime penalty when compared with u('xxx') > (again, because of the lower frequency of occurrence). n = str From vinay_sajip at yahoo.co.uk Sun Feb 26 11:36:41 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 26 Feb 2012 10:36:41 +0000 (UTC) Subject: [Python-Dev] PEP 414 References: <4F49434B.6050604@active-4.com> Message-ID: Serhiy Storchaka gmail.com> writes: > n = str Well, n to indicate that native string is required. Regards, Vinay Sajip From eliben at gmail.com Sun Feb 26 11:45:30 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 26 Feb 2012 12:45:30 +0200 Subject: [Python-Dev] folding cElementTree behind ElementTree in 3.3 In-Reply-To: <20120221230622.2a696bc4@pitrou.net> References: <4F34E554.7090600@v.loewis.de> <4F3CC8C3.8070103@v.loewis.de> <4F4181E1.9040909@v.loewis.de> <20120220165511.Horde.c-6iIaGZi1VPQmzfA-JBT4A@webmail.df.eu> <20120221230622.2a696bc4@pitrou.net> Message-ID: > It probably wouldn't be very difficult to make element_new() the tp_new > of Element_Type, and expose that type as "Element". > That would settle the issue nicely and avoid compatibility concerns :) > > Regards > I guess it's not as simple as that. element_new doesn't quite have the signature required for tp_new. Besides, a constructor would also be needed (since a subclass may be interested in calling Element.__init__) and there's no natural function to serve as the constructor. I've opened issue 14128 to track this. I plan to implement a standard tp_new and tp_init functions for Element to expose it as a class from the module. element_new also happens to be used internally - I'll try to refactor to avoid code duplication as much as possible. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Feb 26 12:00:17 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 26 Feb 2012 22:00:17 +1100 Subject: [Python-Dev] PEP 414 In-Reply-To: References: <4F49434B.6050604@active-4.com> Message-ID: <4F4A10C1.6040806@pearwood.info> Vinay Sajip wrote: > Serhiy Storchaka gmail.com> writes: > > >> n = str > > Well, n to indicate that native string is required. str indicates the native string type, because it *is* the native string type. By definition, str = str in both Python 2.x and Python 3.x. There's no point in aliasing it to "n". Besides, "n" is commonly used for ints. It would be disturbing for me to read code with n a function or type, particularly one that returns a string. I think your suggestion is not well explained. You suggested a function n, expected to take a string literal. The example you gave earlier was: n('xxx') But it seems to me that this is a no-op, because 'xxx' is already the native string type. In Python 2, it gives a str (byte-string), which the n() function converts to a byte-string. In Python 3, it gives a str (unicode-string), which the n() function converts to a unicode-string. -- Steven From ncoghlan at gmail.com Sun Feb 26 12:14:57 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 26 Feb 2012 21:14:57 +1000 Subject: [Python-Dev] PEP 414 In-Reply-To: References: <4F49434B.6050604@active-4.com> Message-ID: On Sun, Feb 26, 2012 at 7:05 PM, Vinay Sajip wrote: > The PEP does not consider an alternative idea such as using "from __future__ > import unicode_literals" in code which needs to run on 2.x, together with e.g. a > callable n('xxx') which can be used where native strings are needed. This avoids > the need to reintroduce the u'xxx' literal syntax, makes it explicit where > native strings are needed, is less obtrusive that u('xxx') or u'xxx' because > typically there will be vastly fewer places where you need native strings, and > is unlikely to impose a major runtime penalty when compared with u('xxx') > (again, because of the lower frequency of occurrence). > > Even if you have arguments against this idea, I think it's at least worth > mentioning in the PEP with any counter-arguments you have. The PEP already mentions that. In fact, all bar the first paragraph in the "Rationale and Goals" section discusses it. However, it's the last paragraph that explains why using that particular future import is, in and of itself, a bad idea: ============ Additionally, the vast majority of people who maintain Python 2.x codebases are more familiar with Python 2.x semantics, and a per-file difference in literal meanings will be very annoying for them in the long run. A quick poll on Twitter about the use of the division future import supported my suspicions that people opt out of behaviour-changing future imports because they are a maintenance burden. Every time you review code you have to check the top of the file to see if the behaviour was changed. Obviously that was an unscientific informal poll, but it might be something worth considering. ============ As soon as you allow the use of "from __future__ import unicode_literals" or a module level "__metaclass__ = type", you can't review diffs in isolation any more - whether the diff is correct or not will depend on the presence or absence of module level tweak to the language semantics. Future imports work well for things like absolute imports, new keywords, or statements becoming functions - if the future import is missing when you expected it to be present (or vice-versa) will result in a quick SyntaxError or ImportError that will point you directly to the offending code. Unicode literals and implicitly creating new-style classes are a different matter - for those, if the module level modification takes place (or doesn't take place when you expected it to be there), you get unexpected changes in behaviour instead of a clear exception that refers directly to the source of the problem. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Feb 26 12:20:16 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 26 Feb 2012 21:20:16 +1000 Subject: [Python-Dev] PEP 414 In-Reply-To: <4F4A10C1.6040806@pearwood.info> References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> Message-ID: On Sun, Feb 26, 2012 at 9:00 PM, Steven D'Aprano wrote: > I think your suggestion is not well explained. You suggested a function n, > expected to take a string literal. The example you gave earlier was: > > n('xxx') > > But it seems to me that this is a no-op, because 'xxx' is already the native > string type. In Python 2, it gives a str (byte-string), which the n() > function converts to a byte-string. In Python 3, it gives a str > (unicode-string), which the n() function converts to a unicode-string. Vinay's suggestion was that it be used in conjunction with the "from __future__ import unicode_literals" import, so that you could write: b"" # Binary data "" # Text (unicode) data str("") # Native string type It reduces the problem (compared to omitting the import and using a u() function), but it's still ugly and still involves the "action at a distance" of the unicode literals import. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From pmon.mail at gmail.com Sun Feb 26 11:33:28 2012 From: pmon.mail at gmail.com (pmon mail) Date: Sun, 26 Feb 2012 12:33:28 +0200 Subject: [Python-Dev] struct.pack inconsistencies between platforms Message-ID: Hi I have found myself in the following troubling situation. I'm running the following code on a Python 2.6.5 on Linux x86: Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) [GCC 4.4.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import struct >>> len(struct.pack('L',0)) 4 Works as expected and documented (http://docs.python.org/library/struct.html ). I'm running the same code on a MacPro (OS X 10.7.3) and I'm getting the following: Python 2.7.1 (r271:86832, Jun 16 2011, 16:59:05) [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import struct >>> len(struct.pack('L',0)) 8 Documentation clearly states that the 'L' is a 4 byte integer. Is this a bug? I'm I missing something? Thanks PMon -------------- next part -------------- An HTML attachment was scrubbed... URL: From vinay_sajip at yahoo.co.uk Sun Feb 26 13:28:08 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 26 Feb 2012 12:28:08 +0000 (UTC) Subject: [Python-Dev] PEP 414 References: <4F49434B.6050604@active-4.com> Message-ID: Nick Coghlan gmail.com> writes: > The PEP already mentions that. In fact, all bar the first paragraph in > the "Rationale and Goals" section discusses it. However, it's the last I didn't meaning the __future__ import bit, but a discussion re. alternatives to u('xxx'). > Future imports work well for things like absolute imports, new > keywords, or statements becoming functions - if the future import is > missing when you expected it to be present (or vice-versa) will result > in a quick SyntaxError or ImportError that will point you directly to > the offending code. Unicode literals and implicitly creating new-style > classes are a different matter - for those, if the module level > modification takes place (or doesn't take place when you expected it > to be there), you get unexpected changes in behaviour instead of a > clear exception that refers directly to the source of the problem. I don't disagree with anything you said here. Perhaps I've been doing too much work recently with single 2.x/3.x codebase projects, so I've just gotten to like using Unicode literals without the u prefix. However, as the proposal doesn't force one to use u prefixes, I'm not really objecting, especially if it speeds transition to 3.x. Regards, Vinay Sajip From eliben at gmail.com Sun Feb 26 13:34:24 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 26 Feb 2012 14:34:24 +0200 Subject: [Python-Dev] struct.pack inconsistencies between platforms In-Reply-To: References: Message-ID: On Sun, Feb 26, 2012 at 12:33, pmon mail wrote: > Hi > > I have found myself in the following troubling situation. > > I'm running the following code on a Python 2.6.5 on Linux x86: > Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) > [GCC 4.4.3] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import struct > >>> len(struct.pack('L',0)) > 4 > > Works as expected and documented ( > http://docs.python.org/library/struct.html). > > I'm running the same code on a MacPro (OS X 10.7.3) and I'm getting the > following: > Python 2.7.1 (r271:86832, Jun 16 2011, 16:59:05) > [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on > darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import struct > >>> len(struct.pack('L',0)) > 8 > > Documentation clearly states that the 'L' is a 4 byte integer. > > Is this a bug? I'm I missing something? > > By default pack uses native size, not standard size. On a 64-bit machine: >>> struct.pack('=L', 0) '\x00\x00\x00\x00' >>> struct.pack('L', 0) '\x00\x00\x00\x00\x00\x00\x00\x00' -------------- next part -------------- An HTML attachment was scrubbed... URL: From ned at nedbatchelder.com Sun Feb 26 13:34:59 2012 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sun, 26 Feb 2012 07:34:59 -0500 Subject: [Python-Dev] PEP 414 In-Reply-To: References: <4F49434B.6050604@active-4.com> Message-ID: <4F4A26F3.6080801@nedbatchelder.com> On 2/26/2012 6:14 AM, Nick Coghlan wrote: > As soon as you allow the use of "from __future__ import > unicode_literals" or a module level "__metaclass__ = type", you can't > review diffs in isolation any more - whether the diff is correct or > not will depend on the presence or absence of module level tweak to > the language semantics. > > Future imports work well for things like absolute imports, new > keywords, or statements becoming functions - if the future import is > missing when you expected it to be present (or vice-versa) will result > in a quick SyntaxError or ImportError that will point you directly to > the offending code. Unicode literals and implicitly creating new-style > classes are a different matter - for those, if the module level > modification takes place (or doesn't take place when you expected it > to be there), you get unexpected changes in behaviour instead of a > clear exception that refers directly to the source of the problem. There are already __future__ imports that violate this principle: from __future__ import division. That doesn't mean I'm in favor of this new __future__, just keeping a wide angle on the viewfinder. --Ned. From storchaka at gmail.com Sun Feb 26 13:35:25 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 26 Feb 2012 14:35:25 +0200 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F49434B.6050604@active-4.com> References: <4F49434B.6050604@active-4.com> Message-ID: Some microbenchmarks: $ python -m timeit -n 10000 -r 100 -s "x = 123" "'foobarbaz_%d' % x" 10000 loops, best of 100: 1.24 usec per loop $ python -m timeit -n 10000 -r 100 -s "x = 123" "str('foobarbaz_%d') % x" 10000 loops, best of 100: 1.59 usec per loop $ python -m timeit -n 10000 -r 100 -s "x = 123" "str(u'foobarbaz_%d') % x" 10000 loops, best of 100: 1.58 usec per loop $ python -m timeit -n 10000 -r 100 -s "x = 123; n = lambda s: s" "n('foobarbaz_%d') % x" 10000 loops, best of 100: 1.41 usec per loop $ python -m timeit -n 10000 -r 100 -s "x = 123; s = 'foobarbaz_%d'" "s % x" 10000 loops, best of 100: 1.22 usec per loop There are no significant overhead to use converters. From ncoghlan at gmail.com Sun Feb 26 13:40:35 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 26 Feb 2012 22:40:35 +1000 Subject: [Python-Dev] PEP 414 In-Reply-To: <4F4A26F3.6080801@nedbatchelder.com> References: <4F49434B.6050604@active-4.com> <4F4A26F3.6080801@nedbatchelder.com> Message-ID: On Sun, Feb 26, 2012 at 10:34 PM, Ned Batchelder wrote: > There are already __future__ imports that violate this principle: ?from > __future__ import division. ?That doesn't mean I'm in favor of this new > __future__, just keeping a wide angle on the viewfinder. Armin's straw poll was actually about whether or not people used the future import for division, rather than unicode literals. It is indeed the same problem - and several of us had a strong preference for forcing float division with "float(x) / y" over relying on the long distance effect of the future import (although it was only in this thread that I figured out exactly *why* I don't like those two, but happily used many of the other future imports when they were necessary). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From vinay_sajip at yahoo.co.uk Sun Feb 26 13:42:44 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 26 Feb 2012 12:42:44 +0000 (UTC) Subject: [Python-Dev] PEP 414 References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> Message-ID: Nick Coghlan gmail.com> writes: > It reduces the problem (compared to omitting the import and using a > u() function), but it's still ugly and still involves the "action at a > distance" of the unicode literals import. I agree about the action-at-a-distance leading to non-obvious bugs and wasted head-scratching time caused by such. It could be mitigated somewhat by project-level conventions, e.g. that all string literals are Unicode on that project. Then, if you put yourself in the relevant mindset when working on that project, there are fewer surprises. It's probably a matter of choosing the lesser among evils, since the proposal seems to allow mixing of literals with and without u prefixes in 3.x code - doesn't that also seem ugly? When this came up earlier (when I think Chris McDonough raised it) the issue of what to do on 3.2 came up, and though it has been addressed somewhat in the PEP, it would be nice to see the suggested on-installation hook fleshed out a little more. Regards, Vinay Sajip From jnoller at gmail.com Sun Feb 26 13:44:21 2012 From: jnoller at gmail.com (Jesse Noller) Date: Sun, 26 Feb 2012 07:44:21 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> Message-ID: <039E7AF2BB9B459EBE2114E7AE682F80@gmail.com> On Saturday, February 25, 2012 at 10:13 PM, Guido van Rossum wrote: > If this can encourage more projects to support Python 3 (even if it's > only 3.3 and later) and hence improve adoption of Python 3, I'm all > for it. > > A small quibble: I'd like to see a benchmark of a 'u' function implemented in C. > > --Guido > After having this explained quite a bit to me by the more web-savvy folks such as Armin and Chris M/etc, I am a +1, the rationale makes sense, and much for the same reason that Guido cites, I think this will help with code bases using the single code base approach, and assist with overall adoption. +1 jesse From armin.ronacher at active-4.com Sun Feb 26 13:44:48 2012 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Sun, 26 Feb 2012 12:44:48 +0000 Subject: [Python-Dev] PEP 414 In-Reply-To: <4F4A26F3.6080801@nedbatchelder.com> References: <4F49434B.6050604@active-4.com> <4F4A26F3.6080801@nedbatchelder.com> Message-ID: <4F4A2940.10002@active-4.com> Hi, On 2/26/12 12:34 PM, Ned Batchelder wrote: > There are already __future__ imports that violate this principle: from > __future__ import division. That doesn't mean I'm in favor of this new > __future__, just keeping a wide angle on the viewfinder. That's actually mentioned in the PEP :-) > A quick poll on Twitter about the use of the division future import > supported my suspicions that people opt out of behaviour-changing > future imports because they are a maintenance burden. Every time you > review code you have to check the top of the file to see if the > behaviour was changed. Regards, Armin From armin.ronacher at active-4.com Sun Feb 26 13:46:53 2012 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Sun, 26 Feb 2012 12:46:53 +0000 Subject: [Python-Dev] PEP 414 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> Message-ID: <4F4A29BD.2090607@active-4.com> Hi, On 2/26/12 12:42 PM, Vinay Sajip wrote: > When this came up earlier (when I think Chris McDonough raised it) the issue of > what to do on 3.2 came up, and though it has been addressed somewhat in the PEP, > it would be nice to see the suggested on-installation hook fleshed out a little > more. I wanted to do that but the tokenizer module is quite ugly to customize in order to allow "u" prefixes to strings which is why I postponed that. It would work similar to how 2to3 is invoked however. In case this PEP gets approved I will refactor the tokenize module while adding support for "u" prefixes and use that as the basis for a installation hook for older Python 3 versions. Regards, Armin From armin.ronacher at active-4.com Sun Feb 26 13:42:53 2012 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Sun, 26 Feb 2012 12:42:53 +0000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> Message-ID: <4F4A28CD.5070903@active-4.com> Hi, On 2/26/12 12:35 PM, Serhiy Storchaka wrote: > Some microbenchmarks: > > $ python -m timeit -n 10000 -r 100 -s "x = 123" "'foobarbaz_%d' % x" > 10000 loops, best of 100: 1.24 usec per loop > $ python -m timeit -n 10000 -r 100 -s "x = 123" "str('foobarbaz_%d') % x" > 10000 loops, best of 100: 1.59 usec per loop > $ python -m timeit -n 10000 -r 100 -s "x = 123" "str(u'foobarbaz_%d') % x" > 10000 loops, best of 100: 1.58 usec per loop > $ python -m timeit -n 10000 -r 100 -s "x = 123; n = lambda s: s" "n('foobarbaz_%d') % x" > 10000 loops, best of 100: 1.41 usec per loop > $ python -m timeit -n 10000 -r 100 -s "x = 123; s = 'foobarbaz_%d'" "s % x" > 10000 loops, best of 100: 1.22 usec per loop > > There are no significant overhead to use converters. That's because what you're benchmarking here more than anything is the overhead of eval() :-) See the benchmark linked in the PEP for one that measures the actual performance of the string literal / wrapper. Regards, Armin From storchaka at gmail.com Sun Feb 26 14:03:36 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 26 Feb 2012 15:03:36 +0200 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4A28CD.5070903@active-4.com> References: <4F49434B.6050604@active-4.com> <4F4A28CD.5070903@active-4.com> Message-ID: 26.02.12 14:42, Armin Ronacher ???????(??): > On 2/26/12 12:35 PM, Serhiy Storchaka wrote: >> Some microbenchmarks: >> >> $ python -m timeit -n 10000 -r 100 -s "x = 123" "'foobarbaz_%d' % x" >> 10000 loops, best of 100: 1.24 usec per loop >> $ python -m timeit -n 10000 -r 100 -s "x = 123" "str('foobarbaz_%d') % x" >> 10000 loops, best of 100: 1.59 usec per loop >> $ python -m timeit -n 10000 -r 100 -s "x = 123" "str(u'foobarbaz_%d') % x" >> 10000 loops, best of 100: 1.58 usec per loop >> $ python -m timeit -n 10000 -r 100 -s "x = 123; n = lambda s: s" > "n('foobarbaz_%d') % x" >> 10000 loops, best of 100: 1.41 usec per loop >> $ python -m timeit -n 10000 -r 100 -s "x = 123; s = 'foobarbaz_%d'" "s > % x" >> 10000 loops, best of 100: 1.22 usec per loop >> >> There are no significant overhead to use converters. > That's because what you're benchmarking here more than anything is the > overhead of eval() :-) See the benchmark linked in the PEP for one that > measures the actual performance of the string literal / wrapper. $ python -m timeit -n 10000 -r 100 "" 10000 loops, best of 100: 0.087 usec per loop Overhead of eval is 5%. Real code is not single string literal, every string literal occured together with a lot of code (getting and setting variables, attribute access, function calls, binary operators, unconditional and conditional jumps, etc), and total effect of using simple converter will be insignificant. From eliben at gmail.com Sun Feb 26 14:05:45 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 26 Feb 2012 15:05:45 +0200 Subject: [Python-Dev] [Python-checkins] cpython (3.2): Issue #14123: Explicitly mention that old style % string formatting has caveats In-Reply-To: References: Message-ID: > > - The formatting operations described here are obsolete and may go away > in future > - versions of Python. Use the new :ref:`string-formatting` in new code. > + The formatting operations described here are modelled on C's printf() > + syntax. They only support formatting of certain builtin types. The > + use of a binary operator means that care may be needed in order to > + format tuples and dictionaries correctly. As the new > + :ref:`string-formatting` syntax is more flexible and handles tuples and > + dictionaries naturally, it is recommended for new code. However, there > + are no current plans to deprecate printf-style formatting. > Please consider just deleting the last sentence. Documentation is meant for users (often new users) and not core devs. As such, I just don't see what it adds. If the aim to to document this intent somewhere, a PEP would be a better place than the formal documentation. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Sun Feb 26 14:09:33 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 26 Feb 2012 13:09:33 +0000 Subject: [Python-Dev] struct.pack inconsistencies between platforms In-Reply-To: References: Message-ID: On 26 February 2012 12:34, Eli Bendersky wrote: > On Sun, Feb 26, 2012 at 12:33, pmon mail wrote: >> Documentation clearly states that the 'L' is a 4 byte integer. >> >> Is this a bug? I'm I missing something? >> > > By default pack uses native size, not standard size. On a 64-bit machine: As the OP points out, the documentation says that the "Standard Size" is 4 bytes (http://docs.python.org/library/struct.html). While "Standard Size" doesn't appear to be defined in the documentation, and the start of the previous section (7.3.2.1. Byte Order, Size, and Alignment) clearly states that C types are represented in native format by default, the documentation could probably do with some clarification. Paul. From eliben at gmail.com Sun Feb 26 14:16:18 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 26 Feb 2012 15:16:18 +0200 Subject: [Python-Dev] struct.pack inconsistencies between platforms In-Reply-To: References: Message-ID: On Sun, Feb 26, 2012 at 15:09, Paul Moore wrote: > On 26 February 2012 12:34, Eli Bendersky wrote: > > On Sun, Feb 26, 2012 at 12:33, pmon mail wrote: > >> Documentation clearly states that the 'L' is a 4 byte integer. > >> > >> Is this a bug? I'm I missing something? > >> > > > > By default pack uses native size, not standard size. On a 64-bit machine: > > As the OP points out, the documentation says that the "Standard Size" > is 4 bytes (http://docs.python.org/library/struct.html). While > "Standard Size" doesn't appear to be defined in the documentation, and > the start of the previous section (7.3.2.1. Byte Order, Size, and > Alignment) clearly states that C types are represented in native > format by default, the documentation could probably do with some > clarification. > > 7.2.3.1 says, shortly after the first table: " Native size and alignment are determined using the C compiler?s sizeofexpression. This is always combined with native byte order. Standard size depends only on the format character; see the table in the *Format Characters* section. " To me this appears to be a reasonable definition of what "standard size" is. 7.3.2.2 says before the size table: "Format characters have the following meaning; the conversion between C and Python values should be obvious given their types. The ?Standard size? column refers to the size of the packed value in bytes when using standard size; that is, when the format string starts with one of '<', '>', '!' or '='. When using native size, the size of the packed value is platform-dependent." Again, taken together with the previous quote, IMHO this defines the difference between standard and native sizes clearly. If you feel differently, feel free to open an issue suggesting a better explanation. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at bytereef.org Sun Feb 26 14:27:21 2012 From: stefan at bytereef.org (Stefan Krah) Date: Sun, 26 Feb 2012 14:27:21 +0100 Subject: [Python-Dev] State of PEP-3118 (memoryview part) Message-ID: <20120226132721.GA1422@sleipnir.bytereef.org> State of PEP-3118 (memoryview part) Hello, In Python 3.3 most issues with the memoryview object have been fixed in a recent commit (3f9b3b6f7ff0). Many features have been added, see: http://docs.python.org/dev/whatsnew/3.3.html The underlying problems with memoryview were intricate and required a long discussion (issue #10181) that led to a complete rewrite of memoryobject.c. We have several options with regard to 2.7 and 3.2: 1) Won't fix. 2) Backport the changes and disable as much of the new functionality as possible. 3) Backport all of it (this would be the least amount of work and could be done relatively quickly). 4) Nick suggested another option: put a module with the new functionality on PyPI. This would be quite a bit of work, and personally I don't have time for that. Options 2) and 3) would ideally entail one backwards incompatible bugfix: In 2.7 and 3.2 assignment to a memoryview with format 'B' rejects integers but accepts byte objects, but according to the struct syntax mandated by the PEP it should be the other way round. It would be nice to get some opinions and ideas, especially of course from the release managers. Stefan Krah From solipsis at pitrou.net Sun Feb 26 14:41:06 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 26 Feb 2012 14:41:06 +0100 Subject: [Python-Dev] State of PEP-3118 (memoryview part) References: <20120226132721.GA1422@sleipnir.bytereef.org> Message-ID: <20120226144106.0bfd38ff@pitrou.net> On Sun, 26 Feb 2012 14:27:21 +0100 Stefan Krah wrote: > > The underlying problems with memoryview were intricate and required > a long discussion (issue #10181) that led to a complete rewrite > of memoryobject.c. > > > We have several options with regard to 2.7 and 3.2: > > 1) Won't fix. Given the extent of the rewrite, this one has my preference. Regards Antoine. From pmon.mail at gmail.com Sun Feb 26 14:45:52 2012 From: pmon.mail at gmail.com (pmon mail) Date: Sun, 26 Feb 2012 15:45:52 +0200 Subject: [Python-Dev] struct.pack inconsistencies between platforms In-Reply-To: References: Message-ID: Sounds reasonable for me. Thanks! On Sun, Feb 26, 2012 at 3:16 PM, Eli Bendersky wrote: > > > On Sun, Feb 26, 2012 at 15:09, Paul Moore wrote: > >> On 26 February 2012 12:34, Eli Bendersky wrote: >> > On Sun, Feb 26, 2012 at 12:33, pmon mail wrote: >> >> Documentation clearly states that the 'L' is a 4 byte integer. >> >> >> >> Is this a bug? I'm I missing something? >> >> >> > >> > By default pack uses native size, not standard size. On a 64-bit >> machine: >> >> As the OP points out, the documentation says that the "Standard Size" >> is 4 bytes (http://docs.python.org/library/struct.html). While >> "Standard Size" doesn't appear to be defined in the documentation, and >> the start of the previous section (7.3.2.1. Byte Order, Size, and >> Alignment) clearly states that C types are represented in native >> format by default, the documentation could probably do with some >> clarification. >> >> > 7.2.3.1 says, shortly after the first table: > > " > > Native size and alignment are determined using the C compiler?s sizeofexpression. This is always combined with native byte order. > > Standard size depends only on the format character; see the table in the *Format > Characters* section. > " > > To me this appears to be a reasonable definition of what "standard size" > is. > > 7.3.2.2 says before the size table: > > "Format characters have the following meaning; the conversion between C > and Python values should be obvious given their types. The ?Standard size? > column refers to the size of the packed value in bytes when using standard > size; that is, when the format string starts with one of '<', '>', '!' or > '='. When using native size, the size of the packed value is > platform-dependent." > > Again, taken together with the previous quote, IMHO this defines the > difference between standard and native sizes clearly. If you feel > differently, feel free to open an issue suggesting a better explanation. > > Eli > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sun Feb 26 14:54:21 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 26 Feb 2012 14:54:21 +0100 Subject: [Python-Dev] cpython: Close issue #6210: Implement PEP 409 References: Message-ID: <20120226145421.6bff8bc7@pitrou.net> On Sun, 26 Feb 2012 09:02:59 +0100 nick.coghlan wrote: > + > +No debugging capability is lost, as the original exception context remains > +available if needed (for example, if an intervening library has incorrectly > +suppressed valuable underlying details):: That's debatable, since you now have to *add* code if you want to print the original exception as well. The new capability improves the terseness and clarity of error messages at the expense of debuggability. > + def prepare_subprocess(): > + # don't create core file > + try: > + setrlimit(RLIMIT_CORE, (0, 0)) > + except (ValueError, resource_error): > + pass Really? This sounds quite wrong, but it should *at least* explain why a test of the "raise" statement would produce a core file! (but I think you should consider removing this part) > + def get_output(self, code, filename=None): > + """ > + Run the specified code in Python (in a new child process) > and read the > + output from the standard error or from a file (if filename > is set). > + Return the output lines as a list. > + """ We already have assert_python_ok and friends. It's not obvious what this additional function achieves. Also, the "filename" argument is never used. > + output = re.sub('Current thread 0x[0-9a-f]+', > + 'Current thread XXX', > + output) This looks like output from the faulthandler module. Why would faulthandler kick in here? Regards Antoine. From victor.stinner at gmail.com Sun Feb 26 15:04:45 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 26 Feb 2012 15:04:45 +0100 Subject: [Python-Dev] Proposing an alternative to PEP 410 In-Reply-To: References: <4F46AF6E.2030300@hastings.org> <20120225163129.3a104cdd@resist.wooz.org> Message-ID: > Scratch that, *I* don't agree. timedelta is a pretty clumsy type to > use. Have you ever tried to compute the number of seconds between two > datetimes? You can't just use the .seconds field, you have to combine > the .days and .seconds fields. And negative timedeltas are even harder > due to the requirement that seconds and microseconds are never > negative; e.g -1 second is represented as -1 days plus 86399 seconds. Guido, you should switch to Python3! timedelta has a new total_seconds() method since Python 3.2. http://docs.python.org/py3k/library/datetime.html#datetime.timedelta.total_seconds >>> datetime.timedelta(1).total_seconds() 86400.0 >>> datetime.timedelta(seconds=-1).total_seconds() -1.0 Victor From victor.stinner at haypocalc.com Sun Feb 26 15:05:18 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sun, 26 Feb 2012 15:05:18 +0100 Subject: [Python-Dev] Proposing an alternative to PEP 410 In-Reply-To: References: <4F46AF6E.2030300@hastings.org> <20120225163129.3a104cdd@resist.wooz.org> Message-ID: > Scratch that, *I* don't agree. timedelta is a pretty clumsy type to > use. Have you ever tried to compute the number of seconds between two > datetimes? You can't just use the .seconds field, you have to combine > the .days and .seconds fields. And negative timedeltas are even harder > due to the requirement that seconds and microseconds are never > negative; e.g -1 second is represented as -1 days plus 86399 seconds. Guido, you should switch to Python3! timedelta has a new total_seconds() method since Python 3.2. http://docs.python.org/py3k/library/datetime.html#datetime.timedelta.total_seconds >>> datetime.timedelta(1).total_seconds() 86400.0 >>> datetime.timedelta(seconds=-1).total_seconds() -1.0 Victor From p.f.moore at gmail.com Sun Feb 26 15:30:46 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 26 Feb 2012 14:30:46 +0000 Subject: [Python-Dev] struct.pack inconsistencies between platforms In-Reply-To: References: Message-ID: On Sun, Feb 26, 2012 at 3:16 PM, Eli Bendersky wrote: > 7.2.3.1 says, shortly after the first table: > > " > > Native size and alignment are determined using the C compiler?s sizeof > expression. This is always combined with native byte order. > > Standard size depends only on the format character; see the table in the > Format Characters section. > > " > > To me this appears to be a reasonable definition of what "standard size" > is. You're right, my apologies. I skimmed a little too much. Actually, I *could* argue that there is still some ambiguity, but as (a) the OP is happy, and (b) until I went and read the docs, I would have expected the current behaviour anyway, that would just just be me being awkward. So I won't :-) Paul. From hodgestar+pythondev at gmail.com Sun Feb 26 15:51:08 2012 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Sun, 26 Feb 2012 16:51:08 +0200 Subject: [Python-Dev] Proposing an alternative to PEP 410 In-Reply-To: References: <4F46AF6E.2030300@hastings.org> <20120225163129.3a104cdd@resist.wooz.org> Message-ID: On Sun, Feb 26, 2012 at 1:31 AM, Guido van Rossum wrote: > I still think that when you are actually interested in *using* times, > the current float format is absolutely fine. Anybody who thinks they > need to accurately know the absolute time that something happened with > nanosecond accuracy is out of their mind; given relativity such times > have an incredibly local significance anyway. There are good scientific use cases for nanosecond time resolution (e.g. radio astronomy) where one is actually measuring time down to that level and taking into account propagation delays. I have first hand experience of at least one radio telescope (MeerKAT) that is using Python to process these sorts of timestamps (Maciej even gave a talk on MeerKAT at PyCon 2011 :). Often these sorts of applications just use an large integer to hold the time. Higher-level constructs like datetime tend to be too bulky and provide functionality that is not particularly relevant. There is also a lot of pressure to have all the details coded by an in-house expert (because you need complete control and understanding of them, so you might as well do it yourself rather than continually patch, say, Python, to match your instrument's view of how this should all work). Hardware capable of generating nanosecond accurate timestamps is, however, becoming fairly easy to get hold of (a suitable crystalline clock slaved to a decent GPS unit can get you a lot of the way) and there are probably quite a few applications where it might become relevant. I'm not sure whether any of this is intended to be for or against any side in the current discussion. :D Schiavo Simon From anacrolix at gmail.com Sun Feb 26 16:11:42 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Sun, 26 Feb 2012 23:11:42 +0800 Subject: [Python-Dev] State of PEP-3118 (memoryview part) In-Reply-To: <20120226144106.0bfd38ff@pitrou.net> References: <20120226132721.GA1422@sleipnir.bytereef.org> <20120226144106.0bfd38ff@pitrou.net> Message-ID: +1 for won't fix. On Feb 26, 2012 9:46 PM, "Antoine Pitrou" wrote: > On Sun, 26 Feb 2012 14:27:21 +0100 > Stefan Krah wrote: > > > > The underlying problems with memoryview were intricate and required > > a long discussion (issue #10181) that led to a complete rewrite > > of memoryobject.c. > > > > > > We have several options with regard to 2.7 and 3.2: > > > > 1) Won't fix. > > Given the extent of the rewrite, this one has my preference. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/anacrolix%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Sun Feb 26 17:12:42 2012 From: larry at hastings.org (Larry Hastings) Date: Sun, 26 Feb 2012 08:12:42 -0800 Subject: [Python-Dev] Proposing an alternative to PEP 410 In-Reply-To: References: <4F46AF6E.2030300@hastings.org> <20120225163129.3a104cdd@resist.wooz.org> Message-ID: <4F4A59FA.6040802@hastings.org> On 02/26/2012 06:51 AM, Simon Cross wrote: > There are good scientific use cases for nanosecond time resolution > (e.g. radio astronomy) where one is actually measuring time down to > that level and taking into account propagation delays. I have first > hand experience [...] > I'm not sure whether any of this is intended to be for or against any > side in the current discussion. :D It's probably neutral. But I do have one question: can you foresee the scientific community moving to a finer resolution than nanoseconds in our lifetimes? //arry/ From p.f.moore at gmail.com Sun Feb 26 17:21:44 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 26 Feb 2012 16:21:44 +0000 Subject: [Python-Dev] State of PEP-3118 (memoryview part) In-Reply-To: <20120226144106.0bfd38ff@pitrou.net> References: <20120226132721.GA1422@sleipnir.bytereef.org> <20120226144106.0bfd38ff@pitrou.net> Message-ID: On 26 February 2012 13:41, Antoine Pitrou wrote: >> We have several options with regard to 2.7 and 3.2: >> >> ? 1) Won't fix. > > Given the extent of the rewrite, this one has my preference. +1 (although I'd word it as "fixed in 3.3" rather than "won't fix"). Paul. From tkoker at gmail.com Sun Feb 26 17:34:28 2012 From: tkoker at gmail.com (Tony Koker) Date: Sun, 26 Feb 2012 11:34:28 -0500 Subject: [Python-Dev] Proposing an alternative to PEP 410 In-Reply-To: <4F4A59FA.6040802@hastings.org> References: <4F46AF6E.2030300@hastings.org> <20120225163129.3a104cdd@resist.wooz.org> <4F4A59FA.6040802@hastings.org> Message-ID: my 2 cents... being in electronics for over 30 years, it is forever expanding in both directions, bigger mega, giga, tera, peta, etc. AND smaller nano, pico, femto, atto. but, I agree that it is moot, as it is not the range, which is usually expressed in an exponential component of the system being used (decimal, hex., etc), and it is more a matter of significant number of digits being operated on, at that point in time. Basically the zeroes are removed and tracked separately. Tony On Sun, Feb 26, 2012 at 11:12 AM, Larry Hastings wrote: > > On 02/26/2012 06:51 AM, Simon Cross wrote: > >> There are good scientific use cases for nanosecond time resolution >> (e.g. radio astronomy) where one is actually measuring time down to >> that level and taking into account propagation delays. I have first >> hand experience [...] >> >> I'm not sure whether any of this is intended to be for or against any >> side in the current discussion. :D >> > > It's probably neutral. But I do have one question: can you foresee the > scientific community moving to a finer resolution than nanoseconds in our > lifetimes? > > > //arry/ > > ______________________________**_________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/**mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** > tkoker%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tkoker at gmail.com Sun Feb 26 17:37:34 2012 From: tkoker at gmail.com (Tony Koker) Date: Sun, 26 Feb 2012 11:37:34 -0500 Subject: [Python-Dev] Proposing an alternative to PEP 410 In-Reply-To: References: <4F46AF6E.2030300@hastings.org> <20120225163129.3a104cdd@resist.wooz.org> <4F4A59FA.6040802@hastings.org> Message-ID: Also, data collection will almost always be done by specialized hardware and the data stored off for deferred processing and analysis. Tony On Sun, Feb 26, 2012 at 11:34 AM, Tony Koker wrote: > my 2 cents... > > being in electronics for over 30 years, it is forever expanding in both > directions, bigger mega, giga, tera, peta, etc. AND smaller nano, pico, > femto, atto. > > but, I agree that it is moot, as it is not the range, which is usually > expressed in an exponential component of the system being used (decimal, > hex., etc), and it is more a matter of significant number of digits being > operated on, at that point in time. Basically the zeroes are removed and > tracked separately. > > Tony > > > > On Sun, Feb 26, 2012 at 11:12 AM, Larry Hastings wrote: > >> >> On 02/26/2012 06:51 AM, Simon Cross wrote: >> >>> There are good scientific use cases for nanosecond time resolution >>> (e.g. radio astronomy) where one is actually measuring time down to >>> that level and taking into account propagation delays. I have first >>> hand experience [...] >>> >>> I'm not sure whether any of this is intended to be for or against any >>> side in the current discussion. :D >>> >> >> It's probably neutral. But I do have one question: can you foresee the >> scientific community moving to a finer resolution than nanoseconds in our >> lifetimes? >> >> >> //arry/ >> >> ______________________________**_________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/**mailman/listinfo/python-dev >> Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** >> tkoker%40gmail.com >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Sun Feb 26 18:31:18 2012 From: barry at python.org (Barry Warsaw) Date: Sun, 26 Feb 2012 12:31:18 -0500 Subject: [Python-Dev] PEP 414 In-Reply-To: References: <4F49434B.6050604@active-4.com> Message-ID: <20120226123118.6e36609e@resist.wooz.org> This seems like too strong a statement: "Python 2.6 and Python 2.7 support syntax features from Python 3 which for the most part make a unified code base possible. Many thought that the unicode_literals future import might make a common source possible, but it turns out that it's doing more harm than good." While it may be true for *some* problem domains, such as WSGI apps, it is not true in general, IMO. I use this future import all the time in both libraries and applications and it's almost always helpful. Cheers, -Barry From solipsis at pitrou.net Sun Feb 26 18:45:47 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 26 Feb 2012 18:45:47 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> Message-ID: <20120226184547.1f415ad3@pitrou.net> Hi, On Sat, 25 Feb 2012 20:23:39 +0000 Armin Ronacher wrote: > > I just uploaded PEP 414 which proposes am optional 'u' prefix for string > literals for Python 3. > > You can read the PEP online: http://www.python.org/dev/peps/pep-0414/ I don't understand this sentence: > The automatic upgrading of binary strings to unicode strings that > would be enabled by this proposal would make it much easier to port > such libraries over. What "automatic upgrading" is that talking about? > For instance, the urllib module in Python 2 is using byte strings, > and the one in Python 3 is using unicode strings. Are you talking about urllib.parse perhaps? > By leveraging a native string, users can avoid having to adjust for > that. What does "leveraging a native string" mean here? > The following is an incomplete list of APIs and general concepts that > use native strings and need implicit upgrading to unicode in Python > 3, and which would directly benefit from this support I'm confused. This PEP talks about unicode literals, not native string literals, so why would these APIs "directly benefit from this support"? Thanks Antoine. From solipsis at pitrou.net Sun Feb 26 18:53:31 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 26 Feb 2012 18:53:31 +0100 Subject: [Python-Dev] Performance of u() References: <4F49434B.6050604@active-4.com> Message-ID: <20120226185331.4407f981@pitrou.net> On Sat, 25 Feb 2012 19:13:26 -0800 Guido van Rossum wrote: > If this can encourage more projects to support Python 3 (even if it's > only 3.3 and later) and hence improve adoption of Python 3, I'm all > for it. > > A small quibble: I'd like to see a benchmark of a 'u' function implemented in C. Even without implementing it in C, caching the results makes it much less prohibitive in tight loops: if sys.version_info >= (3, 0): def u(value): return value else: def u(value, _lit_cache={}): if value in _lit_cache: return _lit_cache[value] s = _lit_cache[value] = unicode(value, 'unicode-escape') return s u'\N{SNOWMAN}barbaz' -> 100000000 loops, best of 3: 0.00928 usec per loop u('\N{SNOWMAN}barbaz') -> 10000000 loops, best of 3: 0.15 usec per loop u'foobarbaz_%d' % x -> 1000000 loops, best of 3: 0.424 usec per loop u('foobarbaz_%d') % x -> 1000000 loops, best of 3: 0.598 usec per loop Regards Antoine. From jsbueno at python.org.br Sun Feb 26 19:03:06 2012 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Sun, 26 Feb 2012 15:03:06 -0300 Subject: [Python-Dev] Status regarding Old vs. Advanced String Formating In-Reply-To: <20120225012039.Horde.G9vmccL8999PSClXHxABAoA@webmail.df.eu> References: <4F4803A3.7040803@v.loewis.de> <20120225012039.Horde.G9vmccL8999PSClXHxABAoA@webmail.df.eu> Message-ID: On 24 February 2012 22:20, wrote: > I find the .format syntax too complicated and difficult to learn. It has > so many bells and whistles, making it more than just a *mini* language. > So for my own code, I always prefer % formatting for simplicity. > > +1 > Regards, > Martin > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hodgestar+pythondev at gmail.com Sun Feb 26 19:11:54 2012 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Sun, 26 Feb 2012 20:11:54 +0200 Subject: [Python-Dev] Proposing an alternative to PEP 410 In-Reply-To: <4F4A59FA.6040802@hastings.org> References: <4F46AF6E.2030300@hastings.org> <20120225163129.3a104cdd@resist.wooz.org> <4F4A59FA.6040802@hastings.org> Message-ID: On Sun, Feb 26, 2012 at 6:12 PM, Larry Hastings wrote: > It's probably neutral. ?But I do have one question: can you foresee the > scientific community moving to a finer resolution than nanoseconds in our > lifetimes? I think we're already there. Even just in radio astronomy new arrays like ALMA which operate a terahertz frequencies are looking at picosecond or possibly femtosecond timing accuracy (ALMA operates at ~1000 times higher frequency than MeerKAT so they need ~1000 times more accurate timing). E.g. http://www.guardian.co.uk/science/2012/jan/29/alma-radio-telescope-chile-astronomy Schiavo Simon From solipsis at pitrou.net Sun Feb 26 19:54:00 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 26 Feb 2012 19:54:00 +0100 Subject: [Python-Dev] cpython (3.2): Issue #14123: Explicitly mention that old style % string formatting has caveats References: <20120226195010.Horde.M75bFKGZi1VPSn7ineISvFA@webmail.df.eu> Message-ID: <20120226195400.6f162e64@pitrou.net> On Sun, 26 Feb 2012 19:50:10 +0100 martin at v.loewis.de wrote: > >> - The formatting operations described here are obsolete and may go away > >> in future > >> - versions of Python. Use the new :ref:`string-formatting` in new code. > >> + The formatting operations described here are modelled on C's printf() > >> + syntax. They only support formatting of certain builtin types. The > >> + use of a binary operator means that care may be needed in order to > >> + format tuples and dictionaries correctly. As the new > >> + :ref:`string-formatting` syntax is more flexible and handles tuples and > >> + dictionaries naturally, it is recommended for new code. However, there > >> + are no current plans to deprecate printf-style formatting. > >> > > > > Please consider just deleting the last sentence. Documentation is meant for > > users (often new users) and not core devs. As such, I just don't see what > > it adds. If the aim to to document this intent somewhere, a PEP would be a > > better place than the formal documentation. > > I'd rather leave the last sentence, and delete the penultimate sentence. > The last sentence is useful information to the end user ("we will not > deprecate printf-style formatting, so there is no need to change existing > code"). I'd drop the penultimate sentence because there is no consensus > that it is a useful recommendation (and it is certainly not a statement > of fact). It would be nice to call it something else than "printf-style formatting". While it is certainly modelled on printf(), knowledge of C or printf is not required to understand %-style formatting, nor even to appreciate it. Regards Antoine. From guido at python.org Sun Feb 26 20:02:30 2012 From: guido at python.org (Guido van Rossum) Date: Sun, 26 Feb 2012 11:02:30 -0800 Subject: [Python-Dev] Proposing an alternative to PEP 410 In-Reply-To: References: <4F46AF6E.2030300@hastings.org> <20120225163129.3a104cdd@resist.wooz.org> <4F4A59FA.6040802@hastings.org> Message-ID: On Sun, Feb 26, 2012 at 10:11 AM, Simon Cross wrote: > On Sun, Feb 26, 2012 at 6:12 PM, Larry Hastings wrote: >> It's probably neutral. ?But I do have one question: can you foresee the >> scientific community moving to a finer resolution than nanoseconds in our >> lifetimes? > > I think we're already there. Even just in radio astronomy new arrays > like ALMA which operate a terahertz frequencies are looking at > picosecond or possibly femtosecond timing accuracy (ALMA operates at > ~1000 times higher frequency than MeerKAT so they need ~1000 times > more accurate timing). > > E.g. http://www.guardian.co.uk/science/2012/jan/29/alma-radio-telescope-chile-astronomy None of that bears any relation on the precision of the timers available in the OS through Python's time and os APIs. -- --Guido van Rossum (python.org/~guido) From fuzzyman at voidspace.org.uk Sun Feb 26 20:07:18 2012 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sun, 26 Feb 2012 19:07:18 +0000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <20120226184547.1f415ad3@pitrou.net> References: <4F49434B.6050604@active-4.com> <20120226184547.1f415ad3@pitrou.net> Message-ID: <32793A9D-A031-4CC3-BFE4-9BED13AB0E97@voidspace.org.uk> On 26 Feb 2012, at 17:45, Antoine Pitrou wrote: > > Hi, > > On Sat, 25 Feb 2012 20:23:39 +0000 > Armin Ronacher wrote: >> >> I just uploaded PEP 414 which proposes am optional 'u' prefix for string >> literals for Python 3. >> >> You can read the PEP online: http://www.python.org/dev/peps/pep-0414/ > > I don't understand this sentence: > >> The automatic upgrading of binary strings to unicode strings that >> would be enabled by this proposal would make it much easier to port >> such libraries over. > > What "automatic upgrading" is that talking about? If you use native string syntax (no prefix) then moving from Python 2 to Python 3 automatically "upgrades" (I agree an odd choice of word) byte string literals to unicode string literals. > >> For instance, the urllib module in Python 2 is using byte strings, >> and the one in Python 3 is using unicode strings. > > Are you talking about urllib.parse perhaps? > >> By leveraging a native string, users can avoid having to adjust for >> that. > > What does "leveraging a native string" mean here? By using native string syntax (without the unicode literals future import) then apis that take a binary string in Python 2 and a unicode string in Python 3 "just work" with the same syntax. You are "leveraging" native syntax to use the same apis with different types across the different version of Python. > >> The following is an incomplete list of APIs and general concepts that >> use native strings and need implicit upgrading to unicode in Python >> 3, and which would directly benefit from this support > > I'm confused. This PEP talks about unicode literals, not native string > literals, so why would these APIs "directly benefit from this support"? Because sometimes in your code you want to specify "native strings" and sometimes you want to specify Unicode strings. There is no single *syntax* that is compatible with both Python 2 and Python 3 that permits this. (If you use "u" for Unicode in Python 2 and no prefix for native strings then your code is Python 3 incompatible, if you use the future import so that your strings are unicode in both Python 2 and Python 3 then you lose the syntax for native strings.) Michael > > Thanks > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From eliben at gmail.com Sun Feb 26 20:23:23 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 26 Feb 2012 21:23:23 +0200 Subject: [Python-Dev] cpython (3.2): Issue #14123: Explicitly mention that old style % string formatting has caveats In-Reply-To: <20120226195400.6f162e64@pitrou.net> References: <20120226195010.Horde.M75bFKGZi1VPSn7ineISvFA@webmail.df.eu> <20120226195400.6f162e64@pitrou.net> Message-ID: > It would be nice to call it something else than "printf-style > formatting". While it is certainly modelled on printf(), knowledge of C > or printf is not required to understand %-style formatting, nor even to > appreciate it. > +1. The section is already titled "old string formatting operations" so if this name is acceptable it should be reused. If it's not, it should then be consistently changed everywhere. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sun Feb 26 20:22:56 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 26 Feb 2012 14:22:56 -0500 Subject: [Python-Dev] [Python-checkins] cpython (3.2): Issue #14123: Explicitly mention that old style % string formatting has caveats In-Reply-To: <20120226195010.Horde.M75bFKGZi1VPSn7ineISvFA@webmail.df.eu> References: <20120226195010.Horde.M75bFKGZi1VPSn7ineISvFA@webmail.df.eu> Message-ID: <4F4A8690.1010801@udel.edu> On 2/26/2012 1:50 PM, martin at v.loewis.de wrote: > > Zitat von Eli Bendersky : > >>> >>> - The formatting operations described here are obsolete and may go away >>> in future >>> - versions of Python. Use the new :ref:`string-formatting` in new code. >>> + The formatting operations described here are modelled on C's printf() >>> + syntax. They only support formatting of certain builtin types. The >>> + use of a binary operator means that care may be needed in order to >>> + format tuples and dictionaries correctly. As the new >>> + :ref:`string-formatting` syntax is more flexible and handles tuples >>> and >>> + dictionaries naturally, it is recommended for new code. However, there >>> + are no current plans to deprecate printf-style formatting. >>> >> >> Please consider just deleting the last sentence. Documentation is >> meant for >> users (often new users) and not core devs. As such, I just don't see what >> it adds. If the aim to to document this intent somewhere, a PEP would >> be a >> better place than the formal documentation. > > I'd rather leave the last sentence, and delete the penultimate sentence. > The last sentence is useful information to the end user ("we will not > deprecate printf-style formatting, so there is no need to change existing > code"). I'd drop the penultimate sentence because there is no consensus > that it is a useful recommendation (and it is certainly not a statement > of fact). I agree that the 'recommendation' is subjective, even though I strongly agree with it *for new Python programmers who are not already familiar with printf style formatting*. However, that sort of nuanced recommendation goes better in a HowTo. Statements about non-deprecation are also out of place as that is the default. So I agree with both of you. Let us drop both of the last two sentences. Then we can all be happy. There is a difference between 'There are no current plans to ...' and 'We will never ...'. However, '...' should not be discussed or even proposed or even mentioned until there is a bug-free automatic converter. I think the recent rehashing was mostly a needless irritation except as it prompted a doc update. --- Terry Jan Reedy From armin.ronacher at active-4.com Sun Feb 26 21:47:39 2012 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Sun, 26 Feb 2012 20:47:39 +0000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <20120226184547.1f415ad3@pitrou.net> References: <4F49434B.6050604@active-4.com> <20120226184547.1f415ad3@pitrou.net> Message-ID: <4F4A9A6B.8010100@active-4.com> Hi, On 2/26/12 5:45 PM, Antoine Pitrou wrote: >> The automatic upgrading of binary strings to unicode strings that >> would be enabled by this proposal would make it much easier to port >> such libraries over. > > What "automatic upgrading" is that talking about? The word "upgrade" is probably something that should be changed. It refers to the fact that 'foo' is a bytestring in 2.x and the same syntax means a unicode string in Python 3. This is exactly what is necessary for interfaces that were promoted to unicode interfaces in Python 3 (for instance Python identifiers, URLs etc.) > Are you talking about urllib.parse perhaps? Not only the parsing module. Headers on the urllib.request module are unicode as well. What the PEP is referring to is the urllib/urlparse and cgi module which was largely consolidated to the urllib package in Python 3. > What does "leveraging a native string" mean here? It means by using a native string to achieve the automatic upgrading which "does the right thing" in a lot of situations. > I'm confused. This PEP talks about unicode literals, not native string > literals, so why would these APIs "directly benefit from this support"? The native string literal already exists. It disappears if `unicode_literals` are future imported which is why this is relevant since the unicode literals future import in 2.x is recommended by some for making libraries run in both 2.x and 3.x. Regards, Armin From rosuav at gmail.com Sun Feb 26 21:48:55 2012 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 27 Feb 2012 07:48:55 +1100 Subject: [Python-Dev] cpython (3.2): Issue #14123: Explicitly mention that old style % string formatting has caveats In-Reply-To: <20120226195400.6f162e64@pitrou.net> References: <20120226195010.Horde.M75bFKGZi1VPSn7ineISvFA@webmail.df.eu> <20120226195400.6f162e64@pitrou.net> Message-ID: On Mon, Feb 27, 2012 at 5:54 AM, Antoine Pitrou wrote: > It would be nice to call it something else than "printf-style > formatting". While it is certainly modelled on printf(), knowledge of C > or printf is not required to understand %-style formatting, nor even to > appreciate it. -1. Calling it "printf-style" ties it in with its origins just as the term "regex" does for the 're' module. There are printf-derived features in quite a few high level languages; they may differ somewhat (Pike's sprintf() can do columnar displays; PHP's vsprintf takes an array, not some weird and mythical varargs token), but in their basics they will be similar. The name is worth keeping. Chris Angelico From barry at python.org Sun Feb 26 22:06:57 2012 From: barry at python.org (Barry Warsaw) Date: Sun, 26 Feb 2012 16:06:57 -0500 Subject: [Python-Dev] PEP 414 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> Message-ID: <20120226160657.5e5d0bff@resist.wooz.org> On Feb 26, 2012, at 09:20 PM, Nick Coghlan wrote: >It reduces the problem (compared to omitting the import and using a >u() function), but it's still ugly and still involves the "action at a >distance" of the unicode literals import. Frankly, that doesn't bother me at all. I've been using the future import in all my code pretty successfully for a long while now. It's much more important for a project to use or not use the future import consistently, and then there really should be no confusion when looking at the code for that project. I'm not necessarily saying I'm opposed to the purpose of the PEP. I do think it's unnecessary for most Python problem domains, but can appreciate that WSGI apps are feeling a special pain here that should be addressed somehow. It would be nice however if the solution were in the form of a separate module that could be used in earlier Python versions. -Barry From ncoghlan at gmail.com Sun Feb 26 22:13:11 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 27 Feb 2012 07:13:11 +1000 Subject: [Python-Dev] cpython (3.2): Issue #14123: Explicitly mention that old style % string formatting has caveats In-Reply-To: References: <20120226195010.Horde.M75bFKGZi1VPSn7ineISvFA@webmail.df.eu> <20120226195400.6f162e64@pitrou.net> Message-ID: On Mon, Feb 27, 2012 at 5:23 AM, Eli Bendersky wrote: > >> It would be nice to call it something else than "printf-style >> formatting". While it is certainly modelled on printf(), knowledge of C >> or printf is not required to understand %-style formatting, nor even to >> appreciate it. > > > +1. The section is already titled "old string formatting operations" so if > this name is acceptable it should be reused. If it's not, it should then be > consistently changed everywhere. I deliberately chose printf-style as being value neutral (whereas old-style vs new-style carries a heavier recommendation that you should be using the new one). Sure you don't need to know printf to understand it, but it needs *some* kind of name, and "printf-style" acknowledges its roots. Another value-neutral term is "mod-style", which describes how it is invoked (and I believe we do use that in a few places already). I didn't actually expect that paragraph to be incorporated wholesale into the docs - it was intended as a discussion starter, not a finished product. Aside from the last two sentences, the other big problem with it is that print-style formatting *does* support formatting arbitrary objects, they're just forced to go through type coercions whereas .format() allows objects to define their own formatting specifiers (such as datetime with strftime strings). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From brett at python.org Sun Feb 26 22:18:54 2012 From: brett at python.org (Brett Cannon) Date: Sun, 26 Feb 2012 16:18:54 -0500 Subject: [Python-Dev] [Python-checkins] cpython: Issue #14080: fix sporadic test_imp failure. Patch by Stefan Krah. In-Reply-To: References: Message-ID: On Sun, Feb 26, 2012 at 12:13, antoine.pitrou wrote: > http://hg.python.org/cpython/rev/1d7472b015f0 > changeset: 75296:1d7472b015f0 > user: Antoine Pitrou > date: Sun Feb 26 18:09:50 2012 +0100 > summary: > Issue #14080: fix sporadic test_imp failure. Patch by Stefan Krah. > > files: > Lib/test/test_imp.py | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > > diff --git a/Lib/test/test_imp.py b/Lib/test/test_imp.py > --- a/Lib/test/test_imp.py > +++ b/Lib/test/test_imp.py > @@ -325,6 +325,7 @@ > self.addCleanup(cleanup) > # Touch the __init__.py file. > support.create_empty_file('pep3147/__init__.py') > + importlib.invalidate_caches() > expected___file__ = os.sep.join(('.', 'pep3147', '__init__.py')) > m = __import__('pep3147') > self.assertEqual(m.__file__, expected___file__, (m.__file__, > m.__path__)) Should that just go into support.create_empty_file()? Since it's just a performance issue I don't see it causing unexpected test failures and it might help with any future issues. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sun Feb 26 22:23:53 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 27 Feb 2012 10:23:53 +1300 Subject: [Python-Dev] State of PEP-3118 (memoryview part) In-Reply-To: <20120226132721.GA1422@sleipnir.bytereef.org> References: <20120226132721.GA1422@sleipnir.bytereef.org> Message-ID: <4F4AA2E9.1070901@canterbury.ac.nz> Stefan Krah wrote: > Options 2) and 3) would ideally entail one backwards incompatible > bugfix: In 2.7 and 3.2 assignment to a memoryview with format 'B' > rejects integers but accepts byte objects, but according to the > struct syntax mandated by the PEP it should be the other way round. Maybe a compromise could be made to accept both in the backport? That would avoid breaking old code while allowing code that does the right thing to work. -- Greg From chrism at plope.com Sun Feb 26 22:29:26 2012 From: chrism at plope.com (Chris McDonough) Date: Sun, 26 Feb 2012 16:29:26 -0500 Subject: [Python-Dev] PEP 414 In-Reply-To: <20120226160657.5e5d0bff@resist.wooz.org> References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <20120226160657.5e5d0bff@resist.wooz.org> Message-ID: <1330291766.12046.26.camel@thinko> On Sun, 2012-02-26 at 16:06 -0500, Barry Warsaw wrote: > On Feb 26, 2012, at 09:20 PM, Nick Coghlan wrote: > > >It reduces the problem (compared to omitting the import and using a > >u() function), but it's still ugly and still involves the "action at a > >distance" of the unicode literals import. > > Frankly, that doesn't bother me at all. I've been using the future import in > all my code pretty successfully for a long while now. It's much more > important for a project to use or not use the future import consistently, and > then there really should be no confusion when looking at the code for that > project. That's completely reasonable in a highly controlled project with relatively few highly-bought-in contributors. In projects with lots of hit-and-run contributors, though, it's more desirable to have things meet a rule of least surprise. Much of the software I work on is Python 3 compatible, but it's still used primarily on Python 2. Because most people still care primarily about Python 2, and most don't have a lot of Python 3 experience, it's extremely common to see folks submitting patches with u'' literals in them. > I'm not necessarily saying I'm opposed to the purpose of the PEP. I do think > it's unnecessary for most Python problem domains, but can appreciate that WSGI > apps are feeling a special pain here that should be addressed somehow. It > would be nice however if the solution were in the form of a separate module > that could be used in earlier Python versions. If we use the unicode_literals future import, or some other exernal module strategy, it doesn't help much with the hitnrun contributor thing, I fear. - C From cs at zip.com.au Sun Feb 26 22:39:32 2012 From: cs at zip.com.au (Cameron Simpson) Date: Mon, 27 Feb 2012 08:39:32 +1100 Subject: [Python-Dev] cpython (3.2): Issue #14123: Explicitly mention that old style % string formatting has caveats In-Reply-To: References: Message-ID: <20120226213932.GA20329@cskk.homeip.net> On 27Feb2012 07:13, Nick Coghlan wrote: | On Mon, Feb 27, 2012 at 5:23 AM, Eli Bendersky wrote: | >> It would be nice to call it something else than "printf-style | >> formatting". While it is certainly modelled on printf(), knowledge of C | >> or printf is not required to understand %-style formatting, nor even to | >> appreciate it. | > | > | > +1. The section is already titled "old string formatting operations" so if | > this name is acceptable it should be reused. If it's not, it should then be | > consistently changed everywhere. | | I deliberately chose printf-style as being value neutral (whereas | old-style vs new-style carries a heavier recommendation that you | should be using the new one). Sure you don't need to know printf to | understand it, but it needs *some* kind of name, and "printf-style" | acknowledges its roots. +1 here from me too: it _is_ printf in roots and several format specifiers (%d, %s etc). If you know printf you _immediately_ know a lot about what you can expect, and if you don't you know know a little about its roots. | Another value-neutral term is "mod-style", | which describes how it is invoked (and I believe we do use that in a | few places already). A -1 on "mod-style" from me. While it does use the "%" operator symbol, in no other way is it like the "mod" arithmetic operation. I think docs _should_ occasionally hint at preferred approaches. The new new formatting is a deliberate Python change. Without some rationale/editorial it flies in the face of the "one obvious way to do things" notion. It shouldn't be overdone, but neither should it be absent. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ Ignorance is preferable to error; and he is less remote from the truth who believes nothing, than he who believes what is wrong. - Thomas Jefferson From tjreedy at udel.edu Sun Feb 26 22:47:41 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 26 Feb 2012 16:47:41 -0500 Subject: [Python-Dev] State of PEP-3118 (memoryview part) In-Reply-To: References: <20120226132721.GA1422@sleipnir.bytereef.org> <20120226144106.0bfd38ff@pitrou.net> Message-ID: Stefan, thank you for the massive rewrite. On 2/26/2012 11:21 AM, Paul Moore wrote: > On 26 February 2012 13:41, Antoine Pitrou wrote: >>> We have several options with regard to 2.7 and 3.2: >>> >>> 1) Won't fix. >> >> Given the extent of the rewrite, this one has my preference. > > +1 (although I'd word it as "fixed in 3.3" rather than "won't fix"). I agree with 3.3 only. My suggestion: when you close the issues, change Versions to 3.3 and Resolution to 'fixed'. On the main issue, change Type to 'enhancement'. Add a message to others saying something like "This was fixed for 3.3 in #xxxxx by rewriting and enhancing memoryview. The new version was not backported because it would either add new features, which is forbidden for bugfix releases, or would require substantial work to try to disable them without introducing bugs, without a guarantee that that would work" -- Terry Jan Reedy From g.brandl at gmx.net Sun Feb 26 22:50:41 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 26 Feb 2012 22:50:41 +0100 Subject: [Python-Dev] cpython (3.2): Issue #14123: Explicitly mention that old style % string formatting has caveats In-Reply-To: References: <20120226195010.Horde.M75bFKGZi1VPSn7ineISvFA@webmail.df.eu> <20120226195400.6f162e64@pitrou.net> Message-ID: On 02/26/2012 10:13 PM, Nick Coghlan wrote: > On Mon, Feb 27, 2012 at 5:23 AM, Eli Bendersky wrote: >> >>> It would be nice to call it something else than "printf-style >>> formatting". While it is certainly modelled on printf(), knowledge of C >>> or printf is not required to understand %-style formatting, nor even to >>> appreciate it. >> >> >> +1. The section is already titled "old string formatting operations" so if >> this name is acceptable it should be reused. If it's not, it should then be >> consistently changed everywhere. > > I deliberately chose printf-style as being value neutral (whereas > old-style vs new-style carries a heavier recommendation that you > should be using the new one). Sure you don't need to know printf to > understand it, but it needs *some* kind of name, and "printf-style" > acknowledges its roots. Another value-neutral term is "mod-style", > which describes how it is invoked (and I believe we do use that in a > few places already). I've seen "percent-formatting", which is neutral, accurate and doesn't require any previous knowledge. (The new one could be "format-formatting" then, which is a tad awkward. :) Georg From solipsis at pitrou.net Sun Feb 26 22:56:15 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 26 Feb 2012 22:56:15 +0100 Subject: [Python-Dev] State of PEP-3118 (memoryview part) References: <20120226132721.GA1422@sleipnir.bytereef.org> Message-ID: <20120226225615.7100e78e@pitrou.net> On Sun, 26 Feb 2012 14:27:21 +0100 Stefan Krah wrote: > State of PEP-3118 (memoryview part) > > Hello, > > In Python 3.3 most issues with the memoryview object have been fixed > in a recent commit (3f9b3b6f7ff0). Oh and congrats for doing this, of course. Regards Antoine. From solipsis at pitrou.net Sun Feb 26 23:00:06 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 26 Feb 2012 23:00:06 +0100 Subject: [Python-Dev] cpython: Issue #14080: fix sporadic test_imp failure. Patch by Stefan Krah. References: Message-ID: <20120226230006.0f5fea49@pitrou.net> On Sun, 26 Feb 2012 16:18:54 -0500 Brett Cannon wrote: > > > > diff --git a/Lib/test/test_imp.py b/Lib/test/test_imp.py > > --- a/Lib/test/test_imp.py > > +++ b/Lib/test/test_imp.py > > @@ -325,6 +325,7 @@ > > self.addCleanup(cleanup) > > # Touch the __init__.py file. > > support.create_empty_file('pep3147/__init__.py') > > + importlib.invalidate_caches() > > expected___file__ = os.sep.join(('.', 'pep3147', '__init__.py')) > > m = __import__('pep3147') > > self.assertEqual(m.__file__, expected___file__, (m.__file__, > > m.__path__)) > > > Should that just go into support.create_empty_file()? Since it's just a > performance issue I don't see it causing unexpected test failures and it > might help with any future issues. I don't think adding import-specific workarounds in create_empty_file() is a very good idea. (I'm also not sure why that function exists) Regards Antoine. From ncoghlan at gmail.com Sun Feb 26 23:38:20 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 27 Feb 2012 08:38:20 +1000 Subject: [Python-Dev] cpython (3.2): Issue #14123: Explicitly mention that old style % string formatting has caveats In-Reply-To: References: <20120226195010.Horde.M75bFKGZi1VPSn7ineISvFA@webmail.df.eu> <20120226195400.6f162e64@pitrou.net> Message-ID: Ah, thanks, I knew there was another term that had a new-style counterpart: percent formatting vs brace formatting. -- Sent from my phone, thus the relative brevity :) On Feb 27, 2012 7:53 AM, "Georg Brandl" wrote: > On 02/26/2012 10:13 PM, Nick Coghlan wrote: > >> On Mon, Feb 27, 2012 at 5:23 AM, Eli Bendersky wrote: >> >>> >>> It would be nice to call it something else than "printf-style >>>> formatting". While it is certainly modelled on printf(), knowledge of C >>>> or printf is not required to understand %-style formatting, nor even to >>>> appreciate it. >>>> >>> >>> >>> +1. The section is already titled "old string formatting operations" so >>> if >>> this name is acceptable it should be reused. If it's not, it should >>> then be >>> consistently changed everywhere. >>> >> >> I deliberately chose printf-style as being value neutral (whereas >> old-style vs new-style carries a heavier recommendation that you >> should be using the new one). Sure you don't need to know printf to >> understand it, but it needs *some* kind of name, and "printf-style" >> acknowledges its roots. Another value-neutral term is "mod-style", >> which describes how it is invoked (and I believe we do use that in a >> few places already). >> > > I've seen "percent-formatting", which is neutral, accurate and doesn't > require any previous knowledge. (The new one could be "format-formatting" > then, which is a tad awkward. :) > > Georg > > ______________________________**_________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/**mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** > ncoghlan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sun Feb 26 23:44:29 2012 From: brett at python.org (Brett Cannon) Date: Sun, 26 Feb 2012 17:44:29 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> Message-ID: On Sat, Feb 25, 2012 at 22:13, Guido van Rossum wrote: > If this can encourage more projects to support Python 3 (even if it's > only 3.3 and later) and hence improve adoption of Python 3, I'm all > for it. > > +1 from me for the same reasons. If this were to go in then for Python 3.3 the section of the porting HOWTO on what to do when you support Python 2.6 and later ( http://docs.python.org/howto/pyporting.html#python-2-3-compatible-source) would change to: * Use ``from __future__ import print_functions`` OR use ``print(x)`` but always with a single argument OR use six * Use ``from __future__ import unicode_literals`` OR make sure to use the 'u' prefix for all Unicode strings (and then mention the concept of native strings) or use six * Use the 'b' prefix for byte literals or use six All understandable and with either a __future__ import solution or syntactic support solution for all issues, giving people the choice of either approach for what they prefer for each approach. I would also be willing to move the Python 2/3 compatible source section to the top and thus implicitly become the preferred way to port since people in the community have seemingly been gravitating towards that approach even without this help. -Brett A small quibble: I'd like to see a benchmark of a 'u' function implemented > in C. > > --Guido > > On Sat, Feb 25, 2012 at 12:23 PM, Armin Ronacher > wrote: > > Hi, > > > > I just uploaded PEP 414 which proposes am optional 'u' prefix for string > > literals for Python 3. > > > > You can read the PEP online: http://www.python.org/dev/peps/pep-0414/ > > > > This is a followup to the discussion about this topic here on the > > mailinglist and on twitter/IRC over the last few weeks. > > > > > > Regards, > > Armin > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vinay_sajip at yahoo.co.uk Mon Feb 27 00:06:38 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 26 Feb 2012 23:06:38 +0000 (UTC) Subject: [Python-Dev] PEP 414 References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <20120226160657.5e5d0bff@resist.wooz.org> <1330291766.12046.26.camel@thinko> Message-ID: Chris McDonough plope.com> writes: > If we use the unicode_literals future import, or some other exernal > module strategy, it doesn't help much with the hitnrun contributor > thing, I fear. Surely some curating of hit-and-run contributions takes place? If you accept contributions from hit-and-run contributors without changes, ISTM that could compromise the quality of the codebase somewhat. Also, is not the overall impact on the codebase of hit-and-run contributors small compared to more the impact from involved contributors? Regards, Vinay Sajip From barry at python.org Mon Feb 27 00:07:58 2012 From: barry at python.org (Barry Warsaw) Date: Sun, 26 Feb 2012 18:07:58 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> Message-ID: <20120226180758.78e9a779@resist.wooz.org> On Feb 26, 2012, at 05:44 PM, Brett Cannon wrote: >On Sat, Feb 25, 2012 at 22:13, Guido van Rossum wrote: > >> If this can encourage more projects to support Python 3 (even if it's >> only 3.3 and later) and hence improve adoption of Python 3, I'm all >> for it. >> >> >+1 from me for the same reasons. Just to be clear, I'm solidly +1 on anything we can do to increase the pace of Python 3 migration. -Barry From tjreedy at udel.edu Mon Feb 27 00:14:52 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 26 Feb 2012 18:14:52 -0500 Subject: [Python-Dev] cpython (3.2): Issue #14123: Explicitly mention that old style % string formatting has caveats In-Reply-To: References: <20120226195010.Horde.M75bFKGZi1VPSn7ineISvFA@webmail.df.eu> <20120226195400.6f162e64@pitrou.net> Message-ID: On 2/26/2012 5:38 PM, Nick Coghlan wrote: > Ah, thanks, I knew there was another term that had a new-style > counterpart: percent formatting vs brace formatting. Hooray! Exact parallel and value-neutral. -- Terry Jan Reedy From guido at python.org Mon Feb 27 00:33:51 2012 From: guido at python.org (Guido van Rossum) Date: Sun, 26 Feb 2012 15:33:51 -0800 Subject: [Python-Dev] cpython (3.2): Issue #14123: Explicitly mention that old style % string formatting has caveats In-Reply-To: References: <20120226195010.Horde.M75bFKGZi1VPSn7ineISvFA@webmail.df.eu> <20120226195400.6f162e64@pitrou.net> Message-ID: On Sun, Feb 26, 2012 at 3:14 PM, Terry Reedy wrote: > On 2/26/2012 5:38 PM, Nick Coghlan wrote: >> >> Ah, thanks, I knew there was another term that had a new-style >> counterpart: percent formatting vs brace formatting. > > Hooray! > Exact parallel and value-neutral. Can we stop it with the "political correctness" already? The old style is best named printf-style formatting because that's the origin of the format language, and there are many other programming languages that support the same formatting language (with minor variations). I care less about what we call the new style -- "new style" or "format method" both work for me. I also would like to suggest that, even if the reality is that we can't deprecate it today, *eventually*, at *some* *distant* point in the future we ought to start deprecating printf-style formatting -- it really does have a couple of nasty traps that keep catching people unawares. In the mean time it doesn't hurt to use terms that make people ever so slightly uneasy with using the old style for new code, while also committing to not throwing it out until Python 4 comes around. That said, for consistency's sake, if you add formatting code to an existing module that uses the old style, please stick to the old style. And to avoid disasters, also please don't go on a library-wide rampage of wholesale conversions. The time to start using the new formatting is when writing new modules or packages, or possibly when doing a major refactoring/upgrade of an existing module or package. One thing I'd like to see happening regardless is support for new-style formatting in the logging module. It's a little tricky to think how that would work, alas -- should this be a property of the logger or of the call? -- --Guido van Rossum (python.org/~guido) From chrism at plope.com Mon Feb 27 00:37:52 2012 From: chrism at plope.com (Chris McDonough) Date: Sun, 26 Feb 2012 18:37:52 -0500 Subject: [Python-Dev] PEP 414 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <20120226160657.5e5d0bff@resist.wooz.org> <1330291766.12046.26.camel@thinko> Message-ID: <1330299472.12046.32.camel@thinko> On Sun, 2012-02-26 at 23:06 +0000, Vinay Sajip wrote: > Chris McDonough plope.com> writes: > > > If we use the unicode_literals future import, or some other exernal > > module strategy, it doesn't help much with the hitnrun contributor > > thing, I fear. > > Surely some curating of hit-and-run contributions takes place? If you accept > contributions from hit-and-run contributors without changes, ISTM that could > compromise the quality of the codebase somewhat. Nah. Real developers just accept all pull requests and let god sort it out. ;-) But seriously, the less time it takes me to review and fix a pull request from a casual contributor, the better. - C From cs at zip.com.au Mon Feb 27 00:44:57 2012 From: cs at zip.com.au (Cameron Simpson) Date: Mon, 27 Feb 2012 10:44:57 +1100 Subject: [Python-Dev] cpython (3.2): Issue #14123: Explicitly mention that old style % string formatting has caveats In-Reply-To: References: Message-ID: <20120226234456.GA6542@cskk.homeip.net> On 26Feb2012 15:33, Guido van Rossum wrote: | One thing I'd like to see happening regardless is support for | new-style formatting in the logging module. It's a little tricky to | think how that would work, alas -- should this be a property of the | logger or of the call? Surely the call? The caller doesn't necessarily know anything about the loggers in play, so if a logger sprouts different message formating syntax the caller is hosed. -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ A Master is someone who started before you did. - Gary Zukav From breamoreboy at yahoo.co.uk Mon Feb 27 01:41:51 2012 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Mon, 27 Feb 2012 00:41:51 +0000 Subject: [Python-Dev] cpython (3.2): Issue #14123: Explicitly mention that old style % string formatting has caveats In-Reply-To: References: <20120226195010.Horde.M75bFKGZi1VPSn7ineISvFA@webmail.df.eu> <20120226195400.6f162e64@pitrou.net> Message-ID: On 26/02/2012 23:33, Guido van Rossum wrote: > On Sun, Feb 26, 2012 at 3:14 PM, Terry Reedy wrote: >> On 2/26/2012 5:38 PM, Nick Coghlan wrote: >>> >>> Ah, thanks, I knew there was another term that had a new-style >>> counterpart: percent formatting vs brace formatting. >> >> Hooray! >> Exact parallel and value-neutral. > > Can we stop it with the "political correctness" already? The old style > is best named printf-style formatting because that's the origin of the > format language, and there are many other programming languages that > support the same formatting language (with minor variations). I care > less about what we call the new style -- "new style" or "format > method" both work for me. > > I also would like to suggest that, even if the reality is that we > can't deprecate it today, *eventually*, at *some* *distant* point in > the future we ought to start deprecating printf-style formatting -- it > really does have a couple of nasty traps that keep catching people > unawares. In the mean time it doesn't hurt to use terms that make > people ever so slightly uneasy with using the old style for new code, > while also committing to not throwing it out until Python 4 comes > around. > > That said, for consistency's sake, if you add formatting code to an > existing module that uses the old style, please stick to the old > style. And to avoid disasters, also please don't go on a library-wide > rampage of wholesale conversions. The time to start using the new > formatting is when writing new modules or packages, or possibly when > doing a major refactoring/upgrade of an existing module or package. > > One thing I'd like to see happening regardless is support for > new-style formatting in the logging module. It's a little tricky to > think how that would work, alas -- should this be a property of the > logger or of the call? > Just thinking out loud that a tool along the lines of 2to3 aimed specifically at changing string formatting would be some encouragement for people to switch. Maybe a project for someone being looked after on the mentors ml? -- Cheers. Mark Lawrence. From larry at hastings.org Mon Feb 27 01:44:50 2012 From: larry at hastings.org (Larry Hastings) Date: Sun, 26 Feb 2012 16:44:50 -0800 Subject: [Python-Dev] cpython (3.2): Issue #14123: Explicitly mention that old style % string formatting has caveats In-Reply-To: References: <20120226195010.Horde.M75bFKGZi1VPSn7ineISvFA@webmail.df.eu> <20120226195400.6f162e64@pitrou.net> Message-ID: <4F4AD202.8080801@hastings.org> On 02/26/2012 03:33 PM, Guido van Rossum wrote: > One thing I'd like to see happening regardless is support for > new-style formatting in the logging module. It's a little tricky to > think how that would work, alas -- should this be a property of the > logger or of the call? There already is some support. logging.Formatter objects can be initialized with a "style" parameter, making this a property of the logger. (New in 3.2.) http://docs.python.org/py3k/library/logging.html#formatter-objects Is that what you had in mind? //arry/ From greg.ewing at canterbury.ac.nz Sun Feb 26 22:04:28 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 27 Feb 2012 10:04:28 +1300 Subject: [Python-Dev] PEP 414 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4A26F3.6080801@nedbatchelder.com> Message-ID: <4F4A9E5C.1080402@canterbury.ac.nz> Nick Coghlan wrote: > Armin's straw poll was actually about whether or not people used the > future import for division, rather than unicode literals. It is indeed > the same problem There are differences, though. Personally I'm very glad of the division import -- it's the only thing that keeps me sane when using floats. The alternative is not only butt-ugly but imposes an annoying performance penalty. I don't mind occasionally needing to glance at the top of a module in order to get the benefits. On the other hand, it's not much of a burden to put 'u' in front of string literals, and there is no performance difference. -- Greg From benjamin at python.org Mon Feb 27 02:30:10 2012 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 26 Feb 2012 20:30:10 -0500 Subject: [Python-Dev] PEP 415: Implementing PEP 409 differently Message-ID: PEP: 415 Title: Implementing PEP 409 differently Version: $Revision$ Last-Modified: $Date$ Author: Benjamin Peterson Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 26-Feb-2012 Post-History: 26-Feb-2012 Abstract ======== PEP 409 allows PEP 3134 exception contexts and causes to be suppressed when the exception is printed. This is done using the ``raise exc from None`` syntax. This PEP proposes to implement context and cause suppression differently. Rationale ========= PEP 409 changes ``__cause__`` to be ``Ellipsis`` by default. Then if ``__cause__`` is set to ``None`` by ``raise exc from None``, no context or cause will be printed should the exception be uncaught. The main problem with this scheme is it complicates the role of ``__cause__``. ``__cause__`` should indicate the cause of the exception not whether ``__context__`` should be printed or not. This use of ``__cause__`` is also not easily extended in the future. For example, we may someday want to allow the programmer to select which of ``__context__`` and ``__cause__`` will be printed. The PEP 409 implementation is not amendable to this. The use of ``Ellipsis`` is a hack. Before PEP 409, ``Ellipsis`` was used exclusively in extended slicing. Extended slicing has nothing to do with exceptions, so it's not clear to someone inspecting an exception object why ``__cause__`` should be set to ``Ellipsis``. Using ``Ellipsis`` by default for ``__cause__`` makes it asymmetrical with ``__context__``. Proposal ======== A new attribute on ``BaseException``, ``__suppress_context__``, will be introduced. The ``raise exc from None`` syntax will cause ``exc.__suppress_context__`` to be set to ``True``. Exception printing code will check for the attribute to determine whether context and cause will be printed. ``__cause__`` will return to its original purpose and values. There is precedence for ``__suppress_context__`` with the ``print_line_and_file`` exception attribute. Patches ======= There is a patch on `Issue 14133`_. References ========== .. _issue 14133: http://bugs.python.org/issue6210 Copyright ========= This document has been placed in the public domain. From tjreedy at udel.edu Mon Feb 27 02:55:24 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 26 Feb 2012 20:55:24 -0500 Subject: [Python-Dev] PEP 414 In-Reply-To: <4F4A29BD.2090607@active-4.com> References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <4F4A29BD.2090607@active-4.com> Message-ID: On 2/26/2012 7:46 AM, Armin Ronacher wrote: I am not enthusiastic about adding duplication that is useless for writing Python 3 code, but like others, I do want to encourage more porting of libraries to run with Python 3. I understand that the unicode transition seems the be the biggest barrier, especially for some applications. It is OK with me if ported code only runs on 3.3+, with its improved unicode. If u'' is added, I would like it to be added as deprecated in the doc with a note that it is only intended for multi-version Python 2/3 code. > In case this PEP gets approved I will refactor the tokenize module while > adding support for "u" prefixes and use that as the basis for a > installation hook for older Python 3 versions. I presume such a hook would simply remove 'u' prefixes and would run *much* faster than 2to3. If such a hook is satisfactory for 3.2, why would it not be satisfactory for 3.3? -- Terry Jan Reedy From ncoghlan at gmail.com Mon Feb 27 02:59:45 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 27 Feb 2012 11:59:45 +1000 Subject: [Python-Dev] New-style formatting in the logging module (was Re: cpython (3.2): Issue #14123: Explicitly mention that old style % string formatting has caveats) Message-ID: On Mon, Feb 27, 2012 at 10:44 AM, Larry Hastings wrote: > On 02/26/2012 03:33 PM, Guido van Rossum wrote: >> >> One thing I'd like to see happening regardless is support for >> new-style formatting in the logging module. It's a little tricky to >> think how that would work, alas -- should this be a property of the >> logger or of the call? > > > There already is some support. ?logging.Formatter objects can be initialized > with a "style" parameter, making this a property of the logger. ?(New in > 3.2.) > > ? http://docs.python.org/py3k/library/logging.html#formatter-objects > > Is that what you had in mind? It's half the puzzle (since composing the event fields into the actual log output is a logger action, you know the style when you supply the format string). The other half is that logging's lazy formatting currently only supporting printf-style format strings - to use brace formatting you currently have to preformat the messages, so you incur the formatting cost even if the message gets filtered out by the logging configuration. For that, a logger setting doesn't work, since one logger may be shared amongst multiple modules, some of which may use printf formatting, others brace formatting. It could possibly be a flag to getLogger() though - provide a facade on the existing logger type that sets an additional event property to specify the lazy formatting style. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Feb 27 03:15:01 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 27 Feb 2012 12:15:01 +1000 Subject: [Python-Dev] PEP 415: Implementing PEP 409 differently In-Reply-To: References: Message-ID: Thanks for writing that up. I'd be amenable if the PEP was clearly updated to say that ``raise exc from cause`` would change from being syntactic sugar for ``_hidden = exc; _hidden.__cause__ = cause; raise exc`` (as it is now) to ``_hidden = exc; _hidden.__cause__ = cause; _hidden.__suppress_context__ = True; raise exc``. The patch should then be implemented accordingly (including appropriate updates to the language reference). I previously didn't like this solution because I thought it would require a special case for None in the syntax expansion, but I have now realised that isn't the case (since the new flag can be set unconditionally regardless of the value assigned to __cause__). Code that wants to display both __cause__ and __context__ can then just either set __cause__ directly, or else later switch off the context suppression. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Feb 27 03:17:43 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 27 Feb 2012 12:17:43 +1000 Subject: [Python-Dev] PEP 414 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <4F4A29BD.2090607@active-4.com> Message-ID: On Mon, Feb 27, 2012 at 11:55 AM, Terry Reedy wrote: > I presume such a hook would simply remove 'u' prefixes and would run *much* > faster than 2to3. If such a hook is satisfactory for 3.2, why would it not > be satisfactory for 3.3? Because an import hook is still a lot more complicated than "Write modern code that runs on 2.6+ and follows certain guidelines and it will also just run on 3.3+". Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Feb 27 03:24:44 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 27 Feb 2012 12:24:44 +1000 Subject: [Python-Dev] cpython: Close issue #6210: Implement PEP 409 In-Reply-To: <20120226145421.6bff8bc7@pitrou.net> References: <20120226145421.6bff8bc7@pitrou.net> Message-ID: On Sun, Feb 26, 2012 at 11:54 PM, Antoine Pitrou wrote: >> + ? ?def prepare_subprocess(): >> + ? ? ? ?# don't create core file >> + ? ? ? ?try: >> + ? ? ? ? ? ?setrlimit(RLIMIT_CORE, (0, 0)) >> + ? ? ? ?except (ValueError, resource_error): >> + ? ? ? ? ? ?pass > > Really? This sounds quite wrong, but it should *at least* explain > why a test of the "raise" statement would produce a core file! > (but I think you should consider removing this part) I managed to convince myself before checking it in that a bunch of the weirdness in Ethan's subprocess test made sense, but I think I was just wrong about that (I certainly can't come up with a sane rationalisation now). Assigned a bug to myself to fix it: http://bugs.python.org/issue14136 Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fredrik at haard.se Mon Feb 27 10:03:38 2012 From: fredrik at haard.se (=?UTF-8?B?RnJlZHJpayBIw6XDpXJk?=) Date: Mon, 27 Feb 2012 10:03:38 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <20120226180758.78e9a779@resist.wooz.org> References: <4F49434B.6050604@active-4.com> <20120226180758.78e9a779@resist.wooz.org> Message-ID: 2012/2/27 Barry Warsaw > On Feb 26, 2012, at 05:44 PM, Brett Cannon wrote: > > >On Sat, Feb 25, 2012 at 22:13, Guido van Rossum wrote: > > > >> If this can encourage more projects to support Python 3 (even if it's > >> only 3.3 and later) and hence improve adoption of Python 3, I'm all > >> for it. > >> > >> > >+1 from me for the same reasons. > > Just to be clear, I'm solidly +1 on anything we can do to increase the > pace of > Python 3 migration. > +1 I think this is a great proposal that has the potential to remove one of the (for me at least, _the_) main obstacles to writing code compatible with both 2.7 and 3.x. -- /f I reject your reality and substitute my own. http://blaag.haard.se -------------- next part -------------- An HTML attachment was scrubbed... URL: From vinay_sajip at yahoo.co.uk Mon Feb 27 10:24:46 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 27 Feb 2012 09:24:46 +0000 (UTC) Subject: [Python-Dev] New-style formatting in the logging module (was Re: cpython (3.2): Issue #14123: Explicitly mention that old style % string formatting has caveats) References: Message-ID: Nick Coghlan gmail.com> writes: > It's half the puzzle (since composing the event fields into the actual > log output is a logger action, you know the style when you supply the > format string). The other half is that logging's lazy formatting > currently only supporting printf-style format strings - to use brace > formatting you currently have to preformat the messages, so you incur > the formatting cost even if the message gets filtered out by the > logging configuration. That isn't necessarily true. Lazy formatting can work for {} and $ formatting types, not just %-formatting: see http://plumberjack.blogspot.com/2010/10/supporting-alternative-formatting.html Composing the event fields into the message is done by the LogRecord, which calls str() on the object passed as the format string to get the actual format string. This allows you to use any of the standard formatting schemes and still take advantage of lazy formatting, as outlined in the above post. Although style support for Formatters is new, that's really for merging the logging event message into the overall log output (with time, severity etc.) - the support for having your own way of formatting the event message has always been there, even before str.format :-) The Formatter style functionality is also available for 2.x through a separate logutils project which I maintain and which contains features which were added to logging in 3.2 such as QueueHandler/QueueListener: http://pypi.python.org/pypi/logutils/ I will add a section in the logging cookbook about support for alternative formatting styles. I thought I already had, but on inspection, it appears not to be the case. Regards, Vinay Sajip From martin at v.loewis.de Mon Feb 27 11:08:52 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 27 Feb 2012 11:08:52 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> Message-ID: <4F4B5634.3020609@v.loewis.de> Am 26.02.2012 07:06, schrieb Nick Coghlan: > On Sun, Feb 26, 2012 at 1:13 PM, Guido van Rossum wrote: >> A small quibble: I'd like to see a benchmark of a 'u' function implemented in C. > > Even if it was quite fast, I don't think such a function would bring > the same benefits as restoring support for u'' literals. You claim that, but your argument doesn't actually support that claim (or I fail to see the argument). > > Using myself as an example, my work projects (such as PulpDist [1]) > are currently written to target Python 2.6, since that's the system > Python on RHEL 6. As a web application, PulpDist has unicode literals > *everywhere*, but (as Armin pointed out to me), turning on "from > __future__ import unicode_literals" in every file would be incorrect, Right. So you shouldn't use the __future__ import, but the u() function. > IIRC, I've previously opposed the restoration of unicode literals as a > retrograde step. Looking at the implications for the future migration > of PulpDist has changed my mind. Did you try to follow the path of the u() function? Regards, Martin From martin at v.loewis.de Mon Feb 27 11:10:25 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 27 Feb 2012 11:10:25 +0100 Subject: [Python-Dev] PEP 414 In-Reply-To: <1330291766.12046.26.camel@thinko> References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <20120226160657.5e5d0bff@resist.wooz.org> <1330291766.12046.26.camel@thinko> Message-ID: <4F4B5691.2090300@v.loewis.de> > Much of the software I work on is Python 3 compatible, but it's still > used primarily on Python 2. Because most people still care primarily > about Python 2, and most don't have a lot of Python 3 experience, it's > extremely common to see folks submitting patches with u'' literals in > them. These can be easily fixed, right? Regards, Martin From martin at v.loewis.de Mon Feb 27 11:17:43 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 27 Feb 2012 11:17:43 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4A28CD.5070903@active-4.com> References: <4F49434B.6050604@active-4.com> <4F4A28CD.5070903@active-4.com> Message-ID: <4F4B5847.1040107@v.loewis.de> >> There are no significant overhead to use converters. > That's because what you're benchmarking here more than anything is the > overhead of eval() :-) See the benchmark linked in the PEP for one that > measures the actual performance of the string literal / wrapper. There are a few other unproven performance claims in the PEP. Can you kindly provide the benchmarks you have been using? In particular, I'm interested in the claim " In many cases 2to3 runs one or two orders of magnitude slower than the testsuite for the library or application it's testing." Regards, Martin From martin at v.loewis.de Mon Feb 27 11:21:16 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 27 Feb 2012 11:21:16 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <20120226180758.78e9a779@resist.wooz.org> References: <4F49434B.6050604@active-4.com> <20120226180758.78e9a779@resist.wooz.org> Message-ID: <4F4B591C.90200@v.loewis.de> Am 27.02.2012 00:07, schrieb Barry Warsaw: > On Feb 26, 2012, at 05:44 PM, Brett Cannon wrote: > >> On Sat, Feb 25, 2012 at 22:13, Guido van Rossum wrote: >> >>> If this can encourage more projects to support Python 3 (even if it's >>> only 3.3 and later) and hence improve adoption of Python 3, I'm all >>> for it. >>> >>> >> +1 from me for the same reasons. > > Just to be clear, I'm solidly +1 on anything we can do to increase the pace of > Python 3 migration. I find this rationale a bit sad: it's not that there is any (IMO) good technical reason for the feature - only that people "hate" the many available alternatives for some reason. But then, practicality beats purity, so be it. Regards, Martin From merwok at netwok.org Mon Feb 27 11:40:08 2012 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 27 Feb 2012 11:40:08 +0100 Subject: [Python-Dev] Marking packaging-related PEPs as Finished after fixing some bugs in them Message-ID: <4F4B5D88.2010804@netwok.org> Hello, The three packaging-related PEPs that were written by the distutils SIG and approved two years ago are still marked as Accepted, not Finished: SA 345 Metadata for Python Software Packages 1.2 Jones SA 376 Database of Installed Python Distributions Ziad? SA 386 Changing the version comparison module in Distutils Ziad? They?re all implemented in packaging/distutils2. Sadly, all of them have rather serious issues, so I wanted to ask what the process should be to solve the problems and mark the PEPs final. In PEP 345, legal values for project names are not defined, so for example ?Foo (>=0.5)? could legally be a project named Foo without version predicate, or a project named ?Foo (>=0.5)?. I don?t have a solution to propose, as I did not run a poll or check existing projects. Second, the Provides-Dist allows values like ?Name (<= version)?, which does not make sense: it should allow only unversioned names and names with one version (without operators). Finally, some of the new metadata fields are also misnamed, namely the ones ending in -Dist like Requires-Dist, whose value is a release (i.e. a name + optional version specifier), not a distribution (i.e. a specific archive or installer for a release), and therefore should be Requires-Release or something prettier. (Remember that it cannot be just Requires, which is taken by PEP 314, contains module names instead of project names, and is not used by anyone TTBOMK.) packaging.database, which implements PEP 376, has a few known bugs; I don?t know if that should prevent the PEP from being marked Finished. It could be that finishing the implementation shows issues in the PEP, like for the other two. In PEP 386, the rule that versions using an 'rc' marker should sort after 'c' is buggy: I don?t think anyone will disagree that 1.0rc1 == 1.0c1 and 1.0rc1 < 1.0c2. The 'rc' marker was added by Tarek shortly before the PEP was accepted (see http://mail.python.org/pipermail/python-dev/2010-January/097041.html), and was not discussed. My preferred solution would be to accept 'rc' when parsing but then always use 'c' internally and in all output (file names, METADATA file, etc.). If it is judged important that projects be able to use 'rc' and see it in output, then I?ll have to add special cases in __eq__ and __hash__ methods, which is feasible if inelegant. (Issues found by Josip Djolonga, Alexis M?taireau and I.) Cheers From g.rodola at gmail.com Mon Feb 27 12:34:55 2012 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Mon, 27 Feb 2012 12:34:55 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F49434B.6050604@active-4.com> References: <4F49434B.6050604@active-4.com> Message-ID: Il 25 febbraio 2012 21:23, Armin Ronacher ha scritto: > Hi, > > I just uploaded PEP 414 which proposes am optional 'u' prefix for string > literals for Python 3. > > You can read the PEP online: http://www.python.org/dev/peps/pep-0414/ > > This is a followup to the discussion about this topic here on the > mailinglist and on twitter/IRC over the last few weeks. > > > Regards, > Armin If the main point of this proposal is avoiding an explicit 2to3 run on account of 2to3 being too slow then I'm -1. That should be fixed at 2to3 level, not at python syntax level. A common strategy to distribute code able to run on both python 2 and python 3 is using the following hack in setup.py: http://docs.python.org/dev/howto/pyporting.html#during-installation That's what I used in psutil and it works just fine. Also, I believe it's the *right* strategy as it lets you freely write python 2 code and avoid using ugly hacks such as "sys.exc_info()[1]" and "if PY3: ..." all around the place. 2to3 might be slow but introducing workarounds encouraging not to use it is only going to cause a proliferation of ugly and hackish code in the python ecosystem. Now, psutil is a relatively small project and the 2to3 conversion doesn't take much time. Having users "unawarely" run 2to3 at installation time is an acceptable burden in terms of speed. That's going to be different on larger code bases such as Twisted's. One way to fix that might be making 2to3 generate and rely on a "2to3.diff" file containing all the differences. That would be generated the first time "python setup.py build/install" is run and then partially re-calculated every time a file is modified. Third-party library vendors can include 2to3.diff as part of the tarball they distribute so that the end user won't experience any slow down deriving from the 2to3 conversion. --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ http://code.google.com/p/pysendfile/ From tshepang at gmail.com Mon Feb 27 12:44:32 2012 From: tshepang at gmail.com (Tshepang Lekhonkhobe) Date: Mon, 27 Feb 2012 13:44:32 +0200 Subject: [Python-Dev] Marking packaging-related PEPs as Finished after fixing some bugs in them In-Reply-To: <4F4B5D88.2010804@netwok.org> References: <4F4B5D88.2010804@netwok.org> Message-ID: On Mon, Feb 27, 2012 at 12:40, ?ric Araujo wrote: > ?In PEP 386, the rule that versions using an 'rc' marker should sort > after 'c' is buggy: I don?t think anyone will disagree that 1.0rc1 == > 1.0c1 and 1.0rc1 < 1.0c2. ?The 'rc' marker was added by Tarek shortly > before the PEP was accepted (see > http://mail.python.org/pipermail/python-dev/2010-January/097041.html), > and was not discussed. ?My preferred solution would be to accept 'rc' > when parsing but then always use 'c' internally and in all output (file > names, METADATA file, etc.). ?If it is judged important that projects be > able to use 'rc' and see it in output, then I?ll have to add special > cases in __eq__ and __hash__ methods, which is feasible if inelegant. I also didn't like the fact that 'rc' > 'c'. The change you are proposing will also make the code change far simpler. Note that the code (the one in the main VCS), sort of assumed that 'rc'=='c', but is missing the assignment, and therefore broken. From solipsis at pitrou.net Mon Feb 27 12:50:25 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 27 Feb 2012 12:50:25 +0100 Subject: [Python-Dev] Marking packaging-related PEPs as Finished after fixing some bugs in them References: <4F4B5D88.2010804@netwok.org> Message-ID: <20120227125025.7f5450ab@pitrou.net> On Mon, 27 Feb 2012 11:40:08 +0100 ?ric Araujo wrote: > > In PEP 386, the rule that versions using an 'rc' marker should sort > after 'c' is buggy: I don?t think anyone will disagree that 1.0rc1 == > 1.0c1 and 1.0rc1 < 1.0c2. The 'rc' marker was added by Tarek shortly > before the PEP was accepted (see > http://mail.python.org/pipermail/python-dev/2010-January/097041.html), > and was not discussed. My preferred solution would be to accept 'rc' > when parsing but then always use 'c' internally and in all output (file > names, METADATA file, etc.). If it is judged important that projects be > able to use 'rc' and see it in output, then I?ll have to add special > cases in __eq__ and __hash__ methods, which is feasible if inelegant. 'rc' makes sense to most people while 'c' is generally unheard of. Regards Antoine. > > (Issues found by Josip Djolonga, Alexis M?taireau and I.) > > Cheers > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org From ncoghlan at gmail.com Mon Feb 27 13:14:17 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 27 Feb 2012 22:14:17 +1000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> Message-ID: On Mon, Feb 27, 2012 at 9:34 PM, Giampaolo Rodol? wrote: > If the main point of this proposal is avoiding an explicit 2to3 run on > account of 2to3 being too slow then I'm -1. No, the main point is that adding a compile step to the Python development process sucks. The slow speed of 2to3 is one factor, but single source is just far, far, easier to maintain than continually running 2to3 to get a working Python 3 version. When we have the maintainers of major web frameworks and libraries telling us that this is a painful aspect for their ports (and, subsequently, the ports of their users), it would be irresponsible of us to ignore their feedback. Sure, some early adopters are happy with the 2to3 process, that's not in dispute. However, many developers are not, and (just as relevant) many folks that haven't started their ports yet have highlighted it as one of the aspects that bothers them. Is restoring support for unicode literals a small retrograde step that partially undoes the language cleanup that occurred in 3.0? Yes, it is. However, it really does significantly increase the amount of 2.x code that will *just run* on Python 3 (or will run with minor tweaks). I can live with that - as MvL said, this is a classic case of practicality beating purity. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Feb 27 13:36:54 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 27 Feb 2012 22:36:54 +1000 Subject: [Python-Dev] [Python-checkins] cpython (3.2): Updated logging cookbook with info on alternative format styles. In-Reply-To: References: Message-ID: On Mon, Feb 27, 2012 at 9:04 PM, vinay.sajip wrote: > +There is, however, a way that you can use {}- and $- formatting to construct > +your individual log messages. Recall that for a message you can use an > +arbitrary object as a message format string, and that the logging package will > +call ``str()`` on that object to get the actual format string. Consider the > +following two classes:: > + > + ? ?class BraceMessage(object): > + ? ? ? ?def __init__(self, fmt, *args, **kwargs): > + ? ? ? ? ? ?self.fmt = fmt > + ? ? ? ? ? ?self.args = args > + ? ? ? ? ? ?self.kwargs = kwargs > + > + ? ? ? ?def __str__(self): > + ? ? ? ? ? ?return self.fmt.format(*self.args, **self.kwargs) > + > + ? ?class DollarMessage(object): > + ? ? ? ?def __init__(self, fmt, **kwargs): > + ? ? ? ? ? ?self.fmt = fmt > + ? ? ? ? ? ?self.kwargs = kwargs > + > + ? ? ? ?def __str__(self): > + ? ? ? ? ? ?from string import Template > + ? ? ? ? ? ?return Template(self.fmt).substitute(**self.kwargs) > + > +Either of these can be used in place of a format string, to allow {}- or > +$-formatting to be used to build the actual "message" part which appears in the > +formatted log output in place of "%(message)s" or "{message}" or "$message". > +It's a little unwieldy to use the class names whenever you want to log > +something, but it's quite palatable if you use an alias such as __ (double > +underscore ? not to be confused with _, the single underscore used as a > +synonym/alias for :func:`gettext.gettext` or its brethren). This is the part I was thinking might be simplified by allowing a "style" parameter to be passed to getLogger(). Consider StrFormatLogger and StringTemplateLogger classes that were just wrappers around an ordinary Logger instance, and made the relevant conversions to StrFormatMessage or StringTemplateMessage on the caller's behalf. Then (assuming the current getLogger() is available as _getLogger()), you could just do something like: _LOGGER_STYLES = { "%": lambda x: x, "{": StrFormatLogger, "$": StringTemplateLogger, } def getLogger(name, style='%'): if style not in _STYLES: raise ValueError('Style must be one of: %s' % ','.join(_LOGGER_STYLES.keys())) return _LOGGER_STYLES[style](_getLogger()) Since each module should generally be doing its own getLogger() call (or else should be documenting that it accepts an ordinary logger instance as a parameter), it seems like this would allow fairly clean use of the alternate styles without complicating each individual logging operation. (The xyzStyle approach used by formatters here won't work, since we want different modules to be able to use different formatting styles. However, ordinary inheritance should allow StrFormatLogger and StringTemplateLogger to share most of their implementation) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From barry at python.org Mon Feb 27 13:52:51 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 27 Feb 2012 07:52:51 -0500 Subject: [Python-Dev] Marking packaging-related PEPs as Finished after fixing some bugs in them In-Reply-To: <4F4B5D88.2010804@netwok.org> References: <4F4B5D88.2010804@netwok.org> Message-ID: <20120227075251.1aa6a997@resist.wooz.org> On Feb 27, 2012, at 11:40 AM, ?ric Araujo wrote: > They?re all implemented in packaging/distutils2. Sadly, all of them >have rather serious issues, so I wanted to ask what the process should >be to solve the problems and mark the PEPs final. From a process point of view, I'd say you should fix the PEP issues you know about, and publish new versions, updating the Post-History field. These PEPs were written before the BDFOP/Czar policy came about, so why not see if you can find someone acceptable to Guido (or maybe suggested by him) to pronounce on the PEPs. Then, if the BDFOP agrees you can mark them Final, since I think they almost are, effectively. Marking them Final doesn't mean they can't be updated if you find some issues with them later. You and the BDFOP might decide to defer Final acceptance until the bugs in the implementations are fixed though. Cheers, -Barry From barry at python.org Mon Feb 27 14:09:06 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 27 Feb 2012 08:09:06 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> Message-ID: <20120227080906.768d6d93@resist.wooz.org> On Feb 27, 2012, at 12:34 PM, Giampaolo Rodol? wrote: >Il 25 febbraio 2012 21:23, Armin Ronacher >If the main point of this proposal is avoiding an explicit 2to3 run on >account of 2to3 being too slow then I'm -1. 2to3's speed isn't the only problem with the tool, although it's a big one. It also doesn't always work, and it makes packaging libraries dependent on it more difficult. As for the "working" part, I forget the details, but let's say you have a test suite in your package. If you run `python setup.py test` in a Python 2 world, then `python3 setup.py test` may fail to build properly. IIRC this was due to some confusion that 2to3 had. I've no doubt that these things can be fixed, but why? I'd much rather see the effort put into allowing us to write Python 3 code natively, with some accommodations for Python 2 from a single code base for the last couple of years that that will still be necessary . Cheers, -Barry From barry at python.org Mon Feb 27 14:16:00 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 27 Feb 2012 08:16:00 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4B591C.90200@v.loewis.de> References: <4F49434B.6050604@active-4.com> <20120226180758.78e9a779@resist.wooz.org> <4F4B591C.90200@v.loewis.de> Message-ID: <20120227081600.79ca8cb7@resist.wooz.org> On Feb 27, 2012, at 11:21 AM, Martin v. L?wis wrote: >I find this rationale a bit sad: it's not that there is any (IMO) good >technical reason for the feature - only that people "hate" the many >available alternatives for some reason. It makes me sad too, and as I've said, I personally have no problem with the existing solutions. They work just fine for me. But I also consistently hear from folks doing web frameworks that there's a big missing piece in the Python 3 story for them. Maybe restoring u-prefix solves their problem, or maybe there's another better solution out there. I don't do a lot of web development these days so I can't say. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From rdmurray at bitdance.com Mon Feb 27 15:22:41 2012 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 27 Feb 2012 09:22:41 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4B591C.90200@v.loewis.de> References: <4F49434B.6050604@active-4.com> <20120226180758.78e9a779@resist.wooz.org> <4F4B591C.90200@v.loewis.de> Message-ID: <20120227142243.812512500CF@webabinitio.net> On Mon, 27 Feb 2012 11:21:16 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote: > I find this rationale a bit sad: it's not that there is any (IMO) good > technical reason for the feature - only that people "hate" the many > available alternatives for some reason. > > But then, practicality beats purity, so be it. Agreed on both counts (but only reluctantly on the second :) The PEP does not currently contain a discussion of the unicode_literals + str() alternative and why that is not considered acceptable. That should be added (and I'm very curious why it isn't acceptable, it seems very elegant to me). In fact, I'd like to see the PEP contain a bullet list of alternatives with a discussion of why each is unacceptable or insufficient. The text as organized now is hard to follow for that purpose. Other comments: I disagree that "it is clear that 2to3 as a tool is insufficient" and that *therefore* people are attempting to use unified source. I think the truth is that people just prefer the unified source approach, because that is more Pythonic. I also strongly disagree with the statement that unicode_literals is doing more harm that good. Many people are using it very successfully. In *certain contexts* (WSGI) it may be problematic, but that doesn't mean it was a bad idea or that it shouldn't be used (given that a project uses it consistently, as noted previously in this thread). As noted above, the native string type *is* available with unicode_literals, it is spelled "str('somestring'). I don't understand the "Who Benefits?" section at all. For example, I think you'll agree I'm experienced working with email issues, and I don't understand how this proposal would help at all in dealing with email. The PEP would be strengthened by providing specific examples of the claims made in this section. I am -0 on this proposal. I will bow to the experience of those actually trying to port and support web code, which I am not doing myself. But I'd like to see the PEP improved so that the proposal is as strong as possible. --David From vinay_sajip at yahoo.co.uk Mon Feb 27 15:40:53 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 27 Feb 2012 14:40:53 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <20120227080906.768d6d93@resist.wooz.org> Message-ID: Barry Warsaw python.org> writes: > As for the "working" part, I forget the details, but let's say you have a test > suite in your package. If you run `python setup.py test` in a Python 2 world, > then `python3 setup.py test` may fail to build properly. IIRC this was due to > some confusion that 2to3 had. > There are other things, too, which make 2to3 a good advisory tool rather than a fully automated solution. 2to3 does a pretty good job of solving a difficult problem, but there are some things it just won't be able to do. For example, it assumes that certain method names belong to dictionaries and wraps their result with a list() because 3.x produces iterators where 2.x produces lists. This has caused problems in practice, e.g. with Django where IIRC calls to the values() method of querysets were wrapped with list(), when it was wrong to do so. Regards, Vinay Sajip From tseaver at palladion.com Mon Feb 27 15:59:10 2012 From: tseaver at palladion.com (Tres Seaver) Date: Mon, 27 Feb 2012 09:59:10 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/27/2012 06:34 AM, Giampaolo Rodol? wrote: > Il 25 febbraio 2012 21:23, Armin Ronacher > ha scritto: >> Hi, >> >> I just uploaded PEP 414 which proposes am optional 'u' prefix for >> string literals for Python 3. >> >> You can read the PEP online: >> http://www.python.org/dev/peps/pep-0414/ >> >> This is a followup to the discussion about this topic here on the >> mailinglist and on twitter/IRC over the last few weeks. >> >> >> Regards, Armin > > If the main point of this proposal is avoiding an explicit 2to3 run > on account of 2to3 being too slow then I'm -1. The main point is that 2to3 as a strategy for "straddling" python2 and python3 is a showstopper for folks who actually need to straddle (as opposed to one-time conversion): - - 2to3 erformance on large projects sucks. - - 2to3 introduces oddities in testing, coverage, etc. - - 2to3 creates problems with stack traces / bug reports from Py3k users. There are a *lot* of folks who have abandoned 2to3 in favor of "single codebase": the PEP addresses one of the last remaining issues to making such codebases clean and easy to maintain (the sys.exec_info hack is not needed in Python >= 2.6). Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9Lmj4ACgkQ+gerLs4ltQ5wBgCfXWUe81vnQh5ptKpGhqLTOL5L oUgAnRrgEUFIq85rgGU6Ky3kN+KzZaqV =CNVl -----END PGP SIGNATURE----- From armin.ronacher at active-4.com Mon Feb 27 16:44:32 2012 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Mon, 27 Feb 2012 15:44:32 +0000 Subject: [Python-Dev] PEP 414 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <4F4A29BD.2090607@active-4.com> Message-ID: <4F4BA4E0.80806@active-4.com> Hi, On 2/27/12 1:55 AM, Terry Reedy wrote: > I presume such a hook would simply remove 'u' prefixes and would run > *much* faster than 2to3. If such a hook is satisfactory for 3.2, why > would it not be satisfactory for 3.3? Agile development and unittests. An installation hook means that you need to install the package before running the tests. Which is fine for CI but horrible during development. "python3 run-tests.py" beats "make venv; install library; run testsuite" anytime in terms of development speed. Regards, Armin From armin.ronacher at active-4.com Mon Feb 27 16:45:14 2012 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Mon, 27 Feb 2012 15:45:14 +0000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4B5847.1040107@v.loewis.de> References: <4F49434B.6050604@active-4.com> <4F4A28CD.5070903@active-4.com> <4F4B5847.1040107@v.loewis.de> Message-ID: <4F4BA50A.3020009@active-4.com> Hi, On 2/27/12 10:17 AM, "Martin v. L?wis" wrote: > There are a few other unproven performance claims in the PEP. Can you > kindly provide the benchmarks you have been using? In particular, I'm > interested in the claim " In many cases 2to3 runs one or two orders of > magnitude slower than the testsuite for the library or application it's > testing." The benchmarks used are linked in the PEP. Regards, Armin From benjamin at python.org Mon Feb 27 16:51:36 2012 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 27 Feb 2012 10:51:36 -0500 Subject: [Python-Dev] PEP 415: Implementing PEP 409 differently In-Reply-To: References: Message-ID: 2012/2/26 Nick Coghlan : > Thanks for writing that up. I'd be amenable if the PEP was clearly > updated to say that ``raise exc from cause`` would change from being > syntactic sugar for ``_hidden = exc; _hidden.__cause__ = cause; raise > exc`` (as it is now) to ``_hidden = exc; _hidden.__cause__ = cause; > _hidden.__suppress_context__ = True; raise exc``. The patch should > then be implemented accordingly (including appropriate updates to the > language reference). I add the following lines to the PEP: To summarize, ``raise exc from cause`` will be equivalent to:: exc.__cause__ = cause exc.__suppress_context__ = cause is None raise exc -- Regards, Benjamin From hodgestar+pythondev at gmail.com Mon Feb 27 17:11:48 2012 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Mon, 27 Feb 2012 18:11:48 +0200 Subject: [Python-Dev] PEP 414 In-Reply-To: <4F4BA4E0.80806@active-4.com> References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <4F4A29BD.2090607@active-4.com> <4F4BA4E0.80806@active-4.com> Message-ID: On Mon, Feb 27, 2012 at 5:44 PM, Armin Ronacher wrote: > Agile development and unittests. ?An installation hook means that you > need to install the package before running the tests. ?Which is fine for > CI but horrible during development. ?"python3 run-tests.py" beats "make > venv; install library; run testsuite" anytime in terms of development speed. "python3 setup.py test" works already with 2to3 (it builds the code and runs the tests under build/). It is however slow and it can be a bit annoying to have to debug things by looking at the generated code under build/lib.linux-i686-3.2/ (or similar). From martin at v.loewis.de Mon Feb 27 17:44:34 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Mon, 27 Feb 2012 17:44:34 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4BA50A.3020009@active-4.com> References: <4F49434B.6050604@active-4.com> <4F4A28CD.5070903@active-4.com> <4F4B5847.1040107@v.loewis.de> <4F4BA50A.3020009@active-4.com> Message-ID: <20120227174434.Horde.PAV6fUlCcOxPS7LyDc6X4bA@webmail.df.eu> Zitat von Armin Ronacher : > Hi, > > On 2/27/12 10:17 AM, "Martin v. L?wis" wrote: >> There are a few other unproven performance claims in the PEP. Can you >> kindly provide the benchmarks you have been using? In particular, I'm >> interested in the claim " In many cases 2to3 runs one or two orders of >> magnitude slower than the testsuite for the library or application it's >> testing." > The benchmarks used are linked in the PEP. Maybe I'm missing something, but there doesn't seem to be a benchmark that measures the 2to3 performance, supporting the claim that it runs "two orders of magnitude" slower (which I'd interpret as a factor of 100). If the claim actually cannot be supported, please remove it from the PEP. Regards, Martin From ethan at stoneleaf.us Mon Feb 27 18:05:54 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 27 Feb 2012 09:05:54 -0800 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4B5634.3020609@v.loewis.de> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> Message-ID: <4F4BB7F2.4070804@stoneleaf.us> Martin v. L?wis wrote: > Am 26.02.2012 07:06, schrieb Nick Coghlan: >> On Sun, Feb 26, 2012 at 1:13 PM, Guido van Rossum wrote: >>> A small quibble: I'd like to see a benchmark of a 'u' function implemented in C. >> Even if it was quite fast, I don't think such a function would bring >> the same benefits as restoring support for u'' literals. > > You claim that, but your argument doesn't actually support that claim > (or I fail to see the argument). Python 2.6 code: this = u'that' Python 3.3 code: this = u('that') Not source compatible, not elegant. (Even though 2to3 could make this fix, it's still kinda ugly.) ~Ethan~ From rdmurray at bitdance.com Mon Feb 27 18:41:10 2012 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 27 Feb 2012 12:41:10 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4BB7F2.4070804@stoneleaf.us> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> Message-ID: <20120227174110.96AFA2500E4@webabinitio.net> On Mon, 27 Feb 2012 09:05:54 -0800, Ethan Furman wrote: > Martin v. L??wis wrote: > > Am 26.02.2012 07:06, schrieb Nick Coghlan: > >> On Sun, Feb 26, 2012 at 1:13 PM, Guido van Rossum wrote: > >>> A small quibble: I'd like to see a benchmark of a 'u' function implemented in C. > >> Even if it was quite fast, I don't think such a function would bring > >> the same benefits as restoring support for u'' literals. > > > > You claim that, but your argument doesn't actually support that claim > > (or I fail to see the argument). > > Python 2.6 code: > this = u'that' > > Python 3.3 code: > this = u('that') > > Not source compatible, not elegant. (Even though 2to3 could make this > fix, it's still kinda ugly.) Eh? The 2.6 version would also be u('that'). That's the whole point of the idiom. You'll need a better counter argument than that. --David From solipsis at pitrou.net Mon Feb 27 18:35:25 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 27 Feb 2012 18:35:25 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4A28CD.5070903@active-4.com> Message-ID: <20120227183525.1591423d@pitrou.net> On Sun, 26 Feb 2012 12:42:53 +0000 Armin Ronacher wrote: > Hi, > > On 2/26/12 12:35 PM, Serhiy Storchaka wrote: > > Some microbenchmarks: > > > > $ python -m timeit -n 10000 -r 100 -s "x = 123" "'foobarbaz_%d' % x" > > 10000 loops, best of 100: 1.24 usec per loop > > $ python -m timeit -n 10000 -r 100 -s "x = 123" "str('foobarbaz_%d') % x" > > 10000 loops, best of 100: 1.59 usec per loop > > $ python -m timeit -n 10000 -r 100 -s "x = 123" "str(u'foobarbaz_%d') % x" > > 10000 loops, best of 100: 1.58 usec per loop > > $ python -m timeit -n 10000 -r 100 -s "x = 123; n = lambda s: s" > "n('foobarbaz_%d') % x" > > 10000 loops, best of 100: 1.41 usec per loop > > $ python -m timeit -n 10000 -r 100 -s "x = 123; s = 'foobarbaz_%d'" "s > % x" > > 10000 loops, best of 100: 1.22 usec per loop > > > > There are no significant overhead to use converters. > That's because what you're benchmarking here more than anything is the > overhead of eval() :-) See the benchmark linked in the PEP for one that > measures the actual performance of the string literal / wrapper. Could you update your benchmarks with the caching version of u()? Thanks Antoine. From chrism at plope.com Mon Feb 27 19:01:02 2012 From: chrism at plope.com (Chris McDonough) Date: Mon, 27 Feb 2012 13:01:02 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <20120227174110.96AFA2500E4@webabinitio.net> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> Message-ID: <1330365662.12046.72.camel@thinko> On Mon, 2012-02-27 at 12:41 -0500, R. David Murray wrote: > On Mon, 27 Feb 2012 09:05:54 -0800, Ethan Furman wrote: > > Martin v. L?wis wrote: > > > Am 26.02.2012 07:06, schrieb Nick Coghlan: > > >> On Sun, Feb 26, 2012 at 1:13 PM, Guido van Rossum wrote: > > >>> A small quibble: I'd like to see a benchmark of a 'u' function implemented in C. > > >> Even if it was quite fast, I don't think such a function would bring > > >> the same benefits as restoring support for u'' literals. > > > > > > You claim that, but your argument doesn't actually support that claim > > > (or I fail to see the argument). > > > > Python 2.6 code: > > this = u'that' > > > > Python 3.3 code: > > this = u('that') > > > > Not source compatible, not elegant. (Even though 2to3 could make this > > fix, it's still kinda ugly.) > > Eh? The 2.6 version would also be u('that'). That's the whole point > of the idiom. You'll need a better counter argument than that. The best argument is that there already exists tons and tons of Python 2 code that already does: u'that' Needing to change it to: u('that') 1) Requires effort on the part of a from-Python-2-porter to service the aesthetic and populist goal of not having an explicit but redundant-under-Py3 literal syntax that says "this is text". 2) Won't atually meet the aesthetic goal, as it's uglier and slower under *both* Python 2 and Python 3. So the populist argument remains.. "it's too confusing for people who learn Python 3 as a new language to have a redundant syntax". But we've had such a syntax in Python 2 for years with b'', and, as mentioned by Armin's PEP single-quoted vs. triple-quoted strings forever. I just don't understand the pushback here at all. This is such a nobrainer. - C From guido at python.org Mon Feb 27 19:17:57 2012 From: guido at python.org (Guido van Rossum) Date: Mon, 27 Feb 2012 10:17:57 -0800 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <1330365662.12046.72.camel@thinko> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> Message-ID: On Mon, Feb 27, 2012 at 10:01 AM, Chris McDonough wrote: > The best argument is that there already exists tons and tons of Python 2 > code that already does: > > ?u'that' +1 > Needing to change it to: > > ?u('that') > > 1) Requires effort on the part of a from-Python-2-porter to service > ? the aesthetic and populist goal of not having an explicit > ? but redundant-under-Py3 literal syntax that says "this is text". > > 2) Won't actually meet the aesthetic goal, as > ? it's uglier and slower under *both* Python 2 and Python 3. > > So the populist argument remains.. "it's too confusing for people who > learn Python 3 as a new language to have a redundant syntax". ?But we've > had such a syntax in Python 2 for years with b'', and, as mentioned by > Armin's PEP single-quoted vs. triple-quoted strings forever. > > I just don't understand the pushback here at all. ?This is such a > nobrainer. I agree. Just let's start deprecating it too, so that once Python 2.x compatibility is no longer relevant we can eventually stop supporting it (though that may have to wait until Python 4...). We need to send *some* sort of signal that this is a compatibility hack and that no new code should use it. Maybe a SilentDeprecationWarning? -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Mon Feb 27 19:22:05 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 27 Feb 2012 10:22:05 -0800 Subject: [Python-Dev] PEP 415: Implementing PEP 409 differently In-Reply-To: References: Message-ID: <4F4BC9CD.8030908@stoneleaf.us> Benjamin Peterson wrote: > 2012/2/26 Nick Coghlan : >> Thanks for writing that up. I'd be amenable if the PEP was clearly >> updated to say that ``raise exc from cause`` would change from being >> syntactic sugar for ``_hidden = exc; _hidden.__cause__ = cause; raise >> exc`` (as it is now) to ``_hidden = exc; _hidden.__cause__ = cause; >> _hidden.__suppress_context__ = True; raise exc``. The patch should >> then be implemented accordingly (including appropriate updates to the >> language reference). > > I add the following lines to the PEP: > > To summarize, ``raise exc from cause`` will be equivalent to:: > > exc.__cause__ = cause > exc.__suppress_context__ = cause is None > raise exc So exc.__cause__ will be None both before and after `raise Exception from None`? ~Ethan~ From tjreedy at udel.edu Mon Feb 27 19:44:43 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 27 Feb 2012 13:44:43 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <1330365662.12046.72.camel@thinko> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> Message-ID: On 2/27/2012 1:01 PM, Chris McDonough wrote: > On Mon, 2012-02-27 at 12:41 -0500, R. David Murray wrote: >> Eh? The 2.6 version would also be u('that'). That's the whole point >> of the idiom. You'll need a better counter argument than that. > > The best argument is that there already exists tons and tons of Python 2 > code that already does: > > u'that' > > Needing to change it to: > > u('that') > > 1) Requires effort on the part of a from-Python-2-porter to service > the aesthetic and populist goal of not having an explicit > but redundant-under-Py3 literal syntax that says "this is text". This is a point, though this would be a one-time conversion by a 2to23 converter that would be part of other needed conversions, some by hand. I presume that most 2.6 code has problems other than u'' when attempting to run under 3.x. > 2) Won't atually meet the aesthetic goal, as > it's uglier and slower under *both* Python 2 and Python 3. Less relevant. The minor ugliness would be in dual-version code, but not Python 3 itself. > So the populist argument remains.. "it's too confusing for people who > learn Python 3 as a new language to have a redundant syntax". But we've > had such a syntax in Python 2 for years with b'', and, as mentioned by > Armin's PEP single-quoted vs. triple-quoted strings forever. > > I just don't understand the pushback here at all. For one thing, u'' does not solve the problem for 3.1 and 3.2, while u() does. 3.2 will be around for years. For one example, it will be in the April long-term-support release of Ubuntu. For another, PyPy is working on a 3.2 compatible version to come out and be put into use this year. > This is such a nobrainer. I could claim that a solution that also works for 3.1 and 3.2 is a nobrainer. It depends on how one weighs different factors. -- Terry Jan Reedy From ethan at stoneleaf.us Mon Feb 27 19:04:13 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 27 Feb 2012 10:04:13 -0800 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <20120227174110.96AFA2500E4@webabinitio.net> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> Message-ID: <4F4BC59D.1050208@stoneleaf.us> R. David Murray wrote: > On Mon, 27 Feb 2012 09:05:54 -0800, Ethan Furman wrote: >> Martin v. L??wis wrote: >>> Am 26.02.2012 07:06, schrieb Nick Coghlan: >>>> On Sun, Feb 26, 2012 at 1:13 PM, Guido van Rossum wrote: >>>>> >>>>> A small quibble: I'd like to see a benchmark of a 'u' function implemented in C. >>>> Even if it was quite fast, I don't think such a function would bring >>>> the same benefits as restoring support for u'' literals. >>> You claim that, but your argument doesn't actually support that claim >>> (or I fail to see the argument). >> >> Python 2.6 code: >> this = u'that' >> >> Python 3.3 code: >> this = u('that') >> >> Not source compatible, not elegant. (Even though 2to3 could make this >> fix, it's still kinda ugly.) > > Eh? The 2.6 version would also be u('that'). That's the whole point > of the idiom. You'll need a better counter argument than that. So the idea is to convert the existing 2.6 code to use parenthesis as well? (I obviously haven't read the PEP -- my apologies.) Then I primarily object on ergonomic reasons, but I still think it's kinda ugly. ;) ~Ethan~ From victor.stinner at haypocalc.com Mon Feb 27 19:53:27 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Mon, 27 Feb 2012 19:53:27 +0100 Subject: [Python-Dev] Add a frozendict builtin type Message-ID: Rationale ========= A frozendict type is a common request from users and there are various implementations. There are two main Python implementations: * "blacklist": frozendict inheriting from dict and overriding methods to raise an exception when trying to modify the frozendict * "whitelist": frozendict not inheriting from dict and only implement some dict methods, or implement all dict methods but raise exceptions when trying to modify the frozendict The blacklist implementation has a major issue: it is still possible to call write methods of the dict class (e.g. dict.set(my_frozendict, key, value)). The whitelist implementation has an issue: frozendict and dict are not "compatible", dict is not a subclass of frozendict (and frozendict is not a subclass of dict). I propose to add a new frozendict builtin type and make dict type inherits from it. frozendict would not have methods to modify its content and values must be immutable. Constraints =========== * frozendict values must be immutable, as dict keys * frozendict can be used with the C API of the dict object (e.g. PyDict_GetItem) but write methods (e.g. PyDict_SetItem) would fail with a TypeError ("expect dict, got frozendict") * frozendict.__hash__() has to be determinist * frozendict has not the following methods: clear, __delitem__, pop, popitem, setdefault, __setitem__ and update. As tuple/frozenset has less methods than list/set. * issubclass(dict, frozendict) is True, whereas issubclass(frozendict, dict) is False Implementation ============== * Add an hash field to the PyDictObject structure * Make dict inherits from frozendict * frozendict values are checked for immutability property by calling their __hash__ method, with a fast-path for known immutable types (int, float, bytes, str, tuple, frozenset) * frozendict.__hash__ computes hash(frozenset(self.items())) and caches the result is its private hash attribute Attached patch is a work-in-progress implementation. TODO ==== * Add a frozendict abstract base class to collections? * frozendict may not overallocate dictionary buckets? -- Examples of frozendict implementations: http://bob.pythonmac.org/archives/2005/03/04/frozendict/ http://code.activestate.com/recipes/498072-implementing-an-immutable-dictionary/ http://code.activestate.com/recipes/414283-frozen-dictionaries/ http://corebio.googlecode.com/svn/trunk/apidocs/corebio.utils.frozendict-class.html http://code.google.com/p/lingospot/source/browse/trunk/frozendict/frozendict.py http://cmssdt.cern.ch/SDT/doxygen/CMSSW_4_4_2/doc/html/d6/d2f/classfrozendict_1_1frozendict.html See also the recent discussion on python-list: http://mail.python.org/pipermail/python-list/2012-February/1287658.html -- See also the PEP 351. Victor -------------- next part -------------- A non-text attachment was scrubbed... Name: frozendict.patch Type: text/x-patch Size: 24014 bytes Desc: not available URL: From tjreedy at udel.edu Mon Feb 27 20:10:57 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 27 Feb 2012 14:10:57 -0500 Subject: [Python-Dev] PEP 414 In-Reply-To: <4F4BA4E0.80806@active-4.com> References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <4F4A29BD.2090607@active-4.com> <4F4BA4E0.80806@active-4.com> Message-ID: On 2/27/2012 10:44 AM, Armin Ronacher wrote: > On 2/27/12 1:55 AM, Terry Reedy wrote: >> I presume such a hook would simply remove 'u' prefixes and would >> run *much* faster than 2to3. If such a hook is satisfactory for >> 3.2, why would it not be satisfactory for 3.3? > Agile development and unittests. Which I am all for. So the issue is not 3.3 versus 3.1/2, but development versus installation. I somehow did not get that reading the PEP but it seems a crucial point in its favor. > An installation hook means that you need to install the package > before running the tests. Which is fine for CI but horrible during > development. "python3 run-tests.py" beats "make venv; install > library; run testsuite" anytime in terms of development speed. That I can appreciate. It makes programming more fun. I presume you are saying that you would run the 'Python 3' tests quickly with 3.3 in your normal development cycle. Then, if you want your library to also run under 3.1/2, only occasionally (daily?) check that they also run under a 3.1/2 installation. That *does* make sense to me. -- Terry Jan Reedy From vinay_sajip at yahoo.co.uk Mon Feb 27 20:16:14 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 27 Feb 2012 19:16:14 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> Message-ID: Terry Reedy udel.edu> writes: > This is a point, though this would be a one-time conversion by a 2to23 > converter that would be part of other needed conversions, some by hand. > I presume that most 2.6 code has problems other than u'' when attempting > to run under 3.x. Right. In doing the Django port, the u() stuff took very little time - I wrote a lib2to3 fixer to do it. A lot more time was spent in areas where the bytes/text interfaces had not been thought through carefully, e.g. in the crypto/hashing stuff - this is stuff that an automatic tools couldn't do. After it was decided in the Django team to drop 2.5 support after Django 1.4 was released, the u('xxx') calls weren't needed any more. Another lib2to3 fixer converted them back to 'xxx' for use with "from __future__ import unicode_literals". > > 2) Won't atually meet the aesthetic goal, as > > it's uglier and slower under *both* Python 2 and Python 3. > > Less relevant. The minor ugliness would be in dual-version code, but not > Python 3 itself. And it would be reasonably easy to transition from u('xxx') -> 'xxx' when support for 2.5 is dropped by a particular project, again using automation via a lib2to3 fixer. > I could claim that a solution that also works for 3.1 and 3.2 is a > nobrainer. It depends on how one weighs different factors. Yes. I feel the same way as Martin and Barry have expressed - it's a shame that people are talking up the potential difficulties of porting to a single code-base without the PEP change. Having been in the trenches with the Django port, I don't feel that the Unicode literal part was really a major problem. And I've now done *two* Django ports - one to a 2.5-compatible codebase with u('xxx'), and one to a 2.6+ compatible codebase with unicode_literals and plain 'xxx'. I'm only keeping the latter one up to date with changes in Django trunk, but both ports, though far from complete from a whole-project point of view, got to the point where they passed the very large test suite. On balance, though, I don't oppose the PEP. We can wish all we want for people to do the right thing (as we see it), but wishing don't make it so. Do I sense a certain amount of worry about the pace of the 2.x -> 3.x transition? It feels like we're blinking first ;-) Regards, Vinay Sajip From benjamin at python.org Mon Feb 27 20:27:39 2012 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 27 Feb 2012 14:27:39 -0500 Subject: [Python-Dev] PEP 415: Implementing PEP 409 differently In-Reply-To: <4F4BC9CD.8030908@stoneleaf.us> References: <4F4BC9CD.8030908@stoneleaf.us> Message-ID: 2012/2/27 Ethan Furman : > Benjamin Peterson wrote: >> >> 2012/2/26 Nick Coghlan : >>> >>> Thanks for writing that up. I'd be amenable if the PEP was clearly >>> updated to say that ``raise exc from cause`` would change from being >>> syntactic sugar for ``_hidden = exc; _hidden.__cause__ = cause; raise >>> exc`` (as it is now) to ``_hidden = exc; _hidden.__cause__ = cause; >>> _hidden.__suppress_context__ = True; raise exc``. The patch should >>> then be implemented accordingly (including appropriate updates to the >>> language reference). >> >> >> I add the following lines to the PEP: >> >> To summarize, ``raise exc from cause`` will be equivalent to:: >> >> ? ?exc.__cause__ = cause >> ? ?exc.__suppress_context__ = cause is None >> ? ?raise exc > > > So exc.__cause__ will be None both before and after `raise Exception from > None`? Yes. -- Regards, Benjamin From ethan at stoneleaf.us Mon Feb 27 20:12:26 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 27 Feb 2012 11:12:26 -0800 Subject: [Python-Dev] cpython: Close issue #6210: Implement PEP 409 In-Reply-To: <20120226145421.6bff8bc7@pitrou.net> References: <20120226145421.6bff8bc7@pitrou.net> Message-ID: <4F4BD59A.7070900@stoneleaf.us> Antoine Pitrou wrote: > On Sun, 26 Feb 2012 09:02:59 +0100 > nick.coghlan wrote: >> + def get_output(self, code, filename=None): >> + """ >> + Run the specified code in Python (in a new child process) >> and read the >> + output from the standard error or from a file (if filename >> is set). >> + Return the output lines as a list. >> + """ > > We already have assert_python_ok and friends. It's not obvious what > this additional function achieves. Also, the "filename" argument is > never used. > >> + output = re.sub('Current thread 0x[0-9a-f]+', >> + 'Current thread XXX', >> + output) > > This looks like output from the faulthandler module. Why would > faulthandler kick in here? That's because I stole those two functions from the faulthandler module. Still learning where all the goodies are. Thanks for the tip about assert_python_ok, etc. ~Ethan~ From vinay_sajip at yahoo.co.uk Mon Feb 27 20:47:42 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 27 Feb 2012 19:47:42 +0000 (UTC) Subject: [Python-Dev] PEP 414 References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <4F4A29BD.2090607@active-4.com> <4F4BA4E0.80806@active-4.com> Message-ID: Terry Reedy udel.edu> writes: > > An installation hook means that you need to install the package > > before running the tests. Which is fine for CI but horrible during > > development. "python3 run-tests.py" beats "make venv; install > > library; run testsuite" anytime in terms of development speed. > > That I can appreciate. It makes programming more fun. I presume you are > saying that you would run the 'Python 3' tests quickly with 3.3 in your > normal development cycle. Then, if you want your library to also run > under 3.1/2, only occasionally (daily?) check that they also run under a > 3.1/2 installation. That *does* make sense to me. Right, but Armin, while arguing against an installation hook for 2to3, is ISTM arguing for an analogous hook for use with 3.2 (and earlier 3.x), which does a smaller amount of work than 2to3 but is the same kind of beast. The "programming fun" part is really an argument about a single codebase, which I am completely in agreement with. But (summarising for my own benefit, but someone please tell me if I've missed or misunderstood something) there are (at least) three ways to get there: 1. Support only 2.6+ code, use from __future__ import unicode_literals, do away with u'xxx' in favour of 'xxx'. This has been opposed because of action-at-a-distance, but can be mitigated by strongly applied discipline on a given project (so that everyone knows that all string literals are Unicode, period). Of course, the same discipline needs to be applied to depended-upon projects, too. 2. Support 2.5 or earlier, where you would have to use u('xxx') in place of u'xxx', unless PEP 414 is accepted - but you would still have the exception syntax hacks to upset you. This has been opposed because of performance and productivity concerns, but I don't think these are yet proven in practice (for performance, microbenchmarks notwithstanding - there's no data on more representative workloads. For productivity I call shenanigans, since if we can trust 2to3 to work automatically, we should be able to trust a 2to3 fixer to do the work on u'xxx' -> u('xxx') or u('xxx') -> 'xxx' automatically). 3. Do one of the above, but approve this PEP and keep u'xxx' literals around for some yet-to-be-determined time, but perhaps the life of Python 3. This has been called a retrograde step, and one can see why; ISTM the main reason for accepting this path is that some fairly vocal and respected developers don't want to (as opposed to can't) take one of the other paths, and are basically saying they're not porting their work to 3.x unless this path is taken. They're saying between the lines that their views are representative of a fair number of other less vocal developers, who are also not porting their code for the same reason. (ISTM they're glossing over the other issues which come up in a 2.x -> 3.x port, which require more time to diagnose and fix than problems caused by string literals.) But never mind that - if we're worried about the pace of the 2.x -> 3.x transition, we can appease these views fairly easily, so why not do it? And while we're at it, we can perhaps also look at those pesky exception clauses and see if we can't get 3.x to support 2.x exception syntax, to make that porting job even easier ;-) Regards, Vinay Sajip From chrism at plope.com Mon Feb 27 20:50:21 2012 From: chrism at plope.com (Chris McDonough) Date: Mon, 27 Feb 2012 14:50:21 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> Message-ID: <1330372221.12046.119.camel@thinko> On Mon, 2012-02-27 at 13:44 -0500, Terry Reedy wrote: > On 2/27/2012 1:01 PM, Chris McDonough wrote: > > On Mon, 2012-02-27 at 12:41 -0500, R. David Murray wrote: > >> Eh? The 2.6 version would also be u('that'). That's the whole point > >> of the idiom. You'll need a better counter argument than that. > > > > The best argument is that there already exists tons and tons of Python 2 > > code that already does: > > > > u'that' > > > > Needing to change it to: > > > > u('that') > > > > 1) Requires effort on the part of a from-Python-2-porter to service > > the aesthetic and populist goal of not having an explicit > > but redundant-under-Py3 literal syntax that says "this is text". > > This is a point, though this would be a one-time conversion by a 2to23 > converter that would be part of other needed conversions, some by hand. > I presume that most 2.6 code has problems other than u'' when attempting > to run under 3.x. > > > 2) Won't atually meet the aesthetic goal, as > > it's uglier and slower under *both* Python 2 and Python 3. > > Less relevant. The minor ugliness would be in dual-version code, but not > Python 3 itself. > > > So the populist argument remains.. "it's too confusing for people who > > learn Python 3 as a new language to have a redundant syntax". But we've > > had such a syntax in Python 2 for years with b'', and, as mentioned by > > Armin's PEP single-quoted vs. triple-quoted strings forever. > > > > I just don't understand the pushback here at all. > > For one thing, u'' does not solve the problem for 3.1 and 3.2, while u() > does. 3.2 will be around for years. For one example, it will be in the > April long-term-support release of Ubuntu. For another, PyPy is working > on a 3.2 compatible version to come out and be put into use this year. I suspect not everyone lives and dies by OS distribution release support policies. Many folks are both willing and capable to install a newer Python on an older OS. It's unfortunate that Python 3 < 3.3 does not have the syntax, and people like me who have a long-term need to "straddle" are to blame; we didn't provide useful feedback early enough to avoid the mistake. That said, it seems like preventing a reintroduction of u'' literal syntax would presume that two wrongs make a right. By our own schedule estimate of Python 3 takeup, many people won't be even thinking about porting any Python 2 code to 3 until years from now. > > This is such a nobrainer. > > I could claim that a solution that also works for 3.1 and 3.2 is a > nobrainer. It depends on how one weighs different factors. An argument for the reintroduction of u'' literal syntax in Python >= 3.3 is not necessarily an argument against the utility of some automated tool conversion support for porting a Python 2 app to a function-based u() syntax so it can run in Python 3 < 3.2. Tools like "2to23" or whatever can obviously be parameterized to emit slightly different 3.2-compatible and 3.3-compatible code. It's almost certain that it will need forward-version-aware modes like this anyway as newer idioms are added to 3.X that make code prettier or more efficient completely independent of u'' support. Currently we handle 3.2 compatibility in packages that "straddle" via six-like functions. We can continue doing this as necessary. If the stdlib tooling helps, great. In an emit-function-based-syntax mode, the conversion code would almost certainly need to rely on the import of an externally downloadable module like six, for compatibility under both Python 2 and 3 because there's no opportunity to go back in time and make "u()" available for older releases unless it was like inlined in every module during the conversion. But if somebody only wants to target 3.3+, and it means they don't have to rely on a six-like module to provide u(), great. - C From rdmurray at bitdance.com Mon Feb 27 21:11:34 2012 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 27 Feb 2012 15:11:34 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> Message-ID: <20120227201135.1A15F25009E@webabinitio.net> On Mon, 27 Feb 2012 10:17:57 -0800, Guido van Rossum wrote: > On Mon, Feb 27, 2012 at 10:01 AM, Chris McDonough wrote: > > The best argument is that there already exists tons and tons of Python 2 > > code that already does: > > > > ??u'that' > > +1 > > > Needing to change it to: > > > > ??u('that') > > > > 1) Requires effort on the part of a from-Python-2-porter to service > > ?? the aesthetic and populist goal of not having an explicit > > ?? but redundant-under-Py3 literal syntax that says "this is text". > > > > 2) Won't actually meet the aesthetic goal, as > > ?? it's uglier and slower under *both* Python 2 and Python 3. > > > > So the populist argument remains.. "it's too confusing for people who > > learn Python 3 as a new language to have a redundant syntax". ??But we've > > had such a syntax in Python 2 for years with b'', and, as mentioned by > > Armin's PEP single-quoted vs. triple-quoted strings forever. > > > > I just don't understand the pushback here at all. ??This is such a > > nobrainer. It's obviously not a *no*-brainer or you wouldn't be getting pushback :) I view most of the pushback as people wanting to make sure all the options have been carefully considered. This should all be documented in the PEP. > I agree. Just let's start deprecating it too, so that once Python 2.x > compatibility is no longer relevant we can eventually stop supporting > it (though that may have to wait until Python 4...). We need to send > *some* sort of signal that this is a compatibility hack and that no > new code should use it. Maybe a SilentDeprecationWarning? Isn't that what PendingDeprecationWarning is? This seems like the kind of use case that was introduced for (though it is less used now that DeprecationWarnings are silent by default). --David From martin at v.loewis.de Mon Feb 27 21:17:28 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 27 Feb 2012 21:17:28 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4BB7F2.4070804@stoneleaf.us> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> Message-ID: <4F4BE4D8.4020409@v.loewis.de> Am 27.02.2012 18:05, schrieb Ethan Furman: > Martin v. L?wis wrote: >> Am 26.02.2012 07:06, schrieb Nick Coghlan: >>> On Sun, Feb 26, 2012 at 1:13 PM, Guido van Rossum >>> wrote: >>>> A small quibble: I'd like to see a benchmark of a 'u' function >>>> implemented in C. >>> Even if it was quite fast, I don't think such a function would bring >>> the same benefits as restoring support for u'' literals. >> >> You claim that, but your argument doesn't actually support that claim >> (or I fail to see the argument). > > Python 2.6 code: > this = u'that' > > Python 3.3 code: > this = u('that') > > Not source compatible, not elegant. (Even though 2to3 could make this > fix, it's still kinda ugly.) No: Python 2.6 code this = u('that') Python 3.3 code this = u('that') It *is* source compatible, and 100% so. As for elegance: I find the u prefix fairly inelegant already; the function removes just a little more elegance. Regards, Martin From vinay_sajip at yahoo.co.uk Mon Feb 27 21:18:31 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 27 Feb 2012 20:18:31 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> Message-ID: Chris McDonough plope.com> writes: > I suspect not everyone lives and dies by OS distribution release support > policies. Many folks are both willing and capable to install a newer > Python on an older OS. But many folks aren't, and lament the slow pace of Python version adoption on e.g. Red Hat and CentOS. > It's unfortunate that Python 3 < 3.3 does not have the syntax, and > people like me who have a long-term need to "straddle" are to blame; we > didn't provide useful feedback early enough to avoid the mistake. That > said, it seems like preventing a reintroduction of u'' literal syntax > would presume that two wrongs make a right. By our own schedule > estimate of Python 3 takeup, many people won't be even thinking about > porting any Python 2 code to 3 until years from now. If the lack of u'' literal is what's holding them back, that's germane to the discussion of the PEP. If it's not, then why propose the PEP? > An argument for the reintroduction of u'' literal syntax in Python >= > 3.3 is not necessarily an argument against the utility of some automated > tool conversion support for porting a Python 2 app to a function-based > u() syntax so it can run in Python 3 < 3.2. I thought the argument was more about backtracking (or not) from Python 3's design decision to use 'xxx' for text and b'yyy' for bytes. That's the only "wrong" we're talking about for this PEP, right? > Currently we handle 3.2 compatibility in packages that "straddle" via > six-like functions. We can continue doing this as necessary. If the > stdlib tooling helps, great. In an emit-function-based-syntax mode, the > conversion code would almost certainly need to rely on the import of an > externally downloadable module like six, for compatibility under both > Python 2 and 3 because there's no opportunity to go back in time and > make "u()" available for older releases unless it was like inlined in > every module during the conversion. > > But if somebody only wants to target 3.3+, and it means they don't have > to rely on a six-like module to provide u(), great. If you only need to straddle from 2.6 onwards, then u('') isn't an issue at all, right now, is it? If you need to straddle from 2.5 downwards, there are other issues to be addressed, like exception syntax, 'with' and so forth - so making u'' available doesn't make the port a no-brainer. And if you bite the bullet and decide to do the port anyway, converting u'' to u('') won't be a problem unless you (a) can't use a fixer to automate the conversion or (b) the function call overhead cannot be borne. I'm not sure either of those objections (can't use fixer, call overhead excessive) have been made with sufficient force (i.e., data) in the discussion so far. Regards, Vinay Sajip From tjreedy at udel.edu Mon Feb 27 21:19:36 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 27 Feb 2012 15:19:36 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> Message-ID: On 2/27/2012 1:17 PM, Guido van Rossum wrote: > On Mon, Feb 27, 2012 at 10:01 AM, Chris McDonough wrote: >> The best argument is that there already exists tons and tons of Python 2 >> code that already does: >> >> u'that' > > +1 >> I just don't understand the pushback here at all. This is such a >> nobrainer. > > I agree. Just let's start deprecating it too, so that once Python 2.x > compatibility is no longer relevant we can eventually stop supporting > it (though that may have to wait until Python 4...). We need to send > *some* sort of signal that this is a compatibility hack and that no > new code should use it. Maybe a SilentDeprecationWarning? One possibility: leave Ref Man 2.4.1. *String and Bytes literals* as is. Add ''' 2.4.1.1 Deprecated u prefix. To aid people who want to update Python 2 code to also run under Python 3, string literals may optionally be prefixed with "u" or "U". For this purpose, but only for this purpose, the grammar actually reads stringprefix ::= "r" | "R" | "ur" | "Ur" | "uR" | "UR" Since "u" and "U" will go away again some year, they should only be used for such multi-version code and not in code only intended for Python 3. See PEP 414. Version added: 3.3 ''' I think the PEP should have exaggerated statements removed, perhaps be shortened, explain how to patch code on installation for 3.1/2, and have something at the top pointing to that explanation. -- Terry Jan Reedy From martin at v.loewis.de Mon Feb 27 21:21:13 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 27 Feb 2012 21:21:13 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4BC59D.1050208@stoneleaf.us> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <4F4BC59D.1050208@stoneleaf.us> Message-ID: <4F4BE5B9.1030301@v.loewis.de> >> Eh? The 2.6 version would also be u('that'). That's the whole point >> of the idiom. You'll need a better counter argument than that. > > So the idea is to convert the existing 2.6 code to use parenthesis as > well? (I obviously haven't read the PEP -- my apologies.) Well, if you didn't, you wouldn't have the same sources on 2.x and 3.x. And if that was ok, you wouldn't need the u() function in 3.x at all, since plain string literals are *already* unicode strings there. Regards, Martin From rdmurray at bitdance.com Mon Feb 27 21:23:35 2012 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 27 Feb 2012 15:23:35 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <1330372221.12046.119.camel@thinko> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> Message-ID: <20120227202335.ACAFD25009E@webabinitio.net> On Mon, 27 Feb 2012 14:50:21 -0500, Chris McDonough wrote: > Currently we handle 3.2 compatibility in packages that "straddle" via > six-like functions. We can continue doing this as necessary. If the It seems to me that this undermines your argument in favor of u''. Why can't you just continue to do the above for 3.3 and beyond? Frankly, *I'm* not worried about the uptake pace of Python3. It feels to me like it is pretty much on schedule, if not ahead of it. But to repeat, I'm not voting -1 here, I'm playing devil's advocate. --David From tjreedy at udel.edu Mon Feb 27 21:32:29 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 27 Feb 2012 15:32:29 -0500 Subject: [Python-Dev] Marking packaging-related PEPs as Finished after fixing some bugs in them In-Reply-To: <20120227125025.7f5450ab@pitrou.net> References: <4F4B5D88.2010804@netwok.org> <20120227125025.7f5450ab@pitrou.net> Message-ID: On 2/27/2012 6:50 AM, Antoine Pitrou wrote: > 'rc' makes sense to most people while 'c' is generally unheard of. 'rc' following 'a' and 'b' only makes sense to people who are used to it and know what it means. 'c' for 'candidate' makes more sense to me both a decade ago and now. 'rc' is inconsistent. Why not 'ra' for 'release alpha' or 'ar' for 'alpha release'? In other words, all releases are releases, so why not be consistent and either always or never include 'r'? (Never would be better since always is redundant.) I suspect many non-developer users find 'rc' as surprising as I did. -- Terry Jan Reedy From chrism at plope.com Mon Feb 27 21:39:29 2012 From: chrism at plope.com (Chris McDonough) Date: Mon, 27 Feb 2012 15:39:29 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <20120227202335.ACAFD25009E@webabinitio.net> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> Message-ID: <1330375169.12046.133.camel@thinko> On Mon, 2012-02-27 at 15:23 -0500, R. David Murray wrote: > On Mon, 27 Feb 2012 14:50:21 -0500, Chris McDonough wrote: > > Currently we handle 3.2 compatibility in packages that "straddle" via > > six-like functions. We can continue doing this as necessary. If the > > It seems to me that this undermines your argument in favor of u''. > Why can't you just continue to do the above for 3.3 and beyond? I really don't know how long I'll need to do future development in the subset language of Python 2 and Python 3 because I can't predict the future. It could be two years, it might be five. Who knows. But I do know that I'm going to be developing in the subset of Python that currently runs on Python 2 >= 2.6 and Python 3 >= 3.2 for at least a year. And that will suck, because that language is a much less fun language in which to develop than either Python 2 or Python 3. Frankly, it's a pretty bad language. If we make this change now, it means a year from now I'll be able to develop in a slightly less sucky subset language if I choose to drop support for 3.2. And people who don't try to support Python 3 at all til then will never have to program in the suckiest subset like I will have had to. Note that u'' literals are sort of the tip of the iceberg here; supporting them will obviously not make development under the subset an order of magnitude less sucky, just a tiny little bit less sucky. There are other extremely annoying things, like str(bytes) returning the repr of a bytestring on Python 3. That's almost as irritating as the absence of u'' literals, but we have to evaluate one thing at a time. - C From ethan at stoneleaf.us Mon Feb 27 21:29:23 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 27 Feb 2012 12:29:23 -0800 Subject: [Python-Dev] PEP 415: Implementing PEP 409 differently In-Reply-To: References: <4F4BC9CD.8030908@stoneleaf.us> Message-ID: <4F4BE7A3.7090303@stoneleaf.us> Benjamin Peterson wrote: > 2012/2/27 Ethan Furman : >> Benjamin Peterson wrote: >>> 2012/2/26 Nick Coghlan : >>>> Thanks for writing that up. I'd be amenable if the PEP was clearly >>>> updated to say that ``raise exc from cause`` would change from being >>>> syntactic sugar for ``_hidden = exc; _hidden.__cause__ = cause; raise >>>> exc`` (as it is now) to ``_hidden = exc; _hidden.__cause__ = cause; >>>> _hidden.__suppress_context__ = True; raise exc``. The patch should >>>> then be implemented accordingly (including appropriate updates to the >>>> language reference). >>> >>> I add the following lines to the PEP: >>> >>> To summarize, ``raise exc from cause`` will be equivalent to:: >>> >>> exc.__cause__ = cause >>> exc.__suppress_context__ = cause is None >>> raise exc >> >> So exc.__cause__ will be None both before and after `raise Exception from >> None`? > > Yes. And the primary advantage being that we don't lose an already set __cause__ (since most of the time __cause__ would be empty and we're just suppressing __context__)... seems like a good idea. +1 As far as Ellipsis goes -- I do think it works well in this case, but I am not opposed to changing it. I do think we do ourselves a disservice if we refuse to use it in other situations because "it's only for slices". ~Ethan~ From vinay_sajip at yahoo.co.uk Mon Feb 27 22:03:23 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 27 Feb 2012 21:03:23 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> Message-ID: Chris McDonough plope.com> writes: > I really don't know how long I'll need to do future development in the > subset language of Python 2 and Python 3 because I can't predict the > future. It could be two years, it might be five. Who knows. > > But I do know that I'm going to be developing in the subset of Python > that currently runs on Python 2 >= 2.6 and Python 3 >= 3.2 for at least > a year. And that will suck, because that language is a much less fun > language in which to develop than either Python 2 or Python 3. Frankly, > it's a pretty bad language. What exactly is it that makes it so bad? Since you're developing for >= 2.6, what stops you from using "from __future__ import unicode_literals" and 'xxx' for text and b'yyy' for bytes? Then you would be working in essentially Python 3.x, at least as far as string literals go. The conversion time will be very small compared to the year time-frame you're talking about. > If we make this change now, it means a year from now I'll be able to > develop in a slightly less sucky subset language if I choose to drop > support for 3.2. And people who don't try to support Python 3 at all > til then will never have to program in the suckiest subset like I will > have had to. And if we don't make the change now and you change your code to use unicode_literals, convert u'xxx' -> 'xxx' and then change the places where you really meant to use bytes, that'll be a one-off change after which you will be working on a common codebase which works on 2.6+ and 3.0+, and as far as string literals are concerned you'll be working in the hopefully non-sucky 3.x syntax. > Note that u'' literals are sort of the tip of the iceberg here; > supporting them will obviously not make development under the subset an > order of magnitude less sucky, just a tiny little bit less sucky. There > are other extremely annoying things, like str(bytes) returning the repr > of a bytestring on Python 3. That's almost as irritating as the absence > of u'' literals, but we have to evaluate one thing at a time. Yes, but making a backward step like reintroducing u'' just to make things a tiny little bit sucky doesn't seem to me to be worth it, because then >= 3.3 is different to 3.2 and earlier. Armin's suggestion of an install-time fixer is analogous to running 2to3 after every change, if you're trying to support 3.2 and 3.3+ at the same time, isn't it? You can't just edit-and-test, which to me is the main benefit of a single codebase. Regards, Vinay Sajip From barry at python.org Mon Feb 27 22:03:51 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 27 Feb 2012 16:03:51 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <1330375169.12046.133.camel@thinko> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> Message-ID: <20120227160351.5ce20059@resist.wooz.org> On Feb 27, 2012, at 03:39 PM, Chris McDonough wrote: >Note that u'' literals are sort of the tip of the iceberg here; >supporting them will obviously not make development under the subset an >order of magnitude less sucky, just a tiny little bit less sucky. There >are other extremely annoying things, like str(bytes) returning the repr >of a bytestring on Python 3. That's almost as irritating as the absence >of u'' literals, but we have to evaluate one thing at a time. Yeah, that one has bitten me many times, and for me it *is* more irritating because it's harder to work around. -Barry From chrism at plope.com Mon Feb 27 22:04:06 2012 From: chrism at plope.com (Chris McDonough) Date: Mon, 27 Feb 2012 16:04:06 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> Message-ID: <1330376646.12046.152.camel@thinko> On Mon, 2012-02-27 at 20:18 +0000, Vinay Sajip wrote: > Chris McDonough plope.com> writes: > > > I suspect not everyone lives and dies by OS distribution release support > > policies. Many folks are both willing and capable to install a newer > > Python on an older OS. > > But many folks aren't, and lament the slow pace of Python version adoption on > e.g. Red Hat and CentOS. It's great to have software that installs easily. That said, the versions of Python that my software supports is (and has to be) be my choice. As far as I can tell, there are maybe three or four people (besides me) using my software on Python 3 right now. They have it pretty rough: lackluster library support and they have to constantly mentally transliterate third-party example code to code that works under Python 3. They are troopers! None of them would so much as bat an eyelash if I told them today they had to use Python 3.3 (if it existed in a final released form anyway) to use my software. It's just a minor drop in the bucket of inconvenience they have to currently withstand. > > It's unfortunate that Python 3 < 3.3 does not have the syntax, and > > people like me who have a long-term need to "straddle" are to blame; we > > didn't provide useful feedback early enough to avoid the mistake. That > > said, it seems like preventing a reintroduction of u'' literal syntax > > would presume that two wrongs make a right. By our own schedule > > estimate of Python 3 takeup, many people won't be even thinking about > > porting any Python 2 code to 3 until years from now. > > If the lack of u'' literal is what's holding them back, that's germane to the > discussion of the PEP. If it's not, then why propose the PEP? Like I said in an earlier email, u'' literal support is by no means the only issue for people who want to straddle. But it *is* an issue, and it's incredibly low-hanging fruit with near-zero real-world impact if it is reintroduced. > > An argument for the reintroduction of u'' literal syntax in Python >= > > 3.3 is not necessarily an argument against the utility of some automated > > tool conversion support for porting a Python 2 app to a function-based > > u() syntax so it can run in Python 3 < 3.2. > > I thought the argument was more about backtracking (or not) from Python 3's > design decision to use 'xxx' for text and b'yyy' for bytes. That's the only > "wrong" we're talking about for this PEP, right? You cast it as "backtracking" to reintroduce the syntax, but things have changed from when the decision to omit it was first made. Its omission introduces pain in a world where it's expected that we don't use 2to3 to automatically translate code at installation time. > > Currently we handle 3.2 compatibility in packages that "straddle" via > > six-like functions. We can continue doing this as necessary. If the > > stdlib tooling helps, great. In an emit-function-based-syntax mode, the > > conversion code would almost certainly need to rely on the import of an > > externally downloadable module like six, for compatibility under both > > Python 2 and 3 because there's no opportunity to go back in time and > > make "u()" available for older releases unless it was like inlined in > > every module during the conversion. > > > > But if somebody only wants to target 3.3+, and it means they don't have > > to rely on a six-like module to provide u(), great. > > If you only need to straddle from 2.6 onwards, then u('') isn't an issue at all, > right now, is it? If you look at a piece of code as something that exists in one of the two states "ported" or "not-ported", sure. But code often needs to be changed, and people of varying buy-in levels need to understand and change such code. It's just much easier for them to assume that the same syntax works on some versions of Python 2 and Python 3 and be done with it rather than need to explain the introduction of a function that only exists to paper over a syntax omission. > If you need to straddle from 2.5 downwards, there are other issues to be > addressed, like exception syntax, 'with' and so forth - so making u'' available > doesn't make the port a no-brainer. And if you bite the bullet and decide to do > the port anyway, converting u'' to u('') won't be a problem unless you (a) can't > use a fixer to automate the conversion or (b) the function call overhead cannot > be borne. I'm not sure either of those objections (can't use fixer, call > overhead excessive) have been made with sufficient force (i.e., data) in the > discussion so far. > > Regards, > > Vinay Sajip > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40plope.com > From p.f.moore at gmail.com Mon Feb 27 22:07:03 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 27 Feb 2012 21:07:03 +0000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <1330375169.12046.133.camel@thinko> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> Message-ID: On 27 February 2012 20:39, Chris McDonough wrote: > Note that u'' literals are sort of the tip of the iceberg here; > supporting them will obviously not make development under the subset an > order of magnitude less sucky, just a tiny little bit less sucky. ?There > are other extremely annoying things, like str(bytes) returning the repr > of a bytestring on Python 3. ?That's almost as irritating as the absence > of u'' literals, but we have to evaluate one thing at a time. So. Am I misunderstanding here, or are you suggesting that this particular PEP doesn't help you much, but if it's accepted, it represents "the thin end of the wedge" for a series of subsequent PEPs suggesting fixes for a number of other "extremely annoying things"...? I'm sure that's not what you meant, but it's certainly what it sounded like to me! Paul. From chrism at plope.com Mon Feb 27 22:10:25 2012 From: chrism at plope.com (Chris McDonough) Date: Mon, 27 Feb 2012 16:10:25 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> Message-ID: <1330377025.12046.155.camel@thinko> On Mon, 2012-02-27 at 21:07 +0000, Paul Moore wrote: > On 27 February 2012 20:39, Chris McDonough wrote: > > Note that u'' literals are sort of the tip of the iceberg here; > > supporting them will obviously not make development under the subset an > > order of magnitude less sucky, just a tiny little bit less sucky. There > > are other extremely annoying things, like str(bytes) returning the repr > > of a bytestring on Python 3. That's almost as irritating as the absence > > of u'' literals, but we have to evaluate one thing at a time. > > So. Am I misunderstanding here, or are you suggesting that this > particular PEP doesn't help you much, but if it's accepted, it > represents "the thin end of the wedge" for a series of subsequent PEPs > suggesting fixes for a number of other "extremely annoying things"...? > > I'm sure that's not what you meant, but it's certainly what it sounded > like to me! I'm way too lazy. The political wrangling is just too draining (especially over something so trivial). But I will definitely support other proposals that make it easier to straddle, sure. - C From python-dev at masklinn.net Mon Feb 27 22:13:37 2012 From: python-dev at masklinn.net (Xavier Morel) Date: Mon, 27 Feb 2012 22:13:37 +0100 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: References: Message-ID: <373DB42E-A78C-4149-912E-B2E86888B4C6@masklinn.net> On 2012-02-27, at 19:53 , Victor Stinner wrote: > Rationale > ========= > > A frozendict type is a common request from users and there are various > implementations. There are two main Python implementations: > > * "blacklist": frozendict inheriting from dict and overriding methods > to raise an exception when trying to modify the frozendict > * "whitelist": frozendict not inheriting from dict and only implement > some dict methods, or implement all dict methods but raise exceptions > when trying to modify the frozendict > > The blacklist implementation has a major issue: it is still possible > to call write methods of the dict class (e.g. dict.set(my_frozendict, > key, value)). > > The whitelist implementation has an issue: frozendict and dict are not > "compatible", dict is not a subclass of frozendict (and frozendict is > not a subclass of dict). This may be an issue at the C level (I'm not sure), but since this would be a Python 3-only collection, "user" code (in Python) should/would generally be using abstract base classes, so type-checking would not be an issue (as in Python code performing `isinstance(a, dict)` checks naturally failing on `frozendict`) Plus `frozenset` does not inherit from `set`, it's a whitelist reimplementation and I've never known anybody to care. So there's that precedent. And of course there's no inheritance relationship between lists and tuples. > * frozendict has not the following methods: clear, __delitem__, pop, > popitem, setdefault, __setitem__ and update. As tuple/frozenset has > less methods than list/set. It'd probably be simpler to define that frozendict is a Mapping (where dict is a MutableMapping). And that's clearer. > * Make dict inherits from frozendict Isn't that the other way around from the statement above? Not that I'd have an issue with it, it's much cleaner, but there's little gained by doing so since `isinstance(a, dict)` will still fail if `a` is a frozendict. > * Add a frozendict abstract base class to collections? Why? There's no `dict` ABC, and there are already a Mapping and a MutableMapping ABC which fit the bill no? From chrism at plope.com Mon Feb 27 22:16:39 2012 From: chrism at plope.com (Chris McDonough) Date: Mon, 27 Feb 2012 16:16:39 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> Message-ID: <1330377399.12046.158.camel@thinko> On Mon, 2012-02-27 at 21:03 +0000, Vinay Sajip wrote: > Chris McDonough plope.com> writes: > > > I really don't know how long I'll need to do future development in the > > subset language of Python 2 and Python 3 because I can't predict the > > future. It could be two years, it might be five. Who knows. > > > > But I do know that I'm going to be developing in the subset of Python > > that currently runs on Python 2 >= 2.6 and Python 3 >= 3.2 for at least > > a year. And that will suck, because that language is a much less fun > > language in which to develop than either Python 2 or Python 3. Frankly, > > it's a pretty bad language. > > What exactly is it that makes it so bad? Since you're developing for >= 2.6, > what stops you from using "from __future__ import unicode_literals" and 'xxx' > for text and b'yyy' for bytes? Then you would be working in essentially Python > 3.x, at least as far as string literals go. The conversion time will be very > small compared to the year time-frame you're talking about. > > > If we make this change now, it means a year from now I'll be able to > > develop in a slightly less sucky subset language if I choose to drop > > support for 3.2. And people who don't try to support Python 3 at all > > til then will never have to program in the suckiest subset like I will > > have had to. > > And if we don't make the change now and you change your code to use > unicode_literals, convert u'xxx' -> 'xxx' and then change the places where you > really meant to use bytes, that'll be a one-off change after which you will be > working on a common codebase which works on 2.6+ and 3.0+, and as far as string > literals are concerned you'll be working in the hopefully non-sucky 3.x syntax. > > > Note that u'' literals are sort of the tip of the iceberg here; > > supporting them will obviously not make development under the subset an > > order of magnitude less sucky, just a tiny little bit less sucky. There > > are other extremely annoying things, like str(bytes) returning the repr > > of a bytestring on Python 3. That's almost as irritating as the absence > > of u'' literals, but we have to evaluate one thing at a time. > > Yes, but making a backward step like reintroducing u'' just to make things a > tiny little bit sucky doesn't seem to me to be worth it, because then >= 3.3 is > different to 3.2 and earlier. Armin's suggestion of an install-time fixer is > analogous to running 2to3 after every change, if you're trying to support 3.2 > and 3.3+ at the same time, isn't it? You can't just edit-and-test, which to me > is the main benefit of a single codebase. The downsides of a unicode_literals future import are spelled out in the PEP: http://www.python.org/dev/peps/pep-0414/#rationale-and-goals - C From victor.stinner at gmail.com Mon Feb 27 22:28:22 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 27 Feb 2012 22:28:22 +0100 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: <373DB42E-A78C-4149-912E-B2E86888B4C6@masklinn.net> References: <373DB42E-A78C-4149-912E-B2E86888B4C6@masklinn.net> Message-ID: > This may be an issue at the C level (I'm not sure), but since this would > be a Python 3-only collection, "user" code (in Python) should/would > generally be using abstract base classes, so type-checking would not > be an issue (as in Python code performing `isinstance(a, dict)` checks > naturally failing on `frozendict`) > > Plus `frozenset` does not inherit from `set`, it's a whitelist > reimplementation and I've never known anybody to care. So there's > that precedent. And of course there's no inheritance relationship > between lists and tuples. At a second thought, I realized that it does not really matter. frozendict and dict can be "unrelated" (no inherance relation). Victor From ethan at stoneleaf.us Mon Feb 27 22:09:24 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 27 Feb 2012 13:09:24 -0800 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4BE5B9.1030301@v.loewis.de> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <4F4BC59D.1050208@stoneleaf.us> <4F4BE5B9.1030301@v.loewis.de> Message-ID: <4F4BF104.4070102@stoneleaf.us> Martin v. L?wis wrote: >>> Eh? The 2.6 version would also be u('that'). That's the whole point >>> of the idiom. You'll need a better counter argument than that. >> So the idea is to convert the existing 2.6 code to use parenthesis as >> well? (I obviously haven't read the PEP -- my apologies.) > > Well, if you didn't, you wouldn't have the same sources on 2.x and 3.x. > And if that was ok, you wouldn't need the u() function in 3.x at all, > since plain string literals are *already* unicode strings there. True -- but I would rather have u'' in 2.6 and 3.3 than u('') in 2.6 and 3.3. ~Ethan~ From armin.ronacher at active-4.com Mon Feb 27 22:35:43 2012 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Mon, 27 Feb 2012 21:35:43 +0000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <20120227174434.Horde.PAV6fUlCcOxPS7LyDc6X4bA@webmail.df.eu> References: <4F49434B.6050604@active-4.com> <4F4A28CD.5070903@active-4.com> <4F4B5847.1040107@v.loewis.de> <4F4BA50A.3020009@active-4.com> <20120227174434.Horde.PAV6fUlCcOxPS7LyDc6X4bA@webmail.df.eu> Message-ID: <4F4BF72F.9010704@active-4.com> Hi, On 2/27/12 4:44 PM, martin at v.loewis.de wrote: > Maybe I'm missing something, but there doesn't seem to be a benchmark > that measures the 2to3 performance, supporting the claim that it > runs "two orders of magnitude" slower (which I'd interpret as a > factor of 100). My Jinja2+Werkzeug's testsuite combined takes 2 seconds to run (Werkzeug actually takes 3 because it pauses for two seconds in a cache expiration test). 2to3 takes 45 seconds to run. And those are small code bases (15K lines combined). It's not exactly two orders of magnitude so I will probably change the writing to "just" 20 times slower but it illustrates the point. Regards, Armin From solipsis at pitrou.net Mon Feb 27 22:36:11 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 27 Feb 2012 22:36:11 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <4F4BC59D.1050208@stoneleaf.us> <4F4BE5B9.1030301@v.loewis.de> <4F4BF104.4070102@stoneleaf.us> Message-ID: <20120227223611.2449b7e9@pitrou.net> On Mon, 27 Feb 2012 13:09:24 -0800 Ethan Furman wrote: > Martin v. L?wis wrote: > >>> Eh? The 2.6 version would also be u('that'). That's the whole point > >>> of the idiom. You'll need a better counter argument than that. > >> So the idea is to convert the existing 2.6 code to use parenthesis as > >> well? (I obviously haven't read the PEP -- my apologies.) > > > > Well, if you didn't, you wouldn't have the same sources on 2.x and 3.x. > > And if that was ok, you wouldn't need the u() function in 3.x at all, > > since plain string literals are *already* unicode strings there. > > True -- but I would rather have u'' in 2.6 and 3.3 than u('') in 2.6 and > 3.3. You don't want to be 3.2-compatible? Antoine. From vinay_sajip at yahoo.co.uk Mon Feb 27 22:43:59 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 27 Feb 2012 21:43:59 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <1330376646.12046.152.camel@thinko> Message-ID: Chris McDonough plope.com> writes: > It's great to have software that installs easily. That said, the > versions of Python that my software supports is (and has to be) be my > choice. Of course. And if I understand correctly, that's 2.6, 2.7, 3.2 and later versions. I'll ignore 2.5 and earlier in this specific reply. > None of them would so much as bat an eyelash if I told them today they > had to use Python 3.3 (if it existed in a final released form anyway) to > use my software. It's just a minor drop in the bucket of inconvenience > they have to currently withstand. Their pain (lacklustre library support and transliterating examples from 2.x to 3.x) would be the same under 3.2 and 3.3 (unless for some perverse reason people only made libraries work under one of 3.2 and 3.3, but not both). Is it really that hard to transliterate 2.x examples to 3.x in the literal-string dimension? I can't believe it is, as the target audience is programmers. > > If the lack of u'' literal is what's holding them back, that's germane to the > > discussion of the PEP. If it's not, then why propose the PEP? > > Like I said in an earlier email, u'' literal support is by no means the > only issue for people who want to straddle. But it *is* an issue, and > it's incredibly low-hanging fruit with near-zero real-world impact if it > is reintroduced. But the implication of the PEP is that lack of u'' support is a major hindrance to porting, justifying the production of the PEP and this discussion. And it's not low-hanging fruit with near-zero real-world impact if we're going to deprecate it at some point (which Guido was talking about) - you're just moving the pain to a later date, unless we don't ever deprecate. I feel, like some others, that 'xxx' is natural for text, u'xxx' is inelegant by comparison, and u('xxx') a little more inelegant still. However, allowing u'' syntax in 3.3 as per this PEP, but allowing it to be optional, allows any combination of u'xxx' and 'xxx' in code in a 3.x context, which doesn't see to me to be an ideal situation especially if you have hit-and-run contributors who are not necessarily attuned to project conventions. > You cast it as "backtracking" to reintroduce the syntax, but things have > changed from when the decision to omit it was first made. Its omission > introduces pain in a world where it's expected that we don't use 2to3 to > automatically translate code at installation time. I'm calling it like it is. "reintroduce" in this case means undoing something already done, so it's appropriate to say "backtracking". I don't agree that things have changed. If I want to write code that works on 2.x and 3.x without the pain of running 2to3 after every change, and I'm only interested in supporting >= 2.6 (your situation, IIUC), then I use "from __future__ import unicode_literals" - that's what it was created for, wasn't it? - and use 'xxx' where I need text, b'xxx' where I need bytes, and a function to deliver native strings where they're needed. If I have a 2.x project full of u'' code which I need to bring into this approach, then I run 2to3, review what it tells me, make the changes necessary (as far as literals go, that's adding the unicode_literals import to all files, and converting u'xxx' -> 'xxx'. When I test the result, I will find numerous failures, some of which point to places where I should have used native strings (e.g. kwargs keys), which I then fix. Other areas will be where I needed to use bytes (e.g. encoding/decoding/hashing), which I will also fix. I use six or a similar approach to sort out any other issues which crop up, e.g. metaclass syntax, execfile, and so on. After a relatively modest amount of work, I have a codebase that works on 2.x and 3.x, and all I have to remember is that 'xxx' is Unicode, and if I create a new module, I need to add the future import (on the assumption that I might add literal strings later, if not now). After that, it seems to be plain sailing, and I don't have to switch mental gears re. string literals. > If you look at a piece of code as something that exists in one of the > two states "ported" or "not-ported", sure. But code often needs to be > changed, and people of varying buy-in levels need to understand and > change such code. It's just much easier for them to assume that the > same syntax works on some versions of Python 2 and Python 3 and be done > with it rather than need to explain the introduction of a function that > only exists to paper over a syntax omission. Well, according to the approach I described above, that one thing needs to be the present 3.x syntax - 'xxx' is text, b'xxx' is bytes, and f('xxx') is native string (or whatever name you want instead of f). With the unicode_literals import, that syntax works on 2.6+ and 3.2+, so ISTM it should work within the constraints you mentioned for your software. Regards, Vinay Sajip From armin.ronacher at active-4.com Mon Feb 27 22:44:27 2012 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Mon, 27 Feb 2012 21:44:27 +0000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <20120227223611.2449b7e9@pitrou.net> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <4F4BC59D.1050208@stoneleaf.us> <4F4BE5B9.1030301@v.loewis.de> <4F4BF104.4070102@stoneleaf.us> <20120227223611.2449b7e9@pitrou.net> Message-ID: <4F4BF93B.8010906@active-4.com> Hi, On 2/27/12 9:36 PM, Antoine Pitrou wrote: > You don't want to be 3.2-compatible? See the PEP. It shows how it would still be 3.2 compatible at installation time due to an installation hook that would be provided. Regards, Armin From tjreedy at udel.edu Mon Feb 27 22:45:30 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 27 Feb 2012 16:45:30 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <1330365662.12046.72.camel@thinko> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> Message-ID: On 2/27/2012 1:01 PM, Chris McDonough wrote: > I just don't understand the pushback here at all. This is such a > nobrainer. Last December, Armin wrote in http://lucumr.pocoo.org/2011/12/7/thoughts-on-python3/ "And in my absolutely personal opinion Python 3.3/3.4 should be more like Python 2* and Python 2.8 should happen and be a bit more like Python 3." * he wrote '3' but obviously mean '2'. Today, you made it clear that you regard this PEP as one small step in reverting Python 3 toward Python 2 and that you support the above goal. *That* is what some are pushing back against. -- Terry Jan Reedy From storchaka at gmail.com Mon Feb 27 22:47:28 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 27 Feb 2012 23:47:28 +0200 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> Message-ID: 27.02.12 22:19, Terry Reedy ???????(??): > Since "u" and "U" will go away again some year, they should only be used > for such multi-version code and not in code only intended for Python 3. > See PEP 414. And not for code intended for both Python 2 and Python 3.0-3.2. From vinay_sajip at yahoo.co.uk Mon Feb 27 22:53:06 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 27 Feb 2012 21:53:06 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <4F4BC59D.1050208@stoneleaf.us> <4F4BE5B9.1030301@v.loewis.de> <4F4BF104.4070102@stoneleaf.us> Message-ID: Ethan Furman stoneleaf.us> writes: > True -- but I would rather have u'' in 2.6 and 3.3 than u('') in 2.6 and > 3.3. You don't need u('') in 2.6 - why do you think you need it there? If you don't implement this PEP, you can have, *uniformly* across 2.6, 2.7 and all 3.x versions, 'xxx' for text and b'yyy' for bytes. For 2.6 you would have to add "from __future__ import unicode_literals", and this might uncover places where you need to change things to use bytes or native strings - either because of bugs in the original code, or drawbacks in a Python version where you can't use Unicode as keys in a kwargs dictionary, or some API that wants you to use str explicitly. But at least some of those places will be things you would have to address anyway, when porting, whatever the state of Unicode literal support. Regards, Vinay Sajip From tjreedy at udel.edu Mon Feb 27 22:54:51 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 27 Feb 2012 16:54:51 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> Message-ID: On 2/27/2012 1:17 PM, Guido van Rossum wrote: >> I just don't understand the pushback here at all. This is such a >> nobrainer. > I agree. Just let's start deprecating it too, so that once Python 2.x > compatibility is no longer relevant we can eventually stop supporting > it (though that may have to wait until Python 4...). We need to send > *some* sort of signal that this is a compatibility hack and that no > new code should use it. Maybe a SilentDeprecationWarning? Before we make this change, I would like to know if this is Armin's last proposal to revert Python 3 toward Python 2 or merely the first in a series. I question this because last December Armin wrote "And in my absolutely personal opinion Python 3.3/3.4 should be more like Python 2* and Python 2.8 should happen and be a bit more like Python 3." * he wrote '3' but obviously means '2'. http://lucumr.pocoo.org/2011/12/7/thoughts-on-python3/ Chris has also made it clear that he (also?) would like more reversions. -- Terry Jan Reedy From jimjjewett at gmail.com Mon Feb 27 22:56:48 2012 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Mon, 27 Feb 2012 13:56:48 -0800 (PST) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: Message-ID: <4f4bfc20.68b8ec0a.020a.1ec2@mx.google.com> In http://mail.python.org/pipermail/python-dev/2012-February/116953.html Terry J. Reedy wrote: > I presume that most 2.6 code has problems other than u'' when > attempting to run under 3.x. Why? If you're talking about generic code that has seen minimal changes since 2.0, sure. But I think this request is specifically for projects that are thinking about python 3, but are trying to use a single source base regardless of version. Using an automatic translation step means that python (or at least python 3) would no longer be the actual source code. I've worked with enough generated "source" code in other languages that it is worth some pain to avoid even a slippery slope. By the time you drop 2.5, the "subset" language is already pretty good; if I have to write something version-specific, I prefer to treat that as a sign that I am using the wrong approach. -jJ -- If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ From armin.ronacher at active-4.com Mon Feb 27 22:57:36 2012 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Mon, 27 Feb 2012 21:57:36 +0000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> Message-ID: <4F4BFC50.5050108@active-4.com> Hi, On 2/27/12 9:47 PM, Serhiy Storchaka wrote: > And not for code intended for both Python 2 and Python 3.0-3.2. Even then since you can use the installation time hook to strip off the 'u' prefixes. Regards, Armin From rdmurray at bitdance.com Mon Feb 27 22:58:27 2012 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 27 Feb 2012 16:58:27 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <1330377399.12046.158.camel@thinko> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> Message-ID: <20120227215829.1DF3B2500E4@webabinitio.net> On Mon, 27 Feb 2012 16:16:39 -0500, Chris McDonough wrote: > On Mon, 2012-02-27 at 21:03 +0000, Vinay Sajip wrote: > > Yes, but making a backward step like reintroducing u'' just to make things a > > tiny little bit sucky doesn't seem to me to be worth it, because then >= 3.3 is > > different to 3.2 and earlier. Armin's suggestion of an install-time fixer is > > analogous to running 2to3 after every change, if you're trying to support 3.2 > > and 3.3+ at the same time, isn't it? You can't just edit-and-test, which to me > > is the main benefit of a single codebase. > > The downsides of a unicode_literals future import are spelled out in the > PEP: > > http://www.python.org/dev/peps/pep-0414/#rationale-and-goals But the PEP doesn't address the unicode_literals plus str() approach. That is, the rationale currently makes a false claim. --David From chrism at plope.com Mon Feb 27 23:01:34 2012 From: chrism at plope.com (Chris McDonough) Date: Mon, 27 Feb 2012 17:01:34 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <1330376646.12046.152.camel@thinko> Message-ID: <1330380094.12046.172.camel@thinko> On Mon, 2012-02-27 at 21:43 +0000, Vinay Sajip wrote: > Chris McDonough plope.com> writes: > > > It's great to have software that installs easily. That said, the > > versions of Python that my software supports is (and has to be) be my > > choice. > > Of course. And if I understand correctly, that's 2.6, 2.7, 3.2 and later > versions. I'll ignore 2.5 and earlier in this specific reply. > > > None of them would so much as bat an eyelash if I told them today they > > had to use Python 3.3 (if it existed in a final released form anyway) to > > use my software. It's just a minor drop in the bucket of inconvenience > > they have to currently withstand. > > Their pain (lacklustre library support and transliterating examples from 2.x to > 3.x) would be the same under 3.2 and 3.3 (unless for some perverse reason people > only made libraries work under one of 3.2 and 3.3, but not both). If I had it to do all over again and a Python 3.X with unicode literals had been available, I might not have targeted Python 3.2 at all. I don't consider that perverse, I just consider it "Python 3 water under the bridge". Python 3.0 and 3.1 were this for me; I paid almost no attention to them at all. Python 3.2 will be that thing for many other people. > > Like I said in an earlier email, u'' literal support is by no means the > > only issue for people who want to straddle. But it *is* an issue, and > > it's incredibly low-hanging fruit with near-zero real-world impact if it > > is reintroduced. > > But the implication of the PEP is that lack of u'' support is a major hindrance > to porting, justifying the production of the PEP and this discussion. And it's > not low-hanging fruit with near-zero real-world impact if we're going to > deprecate it at some point (which Guido was talking about) - you're just moving > the pain to a later date, unless we don't ever deprecate. I personally see no need to deprecate. I can't conceive of an actual downside to eternal backwards compatibility here. All the arguments for its omission presume that there's some enormous untapped market full of people yearning for its omission who would be either horrified to see u'' or whom would not understand it on some fundamental level. I don't think such a market actually exists. However, there *is* a huge market for people who already understand it instinctively. > I feel, like some others, that 'xxx' is natural for text, u'xxx' is inelegant by > comparison, and u('xxx') a little more inelegant still. Yes, the aesthetics argument seems to be the remaining argument. I have no problem with the aesthetics of u'' myself. But I have no problem with the aesthetics of u('') for that matter either; if it had been used as the prevailing style to declare something being text in Python 2 and it had been omitted I'd be arguing for that instead. But it wasn't, of course. Anyway. I think I'm done doing the respond-point-for-point thing; it's becoming diminishing returns. - C From rdmurray at bitdance.com Mon Feb 27 23:02:00 2012 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 27 Feb 2012 17:02:00 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <1330377025.12046.155.camel@thinko> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377025.12046.155.camel@thinko> Message-ID: <20120227220201.956552500CF@webabinitio.net> On Mon, 27 Feb 2012 16:10:25 -0500, Chris McDonough wrote: > On Mon, 2012-02-27 at 21:07 +0000, Paul Moore wrote: > > On 27 February 2012 20:39, Chris McDonough wrote: > > > Note that u'' literals are sort of the tip of the iceberg here; > > > supporting them will obviously not make development under the subset an > > > order of magnitude less sucky, just a tiny little bit less sucky. There > > > are other extremely annoying things, like str(bytes) returning the repr > > > of a bytestring on Python 3. That's almost as irritating as the absence > > > of u'' literals, but we have to evaluate one thing at a time. > > > > So. Am I misunderstanding here, or are you suggesting that this > > particular PEP doesn't help you much, but if it's accepted, it > > represents "the thin end of the wedge" for a series of subsequent PEPs > > suggesting fixes for a number of other "extremely annoying things"...? > > > > I'm sure that's not what you meant, but it's certainly what it sounded > > like to me! > > I'm way too lazy. The political wrangling is just too draining > (especially over something so trivial). But I will definitely support > other proposals that make it easier to straddle, sure. "tip of the iceberg", eh? Or the nose of the camel in the tent. This pushes me in the direction of a -1 vote. --David From solipsis at pitrou.net Mon Feb 27 22:58:16 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 27 Feb 2012 22:58:16 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> Message-ID: <20120227225816.53f08fe3@pitrou.net> On Mon, 27 Feb 2012 16:54:51 -0500 Terry Reedy wrote: > On 2/27/2012 1:17 PM, Guido van Rossum wrote: > > >> I just don't understand the pushback here at all. This is such a > >> nobrainer. > > > I agree. Just let's start deprecating it too, so that once Python 2.x > > compatibility is no longer relevant we can eventually stop supporting > > it (though that may have to wait until Python 4...). We need to send > > *some* sort of signal that this is a compatibility hack and that no > > new code should use it. Maybe a SilentDeprecationWarning? > > Before we make this change, I would like to know if this is Armin's last > proposal to revert Python 3 toward Python 2 or merely the first in a > series. I question this because last December Armin wrote > > "And in my absolutely personal opinion Python 3.3/3.4 should be more > like Python 2* and Python 2.8 should happen and be a bit more like > Python 3." > * he wrote '3' but obviously means '2'. > http://lucumr.pocoo.org/2011/12/7/thoughts-on-python3/ > > Chris has also made it clear that he (also?) would like more reversions. Please. While I'm not strongly in favour of the PEP, this kind of argument is dishonest. Whatever Armin's secret wishes may be, his PEP should be judged on its own grounds. Thank you Antoine. From vinay_sajip at yahoo.co.uk Mon Feb 27 23:02:10 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 27 Feb 2012 22:02:10 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <4F4BC59D.1050208@stoneleaf.us> <4F4BE5B9.1030301@v.loewis.de> <4F4BF104.4070102@stoneleaf.us> <20120227223611.2449b7e9@pitrou.net> <4F4BF93B.8010906@active-4.com> Message-ID: Armin Ronacher active-4.com> writes: > On 2/27/12 9:36 PM, Antoine Pitrou wrote: > > You don't want to be 3.2-compatible? > See the PEP. It shows how it would still be 3.2 compatible at > installation time due to an installation hook that would be provided. I thought Antoine was just responding to the fact that Ethan's comment didn't mention 3.2. Re. the installation hook, let me get this right. If I have to work with code that needs to run under 3.2 or earlier *and* 3.3, and say that because this PEP has been accepted, the code contains both u'xxx' and 'yyy' forms of Unicode literal, then I can't just edit-save-test, right? I have to run your hook every time I want to switch between testing with 3.3 and 3.2 (say). Isn't this exactly the same problem as with running 2to3, except that your hook might run faster? I'm not convinced you can guarantee a seamless testing experience ;-) Regards, Vinay Sajip From guido at python.org Mon Feb 27 23:06:14 2012 From: guido at python.org (Guido van Rossum) Date: Mon, 27 Feb 2012 14:06:14 -0800 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <1330377025.12046.155.camel@thinko> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377025.12046.155.camel@thinko> Message-ID: Indeed, the wrangling has gone too far already. I'm accepting the PEP. It's about as harmless as they come. Make it so. --Guido van Rossum (sent from Android phone) On Feb 27, 2012 1:12 PM, "Chris McDonough" wrote: > On Mon, 2012-02-27 at 21:07 +0000, Paul Moore wrote: > > On 27 February 2012 20:39, Chris McDonough wrote: > > > Note that u'' literals are sort of the tip of the iceberg here; > > > supporting them will obviously not make development under the subset an > > > order of magnitude less sucky, just a tiny little bit less sucky. > There > > > are other extremely annoying things, like str(bytes) returning the repr > > > of a bytestring on Python 3. That's almost as irritating as the > absence > > > of u'' literals, but we have to evaluate one thing at a time. > > > > So. Am I misunderstanding here, or are you suggesting that this > > particular PEP doesn't help you much, but if it's accepted, it > > represents "the thin end of the wedge" for a series of subsequent PEPs > > suggesting fixes for a number of other "extremely annoying things"...? > > > > I'm sure that's not what you meant, but it's certainly what it sounded > > like to me! > > I'm way too lazy. The political wrangling is just too draining > (especially over something so trivial). But I will definitely support > other proposals that make it easier to straddle, sure. > > - C > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Feb 27 23:08:46 2012 From: guido at python.org (Guido van Rossum) Date: Mon, 27 Feb 2012 14:08:46 -0800 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <20120227225816.53f08fe3@pitrou.net> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <20120227225816.53f08fe3@pitrou.net> Message-ID: Well said Antoine. --Guido van Rossum (sent from Android phone) On Feb 27, 2012 2:03 PM, "Antoine Pitrou" wrote: > On Mon, 27 Feb 2012 16:54:51 -0500 > Terry Reedy wrote: > > On 2/27/2012 1:17 PM, Guido van Rossum wrote: > > > > >> I just don't understand the pushback here at all. This is such a > > >> nobrainer. > > > > > I agree. Just let's start deprecating it too, so that once Python 2.x > > > compatibility is no longer relevant we can eventually stop supporting > > > it (though that may have to wait until Python 4...). We need to send > > > *some* sort of signal that this is a compatibility hack and that no > > > new code should use it. Maybe a SilentDeprecationWarning? > > > > Before we make this change, I would like to know if this is Armin's last > > proposal to revert Python 3 toward Python 2 or merely the first in a > > series. I question this because last December Armin wrote > > > > "And in my absolutely personal opinion Python 3.3/3.4 should be more > > like Python 2* and Python 2.8 should happen and be a bit more like > > Python 3." > > * he wrote '3' but obviously means '2'. > > http://lucumr.pocoo.org/2011/12/7/thoughts-on-python3/ > > > > Chris has also made it clear that he (also?) would like more reversions. > > Please. While I'm not strongly in favour of the PEP, this kind of > argument is dishonest. Whatever Armin's secret wishes may be, his PEP > should be judged on its own grounds. > > Thank you > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From armin.ronacher at active-4.com Mon Feb 27 23:10:25 2012 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Mon, 27 Feb 2012 22:10:25 +0000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> Message-ID: <4F4BFF51.2090109@active-4.com> Hi, On 2/27/12 9:54 PM, Terry Reedy wrote: > Before we make this change, I would like to know if this is Armin's last > proposal to revert Python 3 toward Python 2 or merely the first in a > series. I question this because last December Armin wrote You're saying as if providing a sane upgrade path was a bad thing. That said, if I had other proposals I would have submitted them *now* since waiting for another Python version to go by would not be helpful. I only have myself to blame for providing that PEP now instead of earlier which would have been a lot more useful. Regards, Armin From armin.ronacher at active-4.com Mon Feb 27 23:11:36 2012 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Mon, 27 Feb 2012 22:11:36 +0000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <20120227215829.1DF3B2500E4@webabinitio.net> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> Message-ID: <4F4BFF98.2080007@active-4.com> Hi, On 2/27/12 9:58 PM, R. David Murray wrote: > But the PEP doesn't address the unicode_literals plus str() approach. > That is, the rationale currently makes a false claim. Which would be exactly what that u() does not do? Regards, Armin From tjreedy at udel.edu Mon Feb 27 23:18:13 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 27 Feb 2012 17:18:13 -0500 Subject: [Python-Dev] PEP 414 In-Reply-To: <4F4BA4E0.80806@active-4.com> References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <4F4A29BD.2090607@active-4.com> <4F4BA4E0.80806@active-4.com> Message-ID: On 2/27/2012 10:44 AM, Armin Ronacher wrote: > On 2/27/12 1:55 AM, Terry Reedy wrote: >> I presume such a hook would simply remove 'u' prefixes and would run >> *much* faster than 2to3. If such a hook is satisfactory for 3.2, why >> would it not be satisfactory for 3.3? > Agile development and unittests. Given that last December you wrote "And in my absolutely personal opinion Python 3.3/3.4 should be more like Python 2* and Python 2.8 should happen and be a bit more like Python 3." * you wrote '3' but obviously must have meant '2'. http://lucumr.pocoo.org/2011/12/7/thoughts-on-python3/ I would like to know if you think that this one change is enough to do agile development and testing, etc, or whether, as Chris McDonough hopes, this is just the first of a series of proposals you have planned. -- Terry Jan Reedy From tjreedy at udel.edu Mon Feb 27 23:19:06 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 27 Feb 2012 17:19:06 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <1330377025.12046.155.camel@thinko> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377025.12046.155.camel@thinko> Message-ID: On 2/27/2012 4:10 PM, Chris McDonough wrote: > On Mon, 2012-02-27 at 21:07 +0000, Paul Moore wrote: >> On 27 February 2012 20:39, Chris McDonough wrote: >>> Note that u'' literals are sort of the tip of the iceberg here; >>> supporting them will obviously not make development under the subset an >>> order of magnitude less sucky, just a tiny little bit less sucky. There >>> are other extremely annoying things, like str(bytes) returning the repr >>> of a bytestring on Python 3. That's almost as irritating as the absence >>> of u'' literals, but we have to evaluate one thing at a time. >> >> So. Am I misunderstanding here, or are you suggesting that this >> particular PEP doesn't help you much, but if it's accepted, it >> represents "the thin end of the wedge" for a series of subsequent PEPs >> suggesting fixes for a number of other "extremely annoying things"...? Last December, Armin wrote "And in my absolutely personal opinion Python 3.3/3.4 should be more like Python 2* and Python 2.8 should happen and be a bit more like Python 3." * he wrote '3' but obviously means '2'. http://lucumr.pocoo.org/2011/12/7/thoughts-on-python3/ >> I'm sure that's not what you meant, but it's certainly what it sounded >> like to me! > > I'm way too lazy. The political wrangling is just too draining > (especially over something so trivial). Turning Python 3 back into Python 2, or even moving in that direction, is neither 'trivial' nor a 'no-brainer'. > But I will definitely support > other proposals that make it easier to straddle, sure. -- Terry Jan Reedy From barry at python.org Mon Feb 27 23:24:29 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 27 Feb 2012 17:24:29 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <1330376646.12046.152.camel@thinko> Message-ID: <20120227172429.3d31e7e2@resist.wooz.org> On Feb 27, 2012, at 09:43 PM, Vinay Sajip wrote: >Well, according to the approach I described above, that one thing needs to be >the present 3.x syntax - 'xxx' is text, b'xxx' is bytes, and f('xxx') is >native string (or whatever name you want instead of f). With the >unicode_literals import, that syntax works on 2.6+ and 3.2+, so ISTM it >should work within the constraints you mentioned for your software. I agree, this works for me and it's what I do in all my code now. Strings adorned with u-prefixes just look unnatural, and there's no confusion that unadorned strings mean "unicode". And yes, I have had to use str('') occasionally to mean "native strings", but it's so rare and constant cost that I didn't even think twice about it after I discovered this trick. But it seems like this is just not an acceptable solution for proponents of the PEP. Given that the above is the most generally accepted way to spell these things in the Python versions we care about today (>= 2.6, 3.2), at the very least, the PEP needs to be rewritten to make it clear why the above is unacceptable. That's the only way IMO that the PEP can be judged on its own merits. (I'll concede for the sake of argument that 2to3 is unacceptable. I also think it's unnecessary though.) Cheers, -Barry From barry at python.org Mon Feb 27 23:29:23 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 27 Feb 2012 17:29:23 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377025.12046.155.camel@thinko> Message-ID: <20120227172923.4007c2e7@resist.wooz.org> On Feb 27, 2012, at 02:06 PM, Guido van Rossum wrote: >Indeed, the wrangling has gone too far already. I'm accepting the PEP. It's >about as harmless as they come. Make it so. I've learned that once a PEP is pronounced upon, it's usually to my personal (if not all of our mutual :) benefit to stop arguing. I still urge the PEP author to clean up the PEP and specifically address the issues brought up in this thread. That will be useful for the historical record. -Barry From armin.ronacher at active-4.com Mon Feb 27 23:32:48 2012 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Mon, 27 Feb 2012 22:32:48 +0000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <20120227172923.4007c2e7@resist.wooz.org> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377025.12046.155.camel@thinko> <20120227172923.4007c2e7@resist.wooz.org> Message-ID: <4F4C0490.5080603@active-4.com> Hi, On 2/27/12 10:29 PM, Barry Warsaw wrote: > I still urge the PEP author to clean up the PEP and specifically address the > issues brought up in this thread. That will be useful for the historical > record. That is a given. Regards, Armin From storchaka at gmail.com Mon Feb 27 23:38:14 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 28 Feb 2012 00:38:14 +0200 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4BFF98.2080007@active-4.com> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> Message-ID: 28.02.12 00:11, Armin Ronacher ???????(??): > On 2/27/12 9:58 PM, R. David Murray wrote: >> But the PEP doesn't address the unicode_literals plus str() approach. >> That is, the rationale currently makes a false claim. > Which would be exactly what that u() does not do? No. 1. u() is trivial for Python 3 and relatively expensive (and doubtful for non-ascii literals) for Python 2, unicode_literals plus str() is trivial for Python 3 and cheap for Python 2. 2. Text strings are natural and prevalent, but "natural" strings are domain-specific and archaic. From armin.ronacher at active-4.com Mon Feb 27 23:38:56 2012 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Mon, 27 Feb 2012 22:38:56 +0000 Subject: [Python-Dev] PEP 414 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <4F4A29BD.2090607@active-4.com> <4F4BA4E0.80806@active-4.com> Message-ID: <4F4C0600.5010903@active-4.com> Hi, On 2/27/12 10:18 PM, Terry Reedy wrote: > I would like to know if you think that this one change is enough to do > agile development and testing, etc, or whether, as Chris McDonough > hopes, this is just the first of a series of proposals you have planned. Indeed I have three other PEPs in the work. The reintroduction of "except (((ExceptionType),),)", the "<>" comparision operator and the removal of "nonlocal", the latter to make Python 2.x developers feel better about themselves. :-) Regards, Armin From jimjjewett at gmail.com Mon Feb 27 23:50:35 2012 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Mon, 27 Feb 2012 14:50:35 -0800 (PST) Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: Message-ID: <4f4c08bb.e89dec0a.772f.1b19@mx.google.com> In http://mail.python.org/pipermail/python-dev/2012-February/116955.html Victor Stinner proposed: > The blacklist implementation has a major issue: it is still possible > to call write methods of the dict class (e.g. dict.set(my_frozendict, > key, value)). It is also possible to use ctypes and violate even more invariants. For most purposes, this falls under "consenting adults". > The whitelist implementation has an issue: frozendict and dict are not > "compatible", dict is not a subclass of frozendict (and frozendict is > not a subclass of dict). And because of Liskov substitutability, they shouldn't be; they should be sibling children of a basedict that doesn't have the the mutating methods, but also doesn't *promise* not to mutate. > * frozendict values must be immutable, as dict keys Why? That may be useful, but an immutable dict whose values might mutate is also useful; by forcing that choice, it starts to feel too specialized for a builtin. > * Add an hash field to the PyDictObject structure That is another indication that it should really be a sibling class; most of the uses I have had for immutable dicts still didn't need hashing. It might be a worth adding anyhow, but only to immutable dicts -- not to every instance dict or keywords parameter. > * frozendict.__hash__ computes hash(frozenset(self.items())) and > caches the result is its private hash attribute Why? hash(frozenset(selk.keys())) would still meet the hash contract, but it would be approximately twice as fast, and I can think of only one case where it wouldn't work just as well. (That case is wanting to store a dict of alternative configuration dicts (with no defaulting of values), but ALSO wanting to use the configurations themselves (as opposed to their names) as the dict keys.) -jJ -- If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ From barry at python.org Mon Feb 27 23:52:24 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 27 Feb 2012 17:52:24 -0500 Subject: [Python-Dev] PEP 414 In-Reply-To: <4F4C0600.5010903@active-4.com> References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <4F4A29BD.2090607@active-4.com> <4F4BA4E0.80806@active-4.com> <4F4C0600.5010903@active-4.com> Message-ID: <20120227175224.6ff5e73d@resist.wooz.org> On Feb 27, 2012, at 10:38 PM, Armin Ronacher wrote: >Indeed I have three other PEPs in the work. The reintroduction of >"except (((ExceptionType),),)", the "<>" comparision operator and the >removal of "nonlocal", the latter to make Python 2.x developers feel >better about themselves. :-) One of them's a winner in my book, but I'll let you guess which one. OTOH, the time machine can bring you back to the future again. -Barry From storchaka at gmail.com Tue Feb 28 00:04:11 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 28 Feb 2012 01:04:11 +0200 Subject: [Python-Dev] PEP 414 In-Reply-To: <20120227175224.6ff5e73d@resist.wooz.org> References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <4F4A29BD.2090607@active-4.com> <4F4BA4E0.80806@active-4.com> <4F4C0600.5010903@active-4.com> <20120227175224.6ff5e73d@resist.wooz.org> Message-ID: 28.02.12 00:52, Barry Warsaw ???????(??): > On Feb 27, 2012, at 10:38 PM, Armin Ronacher wrote: >> Indeed I have three other PEPs in the work. The reintroduction of >> "except (((ExceptionType),),)", the"<>" comparision operator and the >> removal of "nonlocal", the latter to make Python 2.x developers feel >> better about themselves. :-) > > One of them's a winner in my book, but I'll let you guess which one. OTOH, > the time machine can bring you back to the future again. http://www.artima.com/weblogs/viewpost.jsp?thread=173477 From tjreedy at udel.edu Tue Feb 28 00:19:25 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 27 Feb 2012 18:19:25 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4f4bfc20.68b8ec0a.020a.1ec2@mx.google.com> References: <4f4bfc20.68b8ec0a.020a.1ec2@mx.google.com> Message-ID: On 2/27/2012 4:56 PM, Jim J. Jewett wrote: > In http://mail.python.org/pipermail/python-dev/2012-February/116953.html > Terry J. Reedy wrote: > >> I presume that most 2.6 code has problems other than u'' when >> attempting to run under 3.x. > > Why? Since writing the above, I realized that the following is a realistic scenario. 2.6 or 2.7 code a) uses has/set/getattr, so unicode literals would require a change; b) uses non-ascii chars in unicode literals; c) uses (or could be converted to use) print as a function; and d) otherwise uses a common 2-3 subset. Such would only need the u prefix addition to run under both Pythons. This works the other way, of course, for backporting code. So I am replacing 'most' with 'some unknown-to-me fraction' ;-). -- Terry Jan Reedy From vinay_sajip at yahoo.co.uk Tue Feb 28 00:31:25 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 27 Feb 2012 23:31:25 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377025.12046.155.camel@thinko> <20120227172923.4007c2e7@resist.wooz.org> <4F4C0490.5080603@active-4.com> Message-ID: Armin Ronacher active-4.com> writes: > > Hi, > > On 2/27/12 10:29 PM, Barry Warsaw wrote: > > I still urge the PEP author to clean up the PEP and specifically address the > > issues brought up in this thread. That will be useful for the historical > > record. > That is a given. Great. My particular interest is w.r.t. the installation hook for 3.2 and the workflow for testing code in 3.2 and 3.3 at the same time. Regards, Vinay Sajip From victor.stinner at gmail.com Tue Feb 28 00:34:08 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 28 Feb 2012 00:34:08 +0100 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: <4f4c08bb.e89dec0a.772f.1b19@mx.google.com> References: <4f4c08bb.e89dec0a.772f.1b19@mx.google.com> Message-ID: >> The blacklist implementation has a major issue: it is still possible >> to call write methods of the dict class (e.g. dict.set(my_frozendict, >> key, value)). > > It is also possible to use ctypes and violate even more invariants. > For most purposes, this falls under "consenting adults". My primary usage of frozendict would be pysandbox, a security module. Attackers are not consenting adults :-) Read-only dict would also help optimization, in the CPython peephole or the PyPy JIT. In pysandbox, I'm trying to replace __builtins_ and (maybe also type.__dict__) by a frozendict. These objects rely on PyDict API and so expect a type "compatible" with dict. But PyDict_GetItem() and PyDict_SetItem() may use a test like isinstance(obj, (dict, frozendict)), especially if the C strucure is "compatible". But pysandbox should not drive the design of frozendict :-) >> The whitelist implementation has an issue: frozendict and dict are not >> "compatible", dict is not a subclass of frozendict (and frozendict is >> not a subclass of dict). > > And because of Liskov substitutability, they shouldn't be; they should > be sibling children of a basedict that doesn't have the the mutating > methods, but also doesn't *promise* not to mutate. As I wrote, I realized that it doesn't matter if dict doesn't inherit from frozendict. >> ?* frozendict values must be immutable, as dict keys > > Why? ?That may be useful, but an immutable dict whose values > might mutate is also useful; by forcing that choice, it starts > to feel too specialized for a builtin. If values are mutables, the frozendict cannot be called "immutable". tuple and frozenset can only contain immutables values. All implementations of frozendict that I found expect frozendict to be hashable. >> ?* frozendict.__hash__ computes hash(frozenset(self.items())) and >> caches the result is its private hash attribute > > Why? ?hash(frozenset(selk.keys())) would still meet the hash contract, > but it would be approximately twice as fast, and I can think of only > one case where it wouldn't work just as well. Yes, it would faster but the hash is usually the hash of the whole object content. E.g. the hash of a tuple is not the hash of items with odd index, whereas such hash function would also meet the "hash contract". All implementations of frozendict that I found all use items, and not only values or only keys. Victor From ethan at stoneleaf.us Tue Feb 28 00:15:59 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 27 Feb 2012 15:15:59 -0800 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <20120227223611.2449b7e9@pitrou.net> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <4F4BC59D.1050208@stoneleaf.us> <4F4BE5B9.1030301@v.loewis.de> <4F4BF104.4070102@stoneleaf.us> <20120227223611.2449b7e9@pitrou.net> Message-ID: <4F4C0EAF.6090402@stoneleaf.us> Antoine Pitrou wrote: > On Mon, 27 Feb 2012 13:09:24 -0800 > Ethan Furman wrote: >> Martin v. L?wis wrote: >>>>> Eh? The 2.6 version would also be u('that'). That's the whole point >>>>> of the idiom. You'll need a better counter argument than that. >>>> So the idea is to convert the existing 2.6 code to use parenthesis as >>>> well? (I obviously haven't read the PEP -- my apologies.) >>> Well, if you didn't, you wouldn't have the same sources on 2.x and 3.x. >>> And if that was ok, you wouldn't need the u() function in 3.x at all, >>> since plain string literals are *already* unicode strings there. >> True -- but I would rather have u'' in 2.6 and 3.3 than u('') in 2.6 and >> 3.3. > > You don't want to be 3.2-compatible? Unfortunately I do. However, at some point 3.2 will fall off the edge of the earth and then u'' will be just fine. This is probably a dumb question, but why can't we add u'' back to 3.2? It seems an incredibly minor change, and we are not in security-only fix stage, are we? ~Ethan~ From brian at python.org Tue Feb 28 00:38:16 2012 From: brian at python.org (Brian Curtin) Date: Mon, 27 Feb 2012 17:38:16 -0600 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4C0EAF.6090402@stoneleaf.us> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <4F4BC59D.1050208@stoneleaf.us> <4F4BE5B9.1030301@v.loewis.de> <4F4BF104.4070102@stoneleaf.us> <20120227223611.2449b7e9@pitrou.net> <4F4C0EAF.6090402@stoneleaf.us> Message-ID: On Mon, Feb 27, 2012 at 17:15, Ethan Furman wrote: > This is probably a dumb question, but why can't we add u'' back to 3.2? ?It > seems an incredibly minor change, and we are not in security-only fix stage, > are we? We don't add features to bug-fix releases. From tseaver at palladion.com Tue Feb 28 00:42:24 2012 From: tseaver at palladion.com (Tres Seaver) Date: Mon, 27 Feb 2012 18:42:24 -0500 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: References: <4f4c08bb.e89dec0a.772f.1b19@mx.google.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/27/2012 06:34 PM, Victor Stinner wrote: > tuple and frozenset can only contain immutables values. Tuples can contain mutables:: $ python Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) [GCC 4.4.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> ({},) ({},) $ python3 Python 3.2 (r32:88445, Mar 10 2011, 10:08:58) [GCC 4.4.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> ({},) ({},) Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk9MFOAACgkQ+gerLs4ltQ5mjQCgi1U7CloZUy0u0+c0mlLlIuko +IIAoLqKGcAb6ZAEY5wpkwvtgRa6S+LV =7Mh5 -----END PGP SIGNATURE----- From steve at pearwood.info Tue Feb 28 00:54:24 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 28 Feb 2012 10:54:24 +1100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4BF72F.9010704@active-4.com> References: <4F49434B.6050604@active-4.com> <4F4A28CD.5070903@active-4.com> <4F4B5847.1040107@v.loewis.de> <4F4BA50A.3020009@active-4.com> <20120227174434.Horde.PAV6fUlCcOxPS7LyDc6X4bA@webmail.df.eu> <4F4BF72F.9010704@active-4.com> Message-ID: <4F4C17B0.4040306@pearwood.info> Armin Ronacher wrote: > Hi, > > On 2/27/12 4:44 PM, martin at v.loewis.de wrote: >> Maybe I'm missing something, but there doesn't seem to be a benchmark >> that measures the 2to3 performance, supporting the claim that it >> runs "two orders of magnitude" slower (which I'd interpret as a >> factor of 100). > My Jinja2+Werkzeug's testsuite combined takes 2 seconds to run (Werkzeug > actually takes 3 because it pauses for two seconds in a cache expiration > test). 2to3 takes 45 seconds to run. And those are small code bases > (15K lines combined). > > It's not exactly two orders of magnitude so I will probably change the > writing to "just" 20 times slower but it illustrates the point. That would be one order of magnitude. -- Steven From martin at v.loewis.de Tue Feb 28 01:16:01 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Tue, 28 Feb 2012 01:16:01 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4BFF98.2080007@active-4.com> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> Message-ID: <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> > On 2/27/12 9:58 PM, R. David Murray wrote: >> But the PEP doesn't address the unicode_literals plus str() approach. >> That is, the rationale currently makes a false claim. > Which would be exactly what that u() does not do? Armin, I propose that you correct the *factual* deficits of the PEP (i.e. remove all claims that cannot be supported by facts, or are otherwise incorrect or misleading). Many readers here would be more open to accepting the PEP if it was factual rather than polemic. The PEP author is supposed to collect all arguments, even the ones he doesn't agree with, and refute them. In this specific issue, the PEP states "the unicode_literals import the native string type is no longer available and has to be incorrectly labeled as bytestring" This is incorrect: even though the native string type indeed is no longer available, it is *not* consequential that it has to be labeled as byte string. Instead, you can use the str() function. It may be that you don't like that solution for some reason. If so, please mention the approach in the PEP, along with your reason for not liking it. Regards, Martin From steve at pearwood.info Tue Feb 28 01:30:59 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 28 Feb 2012 11:30:59 +1100 Subject: [Python-Dev] Marking packaging-related PEPs as Finished after fixing some bugs in them In-Reply-To: References: <4F4B5D88.2010804@netwok.org> <20120227125025.7f5450ab@pitrou.net> Message-ID: <4F4C2043.4000507@pearwood.info> Terry Reedy wrote: > On 2/27/2012 6:50 AM, Antoine Pitrou wrote: > >> 'rc' makes sense to most people while 'c' is generally unheard of. > > 'rc' following 'a' and 'b' only makes sense to people who are used to it > and know what it means. 'c' for 'candidate' makes more sense to me both > a decade ago and now. 'rc' is inconsistent. Why not 'ra' for 'release > alpha' or 'ar' for 'alpha release'? In other words, all releases are > releases, so why not be consistent and either always or never include > 'r'? (Never would be better since always is redundant.) > > I suspect many non-developer users find 'rc' as surprising as I did. Yes, but you should only find it surprising *once*, the first time you learn about the standard release schedule: pre-alpha alpha beta release candidate production release http://en.wikipedia.org/wiki/Software_release_life_cycle Not all releases are equivalent. In English, we can not only verbify nouns, but we can also nounify verbs. So, yes, any software which is released is *a* release; but only the last, production-ready release is *the* release. The others are pre-release releases. Ain't English grand? If if you prefer a more wordy but slightly less confusing way of saying it, they are pre-release versions which have been released. This reply of mine on the python-list list may also be relevant: http://mail.python.org/pipermail/python-list/2012-February/1288569.html -- Steven From ethan at stoneleaf.us Tue Feb 28 00:56:18 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 27 Feb 2012 15:56:18 -0800 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <4F4BC59D.1050208@stoneleaf.us> <4F4BE5B9.1030301@v.loewis.de> <4F4BF104.4070102@stoneleaf.us> <20120227223611.2449b7e9@pitrou.net> <4F4C0EAF.6090402@stoneleaf.us> Message-ID: <4F4C1822.4050708@stoneleaf.us> Brian Curtin wrote: > On Mon, Feb 27, 2012 at 17:15, Ethan Furman wrote: >> This is probably a dumb question, but why can't we add u'' back to 3.2? It >> seems an incredibly minor change, and we are not in security-only fix stage, >> are we? > > We don't add features to bug-fix releases. Ah. Well that's easy then! Call it a bug! ;) ~Ethan~ From ncoghlan at gmail.com Tue Feb 28 02:00:08 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 28 Feb 2012 11:00:08 +1000 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: References: <4f4c08bb.e89dec0a.772f.1b19@mx.google.com> Message-ID: On Tue, Feb 28, 2012 at 9:34 AM, Victor Stinner wrote: >>> The blacklist implementation has a major issue: it is still possible >>> to call write methods of the dict class (e.g. dict.set(my_frozendict, >>> key, value)). >> >> It is also possible to use ctypes and violate even more invariants. >> For most purposes, this falls under "consenting adults". > > My primary usage of frozendict would be pysandbox, a security module. > Attackers are not consenting adults :-) > > Read-only dict would also help optimization, in the CPython peephole > or the PyPy JIT. I'm pretty sure the PyPy jit can already pick up and optimise cases where a dict goes "read-only" (i.e. stops being modified). I think you need to elaborate on your use cases further, and explain what *additional* changes would be needed, such as allowing frozendict instances as __dict__ attributes in order to create truly immutable objects in pure Python code. In fact, that may be a better way to pitch the entire PEP. In current Python, you *can't* create a truly immutable object without dropping down to a C extension: >>> from decimal import Decimal >>> x = Decimal(1) >>> x Decimal('1') >>> hash(x) 1 >>> x._exp = 10 >>> x Decimal('1E+10') >>> hash(x) 10000000000 Contrast that with the behaviour of a float instance: >>> 1.0.imag = 1 Traceback (most recent call last): File "", line 1, in AttributeError: attribute 'imag' of 'float' objects is not writable Yes, it's arguably covered by the "consenting adults" rule, but really, Decimal instances should be just as immutable as int and float instances. The only reason they aren't is that it's hard enough to set it up in Python code that the Decimal implementation settles for "near enough is good enough" and just uses __slots__ to prevent addition of new attributes, but doesn't introduce the overhead of custom __setattr__ and __delattr__ implementations to actively *prevent* modifications. We don't even need a new container type, we really just need an easy way to tell the __setattr__ and __delattr__ descriptors for "__slots__" that the instance initialisation is complete and further modifications should be disallowed. For example, if Decimal.__new__ could call "self.__lock_slots__()" at the end to set a flag on the instance object, then the slot descriptors could read that new flag and trigger an error: >>> x._exp = 10 Traceback (most recent call last): File "", line 1, in AttributeError: attribute '_exp' of 'Decimal' objects is not writable To be clear, all of this is currently *possible* if you use custom descriptors (such as a property() implementation where setattr and delattr look for such a flag) or override __setattr__/__delattr__. However, for a micro-optimised type like Decimal, that's a hard choice to be asked to make (and the current implementation came down on the side of speed over enforcing correctness). Given that using __slots__ in the first place is, in and of itself, a micro-optimisation, I suspect Decimal is far from the only "immutable" type implemented in pure Python that finds itself having to make that trade-off. (An extra boolean check in C is a *good* trade-off of speed for correctness. Python level descriptor implementations or attribute access overrides, on the other hand... not so much). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From alex.gaynor at gmail.com Tue Feb 28 02:20:34 2012 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Tue, 28 Feb 2012 01:20:34 +0000 (UTC) Subject: [Python-Dev] Add a frozendict builtin type References: <4f4c08bb.e89dec0a.772f.1b19@mx.google.com> Message-ID: Nick Coghlan gmail.com> writes: > I'm pretty sure the PyPy jit can already pick up and optimise cases > where a dict goes "read-only" (i.e. stops being modified). No, it doesn't. We handle cases like a type's dict, or a module's dict, by having them use a different internal implementation (while, of course, still being dicts at the Python level). We do *not* handle the case of trying to figure out whether a Python object is immutable in any way. Alex From ncoghlan at gmail.com Tue Feb 28 02:45:48 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 28 Feb 2012 11:45:48 +1000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4f4bfc20.68b8ec0a.020a.1ec2@mx.google.com> Message-ID: On Tue, Feb 28, 2012 at 9:19 AM, Terry Reedy wrote: > Since writing the above, I realized that the following is a realistic > scenario. 2.6 or 2.7 code a) uses has/set/getattr, so unicode literals would > require a change; b) uses non-ascii chars in unicode literals; c) uses (or > could be converted to use) print as a function; and d) otherwise uses a > common 2-3 subset. Such would only need the u prefix addition to run under > both Pythons. This works the other way, of course, for backporting code. So > I am replacing 'most' with 'some unknown-to-me fraction' ;-). Yep, that's exactly the situation I'm in with PulpDist (a web app that primarily targets deployment on RHEL 6, which means Python 2.6). Since I preformat all my print output with either str.format or str.join (or use the logging module) and always use "except exc as var" to catch exceptions, the natural way to write Python 2 code for me is *almost* source compatible with Python 3. The only big discrepancy I'm currently aware of? Unicode literals. Now, I could retrofit the entire code base with the unicode_literals import and str("") for native strings, but that has problems of its own: - it doesn't match the Pulp upstream, so it would make it harder for them to review my plugins and client API usage code (or integrate them into the default plugin set or client support API if they decide they like them). Given that I'm one of the guinea pigs for experimental Pulp APIs and have to dive into *their* code on occasion, it would also be a challenge for *me* to switch modes when debugging . - it doesn't match Django (at least, not in 1.3, which is the version I'm using) (another potential annoyance when debugging) - it doesn't match any of the other Django applications I use (once again, debugging may lead to me looking at this code) - it doesn't match the standard library (yep, you guessed it, I'd have to mode switch when looking at standard library code, too) - it doesn't match the intuitions of current Python 2 developers that aren't up to speed with the niceties of Python 3 porting Basically, using the unicode_literals import would significantly raise the barrier to entry for PulpDist *as a Python 2 project*, as well as forcing me to switch mental models for text processing whenever I have to look at the code in a dependency during a debugging session. Therefore, given that Python 2 will be my primary target for the immediate future (and any collaborators are likely to be RHEL 6 and hence Python 2 focused), I don't want to use that particular future import. The downside of that choice (currently) is that it kills any possibility of running any of it on Python 3, even the command line client or the web front end after Django gets ported. With explicit unicode literals being restored in Python 3.3, though, I'm a lot more optimistic about the feasibility of porting it without too much effort (as well as the prospect of other Django app dependencies gaining Python 3 support). In terms of third party upstreams, python 3 compatibility patches that affect *every single string literal in the entire project* (either directly or converting the entire project to the "unicode_literals" import) aren't likely to even get reviewed, let alone accepted. By contrast (for a project that already only supports 2.6+), cleaning up print statements and exception handling should be a much smaller patch that is easy to both review and accept. Making it as easy as possible for maintainers that don't really care about Python 3 to accept patches from people that *do* care is a very good thing. There are still other problems that are going to affect the folks playing at the wire protocol level, but the lack of unicode literals is a big one that affects the entire application stack. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rdmurray at bitdance.com Tue Feb 28 03:04:43 2012 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 27 Feb 2012 21:04:43 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4BFF98.2080007@active-4.com> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> Message-ID: <20120228020445.6FB4C2500CF@webabinitio.net> On Mon, 27 Feb 2012 22:11:36 +0000, Armin Ronacher wrote: > On 2/27/12 9:58 PM, R. David Murray wrote: > > But the PEP doesn't address the unicode_literals plus str() approach. > > That is, the rationale currently makes a false claim. > Which would be exactly what that u() does not do? The rationale claims there's no way to spell "native string" if you use unicode_literals, which is not true. It would be different from u('') in that I would expect that there are far fewer instances where 'native string' is required than there are places where unicode strings work (and should therefore be preferred). This only matters now in order to make the PEP more accurate, but I think that is a good thing to do. --David From vinay_sajip at yahoo.co.uk Tue Feb 28 07:56:31 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 28 Feb 2012 06:56:31 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> Message-ID: R. David Murray bitdance.com> writes: > The rationale claims there's no way to spell "native string" if you use > unicode_literals, which is not true. > > It would be different from u('') in that I would expect that there are > far fewer instances where 'native string' is required than there are > places where unicode strings work (and should therefore be preferred). A couple of people have said that 'native string' is spelt 'str', but I'm not sure that's the right answer. For example, 2.x's cString.StringIO expects native strings, not Unicode: >>> from cStringIO import StringIO >>> s = StringIO(u'\xe9') >>> s >>> s.getvalue() '\xe9\x00\x00\x00' Of course, you can't call str() on that value to get a native string: >>> str(u'\xe9') Traceback (most recent call last): File "", line 1, in UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 0: ordinal not in range(128) So I think using str will not give the desired effect in some situations: on Django, I used a function that resolves differently depending on Python version: something like def native(literal): return literal on Python 3, and def native(literal): return literal.encode('utf-8') on Python 2. I'm not saying this is the right thing to do for all cases - just that str() may not be, either. This should be elaborated in the PEP. Regards, Vinay Sajip From regebro at gmail.com Tue Feb 28 08:23:44 2012 From: regebro at gmail.com (Lennart Regebro) Date: Tue, 28 Feb 2012 08:23:44 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> Message-ID: I'm +1 on the PEP, for reasons already repeated here. We need three types of strings when supporting both Python 2 and Python 3. A binary string, a unicode string and a "native" string, ie one that is the old 8-bit str in python 2 but a Unicode str in Python 3. Adding back the u'' prefix is the easiest, most obvious/intuitive/pythong/whatever way of getting that support, that requires the least amount of code change, and the least ugly code. -- Lennart Regebro: http://regebro.wordpress.com/ Porting to Python 3: http://python3porting.com/ From vinay_sajip at yahoo.co.uk Tue Feb 28 08:51:22 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 28 Feb 2012 07:51:22 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> Message-ID: Lennart Regebro gmail.com> writes: > I'm +1 on the PEP, for reasons already repeated here. > We need three types of strings when supporting both Python 2 and > Python 3. A binary string, a unicode string and a "native" string, ie > one that is the old 8-bit str in python 2 but a Unicode str in Python > 3. Well it's a done deal, and as I said elsewhere on the thread, I wasn't opposing the PEP, but wanting some improvements in it. ISTM that given the PEP as it is, working across 3.2 and 3.3 on a single codebase may not always be the easiest process (IIUC you have to run a mini2to3 process, and it'll need to be cleverer than 2to3 about running over the entire codebase if it's to appear seamless), but I guess that's a smaller number of people you'd upset, and those people are committed to 3.x anyway. It's the 2.x porters we're trying to win over - I see that. It will be very nice if this leads to an increase in the rate at which libraries are ported to 3.x. > Adding back the u'' prefix is the easiest, most > obvious/intuitive/pythong/whatever way of getting that support, that > requires the least amount of code change, and the least ugly code. "Least ugly" is subjective; I find u'xxx' less pretty than 'xxx' for text. Regards, Vinay Sajip From martin at v.loewis.de Tue Feb 28 09:01:23 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Tue, 28 Feb 2012 09:01:23 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> Message-ID: <20120228090123.Horde.yp1BIqGZi1VPTInTgC4X8hA@webmail.df.eu> > A couple of people have said that 'native string' is spelt 'str', but I'm not > sure that's the right answer. For example, 2.x's cString.StringIO > expects native strings, not Unicode: Your counter-example is non-ASCII characters/bytes. I doubt that this is a valid use case; in a "native" string, these shouldn't occur (i.e. native strings should always be ASCII), since the semantics of non-ASCII changes drastically between 2.x and 3.x. So whoever defines some API to take "native" strings can't have defined a valid use of non-ASCII in that interface. > I'm not saying this is the right thing to do for all cases - just > that str() may not be, either. This should be elaborated in the PEP. Indeed it should. If there is a known application of non-ASCII native strings, I surely would like to know what that is. Regards, Martin From armin.ronacher at active-4.com Tue Feb 28 09:09:20 2012 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Tue, 28 Feb 2012 08:09:20 +0000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> Message-ID: <4F4C8BB0.9020302@active-4.com> Hi, On 2/28/12 12:16 AM, martin at v.loewis.de wrote: > Armin, I propose that you correct the *factual* deficits of the PEP > (i.e. remove all claims that cannot be supported by facts, or are otherwise > incorrect or misleading). Many readers here would be more open to accepting > the PEP if it was factual rather than polemic. Please don't call this PEP polemic. > The PEP author is supposed to collect all arguments, even the ones he > doesn't agree with, and refute them. I brought up all the arguments that were I knew about before I submitted this mailinglist thread and I had since not updated it. > In this specific issue, the PEP states "the unicode_literals import the > native string type is no longer available and has to be incorrectly > labeled as bytestring" > > This is incorrect: even though the native string type indeed is no longer > available, it is *not* consequential that it has to be labeled as byte > string. Instead, you can use the str() function. Obviously it means not available by syntax. > It may be that you don't like that solution for some reason. If so, please > mention the approach in the PEP, along with your reason for not liking it. If by str() you mean using "str('x')" as replacement for 'x' in both 2.x and 3.x with __future__ imports as a replacement for native string literals, please mention why this is better than u(), s(), n() etc. It would be equally slow than a custom wrapper function and it would not support non-ascii characters. Regards, Armin From armin.ronacher at active-4.com Tue Feb 28 09:10:25 2012 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Tue, 28 Feb 2012 08:10:25 +0000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4C17B0.4040306@pearwood.info> References: <4F49434B.6050604@active-4.com> <4F4A28CD.5070903@active-4.com> <4F4B5847.1040107@v.loewis.de> <4F4BA50A.3020009@active-4.com> <20120227174434.Horde.PAV6fUlCcOxPS7LyDc6X4bA@webmail.df.eu> <4F4BF72F.9010704@active-4.com> <4F4C17B0.4040306@pearwood.info> Message-ID: <4F4C8BF1.6030905@active-4.com> Hi, On 2/27/12 11:54 PM, Steven D'Aprano wrote: > That would be one order of magnitude. I am aware of that :-) Regards, Armin From dirkjan at ochtman.nl Tue Feb 28 09:25:09 2012 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Tue, 28 Feb 2012 09:25:09 +0100 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: References: Message-ID: On Mon, Feb 27, 2012 at 19:53, Victor Stinner wrote: > A frozendict type is a common request from users and there are various > implementations. There are two main Python implementations: Perhaps this should also detail why namedtuple is not a viable alternative. Cheers, Dirkjan From martin at v.loewis.de Tue Feb 28 09:44:07 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 28 Feb 2012 09:44:07 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4C8BB0.9020302@active-4.com> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C8BB0.9020302@active-4.com> Message-ID: <4F4C93D7.2030302@v.loewis.de> >> The PEP author is supposed to collect all arguments, even the ones he >> doesn't agree with, and refute them. > I brought up all the arguments that were I knew about before I submitted > this mailinglist thread and I had since not updated it. This is fine, of course. I still hope you will update it now, even though it has been accepted. >> This is incorrect: even though the native string type indeed is no longer >> available, it is *not* consequential that it has to be labeled as byte >> string. Instead, you can use the str() function. > Obviously it means not available by syntax. I agree that the native string type is no longer supported by syntax in that approach. >> It may be that you don't like that solution for some reason. If so, please >> mention the approach in the PEP, along with your reason for not liking it. > If by str() you mean using "str('x')" as replacement for 'x' in both 2.x > and 3.x with __future__ imports as a replacement for native string > literals, please mention why this is better than u(), s(), n() etc. It > would be equally slow than a custom wrapper function and it would not > support non-ascii characters. That's not the point. The point is that the PEP ought to mention it as an alternative, instead of making the false claim that "it has to be labeled as byte string" (which I take as using a b"" prefix). Feel free to write something like "... it either has to be labelled as a byte string, or wrapped into a function call, e.g. using the str() function. This would be slow and would not support non-ascii characters" My whole point here is that I want the PEP to mention it, not this email thread. In addition, if you are using this very phrasing that I propose, I would then claim that a) it is not slow (certainly not as slow as a custom wrapper (*)), and b) it's not a problem that it is ASCII-only, since native strings are *practically* restricted to ASCII, anyway (even though not theoretically) In turn, I would ask that this counter-argument of mine is also reflected in the PEP. The whole point of the PEP process is that it settles disputes. Part of that settling is to avoid arguments which go in circles. To that effect, the PEP author ideally should *quickly* update the PEP, along with writing responses, so that anybody repeating an argument could be pointed to the PEP in order to shut up. HTH, Martin (*) This is also something that Guido requested at some point from the PEP: that it fairly analyses efficient implementations of potential wrapper functions, taking C implementations into account as well. From martin at v.loewis.de Tue Feb 28 10:02:46 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 28 Feb 2012 10:02:46 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4BF72F.9010704@active-4.com> References: <4F49434B.6050604@active-4.com> <4F4A28CD.5070903@active-4.com> <4F4B5847.1040107@v.loewis.de> <4F4BA50A.3020009@active-4.com> <20120227174434.Horde.PAV6fUlCcOxPS7LyDc6X4bA@webmail.df.eu> <4F4BF72F.9010704@active-4.com> Message-ID: <4F4C9836.9030008@v.loewis.de> Am 27.02.2012 22:35, schrieb Armin Ronacher: > Hi, > > On 2/27/12 4:44 PM, martin at v.loewis.de wrote: >> Maybe I'm missing something, but there doesn't seem to be a benchmark >> that measures the 2to3 performance, supporting the claim that it >> runs "two orders of magnitude" slower (which I'd interpret as a >> factor of 100). > My Jinja2+Werkzeug's testsuite combined takes 2 seconds to run (Werkzeug > actually takes 3 because it pauses for two seconds in a cache expiration > test). 2to3 takes 45 seconds to run. And those are small code bases > (15K lines combined). I'm not quite able to reproduce that. I don't know how to run the Jinja2 and Werkzeug test suites combined (Werkzeug's setup.py install gives SyntaxError on Python3). So taking Jinja2 alone, this is what I get: - test suite run: 0.86s (python setup.py test) - 2to3 run: 6.7s (python3 setup.py build, using default:3328e388cb28) So this is less than a factor of ten, but more importantly, much shorter than 45s. I also claim that the example is atypical, in that the test suite completes so quickly. Taking distribute 0.6.24 as a counter-example: - test suite run: 9s - 2to3 run: 7s So the test suite runs longer than the build process. Therefore, even a claim "In many cases 2to3 runs 20 times slower than the testsuite for the library or application it's testing" cannot be substantiated, as cannot the claim "This for instance is the case for the Jinja2 library". On the contrary, I'd expect that the build time using 2to3 is significantly shorter than the test suite run times, *in particular* for large projects. For example, for Django, 2to3 takes less than 3 minutes (IIRC), and the test suite runs an hour or so (depending on how many tests get skipped). Regards, Martin From regebro at gmail.com Tue Feb 28 10:21:53 2012 From: regebro at gmail.com (Lennart Regebro) Date: Tue, 28 Feb 2012 10:21:53 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> Message-ID: On Tue, Feb 28, 2012 at 08:51, Vinay Sajip wrote: > Lennart Regebro gmail.com> writes: > >> I'm +1 on the PEP, for reasons already repeated here. >> We need three types of strings when supporting both Python 2 and >> Python 3. A binary string, a unicode string and a "native" string, ie >> one that is the old 8-bit str in python 2 but a Unicode str in Python >> 3. > > Well it's a done deal, and as I said elsewhere on the thread, I wasn't opposing > the PEP, but wanting some improvements in it. ISTM that given the PEP as it is, > working across 3.2 and 3.3 on a single codebase may not always be the easiest > process (IIUC you have to run a mini2to3 process, and it'll need to be cleverer > than 2to3 about running over the entire codebase if it's to appear seamless), Distribute helps with this. I think we might have to add a support in distribute to easily exclude the fixer that removes u''-prefixes, I don't remember if there is an "exclude" feature. From mark at hotpy.org Tue Feb 28 10:47:59 2012 From: mark at hotpy.org (Mark Shannon) Date: Tue, 28 Feb 2012 09:47:59 +0000 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: References: <4f4c08bb.e89dec0a.772f.1b19@mx.google.com> Message-ID: <4F4CA2CF.8020204@hotpy.org> Victor Stinner wrote: >>> The blacklist implementation has a major issue: it is still possible >>> to call write methods of the dict class (e.g. dict.set(my_frozendict, >>> key, value)). >> It is also possible to use ctypes and violate even more invariants. >> For most purposes, this falls under "consenting adults". > > My primary usage of frozendict would be pysandbox, a security module. > Attackers are not consenting adults :-) > > Read-only dict would also help optimization, in the CPython peephole > or the PyPy JIT. Not w.r.t. PyPy. It wouldn't do any harm though. One use of frozendict that you haven't mentioned so far is communication between concurrent processes/tasks. These need to be able to copy objects without changing reference semantics, which demands immutability. Cheers, Mark. From victor.stinner at haypocalc.com Tue Feb 28 11:12:02 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 28 Feb 2012 11:12:02 +0100 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: References: Message-ID: > A frozendict type is a common request from users and there are various >> implementations. There are two main Python implementations: > > Perhaps this should also detail why namedtuple is not a viable alternative. It doesn't have the same API. Example: frozendict[key] vs namedtuple.attr (namedtuple.key). namedtuple has no .keys() or .items() method. Victor From mcepl at redhat.com Tue Feb 28 08:56:34 2012 From: mcepl at redhat.com (Matej Cepl) Date: Tue, 28 Feb 2012 08:56:34 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> Message-ID: <4F4C88B2.40706@redhat.com> On 28.2.2012 01:16, martin at v.loewis.de wrote: > Armin, I propose that you correct the *factual* deficits of the PEP He cannot, because he would have to throw away whole PEP ... it is all based on non-sensical concept of "native string". There is no such animal (there are only strings and bytes, although they are incorrectly named Unicode strings and strings in Python 2), and whole PEP is just "I don't like Python 3 and I want it to be reverted back to Python 2". It doesn't matter anymore now, but I just needed to put it off my chest. Mat?j From solipsis at pitrou.net Tue Feb 28 12:20:19 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 28 Feb 2012 12:20:19 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4A28CD.5070903@active-4.com> <4F4B5847.1040107@v.loewis.de> <4F4BA50A.3020009@active-4.com> <20120227174434.Horde.PAV6fUlCcOxPS7LyDc6X4bA@webmail.df.eu> <4F4BF72F.9010704@active-4.com> <4F4C9836.9030008@v.loewis.de> Message-ID: <20120228122019.18801ea5@pitrou.net> On Tue, 28 Feb 2012 10:02:46 +0100 "Martin v. L?wis" wrote: > > On the contrary, I'd expect that the build time using 2to3 is > significantly shorter than the test suite run times, *in particular* > for large projects. For example, for Django, 2to3 takes less than > 3 minutes (IIRC), and the test suite runs an hour or so (depending > on how many tests get skipped). In the end, that's not particularly relevant, because you don't have to run the test suite entirely; when working on small changes, you usually re-run the impacted parts of the test suite until everything goes fine; on the other hand, 2to3 *has* to run on the entire code base. So, really, it's a couple of seconds to run a single bunch of tests vs. several minutes to run 2to3 on the code base. And it's not just the test suite: every concrete experiment with the library you're porting has a serial dependency on running 2to3. Regards Antoine. From ncoghlan at gmail.com Tue Feb 28 12:42:54 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 28 Feb 2012 21:42:54 +1000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4C88B2.40706@redhat.com> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> Message-ID: On Tue, Feb 28, 2012 at 5:56 PM, Matej Cepl wrote: > He cannot, because he would have to throw away whole PEP ... it is all based > on non-sensical concept of "native string". There is no such animal (there > are only strings and bytes, although they are incorrectly named Unicode > strings and strings in Python 2), and whole PEP is just "I don't like Python > 3 and I want it to be reverted back to Python 2". > > It doesn't matter anymore now, but I just needed to put it off my chest. If you don't know what a native string is, then you need to study more to understand why Armin's PEP exists and why it is useful. I suggest starting with PEP 3333 (the WSGI update to v1.0.1 that first clearly defined the concept of a native string: http://www.python.org/dev/peps/pep-3333/#a-note-on-string-types). There are concrete, practical reasons why the lack of Unicode literals in Python 3 makes porting harder than it needs to be. Are they insurmountable? No, of course not - there are plenty of successful ports already that demonstate porting it quite feasible with existing tools. But the existing approaches require that, in order to be forward compatible with Python 3, a program must be made *worse* in Python 2 (i.e. harder to read and harder to write correctly for someone that hasn't learned Python 3 yet). Restoring unicode literal support in 3.3 is a pragmatic step that allows a lot of code to *just work* on Python 3. Most 2.6+ code that still doesn't work on Python 3 even after this change will be made *better* (or at least not made substantially worse) by the additional changes necessary for forward compatibility. Unicode literals are somewhat unique in their impact on porting efforts, as they show up *everywhere* in Unicode correct code in Python 2. The diffs that will be needed to correctly tag bytestrings in such code under Python 2 are tiny compared to those that would be needed to strip the u"" prefixes. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From victor.stinner at haypocalc.com Tue Feb 28 12:45:54 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 28 Feb 2012 12:45:54 +0100 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: References: <4f4c08bb.e89dec0a.772f.1b19@mx.google.com> Message-ID: > I think you need to elaborate on your use cases further, ... A frozendict can be used as a member of a set or as a key in a dictionary. For example, frozendict is indirectly needed when you want to use an object as a key of a dict, whereas one attribute of this object is a dict. Use a frozendict instead of a dict for this attribute answers to this problem. frozendict helps also in threading and multiprocessing. -- > ... and explain > what *additional* changes would be needed, such as allowing frozendict > instances as __dict__ attributes in order to create truly immutable > objects in pure Python code. > In current Python, you *can't* create a truly immutable object without dropping > down to a C extension: Using frozendict in for type dictionary might be a use case, but please don't focus on this example. There is currently a discussion on python-ideas about this specific use case. I first proposed to use frozendict in type.__new__, but then I proposed something completly different: add a flag to a set to deny any modification of the type. The flag may be set using "__final__ = True" in the class body for example. Victor From solipsis at pitrou.net Tue Feb 28 12:52:26 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 28 Feb 2012 12:52:26 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> Message-ID: <20120228125226.1aa8004a@pitrou.net> On Tue, 28 Feb 2012 21:42:54 +1000 Nick Coghlan wrote: > But the existing approaches require that, in order to be > forward compatible with Python 3, a program must be made *worse* in > Python 2 (i.e. harder to read and harder to write correctly for > someone that hasn't learned Python 3 yet). Wrong. The separate branches approach allows you to have a clean Python 3 codebase without crippling the Python 2 codebase. Of course that approach was downplayed from the start in favour of using 2to3 on a single codebase, and now we discover that this approach is cumbersome. Note that 2to3 is actually helpful when you choose the dual branches approach, and it isn't a serial dependency in that case. (see https://bitbucket.org/pitrou/t3k/) Regards Antoine. From solipsis at pitrou.net Tue Feb 28 12:53:27 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 28 Feb 2012 12:53:27 +0100 Subject: [Python-Dev] Add a frozendict builtin type References: <4f4c08bb.e89dec0a.772f.1b19@mx.google.com> Message-ID: <20120228125327.168c09d1@pitrou.net> On Tue, 28 Feb 2012 12:45:54 +0100 Victor Stinner wrote: > > I think you need to elaborate on your use cases further, ... > > A frozendict can be used as a member of a set or as a key in a dictionary. > > For example, frozendict is indirectly needed when you want to use an > object as a key of a dict, whereas one attribute of this object is a > dict. It isn't. You just have to define __hash__ correctly. > frozendict helps also in threading and multiprocessing. How so? Regards Antoine. From mark at hotpy.org Tue Feb 28 13:07:32 2012 From: mark at hotpy.org (Mark Shannon) Date: Tue, 28 Feb 2012 12:07:32 +0000 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: <20120228125327.168c09d1@pitrou.net> References: <4f4c08bb.e89dec0a.772f.1b19@mx.google.com> <20120228125327.168c09d1@pitrou.net> Message-ID: <4F4CC384.9080905@hotpy.org> Antoine Pitrou wrote: > On Tue, 28 Feb 2012 12:45:54 +0100 > Victor Stinner wrote: >>> I think you need to elaborate on your use cases further, ... >> A frozendict can be used as a member of a set or as a key in a dictionary. >> >> For example, frozendict is indirectly needed when you want to use an >> object as a key of a dict, whereas one attribute of this object is a >> dict. > > It isn't. You just have to define __hash__ correctly. > >> frozendict helps also in threading and multiprocessing. > > How so? Inter process/task communication requires copying. Inter/intra thread communication uses reference semantics. To ensure these are the same, the objects used in communication must be immutable. Cheers, Mark. From vinay_sajip at yahoo.co.uk Tue Feb 28 13:10:43 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 28 Feb 2012 12:10:43 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> Message-ID: Lennart Regebro gmail.com> writes: > Distribute helps with this. I think we might have to add a support in > distribute to easily exclude the fixer that removes u''-prefixes, I > don't remember if there is an "exclude" feature. We might be at cross purposes here. I don't see how Distribute helps, because the use case I'm talking about is not about distributing or installing stuff, but iteratively changing and testing code which needs to work on 2.6+, 3.2 and 3.3+. If the 2.x code depends on having u'xxx' literals, then 3.2 testing will potentially involve running a fixer on all files in the project every time a change is made, writing to a separate directory, or else a fixer which is integrated into the editing environment so it knows what changed. This is painful, and what motivated PEP 314 in the first place - which seems ironic. The PEP 314 approach seems to assume that that if things work on 3.3, they will work on 3.2/3.1/3.0 without any changes other than replacing u'xxx' with 'xxx'. In other words, you aren't supposed to want to e.g. test 3.2 and 3.3 iteratively, using a workflow which intersperses edits with running tests using 3.2 and running tests with 3.3. In any case, a single code base seems not to be possible across 2.6+/3.0/3.1/3.2/3.3+ using the PEP 314 approach, though of course one will be possible for just 2.6+/3.3+. Early adopters of 3.x seem to be penalised by this approach: I for one will try to use the unicode_literals approach wherever I can. Regards, Vinay Sajip From p.f.moore at gmail.com Tue Feb 28 13:13:52 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 28 Feb 2012 12:13:52 +0000 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: <4F4CC384.9080905@hotpy.org> References: <4f4c08bb.e89dec0a.772f.1b19@mx.google.com> <20120228125327.168c09d1@pitrou.net> <4F4CC384.9080905@hotpy.org> Message-ID: On 28 February 2012 12:07, Mark Shannon wrote: >>> frozendict helps also in threading and multiprocessing. >> >> How so? > > Inter process/task communication requires copying. Inter/intra thread > communication uses reference semantics. To ensure these are the same, > the objects used in communication must be immutable. Does that imply that in a frozendict, the *values* as well as the *keys* must be immutable? Isn't that a pretty strong limitation (and hence, does it not make frozendicts a lot less useful than they might otherwise be)? From victor.stinner at haypocalc.com Tue Feb 28 13:14:15 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 28 Feb 2012 13:14:15 +0100 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: References: Message-ID: Updated patch and more justifications. New patch: - dict doesn't inherit from frozendict anymore - frozendict is a subclass of collections.abc.Mutable - more tests > ?* frozendict.__hash__ computes hash(frozenset(self.items())) and > caches the result is its private hash attribute hash(frozenset(self.items())) is preferred over hash(sorted(self.items())) because keys and values may be unorderable. frozenset() is faster than sorted(): O(n) vs O(n*log(n)). frozendict hash doesn't care of the item order creation: >>> a=frozendict.fromkeys('ai') >>> a frozendict({'a': None, 'i': None}) >>> b=frozendict.fromkeys('ia') >>> b frozendict({'i': None, 'a': None}) >>> hash(a) == hash(b) True >>> a == b True >>> tuple(a.items()) == tuple(b.items()) False frozendict supports unorderable keys and values: >>> hash(frozendict({b'abc': 1, 'abc': 2})) 935669091 >>> hash(frozendict({1: b'abc', 2: 'abc'})) 1319859033 > ?* Add a frozendict abstract base class to collections? I realized that Mapping already exists and so the following patch is enough: +Mapping.register(frozendict) > See also the PEP 351. I read the PEP and the email explaining why it was rejected. Just to be clear: the PEP 351 tries to freeze an object, try to convert a mutable or immutable object to an immutable object. Whereas my frozendict proposition doesn't convert anything: it just raises a TypeError if you use a mutable key or value. For example, frozendict({'list': ['a', 'b', 'c']}) doesn't create frozendict({'list': ('a', 'b', 'c')}) but raises a TypeError. Victor -------------- next part -------------- A non-text attachment was scrubbed... Name: frozendict-2.patch Type: text/x-patch Size: 29099 bytes Desc: not available URL: From ncoghlan at gmail.com Tue Feb 28 13:14:54 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 28 Feb 2012 22:14:54 +1000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <20120228125226.1aa8004a@pitrou.net> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> Message-ID: On Tue, Feb 28, 2012 at 9:52 PM, Antoine Pitrou wrote: > On Tue, 28 Feb 2012 21:42:54 +1000 > Nick Coghlan wrote: >> But the existing approaches require that, in order to be >> forward compatible with Python 3, a program must be made *worse* in >> Python 2 (i.e. harder to read and harder to write correctly for >> someone that hasn't learned Python 3 yet). > > Wrong. The separate branches approach allows you to have a clean > Python 3 codebase without crippling the Python 2 codebase. > Of course that approach was downplayed from the start in favour of > using 2to3 on a single codebase, and now we discover that this approach > is cumbersome. If you're using separate branches, then your Python 2 code isn't being made forward compatible with Python 3. Yes, it avoids making your Python 2 code uglier, but it means maintaining two branches in parallel until you drop Python 2 support. You've once again raised the barrier to entry: either people contribute two patches, or they accept that their patch may languish until someone else writes the patch for the other version. Again, as with 2to3, that approach obviously *works* (we've done it ourselves for years with the standard library), but it's hardly a low friction approach to porting. That's all PEP 414 is about - lowering the friction of porting to Python 3. Is it *necessary*? No, there are already enough successful ports to prove that, if sufficiently motivated, porting to Python 3 is feasible with the current toolset. However, that's the wrong question. The right question is "Does PEP 414 make porting substantially *easier*, by significantly reducing the volume of code that needs to change in order to attain Python 3 compatibility?". And the answer to *that* question is "Absolutely." Porting the web frameworks themselves to Python 3 is only the first step in migrating those ecosystems to Python 3, and because the web APIs exposed by those frameworks are so heavily Unicode based this is an issue that will hit pretty much every Python web app and library on the planet. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Tue Feb 28 13:11:08 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 28 Feb 2012 13:11:08 +0100 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: <4F4CC384.9080905@hotpy.org> References: <4f4c08bb.e89dec0a.772f.1b19@mx.google.com> <20120228125327.168c09d1@pitrou.net> <4F4CC384.9080905@hotpy.org> Message-ID: <20120228131108.3538e5b0@pitrou.net> On Tue, 28 Feb 2012 12:07:32 +0000 Mark Shannon wrote: > Antoine Pitrou wrote: > > On Tue, 28 Feb 2012 12:45:54 +0100 > > Victor Stinner wrote: > >>> I think you need to elaborate on your use cases further, ... > >> A frozendict can be used as a member of a set or as a key in a dictionary. > >> > >> For example, frozendict is indirectly needed when you want to use an > >> object as a key of a dict, whereas one attribute of this object is a > >> dict. > > > > It isn't. You just have to define __hash__ correctly. > > > >> frozendict helps also in threading and multiprocessing. > > > > How so? > > Inter process/task communication requires copying. Inter/intra thread > communication uses reference semantics. To ensure these are the same, > the objects used in communication must be immutable. You just need them to be practically constant. No need for an immutable type in the first place. Regards Antoine. From victor.stinner at haypocalc.com Tue Feb 28 13:17:47 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 28 Feb 2012 13:17:47 +0100 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: <20120228125327.168c09d1@pitrou.net> References: <4f4c08bb.e89dec0a.772f.1b19@mx.google.com> <20120228125327.168c09d1@pitrou.net> Message-ID: >> A frozendict can be used as a member of a set or as a key in a dictionary. >> >> For example, frozendict is indirectly needed when you want to use an >> object as a key of a dict, whereas one attribute of this object is a >> dict. > > It isn't. You just have to define __hash__ correctly. Define __hash__ on a mutable object can be surprising. Or do you mean that you deny somehow the modification of the dict attribute, and convert the dict to a immutable object before hashing it? >> frozendict helps also in threading and multiprocessing. > > How so? For example, you don't need a lock to read the frozendict content, because you cannot modify the content. Victor From ncoghlan at gmail.com Tue Feb 28 13:21:11 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 28 Feb 2012 22:21:11 +1000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> Message-ID: On Tue, Feb 28, 2012 at 10:10 PM, Vinay Sajip wrote: > If the 2.x code depends on having u'xxx' literals, then 3.2 testing will > potentially involve running a fixer on all files in the project every time a > change is made, writing to a separate directory, or else a fixer which is > integrated into the editing environment so it knows what changed. This is > painful, and what motivated PEP 314 in the first place - which seems ironic. No, the real idea behind PEP 414 is that most ports that rely on it simply won't support 3.2 - they will only target 3.3+. The u"" fixer will just be one more tool in the arsenal of those that *do* want to support 3.2 (either because they want to target Ubuntu's LTS 3.2 stack, or for their own reasons). All of the other alternatives (such as separate branches or the unicode_literals future import) will also remain available to them. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Tue Feb 28 13:19:35 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 28 Feb 2012 13:19:35 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> Message-ID: <1330431576.3400.2.camel@localhost.localdomain> Le mardi 28 f?vrier 2012 ? 22:14 +1000, Nick Coghlan a ?crit : > If you're using separate branches, then your Python 2 code isn't being > made forward compatible with Python 3. Yes, it avoids making your > Python 2 code uglier, but it means maintaining two branches in > parallel until you drop Python 2 support. IMO, maintaining two branches shouldn't be much more work than maintaining hacks so that a single codebase works with two different programming languages. > You've once again raised the > barrier to entry: either people contribute two patches, or they accept > that their patch may languish until someone else writes the patch for > the other version. Again that's wrong. If you cleverly use 2to3 to port between branches, patches only have to be written against the 2.x version. Regards Antoine. From solipsis at pitrou.net Tue Feb 28 13:22:16 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 28 Feb 2012 13:22:16 +0100 Subject: [Python-Dev] Add a frozendict builtin type References: Message-ID: <20120228132216.2828f31a@pitrou.net> On Tue, 28 Feb 2012 13:14:15 +0100 Victor Stinner wrote: > > > See also the PEP 351. > > I read the PEP and the email explaining why it was rejected. I think you should write a separate PEP and explain the use cases clearly. cheers Antoine. From mark at hotpy.org Tue Feb 28 13:32:10 2012 From: mark at hotpy.org (Mark Shannon) Date: Tue, 28 Feb 2012 12:32:10 +0000 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: References: <4f4c08bb.e89dec0a.772f.1b19@mx.google.com> <20120228125327.168c09d1@pitrou.net> Message-ID: <4F4CC94A.9060809@hotpy.org> Hi, I don't know if an implementation of the frozendict actually exists, but if anyone is planning on writing one then can I suggest that they take a look at my new dict implementation: http://bugs.python.org/issue13903 https://bitbucket.org/markshannon/cpython_new_dict/ Making dicts immutable (at the C level) is quite easy with my new implementation. Cheers, Mark. From mal at egenix.com Tue Feb 28 13:44:20 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 28 Feb 2012 13:44:20 +0100 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: References: Message-ID: <4F4CCC24.5090105@egenix.com> Victor Stinner wrote: >> See also the PEP 351. > > I read the PEP and the email explaining why it was rejected. > > Just to be clear: the PEP 351 tries to freeze an object, try to > convert a mutable or immutable object to an immutable object. Whereas > my frozendict proposition doesn't convert anything: it just raises a > TypeError if you use a mutable key or value. > > For example, frozendict({'list': ['a', 'b', 'c']}) doesn't create > frozendict({'list': ('a', 'b', 'c')}) but raises a TypeError. I fail to see the use case you're trying to address with this kind of frozendict(). The purpose of frozenset() is to be able to use a set as dictionary key (and to some extent allow for optimizations and safe iteration). Your implementation can be used as dictionary key as well, but why would you want to do that in the first place ? If you're thinking about disallowing changes to the dictionary structure, e.g. in order to safely iterate over its keys or items, "freezing" the keys is enough. Requiring the value objects not to change is too much of a restriction to make the type useful in practice, IMHO. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 28 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-02-13: Released eGenix pyOpenSSL 0.13 http://egenix.com/go26 2012-02-09: Released mxODBC.Zope.DA 2.0.2 http://egenix.com/go25 2012-02-06: Released eGenix mx Base 3.2.3 http://egenix.com/go24 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From g.rodola at gmail.com Tue Feb 28 13:48:10 2012 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Tue, 28 Feb 2012 13:48:10 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <1330431576.3400.2.camel@localhost.localdomain> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> <1330431576.3400.2.camel@localhost.localdomain> Message-ID: Il 28 febbraio 2012 13:19, Antoine Pitrou ha scritto: > > Le mardi 28 f?vrier 2012 ? 22:14 +1000, Nick Coghlan a ?crit : >> If you're using separate branches, then your Python 2 code isn't being >> made forward compatible with Python 3. Yes, it avoids making your >> Python 2 code uglier, but it means maintaining two branches in >> parallel until you drop Python 2 support. > > IMO, maintaining two branches shouldn't be much more work than > maintaining hacks so that a single codebase works with two different > programming languages. Would that mean distributing 2 separate tarballs? How would tools such as easy_install and pip work in respect of that? Is there a naming convention they can rely on? --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ http://code.google.com/p/pysendfile/ From ncoghlan at gmail.com Tue Feb 28 13:49:06 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 28 Feb 2012 22:49:06 +1000 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <1330431576.3400.2.camel@localhost.localdomain> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> <1330431576.3400.2.camel@localhost.localdomain> Message-ID: On Tue, Feb 28, 2012 at 10:19 PM, Antoine Pitrou wrote: > > Le mardi 28 f?vrier 2012 ? 22:14 +1000, Nick Coghlan a ?crit : >> If you're using separate branches, then your Python 2 code isn't being >> made forward compatible with Python 3. Yes, it avoids making your >> Python 2 code uglier, but it means maintaining two branches in >> parallel until you drop Python 2 support. > > IMO, maintaining two branches shouldn't be much more work than > maintaining hacks so that a single codebase works with two different > programming languages. Aside from the unicode literal problem, I find that the Python 2.6+/3.2+ subset is still a fairly nice language for an application level web program. Most of the rest of the bytes/text ugliness is hidden away below the framework layer where folks like Chris, Armin and Jacob have to deal with it, but it doesn't affect me as a framework user. >> You've once again raised the >> barrier to entry: either people contribute two patches, or they accept >> that their patch may languish until someone else writes the patch for >> the other version. > > Again that's wrong. If you cleverly use 2to3 to port between branches, > patches only have to be written against the 2.x version. Apparently *you* know how to do that, but I don't. If I, as a CPython core developer, don't know how to do that, how is it reasonable to expect J. Random Hacker to become a Python 3 porting export? PEP 414 is all about lowering the barrier to entry for successful Python 3 ports. OK, fine some very clever people have invested a lot of time in finding ways to deal with the status quo that make it less painful. That doesn't mean it isn't painful - it just means the early adopters have steeled themselves against the pain and learned to suck it up and cope. Now that we've discovered some of the key sources of pain, we can live with a few pragmatic concessions in the purity of Python 3's language definition to ease the transition for the vast number of Python 3 ports which have yet to begin. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From vinay_sajip at yahoo.co.uk Tue Feb 28 14:02:18 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 28 Feb 2012 13:02:18 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> Message-ID: Nick Coghlan gmail.com> writes: > > On Tue, Feb 28, 2012 at 10:10 PM, Vinay Sajip yahoo.co.uk> wrote: > > If the 2.x code depends on having u'xxx' literals, then 3.2 testing will > > potentially involve running a fixer on all files in the project every time a > > change is made, writing to a separate directory, or else a fixer which is > > integrated into the editing environment so it knows what changed. This is > > painful, and what motivated PEP 314 in the first place - which seems ironic. > > No, the real idea behind PEP 414 is that most ports that rely on it > simply won't support 3.2 - they will only target 3.3+. Well, yes in that the PEP will only be implemented in 3+, but the motivation was to make a single codebase easier to achieve. It does that if you take the narrow view of 2.6+/3.3+, but not if you factor 3.2 into the mix. Maybe 3.2 adoption is too low for us to worry about here, but I for one certainly wish it hadn't been relegated to being a 2nd-class citizen. > The u"" fixer will just be one more tool in the arsenal of those that > *do* want to support 3.2 (either because they want to target Ubuntu's > LTS 3.2 stack, or for their own reasons). All of the other > alternatives (such as separate branches or the unicode_literals future > import) will also remain available to them. Right, I get that - as I said, unicode_literals is my preferred path of the options available. It's a shame to see this sort of Balkanisation, though. For example, if Django retains u'xxx' literals (even though I've ported it using unicode_literals, they may choose a different path officially), users wanting to work with it using 2.6/2.7/3.2/3.3 (as I do now) are SOL as far as a single codebase is concerned. Of course, when you're working on your own project, you can call the shots. But problems can arise if you have to work with an external project, as many of us frequently do. Regards, Vinay Sajip From vinay_sajip at yahoo.co.uk Tue Feb 28 14:30:48 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 28 Feb 2012 13:30:48 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> <20120228090123.Horde.yp1BIqGZi1VPTInTgC4X8hA@webmail.df.eu> Message-ID: v.loewis.de> writes: > > > A couple of people have said that 'native string' is spelt 'str', but I'm not > > sure that's the right answer. For example, 2.x's cString.StringIO > > expects native strings, not Unicode: > > Your counter-example is non-ASCII characters/bytes. I doubt that this > is a valid > use case; in a "native" string, these shouldn't occur (i.e. native > strings should > always be ASCII), since the semantics of non-ASCII changes drastically between > 2.x and 3.x. So whoever defines some API to take "native" strings > can't have defined > a valid use of non-ASCII in that interface. It might not be a valid usage, but the 2.x ecosystem has numerous occurrences of invalid usages, which tend to crop up when porting because of 3.x's increased strictness. In the example I gave, cStringIO.StringIO should be able to cope with text strings, but doesn't. Of course there are StringIO.StringIO and io.StringIO in 2.6, but when porting a project, you can't be sure which of these you might run into. > Indeed it should. If there is a known application of non-ASCII native strings, > I surely would like to know what that is. I can't think of a specific instance off-hand, but I seem to recall having problems with some of the cookie APIs insisting on native strings (rather than text, which is validated against ASCII where appropriate). Regards, Vinay Sajip From storchaka at gmail.com Tue Feb 28 14:32:23 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 28 Feb 2012 15:32:23 +0200 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> Message-ID: 28.02.12 14:14, Nick Coghlan ???????(??): > However, that's the wrong question. > The right question is "Does PEP 414 make porting substantially > *easier*, by significantly reducing the volume of code that needs to > change in order to attain Python 3 compatibility?". Another pertinent question: "What are disadvantages of PEP 414 is adopted?" From rdmurray at bitdance.com Tue Feb 28 14:41:13 2012 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 28 Feb 2012 08:41:13 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> Message-ID: <20120228134113.B0E9F2500E4@webabinitio.net> On Tue, 28 Feb 2012 22:21:11 +1000, Nick Coghlan wrote: > On Tue, Feb 28, 2012 at 10:10 PM, Vinay Sajip wrote: > > If the 2.x code depends on having u'xxx' literals, then 3.2 testing will > > potentially involve running a fixer on all files in the project every time a > > change is made, writing to a separate directory, or else a fixer which is > > integrated into the editing environment so it knows what changed. This is > > painful, and what motivated PEP 314 in the first place - which seems ironic. > > No, the real idea behind PEP 414 is that most ports that rely on it > simply won't support 3.2 - they will only target 3.3+. Hmm. It seems to me that this argument implies that PEP 414 is just as likely to *slow down* adoption of Python3 as it is to speed it up, since if this issue is as big a barrier as indicated, many potential porters may choose to wait until OS vendors are supporting 3.3 widely before starting their ports. We are clearly expecting that the reality is that the impact will be at worse neutral, and hopefully positive. --David From ezio.melotti at gmail.com Tue Feb 28 15:20:46 2012 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Tue, 28 Feb 2012 16:20:46 +0200 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <1330431576.3400.2.camel@localhost.localdomain> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> <1330431576.3400.2.camel@localhost.localdomain> Message-ID: <4F4CE2BE.3060408@gmail.com> On 28/02/2012 14.19, Antoine Pitrou wrote: > Le mardi 28 f?vrier 2012 ? 22:14 +1000, Nick Coghlan a ?crit : >> If you're using separate branches, then your Python 2 code isn't being >> made forward compatible with Python 3. Yes, it avoids making your >> Python 2 code uglier, but it means maintaining two branches in >> parallel until you drop Python 2 support. > IMO, maintaining two branches shouldn't be much more work than > maintaining hacks so that a single codebase works with two different > programming languages. +10 For every CPython bug that I fix I first apply the patch on 2.7, then on 3.2 and then on 3.3. Most of the time I don't even need to change anything while applying the patch to 3.2, sometimes I have to do some trivial fixes. This is also true for another personal 12kloc project* where I'm using the two-branches approach. For me, the costs of having two branches are: 1) a one-time conversion when the Python3-compatible branch is created (can be done easily with 2to3); 2) merging the fix I apply to the Python2 branch (and with modern DVCS this is not really an issue). With the shared code base approach, the costs are: 1) a one-time conversion to "fix" the code base and make it run on both 2.x and 3.x; 2) keep using and having to deal with hacks in order to keep it running. With the first approach, you also have two clean and separate code bases, with no hacks; when you stop using Python 2, you end up with a clean Python 3 branch. The one-time conversion also seems easier in the first case. (Note: there are also other costs -- e.g. releasing -- that I haven't considered because they don't affect me personally, but I'm not sure they are big enough to make the two-branches approach worse.) > >> You've once again raised the >> barrier to entry: either people contribute two patches, or they accept >> that their patch may languish until someone else writes the patch for >> the other version. > Again that's wrong. If you cleverly use 2to3 to port between branches, > patches only have to be written against the 2.x version. After the initial conversion of the code base, the fixes are mostly trivial, so people don't need to write two patches (most of the patches we get for CPython are either against 2.7 or 3.2, and sometimes they even apply clearly to both). Using 2to3 to generate the 3.x code automatically for every change applied to the 2.x branch (or to convert everything when a new package is installed) sounds wrong to me. I wouldn't trust generated code even if 2to3 was a better tool. That said, I successfully used the shared code base approach with print_function, unicode_literals, and a couple of try/except for the imports for a few one-file scripts (for 2.7/3.2) that I wrote recently. TL;DR the two-branches approach usually works better (at least IME) than the shared code base approach, doesn't necessarily require more work, and doesn't need ugly hacks to work. * in this case all the string literals I had were already text (rather than bytes) and even without using unicode_literals they worked out of the box when I moved the code to 3.x. There was however a place where it didn't work, and that turned out to be a bug even in Python 2 because I was mixing bytes and text. Best Regards, Ezio Melotti > Regards > > Antoine. From barry at python.org Tue Feb 28 15:53:57 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 28 Feb 2012 09:53:57 -0500 Subject: [Python-Dev] Spreading the Python 3 religion (was Re: PEP 414 - Unicode Literals for Python 3) In-Reply-To: <20120228134113.B0E9F2500E4@webabinitio.net> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> <20120228134113.B0E9F2500E4@webabinitio.net> Message-ID: <20120228095357.2b9fde87@resist.wooz.org> On Feb 28, 2012, at 08:41 AM, R. David Murray wrote: >Hmm. It seems to me that this argument implies that PEP 414 is just >as likely to *slow down* adoption of Python3 as it is to speed it up, >since if this issue is as big a barrier as indicated, many potential >porters may choose to wait until OS vendors are supporting 3.3 widely >before starting their ports. We are clearly expecting that the reality >is that the impact will be at worse neutral, and hopefully positive. If PEP 414 helps some projects migrate to Python 3, great. But I really hope we as a community don't perpetuate the myth that you cannot port to Python 3 without this, and I hope that we spend as much effort on educating other Python developers on how to port to Python 3 *right now* supporting Python 2.6, 2.7, and 3.2. That's the message we should be spreading and we should be helping developers understand exactly how to do this effectively, among the many great options that exist today. Only in the most extreme cases or the most inertially challenged projects should we say "wait for Python 3.3". Cheers, -Barry From steve at pearwood.info Tue Feb 28 15:56:52 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 29 Feb 2012 01:56:52 +1100 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: <4F4CCC24.5090105@egenix.com> References: <4F4CCC24.5090105@egenix.com> Message-ID: <4F4CEB34.2070206@pearwood.info> M.-A. Lemburg wrote: > Victor Stinner wrote: >>> See also the PEP 351. >> I read the PEP and the email explaining why it was rejected. >> >> Just to be clear: the PEP 351 tries to freeze an object, try to >> convert a mutable or immutable object to an immutable object. Whereas >> my frozendict proposition doesn't convert anything: it just raises a >> TypeError if you use a mutable key or value. >> >> For example, frozendict({'list': ['a', 'b', 'c']}) doesn't create >> frozendict({'list': ('a', 'b', 'c')}) but raises a TypeError. > > I fail to see the use case you're trying to address with this > kind of frozendict(). > > The purpose of frozenset() is to be able to use a set as dictionary > key (and to some extent allow for optimizations and safe > iteration). Your implementation can be used as dictionary key as well, > but why would you want to do that in the first place ? Because you have a mapping, and want to use a dict for speedy, convenient lookups. Sometimes your mapping involves the key being a string, or an int, or a tuple, or a set, and Python makes it easy to use that in a dict. Sometimes the key is itself a mapping, and Python makes it very difficult. Just google on "python frozendict" or "python immutabledict" and you will find that this keeps coming up time and time again, e.g.: http://www.cs.toronto.edu/~tijmen/programming/immutableDictionaries.html http://code.activestate.com/recipes/498072-implementing-an-immutable-dictionary/ http://code.activestate.com/recipes/414283-frozen-dictionaries/ http://bob.pythonmac.org/archives/2005/03/04/frozendict/ http://python.6.n6.nabble.com/frozendict-td4377791.html http://www.velocityreviews.com/forums/t648910-does-python3-offer-a-frozendict.html http://stackoverflow.com/questions/2703599/what-would-be-a-frozen-dict > If you're thinking about disallowing changes to the dictionary > structure, e.g. in order to safely iterate over its keys or items, > "freezing" the keys is enough. > > Requiring the value objects not to change is too much of a restriction > to make the type useful in practice, IMHO. It's no more of a limitation than the limitation that strings can't change. Frozendicts must freeze the value as well as the key. Consider the toy example, mapping food combinations to calories: d = { {appetizer => fried fish, main => double burger, drink => cola}: 5000, {appetizer => None, main => green salad, drink => tea}: 200, } (syntax is only for illustration purposes) Clearly the hash has to take the keys and values into account, which means that both the keys and values have to be frozen. (Values may be mutable objects, but then the frozendict can't be hashed -- just like tuples can't be hashed if any item in them is mutable.) -- Steven From vinay_sajip at yahoo.co.uk Tue Feb 28 16:02:37 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 28 Feb 2012 15:02:37 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C8BB0.9020302@active-4.com> Message-ID: Armin Ronacher active-4.com> writes: > If by str() you mean using "str('x')" as replacement for 'x' in both 2.x > and 3.x with __future__ imports as a replacement for native string > literals, please mention why this is better than u(), s(), n() etc. It > would be equally slow than a custom wrapper function and it would not > support non-ascii characters. Well, you can give it any name you like, but if PY3: def n(literal): return literal else: # used along with "from __future__ import unicode_literals" in client code def n(literal): return literal.encode('utf-8') will support non-ASCII characters. You have not provided anything other than a microbenchmark regarding performance - as you are well aware, this does not illustrate what the performance might be on a representative workload. While there might be the odd percent in it, I didn't see any major degradation when running the Django test suite - which I would think is a more balanced workload than just benchmarking the wrapper. Of course, I don't claim to have studied the performance characteristics closely - I haven't. AFAICT, the incidence of native strings in an application is not that great (of course there can be pathological cases), so the number of calls to n() or whatever it's called is unlikely to have any significant impact. Even when I was using u() calls with the 2.5 port - which are of course much more common - the performance impact was unremarkable. Regards, Vinay Sajip From barry at python.org Tue Feb 28 16:04:34 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 28 Feb 2012 10:04:34 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> <1330431576.3400.2.camel@localhost.localdomain> Message-ID: <20120228100434.2b5c7bcd@resist.wooz.org> On Feb 28, 2012, at 10:49 PM, Nick Coghlan wrote: >On Tue, Feb 28, 2012 at 10:19 PM, Antoine Pitrou wrote: >> Again that's wrong. If you cleverly use 2to3 to port between branches, >> patches only have to be written against the 2.x version. > >Apparently *you* know how to do that, but I don't. If I, as a CPython >core developer, don't know how to do that, how is it reasonable to >expect J. Random Hacker to become a Python 3 porting export? They don't need to, but *we* do, and it's incumbent on us to educate our users. I strongly believe that *now* is the time to be porting to Python 3. It's critical to the long-term health of Python. It's up to us to learn the strategies for accomplishing this, spread the message that it is not only possible, but usually easy (and yes even, from my own experience, fun!). Oh and here's how in three easy steps, 1, 2, 3. I've blogged about my own porting experiences extensively. My strategies may not work for everyone, but they will work for a great many projects. If they work for yours, spread the word. If they don't, figure out something better, write about it, and spread the word. We really need to stop saying that porting to Python 3 is hard, or should be delayed. It's not in the vast majority of cases. Yes, there are warts, and we should continue to improve Python 3 so it gets easier, but by no means is it impossible for most code to be working very nicely on Python 3 today. -Barry From vinay_sajip at yahoo.co.uk Tue Feb 28 16:18:29 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 28 Feb 2012 15:18:29 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> Message-ID: Nick Coghlan gmail.com> writes: > tools. But the existing approaches require that, in order to be > forward compatible with Python 3, a program must be made *worse* in > Python 2 (i.e. harder to read and harder to write correctly for > someone that hasn't learned Python 3 yet). Restoring unicode literal How so? In the case of string literals, are you saying that it's worse in that you use 'xxx' instead of u'xxx' for text, and have to add a unicode_literals import? I don't feel that either of those make a 2.x program worse. > support in 3.3 is a pragmatic step that allows a lot of code to *just > work* on Python 3. Most 2.6+ code that still doesn't work on Python 3 > even after this change will be made *better* (or at least not made > substantially worse) by the additional changes necessary for forward > compatibility. Remember, the PEP advocates what it does in the name of a single codebase. If you want to (or have to) support 3.2 in addition to 3.3, 2.6, 2.7, the PEP does not work for you. It only works for you if you're interested in 2.6+ and 3.3+. > Unicode literals are somewhat unique in their impact on porting > efforts, as they show up *everywhere* in Unicode correct code in > Python 2. The diffs that will be needed to correctly tag bytestrings > in such code under Python 2 are tiny compared to those that would be > needed to strip the u"" prefixes. But that's a one-time operation using a lib2to3 fixer, and even for a big project like Django, we're not talking about a lot of time spent on this (at least, in my experience). Having a good test suite helps catch those byte-string cases more easily, of course. Regards, Vinay Sajip From brett at python.org Tue Feb 28 16:23:41 2012 From: brett at python.org (Brett Cannon) Date: Tue, 28 Feb 2012 10:23:41 -0500 Subject: [Python-Dev] Spreading the Python 3 religion (was Re: PEP 414 - Unicode Literals for Python 3) In-Reply-To: <20120228095357.2b9fde87@resist.wooz.org> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> <20120228134113.B0E9F2500E4@webabinitio.net> <20120228095357.2b9fde87@resist.wooz.org> Message-ID: On Tue, Feb 28, 2012 at 09:53, Barry Warsaw wrote: > On Feb 28, 2012, at 08:41 AM, R. David Murray wrote: > > >Hmm. It seems to me that this argument implies that PEP 414 is just > >as likely to *slow down* adoption of Python3 as it is to speed it up, > >since if this issue is as big a barrier as indicated, many potential > >porters may choose to wait until OS vendors are supporting 3.3 widely > >before starting their ports. We are clearly expecting that the reality > >is that the impact will be at worse neutral, and hopefully positive. > > If PEP 414 helps some projects migrate to Python 3, great. > > But I really hope we as a community don't perpetuate the myth that you > cannot > port to Python 3 without this, and I hope that we spend as much effort on > educating other Python developers on how to port to Python 3 *right now* > supporting Python 2.6, 2.7, and 3.2. That's the message we should be > spreading and we should be helping developers understand exactly how to do > this effectively, among the many great options that exist today. Only in > the > most extreme cases or the most inertially challenged projects should we say > "wait for Python 3.3". > Well, when the code is committed I will update the porting HOWTO and push the __future__ imports first since they cover more versions of Python (i.e. Python 3.2). But I will mention the options that skip the __future__ imports for those that choose not to use them (or have already done the work of using the u prefix in their code). Plus that doc probably will need an update of caveats that seem to bit everyone (e.g. the str(bytes) thing which I didn't know about) when trying to do source-compatible versions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vinay_sajip at yahoo.co.uk Tue Feb 28 16:29:02 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 28 Feb 2012 15:29:02 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> Message-ID: Antoine Pitrou pitrou.net> writes: > Wrong. The separate branches approach allows you to have a clean > Python 3 codebase without crippling the Python 2 codebase. There may be warts in a single codebase (you usually can't have something for nothing), but it's not necessarily *crippled* when running in 2.x. Of course two branches allow you to have a no-compromise approach for the code style, but you might pay for that in time spent doing merges etc. > Note that 2to3 is actually helpful when you choose the dual branches > approach, and it isn't a serial dependency in that case. > (see https://bitbucket.org/pitrou/t3k/) Yes, 2to3 is very useful when doing an initial porting exercise. I've used it just once in each port I've done. It also works well for a single codebase approach, only I just follow its advice rather than letting it do the conversion automatically. Regards, Vinay Sajip From brian at python.org Tue Feb 28 16:29:20 2012 From: brian at python.org (Brian Curtin) Date: Tue, 28 Feb 2012 09:29:20 -0600 Subject: [Python-Dev] Porting Stories (was PEP 414 - Unicode Literals for Python 3) Message-ID: On Tue, Feb 28, 2012 at 09:04, Barry Warsaw wrote: > We really need to stop saying that porting to Python 3 is hard, or should be > delayed. ?It's not in the vast majority of cases. ?Yes, there are warts, and > we should continue to improve Python 3 so it gets easier, but by no means is > it impossible for most code to be working very nicely on Python 3 today. I've been singing and dancing about the ease of porting for a while now, but it's mostly thanks to the fact that I never had to do any Unicode tomfoolery. Now with this PEP, the game gets easier for a lot more people. Does anyone have a good porting experience they'd like to share, which I could maybe use as a PR effort for us? Barry, I know you wrote some pretty solid coverage of your DBus port. Anyone else? Personal projects or work stuff (assuming its ok to share). blog.python.org has been asleep for a while and a good porting testimonial might be a way to jumpstart it, or I can get it on the python.org front page. If you have anything to share on that front, please contact me directly. From g.rodola at gmail.com Tue Feb 28 16:30:58 2012 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Tue, 28 Feb 2012 16:30:58 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4F4CE2BE.3060408@gmail.com> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> <1330431576.3400.2.camel@localhost.localdomain> <4F4CE2BE.3060408@gmail.com> Message-ID: Il 28 febbraio 2012 15:20, Ezio Melotti ha scritto: > On 28/02/2012 14.19, Antoine Pitrou wrote: >> >> Le mardi 28 f?vrier 2012 ? 22:14 +1000, Nick Coghlan a ?crit : >>> >>> If you're using separate branches, then your Python 2 code isn't being >>> made forward compatible with Python 3. Yes, it avoids making your >>> Python 2 code uglier, but it means maintaining two branches in >>> parallel until you drop Python 2 support. >> >> IMO, maintaining two branches shouldn't be much more work than >> maintaining hacks so that a single codebase works with two different >> programming languages. > > > +10 > > For every CPython bug that I fix I first apply the patch on 2.7, then on 3.2 > and then on 3.3. > Most of the time I don't even need to change anything while applying the > patch to 3.2, sometimes I have to do some trivial fixes. ?This is also true > for another personal 12kloc project* where I'm using the two-branches > approach. > > For me, the costs of having two branches are: > ?1) a one-time conversion when the Python3-compatible branch is created (can > be done easily with 2to3); > ?2) merging the fix I apply to the Python2 branch (and with modern DVCS this > is not really an issue). > > With the shared code base approach, the costs are: > ?1) a one-time conversion to "fix" the code base and make it run on both 2.x > and 3.x; > ?2) keep using and having to deal with hacks in order to keep it running. > > With the first approach, you also have two clean and separate code bases, > with no hacks; when you stop using Python 2, you end up with a clean Python > 3 branch. > The one-time conversion also seems easier in the first case. > > (Note: there are also other costs -- e.g. releasing -- that I haven't > considered because they don't affect me personally, but I'm not sure they > are big enough to make the two-branches approach worse.) They are. With that kind of approach you're basically forced to include the python version number as part of the tarball name (e.g. foo-0.3.1-py2.tar.gz and foo-0.3.1-py3.tar.gz). Just to name one, that means "foo" can't be installed via pip/easy_install. Regards, --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ http://code.google.com/p/pysendfile/ From barry at python.org Tue Feb 28 16:33:46 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 28 Feb 2012 10:33:46 -0500 Subject: [Python-Dev] Spreading the Python 3 religion (was Re: PEP 414 - Unicode Literals for Python 3) In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> <20120228134113.B0E9F2500E4@webabinitio.net> <20120228095357.2b9fde87@resist.wooz.org> Message-ID: <20120228103346.2fba69bf@resist.wooz.org> On Feb 28, 2012, at 10:23 AM, Brett Cannon wrote: >Well, when the code is committed I will update the porting HOWTO and push >the __future__ imports first since they cover more versions of Python (i.e. >Python 3.2). But I will mention the options that skip the __future__ >imports for those that choose not to use them (or have already done the >work of using the u prefix in their code). Plus that doc probably will need >an update of caveats that seem to bit everyone (e.g. the str(bytes) thing >which I didn't know about) when trying to do source-compatible versions. See, I think the emphasis should be on using the future imports and unadorning your unicode literals. Forget about this PEP except as a footnote. This strategy works today for most packages. You might think that this is ugly, but really, I think that doesn't matter (or maybe better: get over it! :). Definitely don't let that stop you from porting *now*. In the small minority of cases where this strategy cannot work for you (and I admit to not really understanding what those cases are), then the footnote about the reintroduction of the u-prefix should be enough. And yes, the str(bytes) thing is a pain, but it too can be worked around, and is such a minor wart that it should not delay your porting efforts. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From vinay_sajip at yahoo.co.uk Tue Feb 28 16:39:47 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 28 Feb 2012 15:39:47 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> Message-ID: Serhiy Storchaka gmail.com> writes: > Another pertinent question: "What are disadvantages of PEP 414 is adopted?" It's moot, but as I see it: the purpose of PEP 414 is to facilitate a single codebase across 2.x and 3.x. However, it only does this if your 3.x interest is 3.3+. If you also want to or need to support 3.0 - 3.2, it makes your workflow more painful, because you can't run tests on 2.x or 3.3 and then run them on 3.2 without an intermediate source conversion step - just like the 2to3 step that people find painful when it's part of maintenance workflow, and which in part prompted the PEP in the first place. Regards, Vinay Sajip From steve at pearwood.info Tue Feb 28 17:02:30 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 29 Feb 2012 03:02:30 +1100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> Message-ID: <4F4CFA96.5050907@pearwood.info> Vinay Sajip wrote: > Serhiy Storchaka gmail.com> writes: > >> Another pertinent question: "What are disadvantages of PEP 414 is adopted?" > > It's moot, but as I see it: the purpose of PEP 414 is to facilitate a single > codebase across 2.x and 3.x. However, it only does this if your 3.x interest is > 3.3+. If you also want to or need to support 3.0 - 3.2, it makes your workflow > more painful, because you can't run tests on 2.x or 3.3 and then run them on 3.2 > without an intermediate source conversion step - just like the 2to3 step that > people find painful when it's part of maintenance workflow, and which in part > prompted the PEP in the first place. I don't think it's fair to say it makes it *more* painful. Fair to say it doesn't make it less painful, but adding u'' to 3.3+ doesn't make it harder to port from 2.x to 3.1+. You're merely no better off with it than without it. Aside: in my opinion, people shouldn't actively support 3.0, or at least not advertise support for it, as it was end-of-lifed on the release of 3.1. As I see it, it is best to pretend that 3.0 never existed :) -- Steven From vinay_sajip at yahoo.co.uk Tue Feb 28 17:08:06 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 28 Feb 2012 16:08:06 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> <1330431576.3400.2.camel@localhost.localdomain> <4F4CE2BE.3060408@gmail.com> Message-ID: Ezio Melotti gmail.com> writes: > For every CPython bug that I fix I first apply the patch on 2.7, then on > 3.2 and then on 3.3. > Most of the time I don't even need to change anything while applying the > patch to 3.2, sometimes I have to do some trivial fixes. This is also > true for another personal 12kloc project* where I'm using the > two-branches approach. I hear what you say about the personal project, but IMO CPython is atypical (as far as this discussion is concerned), not least because it's not a pure-Python project. > For me, the costs of having two branches are: > 1) a one-time conversion when the Python3-compatible branch is created > (can be done easily with 2to3); Yes, but the amount of ease is project-dependent. For example, 2to3 wraps values() method calls with list(), which is a reasonable thing to do for dicts; when presented Django's querysets, which have a values() method which should not be wrapped, then you have to go through and sort things out. I'm not knocking 2to3, which I think is great. Just that things go well sometimes, and less well at other times, > With the shared code base approach, the costs are: > 1) a one-time conversion to "fix" the code base and make it run on > both 2.x and 3.x; > 2) keep using and having to deal with hacks in order to keep it running. Which hacks do you mean, if you're only interested in 2.6+? > With the first approach, you also have two clean and separate code > bases, with no hacks; when you stop using Python 2, you end up with a > clean Python 3 branch. > The one-time conversion also seems easier in the first case. > > (Note: there are also other costs -- e.g. releasing -- that I haven't > considered because they don't affect me personally, but I'm not sure > they are big enough to make the two-branches approach worse.) I don't believe there's a one-size-fits-all. The two branches approach is appealing, and I have no quarrel with it: but I contend that big projects like Django would be reluctant to switch, or take much longer to switch to 3.x, if they had to maintain separate branches. Given the size of their user community, they have to follow strict release procedures, which (even with just running on 2.x) smaller projects can be more relaxed about. You forgot to mention the part which is most time-consuming day-to-day: making changes and testing. For the two-branch approach, its 1. Change on 2.x 2. Test on 2.x 3. Commit on 2.x 4. Merge to 3.x 5. Possibly change on 3.x 6. Test on 3.x 7. Commit on 3.x where each "test" step, if failures occur, might take you back to a previous "change" step. For the single codebase, that's 1. Change 2. Test on 2.x 3. Test on 3.x 4. Commit This, to me, is the single big advantage of the single codebase approach, and the productivity improvements outweigh code purity issues which are, in the grand scheme of things, not all that large. Another advantage is DRY: you don't have to worry about forgetting to merge some changes from 2.x to 3.x. Haven't we all been there one time or another? I know I have, though I try not to make a habit of it ;-) > After the initial conversion of the code base, the fixes are mostly > trivial, so people don't need to write two patches (most of the patches > we get for CPython are either against 2.7 or 3.2, and sometimes they > even apply clearly to both). Fixes may be trivial, but new features might not always be so. Regards, Vinay Sajip From yselivanov.ml at gmail.com Tue Feb 28 17:29:42 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 28 Feb 2012 11:29:42 -0500 Subject: [Python-Dev] PEP 414 In-Reply-To: <4F4C0600.5010903@active-4.com> References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <4F4A29BD.2090607@active-4.com> <4F4BA4E0.80806@active-4.com> <4F4C0600.5010903@active-4.com> Message-ID: <87A20E5B-D624-4F32-BEE5-57A5C6D83339@gmail.com> Hi Armin, Could you please remove from the PEP the following statement: """As it stands, Python 3 is currently a bad choice for long-term investments, since the ecosystem is not yet properly developed, and libraries are still fighting with their API decisions for Python 3.""" While it may be as such for you, I think it is incorrect for the rest. Moreover, it is harmful for the python 3 adoption, to put such documents on python.org. The python ecosystem is not just limited to WSGI apps, Django and Flask. Yes, we don't have all the packages on pypi support python 3, but many of those are portable within 10 minutes to couple of hours of work (and I did many of such ports for our internal systems.) And many of the essential packages do exist for python 3, like numpy, zeromq etc. I know several sturt-ups, including mine that develop huge commercial applications entirely on python 3. Thanks, -Yury On 2012-02-27, at 5:38 PM, Armin Ronacher wrote: > Hi, > > On 2/27/12 10:18 PM, Terry Reedy wrote: >> I would like to know if you think that this one change is enough to do >> agile development and testing, etc, or whether, as Chris McDonough >> hopes, this is just the first of a series of proposals you have planned. > Indeed I have three other PEPs in the work. The reintroduction of > "except (((ExceptionType),),)", the "<>" comparision operator and the > removal of "nonlocal", the latter to make Python 2.x developers feel > better about themselves. :-) > > > Regards, > Armin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/yselivanov.ml%40gmail.com From vinay_sajip at yahoo.co.uk Tue Feb 28 17:29:44 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 28 Feb 2012 16:29:44 +0000 (UTC) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> <4F4CFA96.5050907@pearwood.info> Message-ID: Steven D'Aprano pearwood.info> writes: > I don't think it's fair to say it makes it *more* painful. Fair to say it > doesn't make it less painful, but adding u'' to 3.3+ doesn't make it harder to > port from 2.x to 3.1+. You're merely no better off with it than without it. No, it actually does make it *more* painful in some scenarios. Let's say Django decides to move to 3.x using a single codebase starting with 3.3, using PEP 414 to avoid changing u'xxx' in their source code. This is dandy for 3.3, and say I have to work with Django on 2.6, 2.7 and 3.3. Great - I make some changes, I run tests on 2.x, 3.3 - make changes as needed to fix failures, then commit. And on to the next set of changes. Now, suppose I also need to support 3.2, in addition to the other versions. I don't get the same easy workflow I had before: for 3.2, I have to run Armin's hook to remove the u'' prefixes between making changes and running tests, *every time*, but the output will be written to a separate directory, and I may have to maintain a separate test environment there in terms of test data files etc. It's exactly the complaint the PEP makes about having to have 2to3 in the workflow, and how that hurts your productivity! Though the experience may differ in degree because Armin's tool is faster, it's not going to make for a seamless workflow. Especially not if it has to run over all the files in the Django codebase. And if it's going to know only which files have changed and run only on those, how does it propose to do that, independently of my editing tools? Regards, Vinay Sajip From yselivanov.ml at gmail.com Tue Feb 28 17:42:47 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 28 Feb 2012 11:42:47 -0500 Subject: [Python-Dev] PEP 415: Implementing PEP 409 differently In-Reply-To: References: Message-ID: <85D42D35-C500-4DBB-8632-7954B5C170D0@gmail.com> Big +1. Indeed, this whole Ellipsis approach is just an awful hack. - Yury On 2012-02-26, at 8:30 PM, Benjamin Peterson wrote: > PEP: 415 > Title: Implementing PEP 409 differently > Version: $Revision$ > Last-Modified: $Date$ > Author: Benjamin Peterson > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 26-Feb-2012 > Post-History: 26-Feb-2012 > > > Abstract > ======== > > PEP 409 allows PEP 3134 exception contexts and causes to be suppressed when the > exception is printed. This is done using the ``raise exc from None`` > syntax. This PEP proposes to implement context and cause suppression > differently. > > Rationale > ========= > > PEP 409 changes ``__cause__`` to be ``Ellipsis`` by default. Then if > ``__cause__`` is set to ``None`` by ``raise exc from None``, no context or cause > will be printed should the exception be uncaught. > > The main problem with this scheme is it complicates the role of > ``__cause__``. ``__cause__`` should indicate the cause of the exception not > whether ``__context__`` should be printed or not. This use of ``__cause__`` is > also not easily extended in the future. For example, we may someday want to > allow the programmer to select which of ``__context__`` and ``__cause__`` will > be printed. The PEP 409 implementation is not amendable to this. > > The use of ``Ellipsis`` is a hack. Before PEP 409, ``Ellipsis`` was used > exclusively in extended slicing. Extended slicing has nothing to do with > exceptions, so it's not clear to someone inspecting an exception object why > ``__cause__`` should be set to ``Ellipsis``. Using ``Ellipsis`` by default for > ``__cause__`` makes it asymmetrical with ``__context__``. > > Proposal > ======== > > A new attribute on ``BaseException``, ``__suppress_context__``, will be > introduced. The ``raise exc from None`` syntax will cause > ``exc.__suppress_context__`` to be set to ``True``. Exception printing code will > check for the attribute to determine whether context and cause will be > printed. ``__cause__`` will return to its original purpose and values. > > There is precedence for ``__suppress_context__`` with the > ``print_line_and_file`` exception attribute. > > Patches > ======= > > There is a patch on `Issue 14133`_. > > > References > ========== > > .. _issue 14133: > http://bugs.python.org/issue6210 > > Copyright > ========= > > This document has been placed in the public domain. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/yselivanov.ml%40gmail.com From barry at python.org Tue Feb 28 17:44:29 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 28 Feb 2012 11:44:29 -0500 Subject: [Python-Dev] PEP 414 In-Reply-To: <87A20E5B-D624-4F32-BEE5-57A5C6D83339@gmail.com> References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <4F4A29BD.2090607@active-4.com> <4F4BA4E0.80806@active-4.com> <4F4C0600.5010903@active-4.com> <87A20E5B-D624-4F32-BEE5-57A5C6D83339@gmail.com> Message-ID: <20120228114429.3f773581@limelight.wooz.org> On Feb 28, 2012, at 11:29 AM, Yury Selivanov wrote: >Could you please remove from the PEP the following statement: > >"""As it stands, Python 3 is currently a bad choice for long-term >investments, since the ecosystem is not yet properly developed, and >libraries are still fighting with their API decisions for Python 3.""" > >While it may be as such for you, I think it is incorrect for the rest. >Moreover, it is harmful for the python 3 adoption, to put such documents >on python.org. +? -Barry From martin at v.loewis.de Tue Feb 28 17:47:23 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Tue, 28 Feb 2012 17:47:23 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <20120228122019.18801ea5@pitrou.net> References: <4F49434B.6050604@active-4.com> <4F4A28CD.5070903@active-4.com> <4F4B5847.1040107@v.loewis.de> <4F4BA50A.3020009@active-4.com> <20120227174434.Horde.PAV6fUlCcOxPS7LyDc6X4bA@webmail.df.eu> <4F4BF72F.9010704@active-4.com> <4F4C9836.9030008@v.loewis.de> <20120228122019.18801ea5@pitrou.net> Message-ID: <20120228174723.Horde.sPNDEUlCcOxPTQUbLXyW2qA@webmail.df.eu> > In the end, that's not particularly relevant, because you don't have to > run the test suite entirely; when working on small changes, you usually > re-run the impacted parts of the test suite until everything goes fine; > on the other hand, 2to3 *has* to run on the entire code base. Not at all. If you are working on the code, 2to3 only needs to run on the parts of the code that you changed, since the unmodified parts will not need to be re-transformed using 2to3. > So, really, it's a couple of seconds to run a single bunch of tests vs. > several minutes to run 2to3 on the code base. Not in my experience. The incremental run-time of 2to3 after a single change is in the order of fractions of a second. > And it's not just the test suite: every concrete experiment with the > library you're porting has a serial dependency on running 2to3. Therefore, your build process should support incremental changes. Fortunately, distribute does support this approach. Regards, Martin From martin at v.loewis.de Tue Feb 28 17:50:56 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Tue, 28 Feb 2012 17:50:56 +0100 Subject: [Python-Dev] Spreading the Python 3 religion (was Re: PEP 414 - Unicode Literals for Python 3) In-Reply-To: <20120228095357.2b9fde87@resist.wooz.org> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> <20120228134113.B0E9F2500E4@webabinitio.net> <20120228095357.2b9fde87@resist.wooz.org> Message-ID: <20120228175056.Horde.KfPofklCcOxPTQXw0KqW1nA@webmail.df.eu> > If PEP 414 helps some projects migrate to Python 3, great. > > But I really hope we as a community don't perpetuate the myth that you cannot > port to Python 3 without this, and I hope that we spend as much effort on > educating other Python developers on how to port to Python 3 *right now* > supporting Python 2.6, 2.7, and 3.2. One thing that the PEP will certainly achieve is to spread the myth that you cannot port to Python 3 if you also want to support Python 2.5. That's because people will accept the "single source" approach as the one right way, and will accept that this only works well with Python 2.6. Regards, Martin From vinay_sajip at yahoo.co.uk Tue Feb 28 18:07:14 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 28 Feb 2012 17:07:14 +0000 (UTC) Subject: [Python-Dev] Spreading the Python 3 religion (was Re: PEP 414 - Unicode Literals for Python 3) References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> <20120228134113.B0E9F2500E4@webabinitio.net> <20120228095357.2b9fde87@resist.wooz.org> <20120228175056.Horde.KfPofklCcOxPTQXw0KqW1nA@webmail.df.eu> Message-ID: v.loewis.de> writes: > One thing that the PEP will certainly achieve is to spread the myth that > you cannot port to Python 3 if you also want to support Python 2.5. That's > because people will accept the "single source" approach as the one right > way, and will accept that this only works well with Python 2.6. Let's hope not. We can mitigate that by spelling out in the docs that there's no one right way, how to choose which approach is best for a given project, and so on. Regards, Vinay Sajip From brett at python.org Tue Feb 28 18:34:19 2012 From: brett at python.org (Brett Cannon) Date: Tue, 28 Feb 2012 12:34:19 -0500 Subject: [Python-Dev] Spreading the Python 3 religion (was Re: PEP 414 - Unicode Literals for Python 3) In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> <20120228134113.B0E9F2500E4@webabinitio.net> <20120228095357.2b9fde87@resist.wooz.org> <20120228175056.Horde.KfPofklCcOxPTQXw0KqW1nA@webmail.df.eu> Message-ID: On Tue, Feb 28, 2012 at 12:07, Vinay Sajip wrote: > v.loewis.de> writes: > > > One thing that the PEP will certainly achieve is to spread the myth that > > you cannot port to Python 3 if you also want to support Python 2.5. > That's > > because people will accept the "single source" approach as the one right > > way, and will accept that this only works well with Python 2.6. > > Let's hope not. We can mitigate that by spelling out in the docs that > there's > no one right way, how to choose which approach is best for a given > project, and > so on. > Changes to http://docs.python.org/howto/pyporting.html are welcome. I tried to make sure it exposed all possibilities with tips on how to support as far back as Python 2.5. -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue Feb 28 18:41:37 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 28 Feb 2012 18:41:37 +0100 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: <4f4c08bb.e89dec0a.772f.1b19@mx.google.com> References: <4f4c08bb.e89dec0a.772f.1b19@mx.google.com> Message-ID: >> ?* frozendict values must be immutable, as dict keys > > Why? ?That may be useful, but an immutable dict whose values > might mutate is also useful; by forcing that choice, it starts > to feel too specialized for a builtin. Hum, I realized that calling hash(my_frozendict) on a frozendict instance is enough to check if a frozendict only contains immutable objects. And it is also possible to check manually that values are immutable *before* creating the frozendict. I also prefer to not check for immutability because it does simplify the code :-) $ diffstat frozendict-3.patch Include/dictobject.h | 9 + Lib/collections/abc.py | 1 Lib/test/test_dict.py | 59 +++++++++++ Objects/dictobject.c | 256 ++++++++++++++++++++++++++++++++++++++++++------- Objects/object.c | 3 Python/bltinmodule.c | 1 6 files changed, 295 insertions(+), 34 deletions(-) The patch is quite small to add a new builtin type. That's because most of the code is shared with the builtin dict type. (But the patch doesn't include the documentation, it didn't write it yet.) Victor -------------- next part -------------- A non-text attachment was scrubbed... Name: frozendict-3.patch Type: text/x-patch Size: 20979 bytes Desc: not available URL: From ezio.melotti at gmail.com Tue Feb 28 18:41:24 2012 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Tue, 28 Feb 2012 19:41:24 +0200 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> <1330431576.3400.2.camel@localhost.localdomain> <4F4CE2BE.3060408@gmail.com> Message-ID: <4F4D11C4.7080305@gmail.com> On 28/02/2012 18.08, Vinay Sajip wrote: > Ezio Melotti gmail.com> writes: >> For every CPython bug that I fix I first apply the patch on 2.7, then on >> 3.2 and then on 3.3. >> Most of the time I don't even need to change anything while applying the >> patch to 3.2, sometimes I have to do some trivial fixes. This is also >> true for another personal 12kloc project* where I'm using the >> two-branches approach. > I hear what you say about the personal project, but IMO CPython is atypical (as > far as this discussion is concerned), not least because it's not a pure-Python > project. Most of the things I fix are pure Python, I wasn't considering the C patches and doc fixes here. >> For me, the costs of having two branches are: >> 1) a one-time conversion when the Python3-compatible branch is created >> (can be done easily with 2to3); > Yes, but the amount of ease is project-dependent. For example, 2to3 wraps > values() method calls with list(), which is a reasonable thing to do for dicts; > when presented Django's querysets, which have a values() method which should not > be wrapped, then you have to go through and sort things out. I'm not knocking > 2to3, which I think is great. Just that things go well sometimes, and less well > at other times, With the personal project this is what I did: 1) make a separate branch; 2) run 2to3 and let it overwrite the file; 3) review the changes as I would do with any other patch before committing; 4) fix things that 2to3 missed and other minor glitches; 5) fix a few bugs that surfaced after the port (and were in the original code too); The fixes made by 2to3 were mostly: * removing u'' from strings; * renaming imports, methods (like the .iteritems); * adding 'as' in the "except"s; * adding () for a few "print"s; These changes affected about 500 lines of code (out of 12kloc). The changes I did manually after running 2to3 were (some where not strictly necessary): * removing 'object' from classes; * removing ord() in a few places; * removing the content of super(...); * removing codecs.open() and use open() instead; * removing a few .decode('utf-8'); * adding a couple of b''; After a couple of days almost everything was working fine. > >> With the shared code base approach, the costs are: >> 1) a one-time conversion to "fix" the code base and make it run on >> both 2.x and 3.x; >> 2) keep using and having to deal with hacks in order to keep it running. > Which hacks do you mean, if you're only interested in 2.6+? Things like try/except for names that changed and wrappers for bytes/strings. Of course the situation is worse for projects that have to support earlier versions. > >> With the first approach, you also have two clean and separate code >> bases, with no hacks; when you stop using Python 2, you end up with a >> clean Python 3 branch. >> The one-time conversion also seems easier in the first case. >> >> (Note: there are also other costs -- e.g. releasing -- that I haven't >> considered because they don't affect me personally, but I'm not sure >> they are big enough to make the two-branches approach worse.) > I don't believe there's a one-size-fits-all. The two branches approach is > appealing, and I have no quarrel with it: but I contend that big projects like > Django would be reluctant to switch, or take much longer to switch to 3.x, if > they had to maintain separate branches. I would actually feel safer doing the port in a separate branch and keep it there. Changing all the code in the main branch just to make it work for 3.x too doesn't strike like a really good idea to me. > Given the size of their user community, > they have to follow strict release procedures, which (even with just running on > 2.x) smaller projects can be more relaxed about. I don't have much experience regarding releases, but developing on a separate branch shouldn't affect the release of the 2.x version. The developers will have to merge the changes to the py3 branch too, and eventually they will be able to ship an additional release for py3 too. Sure, there's more work for the developers, but that's no news. > You forgot to mention the part which is most time-consuming day-to-day: making > changes and testing. For the two-branch approach, its > > 1. Change on 2.x > 2. Test on 2.x > 3. Commit on 2.x > 4. Merge to 3.x > 5. Possibly change on 3.x > 6. Test on 3.x > 7. Commit on 3.x > > where each "test" step, if failures occur, might take you back to a previous > "change" step. > > For the single codebase, that's > > 1. Change > 2. Test on 2.x > 3. Test on 3.x > 4. Commit And if something fails here, you will have to repeat both step 2 and 3, until you get it right for both at the same time. The step 1 of the single codebase is in the end more or less equivalent to steps 1+4+5, just in a different way. The remaining extra commit takes no time, and since the branches are independent, if you find a problem with py3 you don't have to run the test suite for 2.x again. In my experience with CPython, the most time-consuming part is making the patch work on one of the branch in the first place. Once it works, porting it to the other branches is just a mechanical step that doesn't really take much. The problems during the porting arise when the two codebases diverged. (Also keep in mind that we are not actually merging from 2.x to 3.x in CPython, otherwise it would be even easier.) > This, to me, is the single big advantage of the single codebase approach, and > the productivity improvements outweigh code purity issues which are, in the > grand scheme of things, not all that large. ISTM that the amount of time is pretty much the same, so I don't see this as a point of favor of the single codebase approach. I might be wrong (I don't have much experience with the single codebase approach), but having to deal with 2+ branches never bothered me (I might be biased though, since I was already used to maintaining 3-4 branches with Python). > Another advantage is DRY: you don't have to worry about forgetting to merge some > changes from 2.x to 3.x. Haven't we all been there one time or another? I know I > have, though I try not to make a habit of it ;-) I don't think it never happened to me, but I see how this could happen, especially in the first period after the second branch is introduced. Your DVCS should warn you about this though, so, at worst, you'll end up having to merge someone else's commit. > >> After the initial conversion of the code base, the fixes are mostly >> trivial, so people don't need to write two patches (most of the patches >> we get for CPython are either against 2.7 or 3.2, and sometimes they >> even apply clearly to both). > Fixes may be trivial, but new features might not always be so. True, but especially if the feature is complicated, I would rather spend a bit more time and have to clean, separate versions than a single version that tries to work on both. Best Regards, Ezio Melotti > Regards, > > Vinay Sajip > From vinay_sajip at yahoo.co.uk Tue Feb 28 18:51:24 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 28 Feb 2012 17:51:24 +0000 (UTC) Subject: [Python-Dev] Spreading the Python 3 religion (was Re: PEP 414 - Unicode Literals for Python 3) References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> <20120228134113.B0E9F2500E4@webabinitio.net> <20120228095357.2b9fde87@resist.wooz.org> <20120228175056.Horde.KfPofklCcOxPTQXw0KqW1nA@webmail.df.eu> Message-ID: Brett Cannon python.org> writes: > Changes to http://docs.python.org/howto/pyporting.html are welcome. I tried to > make sure it exposed all possibilities with tips on how to support as far back > as Python 2.5.? Right, will take a look. FYI a Google search for "python 3 porting guide" shows the Wiki PortingToPy3K page, then Brian Curtin's Python 3 Porting Guide, then Lennart Regebro's porting book website, and then the howto referred to above. Possibly the Wiki page and Brian's guide need to link to the howto, as I presume that's the canonical go-to guide - they don't seem to do so currently. Regards, Vinay Sajip From brian at python.org Tue Feb 28 19:03:02 2012 From: brian at python.org (Brian Curtin) Date: Tue, 28 Feb 2012 12:03:02 -0600 Subject: [Python-Dev] Spreading the Python 3 religion (was Re: PEP 414 - Unicode Literals for Python 3) In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> <20120228134113.B0E9F2500E4@webabinitio.net> <20120228095357.2b9fde87@resist.wooz.org> <20120228175056.Horde.KfPofklCcOxPTQXw0KqW1nA@webmail.df.eu> Message-ID: On Tue, Feb 28, 2012 at 11:51, Vinay Sajip wrote: > Brett Cannon python.org> writes: > >> Changes to http://docs.python.org/howto/pyporting.html are welcome. I tried to >> make sure it exposed all possibilities with tips on how to support as far back >> as Python 2.5. > > Right, will take a look. FYI a Google search for "python 3 porting guide" shows > the Wiki PortingToPy3K page, then Brian Curtin's Python 3 Porting Guide, then > Lennart Regebro's porting book website, and then the howto referred to above. > Possibly the Wiki page and Brian's guide need to link to the howto, as I presume > that's the canonical go-to guide - they don't seem to do so currently. Funny that you mention this: just a few minutes ago someone mentioned on twitter that they found and liked the guide I wrote, then I mentioned the howto/porting page since Brett's last message reminded me of it, and I mentioned that I should update and link to howto/porting. In the words of Guido, I will "make it so". From mark at hotpy.org Tue Feb 28 19:13:01 2012 From: mark at hotpy.org (Mark Shannon) Date: Tue, 28 Feb 2012 18:13:01 +0000 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: References: <4f4c08bb.e89dec0a.772f.1b19@mx.google.com> Message-ID: <4F4D192D.90408@hotpy.org> Victor Stinner wrote: >>> * frozendict values must be immutable, as dict keys >> Why? That may be useful, but an immutable dict whose values >> might mutate is also useful; by forcing that choice, it starts >> to feel too specialized for a builtin. > > Hum, I realized that calling hash(my_frozendict) on a frozendict > instance is enough to check if a frozendict only contains immutable > objects. And it is also possible to check manually that values are > immutable *before* creating the frozendict. > > I also prefer to not check for immutability because it does simplify > the code :-) > > $ diffstat frozendict-3.patch > Include/dictobject.h | 9 + > Lib/collections/abc.py | 1 > Lib/test/test_dict.py | 59 +++++++++++ > Objects/dictobject.c | 256 ++++++++++++++++++++++++++++++++++++++++++------- > Objects/object.c | 3 > Python/bltinmodule.c | 1 > 6 files changed, 295 insertions(+), 34 deletions(-) > > The patch is quite small to add a new builtin type. That's because > most of the code is shared with the builtin dict type. (But the patch > doesn't include the documentation, it didn't write it yet.) > Could you create an issue for this on the tracker, maybe write a PEP. I don't think sending patches to this mailing list is the way to do this. Would you mind taking a look at how your code interacts with PEP 412. Cheers, Mark. From tjreedy at udel.edu Tue Feb 28 19:27:23 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 28 Feb 2012 13:27:23 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> Message-ID: On 2/28/2012 7:10 AM, Vinay Sajip wrote: > The PEP 314 approach seems to assume that that if things work on 3.3, > they will work on 3.2/3.1/3.0 without any changes other than > replacing u'xxx' with 'xxx'. (Delete 3.0. 3.1 is also less of a concern.) It actually assumes that if things work on 3.3 *and* 2.7 (or .6), then ... . At first glance, this seems reasonable. If the code works on 2.7, then it does not use any new 3.3 features. Nor does it depend on any 3.3-only bug fixes that were part of a feature patch. 2.6, of course, is essentially not getting any bugfixes. > In other words, you aren't supposed to want to e.g. test 3.2 and 3.3 > iteratively, using a workflow which intersperses edits with running > tests using 3.2 and running tests with 3.3. Anyone who is also targeting 3.2 could run a test32 script whenever they need to take a break. Or it could be run in the background (perhaps on a different core) while editing continues. People will work this out on a project by project basis, or use one of the other solutions. > In any case, a single code base seems not to be possible across > 2.6+/3.0/3.1/3.2/3.3+ using the PEP 314 approach, though of course > one will be possible for just 2.6+/3.3+. Early adopters of 3.x seem > to be penalised by this approach: I for one will try to use the > unicode_literals approach wherever I can. Early adoption of new tech typically has costs as well as benefits ;-). -- Terry Jan Reedy From jimjjewett at gmail.com Tue Feb 28 19:59:20 2012 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Tue, 28 Feb 2012 10:59:20 -0800 (PST) Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: Message-ID: <4f4d2408.0b56650a.79ed.ffffd28a@mx.google.com> In http://mail.python.org/pipermail/python-dev/2012-February/117070.html Vinay Sajip wrote: > It's moot, but as I see it: the purpose of PEP 414 is to facilitate a > single codebase across 2.x and 3.x. However, it only does this if your > 3.x interest is 3.3+ For many people -- particularly those who haven't ported yet -- 3.x will mean 3.3+. There are some who will support 3.2 because it is a LTS release on some distribution, just as there were some who supported Python 1.5 (but not 1.6) long into the 2.x cycle, but I expect them to be the minority. I certainly don't expect 3.2 to remain a primary development target, the way that 2.7 is. IIRC, the only ways to use 3.2 even today are: (a) Make an explicit choice to use something other than the default (b) Download directly and choose 3.x without OS support (c) Use Arch Linux These are the sort of people who can be expected to upgrade. Now also remember that we're talking specifically about projects that have *not* been ported to 3.x (==> no existing users to support), and that won't be ported until 3.2 is already in maintenance mode. > If you also want to or need to support 3.0 - 3.2, it makes your > workflow more painful, Compared to dropping 3.2, yes. Compared to supporting 3.2 today? I don't see how. > because you can't run tests on 2.x or 3.3 and then run them on 3.2 > without an intermediate source conversion step - just like the 2to3 > step that people find painful when it's part of maintenance workflow, > and which in part prompted the PEP in the first place. So the only differences compared to today are that: (a) Fewer branches are after the auto-conversion. (b) No "current" branches are after the auto-conversion. (c) The auto-conversion is much more limited in scope. -jJ -- If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ From barry at python.org Tue Feb 28 20:15:03 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 28 Feb 2012 14:15:03 -0500 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: <4f4d2408.0b56650a.79ed.ffffd28a@mx.google.com> References: <4f4d2408.0b56650a.79ed.ffffd28a@mx.google.com> Message-ID: <20120228141503.6de426aa@limelight.wooz.org> On Feb 28, 2012, at 10:59 AM, Jim J. Jewett wrote: >For many people -- particularly those who haven't ported yet -- 3.x >will mean 3.3+. There are some who will support 3.2 because it is a >LTS release on some distribution, just as there were some who supported >Python 1.5 (but not 1.6) long into the 2.x cycle, but I expect them to >be the minority. > >I certainly don't expect 3.2 to remain a primary development target, >the way that 2.7 is. IIRC, the only ways to use 3.2 even today are: > > (a) Make an explicit choice to use something other than the default > (b) Download directly and choose 3.x without OS support > (c) Use Arch Linux On Debian and Ubuntu, installing Python 3.2 is easy, even if it isn't the default. However, once installed, 'python3' is Python 3.2. I personally think Python 3.2 makes for a fine platform for new code, and just as good for porting most existing libraries and applications to. You can get many Python 3.2 compatible packages from the Debian and Ubuntu archives by using the normal installation procedures, and generally, if there is a 'python-foo' package, the Python 3.2 compatible version will be called 'python3-foo'. I would expect other Linux distros to be in generally the same boat. There's a lot already available, and this will definitely increase over time. Although on Ubuntu we'll be planning future developments at UDS in May, I would expect Ubuntu 12.10 to have Python 3.3 (probably in addition to Python 3.2 since we can do that easily), and looking ahead at the expected Python release schedule, I'm expecting our next LTS in 2014 (Ubuntu 14.04) will probably ship with Python 3.4, either with or without the earlier Python 3 versions. So I think if you're starting a new project, write it in Python 3 and target Python 3.2. The only reason not to do that is if some critical part of your dependency stack hasn't yet been ported, and in that case, help them get there! IME, most are grateful for a patch or branch that added Python 3 support. >These are the sort of people who can be expected to upgrade. > >Now also remember that we're talking specifically about projects that >have *not* been ported to 3.x (==> no existing users to support), and >that won't be ported until 3.2 is already in maintenance mode. I really hope most people won't wait. Sure, the big frameworks by their nature are going to have more inertia, but if you are the author of a Python library, you can and should port *now* and target Python 3.2. Only this way will we as a community be able to build up the dependency stack so that when the large frameworks are ready, your library which they may depend on, will have a long and stable history on Python 3. -Barry From mal at egenix.com Tue Feb 28 21:34:43 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 28 Feb 2012 21:34:43 +0100 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: <4F4CEB34.2070206@pearwood.info> References: <4F4CCC24.5090105@egenix.com> <4F4CEB34.2070206@pearwood.info> Message-ID: <4F4D3A63.6070006@egenix.com> Steven D'Aprano wrote: > M.-A. Lemburg wrote: >> Victor Stinner wrote: >>>> See also the PEP 351. >>> I read the PEP and the email explaining why it was rejected. >>> >>> Just to be clear: the PEP 351 tries to freeze an object, try to >>> convert a mutable or immutable object to an immutable object. Whereas >>> my frozendict proposition doesn't convert anything: it just raises a >>> TypeError if you use a mutable key or value. >>> >>> For example, frozendict({'list': ['a', 'b', 'c']}) doesn't create >>> frozendict({'list': ('a', 'b', 'c')}) but raises a TypeError. >> >> I fail to see the use case you're trying to address with this >> kind of frozendict(). >> >> The purpose of frozenset() is to be able to use a set as dictionary >> key (and to some extent allow for optimizations and safe >> iteration). Your implementation can be used as dictionary key as well, >> but why would you want to do that in the first place ? > > Because you have a mapping, and want to use a dict for speedy, convenient lookups. Sometimes your > mapping involves the key being a string, or an int, or a tuple, or a set, and Python makes it easy > to use that in a dict. Sometimes the key is itself a mapping, and Python makes it very difficult. > > Just google on "python frozendict" or "python immutabledict" and you will find that this keeps > coming up time and time again, e.g.: > > http://www.cs.toronto.edu/~tijmen/programming/immutableDictionaries.html > http://code.activestate.com/recipes/498072-implementing-an-immutable-dictionary/ > http://code.activestate.com/recipes/414283-frozen-dictionaries/ > http://bob.pythonmac.org/archives/2005/03/04/frozendict/ > http://python.6.n6.nabble.com/frozendict-td4377791.html > http://www.velocityreviews.com/forums/t648910-does-python3-offer-a-frozendict.html > http://stackoverflow.com/questions/2703599/what-would-be-a-frozen-dict Only the first of those links appears to actually discuss reasons for adding a frozendict, but it fails to provide real world use cases and only gives theoretical reasons for why this would be nice to have. >From a practical view, a frozendict would allow thread-safe iteration over a dict and enable more optimizations (e.g. using an optimized lookup function, optimized hash parameters, etc.) to make lookup in static tables more efficient. OTOH, using a frozendict as key in some other dictionary is, well, not a very realistic use case - programmers should think twice before using such a design :-) >> If you're thinking about disallowing changes to the dictionary >> structure, e.g. in order to safely iterate over its keys or items, >> "freezing" the keys is enough. >> >> Requiring the value objects not to change is too much of a restriction >> to make the type useful in practice, IMHO. > > It's no more of a limitation than the limitation that strings can't change. > > Frozendicts must freeze the value as well as the key. Consider the toy example, mapping food > combinations to calories: > > > d = { {appetizer => fried fish, main => double burger, drink => cola}: 5000, > {appetizer => None, main => green salad, drink => tea}: 200, > } > > (syntax is only for illustration purposes) > > Clearly the hash has to take the keys and values into account, which means that both the keys and > values have to be frozen. > > (Values may be mutable objects, but then the frozendict can't be hashed -- just like tuples can't be > hashed if any item in them is mutable.) Right, but that doesn't mean you have to require that values are hashable. A frozendict could (and probably should) use the same logic as tuples: if the values are hashable, the frozendict is hashable, otherwise not. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 28 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-02-13: Released eGenix pyOpenSSL 0.13 http://egenix.com/go26 2012-02-09: Released mxODBC.Zope.DA 2.0.2 http://egenix.com/go25 2012-02-06: Released eGenix mx Base 3.2.3 http://egenix.com/go24 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ethan at stoneleaf.us Tue Feb 28 21:27:34 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 28 Feb 2012 12:27:34 -0800 Subject: [Python-Dev] Backporting PEP 414 Message-ID: <4F4D38B6.4020103@stoneleaf.us> Here's what I know: We don't add features to bug-fix releases. u'' is considered a feature. By not backporting to 3.1 and 3.2 we are not easing the migration pains from 2.x. Here's what I don't know: Why is readding u'' a feature and not a bug? (Just had a thought about this -- because the removal of u'' is documented.) To take a different example: callable() had been removed from 3.0, and was added back in 3.2. callable() is not a big deal as you can roll your own quite easily -- and that is the huge difference: a user *cannot* add u'' back to 3.0/3.1 (at least, not without modifying and rebuilding the Python interpreter source). If there is already a FAQ entry feel free to point me to it, but I would still be curious why, in this instance, practicality does not beat purity? My apologies if this type of question has been rehashed before. ~Ethan~ From benjamin at python.org Tue Feb 28 21:54:05 2012 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 28 Feb 2012 15:54:05 -0500 Subject: [Python-Dev] Backporting PEP 414 In-Reply-To: <4F4D38B6.4020103@stoneleaf.us> References: <4F4D38B6.4020103@stoneleaf.us> Message-ID: 2012/2/28 Ethan Furman : > Here's what I know: > > We don't add features to bug-fix releases. > u'' is considered a feature. > By not backporting to 3.1 and 3.2 we are not easing the migration pains from > 2.x. > > > Here's what I don't know: > > Why is readding u'' a feature and not a bug? ?(Just had a thought about this > -- because the removal of u'' is documented.) Because it's a new "thing" which doesn't fix obviously broken behavior. > > > If there is already a FAQ entry feel free to point me to it, but I would > still be curious why, in this instance, practicality does not beat purity? Because it's practical not to break bugfix releases with new features. -- Regards, Benjamin From brian at python.org Tue Feb 28 21:59:33 2012 From: brian at python.org (Brian Curtin) Date: Tue, 28 Feb 2012 14:59:33 -0600 Subject: [Python-Dev] Backporting PEP 414 In-Reply-To: <4F4D38B6.4020103@stoneleaf.us> References: <4F4D38B6.4020103@stoneleaf.us> Message-ID: On Tue, Feb 28, 2012 at 14:27, Ethan Furman wrote: > Here's what I know: > > We don't add features to bug-fix releases. > u'' is considered a feature. > By not backporting to 3.1 and 3.2 we are not easing the migration pains from > 2.x. Let's say it's 2013 and 3.3 has been out for a few months and you want to port your library to Python 3. Why would you worry about 3.1 or 3.2? You certainly see why we're not worried about 3.0. From chrism at plope.com Tue Feb 28 22:23:40 2012 From: chrism at plope.com (Chris McDonough) Date: Tue, 28 Feb 2012 16:23:40 -0500 Subject: [Python-Dev] Backporting PEP 414 In-Reply-To: References: <4F4D38B6.4020103@stoneleaf.us> Message-ID: <1330464220.8772.2.camel@thinko> On Tue, 2012-02-28 at 15:54 -0500, Benjamin Peterson wrote: > 2012/2/28 Ethan Furman : > > Here's what I know: > > > > We don't add features to bug-fix releases. > > u'' is considered a feature. > > By not backporting to 3.1 and 3.2 we are not easing the migration pains from > > 2.x. > > > > > > Here's what I don't know: > > > > Why is readding u'' a feature and not a bug? (Just had a thought about this > > -- because the removal of u'' is documented.) > > Because it's a new "thing" which doesn't fix obviously broken behavior. > > > > > > > If there is already a FAQ entry feel free to point me to it, but I would > > still be curious why, in this instance, practicality does not beat purity? > > Because it's practical not to break bugfix releases with new features. This change, by its nature, cannot break old programs. - C From solipsis at pitrou.net Tue Feb 28 22:33:37 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 28 Feb 2012 22:33:37 +0100 Subject: [Python-Dev] Backporting PEP 414 References: <4F4D38B6.4020103@stoneleaf.us> <1330464220.8772.2.camel@thinko> Message-ID: <20120228223337.4b0ba684@pitrou.net> On Tue, 28 Feb 2012 16:23:40 -0500 Chris McDonough wrote: > On Tue, 2012-02-28 at 15:54 -0500, Benjamin Peterson wrote: > > 2012/2/28 Ethan Furman : > > > Here's what I know: > > > > > > We don't add features to bug-fix releases. > > > u'' is considered a feature. > > > By not backporting to 3.1 and 3.2 we are not easing the migration pains from > > > 2.x. > > > > > > > > > Here's what I don't know: > > > > > > Why is readding u'' a feature and not a bug? (Just had a thought about this > > > -- because the removal of u'' is documented.) > > > > Because it's a new "thing" which doesn't fix obviously broken behavior. > > > > > > > > > > > If there is already a FAQ entry feel free to point me to it, but I would > > > still be curious why, in this instance, practicality does not beat purity? > > > > Because it's practical not to break bugfix releases with new features. > > This change, by its nature, cannot break old programs. Unless the implementation is buggy, or has unintended side-effects. In theory, *most* changes done in feature releases cannot break old programs. Reality is often a bit more surprising :) Regards Antoine. From barry at python.org Tue Feb 28 22:48:27 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 28 Feb 2012 16:48:27 -0500 Subject: [Python-Dev] Backporting PEP 414 In-Reply-To: References: <4F4D38B6.4020103@stoneleaf.us> Message-ID: <20120228164827.1bf4ab51@limelight.wooz.org> On Feb 28, 2012, at 03:54 PM, Benjamin Peterson wrote: >> If there is already a FAQ entry feel free to point me to it, but I would >> still be curious why, in this instance, practicality does not beat purity? > >Because it's practical not to break bugfix releases with new features. And because now your code is incompatible with three micro-release versions (3.2.0, 3.2.1, and 3.2.2), two of which are bug fix releases. Which means for example, you can't be sure which version of which distro your code will work on. Doesn't anybody else remember the True/False debacle in 2.2.1? Cheers, -Barry From chrism at plope.com Tue Feb 28 23:17:24 2012 From: chrism at plope.com (Chris McDonough) Date: Tue, 28 Feb 2012 17:17:24 -0500 Subject: [Python-Dev] Backporting PEP 414 In-Reply-To: <20120228164827.1bf4ab51@limelight.wooz.org> References: <4F4D38B6.4020103@stoneleaf.us> <20120228164827.1bf4ab51@limelight.wooz.org> Message-ID: <1330467444.8772.11.camel@thinko> On Tue, 2012-02-28 at 16:48 -0500, Barry Warsaw wrote: > On Feb 28, 2012, at 03:54 PM, Benjamin Peterson wrote: > > >> If there is already a FAQ entry feel free to point me to it, but I would > >> still be curious why, in this instance, practicality does not beat purity? > > > >Because it's practical not to break bugfix releases with new features. > > And because now your code is incompatible with three micro-release versions > (3.2.0, 3.2.1, and 3.2.2), two of which are bug fix releases. Which means for > example, you can't be sure which version of which distro your code will work > on. That I do sympathize with. > Doesn't anybody else remember the True/False debacle in 2.2.1? I do. It was slightly different than this because the feature was added twice, once in 2.2.1 with direct aliases to 0 and 1, which was found to be lacking, and then later again in 2.3 with explicit types, so it was sort of an extended-timeframe unpleasantness, and the feature's minor-dot-introduction was only a contributing factor, IIRC. But yeah. A year from now I wouldn't remember which version of 3.2 got a new feature, and neither would anybody else. The no-new-features guidelines are useful in the real world this way, because they represent a coherent policy, as tempting as it is to just jam it in. - C From ncoghlan at gmail.com Wed Feb 29 00:25:52 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 29 Feb 2012 09:25:52 +1000 Subject: [Python-Dev] Backporting PEP 414 In-Reply-To: <1330467444.8772.11.camel@thinko> References: <4F4D38B6.4020103@stoneleaf.us> <20120228164827.1bf4ab51@limelight.wooz.org> <1330467444.8772.11.camel@thinko> Message-ID: On Wed, Feb 29, 2012 at 8:17 AM, Chris McDonough wrote: > But yeah. ?A year from now I wouldn't remember which version of 3.2 got > a new feature, and neither would anybody else. ?The no-new-features > guidelines are useful in the real world this way, because they represent > a coherent policy, as tempting as it is to just jam it in. Also, I think there may be some confusion about Armin's plan to handle 3.2 - he aims to write an *import hook* that accepts the u/U prefixes during tokenisation, not a source-to-source transform like 2to3. It's clumsier than the plan for native syntactic support in 3.3 (since you'll need to make sure the import hook is installed, the presence of the hook will add overhead during application startup, and any attempts to compile affected modules that don't go through the import machinery will fail with a syntax error), but the presence of importlib in 3.2 makes it quite feasible. When loading from a cached .pyc, the hook won't even have to do anything special (since the tokenisation change only affects the compilation step). Assuming Armin can get the hook working as intended, then long running applications where startup overhead isn't a big deal will just need to ensure the hook is in place before they import any modules that use the old-style string literals. For cases where the startup overhead isn't acceptable (such as command line applications), then approaches that change the source in advance (i.e. separate branches or single source with the unicode_literals future import) will continue to be the preferred mechanism for providing 3.2 support. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From martin at v.loewis.de Wed Feb 29 01:13:13 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Wed, 29 Feb 2012 01:13:13 +0100 Subject: [Python-Dev] Backporting PEP 414 In-Reply-To: <4F4D38B6.4020103@stoneleaf.us> References: <4F4D38B6.4020103@stoneleaf.us> Message-ID: <20120229011313.Horde.zLfORVNNcXdPTW2ZumqDWGA@webmail.df.eu> > Why is readding u'' a feature and not a bug? There is a really simple litmus test for whether something is a bug: does it deviate from the specification? In this case, the specification is the grammar, and the implementation certainly doesn't deviate from it. So it can't be a bug. Regards, Martin P.S. Before anybody over-interprets this criterion: there is certain "implicit behavior" assumed in Python that may not actually be documented, such as "the interpreter will not core dump", and "the source code will compile with any standard C compiler". Deviation from these implicit assumption is also a bug. However, they don't apply here. From vinay_sajip at yahoo.co.uk Wed Feb 29 01:22:02 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Wed, 29 Feb 2012 00:22:02 +0000 (UTC) Subject: [Python-Dev] Backporting PEP 414 References: <4F4D38B6.4020103@stoneleaf.us> <20120228164827.1bf4ab51@limelight.wooz.org> <1330467444.8772.11.camel@thinko> Message-ID: Nick Coghlan gmail.com> writes: > Also, I think there may be some confusion about Armin's plan to handle > 3.2 - he aims to write an *import hook* that accepts the u/U prefixes > during tokenisation, not a source-to-source transform like 2to3. It's I must confess, I thought it was a source-to-source transform, because he called it an installation-time hook (which of course makes you think of 2to3) and not an import hook. That will have a much better chance of acceptable performance, since it'll only touch changed stuff automatically. I feel better about the prospects for 3.2 support :-) Regards, Vinay Sajip From merwok at netwok.org Wed Feb 29 06:56:45 2012 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Wed, 29 Feb 2012 06:56:45 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> <1330431576.3400.2.camel@localhost.localdomain> Message-ID: <4F4DBE1D.2000009@netwok.org> Le 28/02/2012 13:48, Giampaolo Rodol? a ?crit : > Il 28 febbraio 2012 13:19, Antoine Pitrou ha scritto: >> IMO, maintaining two branches shouldn't be much more work than >> maintaining hacks so that a single codebase works with two different >> programming languages. > > Would that mean distributing 2 separate tarballs? > How would tools such as easy_install and pip work in respect of that? > Is there a naming convention they can rely on? Sadly, PyPI and the packaging tools don?t play nice with non-single-codebase projects, so you have to use a different name for your 3.x-compatible release, like ?unittestpy3k?. Some bdists include the Python version in the file name, but sdists don?t. Regards From regebro at gmail.com Wed Feb 29 07:08:25 2012 From: regebro at gmail.com (Lennart Regebro) Date: Wed, 29 Feb 2012 07:08:25 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> Message-ID: On Tue, Feb 28, 2012 at 13:10, Vinay Sajip wrote: > We might be at cross purposes here. I don't see how Distribute helps, because > the use case I'm talking about is not about distributing or installing stuff, > but iteratively changing and testing code which needs to work on 2.6+, 3.2 and > 3.3+. Make sure you can run the tests with python setup.py test, and you're "in the butter", as we say in Sweden. :-) > If the 2.x code depends on having u'xxx' literals, then 3.2 testing will > potentially involve running a fixer on all files in the project every time a > change is made, writing to a separate directory, or else a fixer which is > integrated into the editing environment so it knows what changed. This is > painful Sure, and distribute does this for you. http://python3porting.com/2to3.html //Lennart From regebro at gmail.com Wed Feb 29 07:10:52 2012 From: regebro at gmail.com (Lennart Regebro) Date: Wed, 29 Feb 2012 07:10:52 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> Message-ID: All the various strategies for supporting Python 2 and Python 3 as well as their various drawbacks and ways around this is covered in my book, chapter 2. :-) http://python3porting.com/strategies.html I may be too late to point this out, but it feels like this discussion could have been shorter if everyone read this first. :-) //Lennart From regebro at gmail.com Wed Feb 29 07:34:42 2012 From: regebro at gmail.com (Lennart Regebro) Date: Wed, 29 Feb 2012 07:34:42 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> <1330431576.3400.2.camel@localhost.localdomain> <4F4CE2BE.3060408@gmail.com> Message-ID: On Tue, Feb 28, 2012 at 16:30, Giampaolo Rodol? wrote: > Il 28 febbraio 2012 15:20, Ezio Melotti ha scritto: >> (Note: there are also other costs -- e.g. releasing -- that I haven't >> considered because they don't affect me personally, but I'm not sure they >> are big enough to make the two-branches approach worse.) > > They are. > With that kind of approach you're basically forced to include the > python version number as part of the tarball name (e.g. > foo-0.3.1-py2.tar.gz and foo-0.3.1-py3.tar.gz). Not at all. You can include both code bases in one package. http://python3porting.com/2to3.html#distributing-packages //Lennart From regebro at gmail.com Wed Feb 29 07:35:55 2012 From: regebro at gmail.com (Lennart Regebro) Date: Wed, 29 Feb 2012 07:35:55 +0100 Subject: [Python-Dev] PEP 414 - Unicode Literals for Python 3 In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228011601.Horde.1Bj-UElCcOxPTBzBEJEUvGA@webmail.df.eu> <4F4C88B2.40706@redhat.com> <20120228125226.1aa8004a@pitrou.net> Message-ID: On Tue, Feb 28, 2012 at 16:39, Vinay Sajip wrote: > Serhiy Storchaka gmail.com> writes: > >> Another pertinent question: "What are disadvantages of PEP 414 is adopted?" > > It's moot, but as I see it: the purpose of PEP 414 is to facilitate a single > codebase across 2.x and 3.x. The bytes/native/unicode issue is an issue even if you use 2to3. But of course that *is* a form of "single codebase" so maybe that's what you meant. :-) //Lennart From regebro at gmail.com Wed Feb 29 07:52:39 2012 From: regebro at gmail.com (Lennart Regebro) Date: Wed, 29 Feb 2012 07:52:39 +0100 Subject: [Python-Dev] Backporting PEP 414 In-Reply-To: <4F4D38B6.4020103@stoneleaf.us> References: <4F4D38B6.4020103@stoneleaf.us> Message-ID: On Tue, Feb 28, 2012 at 21:27, Ethan Furman wrote: > Here's what I know: > > We don't add features to bug-fix releases. > u'' is considered a feature. > By not backporting to 3.1 and 3.2 we are not easing the migration pains from > 2.x. If this is added to 3.2.3, then some programs will work with 3.2.3, but not 3.2.2. I'm pretty sure this will confuse people no end. :-) //Lennart From stephen at xemacs.org Wed Feb 29 08:23:53 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 29 Feb 2012 16:23:53 +0900 Subject: [Python-Dev] Spreading the Python 3 religion (was Re: PEP 414 - Unicode Literals for Python 3) In-Reply-To: <20120228175056.Horde.KfPofklCcOxPTQXw0KqW1nA@webmail.df.eu> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> <20120228134113.B0E9F2500E4@webabinitio.net> <20120228095357.2b9fde87@resist.wooz.org> <20120228175056.Horde.KfPofklCcOxPTQXw0KqW1nA@webmail.df.eu> Message-ID: <878vjmksqe.fsf@uwakimon.sk.tsukuba.ac.jp> martin at v.loewis.de writes: > One thing that the PEP will certainly achieve is to spread the myth that > you cannot port to Python 3 if you also want to support Python 2.5. That's > because people will accept the "single source" approach as the one right way, > and will accept that this only works well with Python 2.6. Please, Martin, I dislike this idea as much as you do. (There was no -1 from me, though, because I don't work in the context of the claimed use cases at all, but lots of people obviously find them persuasive.) But in respect of myth-spreading, the problem with the PEP is the polemic tone. (Yeah, I've seen Armin's claim that it's not polemic. I disagree.) The unqualified claims that "2to3 is insufficient" and the PEP will "enable side-by-side support" of Python 2 and Python 3 by libraries are too extreme, and really unnecessary in light of Guido's logic for acceptance. As far as I can see, like 2to3, like u()/b(), this PEP introduces a device that will be the most *convenient* approach for *some* use cases. If it were presented that way, with recommendation for its use restricted to the particular intended use case, I don't think it would have a huge effect on people's perception of the difficulty of porting in general, including multiversion support including 2.5. If others want to use it, even though you and I think that's a bad idea, well, we can blog, and "consenting adults" covers those users. On the other hand, implementation of the PEP itself should have a positive effect on the community's perception of python-dev's responsiveness to its pain. Ie, a lot of us feel strongly that this is the wrong thing to do in principle -- but we're gonna do it anyway, because part of the community wants it. So, let's work on integrating this PEP into the more general framework of recommendations for porting Python 2 code to Python 3 and/or developing libraries targeting both. From ncoghlan at gmail.com Wed Feb 29 08:55:30 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 29 Feb 2012 17:55:30 +1000 Subject: [Python-Dev] Spreading the Python 3 religion (was Re: PEP 414 - Unicode Literals for Python 3) In-Reply-To: <878vjmksqe.fsf@uwakimon.sk.tsukuba.ac.jp> References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> <20120228134113.B0E9F2500E4@webabinitio.net> <20120228095357.2b9fde87@resist.wooz.org> <20120228175056.Horde.KfPofklCcOxPTQXw0KqW1nA@webmail.df.eu> <878vjmksqe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Wed, Feb 29, 2012 at 5:23 PM, Stephen J. Turnbull wrote: > martin at v.loewis.de writes: > > ?> One thing that the PEP will certainly achieve is to spread the myth that > ?> you cannot port to Python 3 if you also want to support Python 2.5. That's > ?> because people will accept the "single source" approach as the one right way, > ?> and will accept that this only works well with Python 2.6. > > Please, Martin, I dislike this idea as much as you do. ?(There was no > -1 from me, though, because I don't work in the context of the claimed > use cases at all, but lots of people obviously find them persuasive.) > > But in respect of myth-spreading, the problem with the PEP is the > polemic tone. ?(Yeah, I've seen Armin's claim that it's not polemic. > I disagree.) ?The unqualified claims that "2to3 is insufficient" and > the PEP will "enable side-by-side support" of Python 2 and Python 3 by > libraries are too extreme, and really unnecessary in light of Guido's > logic for acceptance. FWIW, I agree that much of the rhetoric in the current version of PEP 414 is excessive. Armin has given me permission to create an updated version of PEP 414 and toning down the hyperbole (or removing it entirely in cases where it's irrelevant to the final decision) is one of the things that I will be changing. I also plan to add a link to Lennart's guide to the various porting strategies that are currently available, more clearly articulate the cases where the new approach can most help (i.e. when there are project specific reasons to avoid the unicode_literals import), as well as name drop Pyramid (Chris McDonough), Flask (Armin), Django (Jacob Kaplan-Moss) and requests (Kenneth Reitz) as cases where key developers of web-related third party frameworks or libraries have indicated that PEP 414 will help greatly with bringing the sections of the Python ecosystem they're involved with into the Python 3 fold over the next few years. My aim is for the end result to better reflect the reasons why Guido *accepted* the PEP, moreso than Armin's own reasons for *wanting* it. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From devel at baptiste-carvello.net Wed Feb 29 09:59:51 2012 From: devel at baptiste-carvello.net (Baptiste Carvello) Date: Wed, 29 Feb 2012 09:59:51 +0100 Subject: [Python-Dev] Backporting PEP 414 In-Reply-To: References: <4F4D38B6.4020103@stoneleaf.us> <20120228164827.1bf4ab51@limelight.wooz.org> <1330467444.8772.11.camel@thinko> Message-ID: Le 29/02/2012 00:25, Nick Coghlan a ?crit : > Also, I think there may be some confusion about Armin's plan to handle > 3.2 - he aims to write an *import hook* that accepts the u/U prefixes > during tokenisation, not a source-to-source transform like 2to3. > this needs to be emphasized. Read from the outside, the whole PEP 414 discussion could give the idea that 3.2 is a second class citizen for porting, like 3.0 and 3.1 have been. If such an opinion would spread, that would be bad PR for Python 3 as a whole. Baptiste From jnoller at gmail.com Wed Feb 29 11:51:36 2012 From: jnoller at gmail.com (Jesse Noller) Date: Wed, 29 Feb 2012 05:51:36 -0500 Subject: [Python-Dev] Spreading the Python 3 religion (was Re: PEP 414 - Unicode Literals for Python 3) In-Reply-To: References: <4F49434B.6050604@active-4.com> <4F4B5634.3020609@v.loewis.de> <4F4BB7F2.4070804@stoneleaf.us> <20120227174110.96AFA2500E4@webabinitio.net> <1330365662.12046.72.camel@thinko> <1330372221.12046.119.camel@thinko> <20120227202335.ACAFD25009E@webabinitio.net> <1330375169.12046.133.camel@thinko> <1330377399.12046.158.camel@thinko> <20120227215829.1DF3B2500E4@webabinitio.net> <4F4BFF98.2080007@active-4.com> <20120228020445.6FB4C2500CF@webabinitio.net> <20120228134113.B0E9F2500E4@webabinitio.net> <20120228095357.2b9fde87@resist.wooz.org> <20120228175056.Horde.KfPofklCcOxPTQXw0KqW1nA@webmail.df.eu> <878vjmksqe.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <82FF509173E84348A83196CFCB49E8A1@gmail.com> > > FWIW, I agree that much of the rhetoric in the current version of PEP > 414 is excessive. > > Armin has given me permission to create an updated version of PEP 414 > and toning down the hyperbole (or removing it entirely in cases where > it's irrelevant to the final decision) is one of the things that I > will be changing. I also plan to add a link to Lennart's guide to the > various porting strategies that are currently available, more clearly > articulate the cases where the new approach can most help (i.e. when > there are project specific reasons to avoid the unicode_literals > import), as well as name drop Pyramid (Chris McDonough), Flask > (Armin), Django (Jacob Kaplan-Moss) and requests (Kenneth Reitz) as > cases where key developers of web-related third party frameworks or > libraries have indicated that PEP 414 will help greatly with bringing > the sections of the Python ecosystem they're involved with into the > Python 3 fold over the next few years. > > My aim is for the end result to better reflect the reasons why Guido > *accepted* the PEP, moreso than Armin's own reasons for *wanting* it. > Thank you Nick and Armin. I think toning down the rhetoric is a very amicable solution. Let me know if I need to add anything to http://getpython3.com/ (have linked porting guides there too if you want) jesse From yselivanov.ml at gmail.com Wed Feb 29 13:34:06 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 29 Feb 2012 07:34:06 -0500 Subject: [Python-Dev] PEP 414 In-Reply-To: <87A20E5B-D624-4F32-BEE5-57A5C6D83339@gmail.com> References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <4F4A29BD.2090607@active-4.com> <4F4BA4E0.80806@active-4.com> <4F4C0600.5010903@active-4.com> <87A20E5B-D624-4F32-BEE5-57A5C6D83339@gmail.com> Message-ID: <4E928BD2-7A8D-4089-B52C-D60D4FF8BAD0@gmail.com> Armin, I see you've (or somebody) changed: """As it stands, Python 3 is currently a bad choice for long-term investments, since the ecosystem is not yet properly developed, and libraries are still fighting with their API decisions for Python 3.""" to: """As it stands, when chosing between 2.7 and Python 3.2, Python 3 is currently not the best choice for certain long-term investments, since the ecosystem is not yet properly developed, and libraries are still fighting with their API decisions for Python 3.""" Could you just remove the statement completely? Again, my understanding of what is the best choice for certain *long-term* investments is drastically different from yours. In my opinion, python 3 is much more suitable for anything *long-term* than python 2. I don't think that PEPs are the right place to put such polemic and biased statements. Nobody asked you to express your *personal* feelings and thoughts about applicability or state of python3 in the PEP. There are blogs for that. Thank you. - Yury On 2012-02-28, at 11:29 AM, Yury Selivanov wrote: > Hi Armin, > > Could you please remove from the PEP the following statement: > > """As it stands, Python 3 is currently a bad choice for long-term > investments, since the ecosystem is not yet properly developed, and > libraries are still fighting with their API decisions for Python 3.""" > > While it may be as such for you, I think it is incorrect for the rest. > Moreover, it is harmful for the python 3 adoption, to put such documents > on python.org. > > The python ecosystem is not just limited to WSGI apps, Django and Flask. > Yes, we don't have all the packages on pypi support python 3, but many > of those are portable within 10 minutes to couple of hours of work (and > I did many of such ports for our internal systems.) And many of the > essential packages do exist for python 3, like numpy, zeromq etc. > > I know several sturt-ups, including mine that develop huge commercial > applications entirely on python 3. > > Thanks, > -Yury > > On 2012-02-27, at 5:38 PM, Armin Ronacher wrote: > >> Hi, >> >> On 2/27/12 10:18 PM, Terry Reedy wrote: >>> I would like to know if you think that this one change is enough to do >>> agile development and testing, etc, or whether, as Chris McDonough >>> hopes, this is just the first of a series of proposals you have planned. >> Indeed I have three other PEPs in the work. The reintroduction of >> "except (((ExceptionType),),)", the "<>" comparision operator and the >> removal of "nonlocal", the latter to make Python 2.x developers feel >> better about themselves. :-) >> >> >> Regards, >> Armin >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: http://mail.python.org/mailman/options/python-dev/yselivanov.ml%40gmail.com > From yselivanov at gmail.com Wed Feb 29 13:30:05 2012 From: yselivanov at gmail.com (Yury Selivanov) Date: Wed, 29 Feb 2012 07:30:05 -0500 Subject: [Python-Dev] PEP 414 In-Reply-To: <87A20E5B-D624-4F32-BEE5-57A5C6D83339@gmail.com> References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <4F4A29BD.2090607@active-4.com> <4F4BA4E0.80806@active-4.com> <4F4C0600.5010903@active-4.com> <87A20E5B-D624-4F32-BEE5-57A5C6D83339@gmail.com> Message-ID: <918CA06F-DFAE-4696-A824-D1559DD58010@gmail.com> Armin, I see you've (or somebody) changed: """As it stands, Python 3 is currently a bad choice for long-term investments, since the ecosystem is not yet properly developed, and libraries are still fighting with their API decisions for Python 3.""" to: """As it stands, when chosing between 2.7 and Python 3.2, Python 3 is currently not the best choice for certain long-term investments, since the ecosystem is not yet properly developed, and libraries are still fighting with their API decisions for Python 3.""" Could you just remove the statement completely? Again, my understanding of what is the best choice for certain *long-term* investments is drastically different from yours. In my opinion, python 3 is much more suitable for anything *long-term* than python 2. I don't think that PEPs are the right place to put such polemic and biased statements. Nobody asked you to express your *personal* feelings and thoughts about applicability or state of python3 in the PEP. There are blogs for that. Thank you. - Yury On 2012-02-28, at 11:29 AM, Yury Selivanov wrote: > Hi Armin, > > Could you please remove from the PEP the following statement: > > """As it stands, Python 3 is currently a bad choice for long-term > investments, since the ecosystem is not yet properly developed, and > libraries are still fighting with their API decisions for Python 3.""" > > While it may be as such for you, I think it is incorrect for the rest. > Moreover, it is harmful for the python 3 adoption, to put such documents > on python.org. > > The python ecosystem is not just limited to WSGI apps, Django and Flask. > Yes, we don't have all the packages on pypi support python 3, but many > of those are portable within 10 minutes to couple of hours of work (and > I did many of such ports for our internal systems.) And many of the > essential packages do exist for python 3, like numpy, zeromq etc. > > I know several sturt-ups, including mine that develop huge commercial > applications entirely on python 3. > > Thanks, > -Yury > > On 2012-02-27, at 5:38 PM, Armin Ronacher wrote: > >> Hi, >> >> On 2/27/12 10:18 PM, Terry Reedy wrote: >>> I would like to know if you think that this one change is enough to do >>> agile development and testing, etc, or whether, as Chris McDonough >>> hopes, this is just the first of a series of proposals you have planned. >> Indeed I have three other PEPs in the work. The reintroduction of >> "except (((ExceptionType),),)", the "<>" comparision operator and the >> removal of "nonlocal", the latter to make Python 2.x developers feel >> better about themselves. :-) >> >> >> Regards, >> Armin >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: http://mail.python.org/mailman/options/python-dev/yselivanov.ml%40gmail.com > From barry at python.org Wed Feb 29 15:28:56 2012 From: barry at python.org (Barry Warsaw) Date: Wed, 29 Feb 2012 09:28:56 -0500 Subject: [Python-Dev] PEP 414 In-Reply-To: <918CA06F-DFAE-4696-A824-D1559DD58010@gmail.com> References: <4F49434B.6050604@active-4.com> <4F4A10C1.6040806@pearwood.info> <4F4A29BD.2090607@active-4.com> <4F4BA4E0.80806@active-4.com> <4F4C0600.5010903@active-4.com> <87A20E5B-D624-4F32-BEE5-57A5C6D83339@gmail.com> <918CA06F-DFAE-4696-A824-D1559DD58010@gmail.com> Message-ID: <20120229092856.2aeb9256@limelight.wooz.org> On Feb 29, 2012, at 07:30 AM, Yury Selivanov wrote: >"""As it stands, Python 3 is currently a bad choice for long-term >investments, since the ecosystem is not yet properly developed, and >libraries are still fighting with their API decisions for Python 3.""" > >to: > >"""As it stands, when chosing between 2.7 and Python 3.2, Python 3 >is currently not the best choice for certain long-term investments, >since the ecosystem is not yet properly developed, and libraries are >still fighting with their API decisions for Python 3.""" > >Could you just remove the statement completely? If I read correctly, Nick is undertaking a rewrite of PEP 414, which should help a lot. -Barry From victor.stinner at haypocalc.com Wed Feb 29 19:21:37 2012 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 29 Feb 2012 19:21:37 +0100 Subject: [Python-Dev] PEP 416: Add a frozendict builtin type Message-ID: As requested, I create a PEP and a related issue: http://www.python.org/dev/peps/pep-0416/ http://bugs.python.org/issue14162 The PEP 416 is different from my previous propositions: frozendict values can be mutable and dict doesn't inherit from frozendict anymore. But it is still possible to use the PyDict C API on frozendict (which is more an implementation detail). TODO: - write the documentation - decide if new functions should be added to the C API, maybe a new PyFrozenDict_New() function? (but would it take a mapping or a list of tuple?) -- PEP: 416 Title: Add a frozendict builtin type Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 29-February-2012 Python-Version: 3.3 Abstract ======== Add a new frozendict builtin type. Rationale ========= A frozendict mapping cannot be changed, but its values can be mutable (not hashable). A frozendict is hashable and so immutable if all values are hashable (immutable). Use cases of frozendict: * hashable frozendict can be used as a key of a mapping or as a member of set * frozendict helps optimization because the mapping is constant * frozendict avoids the need of a lock when the frozendict is shared by multiple threads or processes, especially hashable frozendict Constraints =========== * frozendict has to implement the Mapping abstract base class * frozendict keys and values can be unorderable * a frozendict is hashable if all keys and values are hashable * frozendict hash does not depend on the items creation order Implementation ============== * Add a PyFrozenDictObject structure based on PyDictObject with an extra "Py_hash_t hash;" field * frozendict.__hash__() is implemented using hash(frozenset(self.items())) and caches the result in its private hash attribute * Register frozendict has a collections.abc.Mapping * frozendict can be used with PyDict_GetItem(), but PyDict_SetItem() and PyDict_DelItem() raise a TypeError Recipe: immutable dict ====================== An immutable mapping can be implemented using frozendict:: import itertools class immutabledict(frozendict): def __new__(cls, *args, **kw): # ensure that all values are immutable for key, value in itertools.chain(args, kw.items()): if not isinstance(value, (int, float, complex, str, bytes)): hash(value) # frozendict ensures that all keys are immutable return frozendict.__new__(cls, *args, **kw) def __repr__(self): return 'immutabledict' + frozendict.__repr__(self)[10:] Objections ========== *namedtuple may fit the requiements of a frozendict.* A namedtuple is not a mapping, it does not implement the Mapping abstract base class. *frozendict can be implemented in Python using descriptors" and "frozendict just need to be practically constant.* If frozendict is used to harden Python (security purpose), it must be implemented in C. A type implemented in C is also faster. *The PEP 351 was rejected.* The PEP 351 tries to freeze an object and so may convert a mutable object to an immutable object (using a different type). frozendict doesn't convert anything: hash(frozendict) raises a TypeError if a value is not hashable. Freezing an object is not the purpose of this PEP. Links ===== * PEP 412: Key-Sharing Dictionary (`issue #13903 `_) * PEP 351: The freeze protocol * `The case for immutable dictionaries; and the central misunderstanding of PEP 351 `_ * `Frozen dictionaries (Python recipe 414283) `_ by Oren Tirosh Copyright ========= This document has been placed in the public domain. From dmalcolm at redhat.com Wed Feb 29 19:52:28 2012 From: dmalcolm at redhat.com (David Malcolm) Date: Wed, 29 Feb 2012 13:52:28 -0500 Subject: [Python-Dev] PEP 416: Add a frozendict builtin type In-Reply-To: References: Message-ID: <1330541549.7844.69.camel@surprise> On Wed, 2012-02-29 at 19:21 +0100, Victor Stinner wrote: > As requested, I create a PEP and a related issue: > > http://www.python.org/dev/peps/pep-0416/ [...snip...] > > Rationale > ========= > > A frozendict mapping cannot be changed, but its values can be mutable > (not hashable). A frozendict is hashable and so immutable if all > values are hashable (immutable). The wording of the above seems very unclear to me. Do you mean "A frozendict has a constant set of keys, and for every key, d[key] has a specific value for the lifetime of the frozendict. However, these values *may* be mutable. The frozendict is hashable iff all of the values are hashable." ? (or somesuch) [...snip...] > * Register frozendict has a collections.abc.Mapping s/has/as/ ? [...snip...] > If frozendict is used to harden Python (security purpose), it must be > implemented in C. A type implemented in C is also faster. You mention security purposes here, but this isn't mentioned in the Rationale or Use Cases Hope this is helpful Dave From eliben at gmail.com Wed Feb 29 20:10:55 2012 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 29 Feb 2012 21:10:55 +0200 Subject: [Python-Dev] PEP 416: Add a frozendict builtin type In-Reply-To: <1330541549.7844.69.camel@surprise> References: <1330541549.7844.69.camel@surprise> Message-ID: > > Rationale > > ========= > > > > A frozendict mapping cannot be changed, but its values can be mutable > > (not hashable). A frozendict is hashable and so immutable if all > > values are hashable (immutable). > The wording of the above seems very unclear to me. > > Do you mean "A frozendict has a constant set of keys, and for every key, > d[key] has a specific value for the lifetime of the frozendict. > However, these values *may* be mutable. The frozendict is hashable iff > all of the values are hashable." ? (or somesuch) > > [...snip...] > I agree that this sentence needs some clarification. David's formulation is also what I would guess it to mean, but it should be stated more explicitly. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Wed Feb 29 20:17:05 2012 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 29 Feb 2012 11:17:05 -0800 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: References: Message-ID: <61817B63-B0D8-46FA-8284-45F88266516D@gmail.com> On Feb 27, 2012, at 10:53 AM, Victor Stinner wrote: > A frozendict type is a common request from users and there are various > implementations. ISTM, this request is never from someone who has a use case. Instead, it almost always comes from "completers", people who see that we have a frozenset type and think the core devs missed the ObviousThingToDo(tm). Frozendicts are trivial to implement, so that is why there are various implementations (i.e. the implementations are more fun to write than they are to use). The frozenset type covers a niche case that is nice-to-have but *rarely* used. Many experienced Python users simply forget that we have a frozenset type. We don't get bug reports or feature requests about the type. When I do Python consulting work, I never see it in a client's codebase. It does occasionally get discussed in questions on StackOverflow but rarely gets offered as an answer (typically on variants of the "how do you make a set-of-sets" question). If Google's codesearch were still alive, we could add another datapoint showing how infrequently this type is used. I wrote the C implementation for frozensets and the tests that demonstrate their use in problems involving sets-of-sets, yet I have *needed* the frozenset once in my career (for a NFA/DFA conversion algorithm). From this experience, I conclude that adding a frozendict type would be a total waste (except that it would inspire more people to request frozen variante of other containers). Raymond P.S. The one advantage I can see for frozensets and frozendicts is that we have an opportunity to optimize them once they are built (optimizing insertion order to minimize collisions, increasing or decreasing density, eliminating dummy entries, etc). That being said, the same could be accomplished for regular sets and dicts by the addition of an optimize() method. I'm not really enamoured of that idea though because it breaks the abstraction and because people don't seem to need it (i.e. it has never been requested). -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Wed Feb 29 20:33:43 2012 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 29 Feb 2012 21:33:43 +0200 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: <61817B63-B0D8-46FA-8284-45F88266516D@gmail.com> References: <61817B63-B0D8-46FA-8284-45F88266516D@gmail.com> Message-ID: > > > The frozenset type covers a niche case that is nice-to-have but > *rarely* used. Many experienced Python users simply forget > that we have a frozenset type. We don't get bug reports or > feature requests about the type. When I do Python consulting > work, I never see it in a client's codebase. It does occasionally > get discussed in questions on StackOverflow but rarely gets > offered as an answer (typically on variants of the "how do you > make a set-of-sets" question). If Google's codesearch were still > alive, we could add another datapoint showing how infrequently > this type is used. > > There are some alternatives to code.google.com, though. For example: http://www.koders.com/default.aspx?s=frozenset&submit=Search&la=Python&li=* >From a cursory look: quite a bit of the found results are from the various Python implementations, and there is some duplication of projects, but it would be unfair to conclude that frozenset is not being used since many of the results do look legitimate. This is not to argue in favor or against frozendict, just stating that there's still a way to search code online :) Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at bytereef.org Wed Feb 29 20:34:49 2012 From: stefan at bytereef.org (Stefan Krah) Date: Wed, 29 Feb 2012 20:34:49 +0100 Subject: [Python-Dev] State of PEP-3118 (memoryview part) In-Reply-To: <4F4AA2E9.1070901@canterbury.ac.nz> References: <20120226132721.GA1422@sleipnir.bytereef.org> <4F4AA2E9.1070901@canterbury.ac.nz> Message-ID: <20120229193449.GA32607@sleipnir.bytereef.org> Greg Ewing wrote: >> Options 2) and 3) would ideally entail one backwards incompatible >> bugfix: In 2.7 and 3.2 assignment to a memoryview with format 'B' >> rejects integers but accepts byte objects, but according to the >> struct syntax mandated by the PEP it should be the other way round. > > Maybe a compromise could be made to accept both in the > backport? That would avoid breaking old code while allowing > code that does the right thing to work. This could definitely be done. But backporting is beginning to look unlikely, since we currently have three +1 for "too complex to backport". I'm not strongly in favor of backporting myself. The main reason for me would be to prevent having additional 2->3 or 3->2 porting obstacles. Stefan Krah From stefan at bytereef.org Wed Feb 29 20:35:46 2012 From: stefan at bytereef.org (Stefan Krah) Date: Wed, 29 Feb 2012 20:35:46 +0100 Subject: [Python-Dev] State of PEP-3118 (memoryview part) In-Reply-To: <20120226225615.7100e78e@pitrou.net> References: <20120226132721.GA1422@sleipnir.bytereef.org> <20120226225615.7100e78e@pitrou.net> Message-ID: <20120229193546.GB32607@sleipnir.bytereef.org> Antoine Pitrou wrote: > Stefan Krah wrote: > > In Python 3.3 most issues with the memoryview object have been fixed > > in a recent commit (3f9b3b6f7ff0). > > Oh and congrats for doing this, of course. Thanks! Stefan Krah From eliben at gmail.com Wed Feb 29 20:48:21 2012 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 29 Feb 2012 21:48:21 +0200 Subject: [Python-Dev] PEP 411: Provisional packages in the Python standard library In-Reply-To: References: Message-ID: I have updated PEP 411, following the input from this discussion. The updated PEP is at: http://hg.python.org/peps/file/default/pep-0411.txt Changes: - Specified that a package may remain provisional for longer than a single minor release - Shortened the suggested documentation notice, linking to the glossary which will contain the full text - Adding a notice to the package's docstring, which may be programmatically inspected. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Wed Feb 29 22:08:20 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 29 Feb 2012 21:08:20 +0000 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: <61817B63-B0D8-46FA-8284-45F88266516D@gmail.com> References: <61817B63-B0D8-46FA-8284-45F88266516D@gmail.com> Message-ID: On 29 February 2012 19:17, Raymond Hettinger wrote: > From this experience, I conclude that adding a frozendict type > would be a total waste (except that it would inspire more people > to request?frozen variante of other containers). It would (apparently) help Victor to fix issues in his pysandbox project. I don't know if a secure Python sandbox is an important enough concept to warrant core changes to make it possible. However, if Victor was saying that implementing this PEP was all that is needed to implement a secure sandbox, then that would be a very different claim, and likely much more compelling (to some, at least - I have no personal need for a secure sandbox). Victor quotes 6 implementations. I don't see any rationale (either in the email that started this thread, or in the PEP) to explain why these aren't good enough, and in particular why the implementation has to be in the core. There's the hint in the PEP "If frozendict is used to harden Python (security purpose), it must be implemented in C". But why in the core (as opposed to an extension)? And why and how would frozendict help in hardening Python? As it stands, I don't find the PEP compelling. The hardening use case might be significant but Victor needs to spell it out if it's to make a difference. Paul. From ncoghlan at gmail.com Wed Feb 29 22:13:15 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 1 Mar 2012 07:13:15 +1000 Subject: [Python-Dev] Add a frozendict builtin type In-Reply-To: References: <61817B63-B0D8-46FA-8284-45F88266516D@gmail.com> Message-ID: On Thu, Mar 1, 2012 at 7:08 AM, Paul Moore wrote: > As it stands, I don't find the PEP compelling. The hardening use case > might be significant but Victor needs to spell it out if it's to make > a difference. +1 Avoiding-usenet-nod-syndrome'ly, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ironfroggy at gmail.com Wed Feb 29 23:06:21 2012 From: ironfroggy at gmail.com (Calvin Spealman) Date: Wed, 29 Feb 2012 17:06:21 -0500 Subject: [Python-Dev] Backporting PEP 414 In-Reply-To: <20120229011313.Horde.zLfORVNNcXdPTW2ZumqDWGA@webmail.df.eu> References: <4F4D38B6.4020103@stoneleaf.us> <20120229011313.Horde.zLfORVNNcXdPTW2ZumqDWGA@webmail.df.eu> Message-ID: On Feb 28, 2012 7:14 PM, wrote: >> >> Why is readding u'' a feature and not a bug? > > > There is a really simple litmus test for whether something is a bug: > does it deviate from the specification? > > In this case, the specification is the grammar, and the implementation > certainly doesn't deviate from it. So it can't be a bug. I don't think anyone can assert that the specification itself is immune to having "bugs". > > Regards, > Martin > > P.S. Before anybody over-interprets this criterion: there is certain > "implicit behavior" assumed in Python that may not actually be documented, > such as "the interpreter will not core dump", and "the source code will > compile with any standard C compiler". Deviation from these implicit > assumption is also a bug. However, they don't apply here. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ironfroggy%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Wed Feb 29 23:43:45 2012 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 29 Feb 2012 17:43:45 -0500 Subject: [Python-Dev] Backporting PEP 414 In-Reply-To: References: <4F4D38B6.4020103@stoneleaf.us> <20120229011313.Horde.zLfORVNNcXdPTW2ZumqDWGA@webmail.df.eu> Message-ID: <20120229224346.D0FE82500CF@webabinitio.net> On Wed, 29 Feb 2012 17:06:21 -0500, Calvin Spealman wrote: > On Feb 28, 2012 7:14 PM, wrote: > >> > >> Why is readding u'' a feature and not a bug? > > > > > > There is a really simple litmus test for whether something is a bug: > > does it deviate from the specification? > > > > In this case, the specification is the grammar, and the implementation > > certainly doesn't deviate from it. So it can't be a bug. > > I don't think anyone can assert that the specification itself is immune to > having "bugs". Yes, but specification bug fixes are new features :) --David