From martin at v.loewis.de Sat Oct 1 14:52:03 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sat, 01 Oct 2011 14:52:03 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Enhance Py_ARRAY_LENGTH(): fail at build time if the argument is not an array In-Reply-To: <201109290345.59665.victor.stinner@haypocalc.com> References: <201109290345.59665.victor.stinner@haypocalc.com> Message-ID: <4E870CF3.60102@v.loewis.de> >> Do we really need a new file? Why not pyport.h where other compiler stuff >> goes? > > I'm not sure that pyport.h is the right place to add Py_MIN, Py_MAX, > Py_ARRAY_LENGTH. pyport.h looks to be related to all things specific to the > platform like INT_MAX, Py_VA_COPY, ... pymacro.h contains platform independant > macros. I'm -1 on additional header files as well. If no other reasonable place is found, Python.h is still available. Regards, Martin From stefan at bytereef.org Sat Oct 1 15:06:03 2011 From: stefan at bytereef.org (Stefan Krah) Date: Sat, 1 Oct 2011 15:06:03 +0200 Subject: [Python-Dev] PEP-393: request for keeping PyUnicode_EncodeDecimal() Message-ID: <20111001130603.GA16027@sleipnir.bytereef.org> Hello, the subject says it all. PyUnicode_EncodeDecimal() is listed among the deprecated functions. In cdecimal, I'm relying on this function for a number of reasons: * It is not trivial to implement. * With the Unicode implementation constantly changing, it is nearly impossible to know what input is currently regarded as a decimal digit. See also: http://bugs.python.org/issue10557 http://bugs.python.org/issue10557#msg123123 "The API won't go away (it does have its use and is being used in 3rd party extensions) [...]" Stefan Krah From martin at v.loewis.de Sat Oct 1 15:26:01 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 01 Oct 2011 15:26:01 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Implement PEP 393. In-Reply-To: <4E83AC0C.2010006@trueblade.com> References: <4E83AC0C.2010006@trueblade.com> Message-ID: <4E8714E9.7020504@v.loewis.de> Am 29.09.2011 01:21, schrieb Eric V. Smith: > Is there some reason str.format had such major surgery done to it? Yes: I couldn't figure out how to do it any other way. The formatting code had a few basic assumptions which now break (unless you keep using the legacy API). Primarily, the assumption is that there is a notion of a "STRINGLIB_CHAR" which is the element of a string representation. With PEP 393, no such type exists anymore - it depends on the individual object what the element type for the representation is. In other cases, I worked around that by compiling the stringlib three times, for Py_UCS1, Py_UCS2, and Py_UCS4. For one, this gives considerable code bloat, which I didn't like for the formatting code (as that is already a considerable amount of code). More importantly, this approach wouldn't have worked well, anyway, since the formatting combines multiple Unicode objects (especially with the OutputString buffer), and different inputs may have different representations. On top of that, OutputString needs widening support, starting out with a narrow string, and widening step-by-step as input strings are more wide than the current output (or not, if the input strings are all ASCII). It would have been possible to keep the basic structure by doing all formatting in Py_UCS4. This would cost a significant memory and runtime overhead. > In addition, there are outstanding patches that are now broken. I'm sorry about that. Try applying them to the new files, though - patch may still be able to figure out how to integrate them, as the algorithms and function structure hasn't changed. > I'd prefer it return to how it used to be, and just the minimum changes > required for PEP 393 be made to it. Please try for yourself. On string_format.h, I think there is zero chance, unless you want to compromise and efficiency (in addition to the already-present compromise on code cleanliness, due the the fact that the code is more general than it needs to be). On formatter.h, it may actually be possible to restore what it was - in particular if you can make a guarantee that all number formatting always outputs ASCII-strings only (which I'm not so sure about, as the thousands separator could be any character, in principle). Without that guarantee, it may indeed be reasonable to compile formatter.h in Py_UCS4, since the resulting strings will be small, so the overhead is probably negligible. Regards, Martin From martin at v.loewis.de Sat Oct 1 16:14:51 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 01 Oct 2011 16:14:51 +0200 Subject: [Python-Dev] PEP-393: request for keeping PyUnicode_EncodeDecimal() In-Reply-To: <20111001130603.GA16027@sleipnir.bytereef.org> References: <20111001130603.GA16027@sleipnir.bytereef.org> Message-ID: <4E87205B.905@v.loewis.de> > the subject says it all. PyUnicode_EncodeDecimal() is listed among > the deprecated functions. Please see the section on deprecation. None of the deprecated functions will be removed for a period of five years, and afterwards, they will be kept until usage outside of the core is low. Most likely, this means they will be kept until Python 4. > * It is not trivial to implement. > > * With the Unicode implementation constantly changing, it is nearly > impossible to know what input is currently regarded as a decimal > digit. See also: I still recommend that you come up with your own implementation of that algorithm. You probably don't need any of the error handler support, which makes the largest portion of the code. Then, use Py_UNICODE_TODECIMAL to process individual characters. It's a simple loop over every character. In addition, you could also take the same approach as decimal.py, i.e. do self._int = str(int(intpart+fracpart)) This would improve compatibility with the decimal.py implementation, which doesn't use PyUnicode_EncodeDecimal either (but instead goes through _PyUnicode_TransformDecimalAndSpaceToASCII). Regards, Martin From stefan at bytereef.org Sat Oct 1 16:58:59 2011 From: stefan at bytereef.org (Stefan Krah) Date: Sat, 1 Oct 2011 16:58:59 +0200 Subject: [Python-Dev] PEP-393: request for keeping PyUnicode_EncodeDecimal() In-Reply-To: <4E87205B.905@v.loewis.de> References: <20111001130603.GA16027@sleipnir.bytereef.org> <4E87205B.905@v.loewis.de> Message-ID: <20111001145859.GA16431@sleipnir.bytereef.org> "Martin v. L?wis" wrote: > > the subject says it all. PyUnicode_EncodeDecimal() is listed among > > the deprecated functions. > > Please see the section on deprecation. None of the deprecated functions > will be removed for a period of five years, and afterwards, they will > be kept until usage outside of the core is low. Most likely, this means > they will be kept until Python 4. I've to confess that I missed that; sounds good. > In addition, you could also take the same approach as decimal.py, > i.e. do > > self._int = str(int(intpart+fracpart)) > > This would improve compatibility with the decimal.py implementation, > which doesn't use PyUnicode_EncodeDecimal either (but instead goes > through _PyUnicode_TransformDecimalAndSpaceToASCII). longobject.c still used PyUnicode_EncodeDecimal() until 10 months ago (8304bd765bcf). I missed the PyUnicode_TransformDecimalToASCII() commit, probably because #10557 is still open. That's why I wouldn't like to implement the function myself at least until the API is settled. I see this in the new code: #if 0 static PyObject * unicode__decimal2ascii(PyObject *self) { return PyUnicode_TransformDecimalAndSpaceToASCII(self); } #endif Will PyUnicode_TransformDecimalAndSpaceToASCII() be public? Stefan Krah From solipsis at pitrou.net Sat Oct 1 17:18:42 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 1 Oct 2011 17:18:42 +0200 Subject: [Python-Dev] cpython: Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros References: Message-ID: <20111001171842.63d48736@pitrou.net> On Sat, 01 Oct 2011 16:53:44 +0200 victor.stinner wrote: > http://hg.python.org/cpython/rev/4afab01f5374 > changeset: 72565:4afab01f5374 > user: Victor Stinner > date: Sat Oct 01 16:48:13 2011 +0200 > summary: > Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros > > * Rename existing _PyUnicode_UTF8() macro to PyUnicode_UTF8() Wouldn't this be better called PyUnicode_AS_UTF8()? From martin at v.loewis.de Sat Oct 1 17:40:28 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 01 Oct 2011 17:40:28 +0200 Subject: [Python-Dev] PEP-393: request for keeping PyUnicode_EncodeDecimal() In-Reply-To: <20111001145859.GA16431@sleipnir.bytereef.org> References: <20111001130603.GA16027@sleipnir.bytereef.org> <4E87205B.905@v.loewis.de> <20111001145859.GA16431@sleipnir.bytereef.org> Message-ID: <4E87346C.3060109@v.loewis.de> > longobject.c still used PyUnicode_EncodeDecimal() until 10 months > ago (8304bd765bcf). I missed the PyUnicode_TransformDecimalToASCII() > commit, probably because #10557 is still open. > > That's why I wouldn't like to implement the function myself at least > until the API is settled. I don't understand. If you implement it yourself, you don't have to worry at all what the API is. Py_UNICODE_TODECIMAL has been around for a long time, and will stay, no matter how number parsing is implemented. That's all you need. out = malloc(PyUnicode_GET_LENGTH(in)+1); for (i = 0; i < PyUnicode_GET_LENGTH(in); i++) { Py_UCS4 ch = PyUnicode_READ_CHAR(in, i); int d = Py_UNICODE_TODIGIT(ch); if (d != -1) { out[i] == '0'+d; continue; } if (ch < 128) out[i] = ch; else { error(); return; } } out[i] = '\0'; OTOH, *if* number parsing is ever updated (e.g. to consider alternative decimal points), PyUnicode_EncodeDecimal still won't be changed - it will continue to do exactly what it does today. > Will PyUnicode_TransformDecimalAndSpaceToASCII() be public? It's already included in 3.2, so it can't be removed that easily. I wish it had been private, though - we have way too many API functions dealing with Unicode. Regards, Martin From martin at v.loewis.de Sat Oct 1 17:47:26 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 01 Oct 2011 17:47:26 +0200 Subject: [Python-Dev] cpython: Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros In-Reply-To: <20111001171842.63d48736@pitrou.net> References: <20111001171842.63d48736@pitrou.net> Message-ID: <4E87360E.3000307@v.loewis.de> Am 01.10.2011 17:18, schrieb Antoine Pitrou: > On Sat, 01 Oct 2011 16:53:44 +0200 > victor.stinner wrote: >> http://hg.python.org/cpython/rev/4afab01f5374 >> changeset: 72565:4afab01f5374 >> user: Victor Stinner >> date: Sat Oct 01 16:48:13 2011 +0200 >> summary: >> Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros >> >> * Rename existing _PyUnicode_UTF8() macro to PyUnicode_UTF8() > > Wouldn't this be better called PyUnicode_AS_UTF8()? No. _AS_UTF8 would imply that some conversion function is called. In this case, it's a pure structure accessor macro, that may give NULL if the pointer is not yet filled out. It's not called Py_AS_TYPE, but Py_TYPE; likewise not PyWeakref_AS_OBJECT, but PyWeakref_GET_OBJECT. In this case, PyUnicode_GET_UTF8 might have been an alternative. Regards, Martin From solipsis at pitrou.net Sat Oct 1 17:48:35 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 1 Oct 2011 17:48:35 +0200 Subject: [Python-Dev] cpython: Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros In-Reply-To: <4E87360E.3000307@v.loewis.de> References: <20111001171842.63d48736@pitrou.net> <4E87360E.3000307@v.loewis.de> Message-ID: <20111001174835.26c1155f@pitrou.net> On Sat, 01 Oct 2011 17:47:26 +0200 "Martin v. L?wis" wrote: > Am 01.10.2011 17:18, schrieb Antoine Pitrou: > > On Sat, 01 Oct 2011 16:53:44 +0200 > > victor.stinner wrote: > >> http://hg.python.org/cpython/rev/4afab01f5374 > >> changeset: 72565:4afab01f5374 > >> user: Victor Stinner > >> date: Sat Oct 01 16:48:13 2011 +0200 > >> summary: > >> Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros > >> > >> * Rename existing _PyUnicode_UTF8() macro to PyUnicode_UTF8() > > > > Wouldn't this be better called PyUnicode_AS_UTF8()? > > No. _AS_UTF8 would imply that some conversion function is called. PyBytes_AS_STRING doesn't call any conversion function, and neither did PyUnicode_AS_UNICODE. From martin at v.loewis.de Sat Oct 1 17:52:29 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sat, 01 Oct 2011 17:52:29 +0200 Subject: [Python-Dev] What it takes to change a single keyword. In-Reply-To: References: Message-ID: <4E87373D.2030503@v.loewis.de> > First of all, I am sincerely sorry if this is wrong mailing list to ask > this question. I checked out definitions of couple other mailing list, > and this one seemed most suitable. Here is my question: In principle, python-list would be more appropriate, but this really is a border case. So welcome! > Let's say I want to change a single keyword, let's say import keyword, > to be spelled as something else, like it's translation to my language. I > guess it would be more complicated than modifiying Grammar/Grammar, but > I can't be sure which files should get edited. Hmm. I also think editing Grammar/Grammar should be sufficient. Try restricting yourself to ASCII keywords first; this just worked fine for me. Of course, if you change a single keyword, none of the existing Python code will work anymore. See for yourself by changing 'def' to 'fed' (say). Regards, Martin From victor.stinner at haypocalc.com Sat Oct 1 19:17:56 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sat, 1 Oct 2011 19:17:56 +0200 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? Message-ID: <201110011917.56330.victor.stinner@haypocalc.com> Hi, Since the integration of the PEP 393, str += str is not more super-fast (but just fast). For example, adding a single character to a string has to copy all characters to a new string. I suppose that performances of a lot of applications manipulating text may be affected by this issue, especially text templating libraries. io.StringIO has also been changed to store characters as Py_UCS4 (4 bytes) instead of Py_UNICODE (2 or 4 bytes). This class doesn't benefit from the new PEP 393. I propose to add a new builtin type to Python to improve both issues (cpu and memory): *strarray*. This type would have the same API than str, except: * has append() and extend() methods * methods results are strarray instead of str I'm writing this email to ask you if this type solves a real issue, or if we can just prove the super-fast str.join(list of str). -- strarray is similar to bytearray, but different: strarray('abc')[0] is 'a', not 97, and strarray can store any Unicode character (not only integers in range 0-255). I wrote a quick and dirty implementation in Python just to be able to play with the API, and to have an idea of the quantity of work required to implement it: https://bitbucket.org/haypo/misc/src/tip/python/strarray.py (Some methods are untested: see the included TODO list.) -- Implement strarray in C is not trivial and it would be easier to implement it in 3 steps: (a) Use Py_UCS4 array (b) The array type depends on the content: best memory footprint, as the PEP 393 (c) Use strarray to implement a new io.StringIO Or we can just stop after step (a). -- strarray API has to be discussed. Most bytearray methods return a new object in most cases. I don't understand why, it's not efficient. I don't know if we can do in-place operations for strarray methods having the same name than bytearray methods (which are not in-place methods). str has some more methods that bytes and bytearary don't have, like format. We may do in-place operation for these methods. Victor From victor.stinner at haypocalc.com Sat Oct 1 19:21:52 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sat, 1 Oct 2011 19:21:52 +0200 Subject: [Python-Dev] cpython: Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros In-Reply-To: <20111001171842.63d48736@pitrou.net> References: <20111001171842.63d48736@pitrou.net> Message-ID: <201110011921.52537.victor.stinner@haypocalc.com> Le samedi 1 octobre 2011 17:18:42, Antoine Pitrou a ?crit : > On Sat, 01 Oct 2011 16:53:44 +0200 > > victor.stinner wrote: > > http://hg.python.org/cpython/rev/4afab01f5374 > > changeset: 72565:4afab01f5374 > > user: Victor Stinner > > date: Sat Oct 01 16:48:13 2011 +0200 > > > > summary: > > Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros > > > > * Rename existing _PyUnicode_UTF8() macro to PyUnicode_UTF8() > > Wouldn't this be better called PyUnicode_AS_UTF8()? All these macro are privates and just used to have a more readable C code. For example, _PyUnicode_UTF8() just gives access to a field a structure after casting the object to the right type. We may drop "PyUnicode_" and "_PyUnicode_" prefixes if these names are confusing. Victor From victor.stinner at haypocalc.com Sat Oct 1 20:02:23 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sat, 1 Oct 2011 20:02:23 +0200 Subject: [Python-Dev] =?utf-8?q?=5BPython-checkins=5D_cpython=3A_=3D=3Futf?= =?utf-8?b?LTg/cT9FbmhhbmNlPTA5UHk9NUZBUlJBWT01RkxFTkdUSD89KCk6IGZhaWwg?= =?utf-8?q?at_build_time_if_the_argument_is_not=09an_array?= In-Reply-To: <4E870CF3.60102@v.loewis.de> References: <201109290345.59665.victor.stinner@haypocalc.com> <4E870CF3.60102@v.loewis.de> Message-ID: <201110012002.23140.victor.stinner@haypocalc.com> Le samedi 1 octobre 2011 14:52:03, vous avez ?crit : > >> Do we really need a new file? Why not pyport.h where other compiler > >> stuff goes? > > > > I'm not sure that pyport.h is the right place to add Py_MIN, Py_MAX, > > Py_ARRAY_LENGTH. pyport.h looks to be related to all things specific to > > the platform like INT_MAX, Py_VA_COPY, ... pymacro.h contains platform > > independant macros. > > I'm -1 on additional header files as well. If no other reasonable place > is found, Python.h is still available. I moved them to pymacro.h because I don't consider Python.h as a reasonable place for them. Victor From victor.stinner at haypocalc.com Sat Oct 1 22:06:11 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sat, 1 Oct 2011 22:06:11 +0200 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? In-Reply-To: <201110011917.56330.victor.stinner@haypocalc.com> References: <201110011917.56330.victor.stinner@haypocalc.com> Message-ID: <201110012206.11806.victor.stinner@haypocalc.com> > Since the integration of the PEP 393, str += str is not more super-fast > (but just fast). Oh oh. str+=str is now *1450x* slower than ''.join() pattern. Here is a benchmark (see attached script, bench_build_str.py): Python 3.3 str += str : 14548 ms ''.join() : 10 ms StringIO.write: 12 ms StringBuilder : 30 ms array('u') : 67 ms Python 3.2 str += str : 9 ms ''.join() : 9 ms StringIO.write: 9 ms StringBuilder : 30 ms array('u') : 77 ms (FYI results are very different in Python 2) I expect performances similar to StringIO.write if strarray is implemented using a Py_UCS4 buffer, as io.StringIO. PyPy has a UnicodeBuilder class (in __pypy__.builders): it has append(), append_slice() and build() methods. In PyPy, it is the fastest method to build a string: PyPy 1.6 ''.join() : 16 ms StringIO.join : 24 ms StringBuilder : 9 ms array('u') : 66 ms It is even faster if you specify the size to the constructor: 3 ms. > I'm writing this email to ask you if this type solves a real issue, or if > we can just prove the super-fast str.join(list of str). Hum, it looks like "What is the most efficient string concatenation method in python?" in a frequently asked question. There is a recent thread on python- ideas mailing list: "Create a StringBuilder class and use it everywhere" http://code.activestate.com/lists/python-ideas/11147/ (I just subscribed to this list.) Another alternative is a "string-join" object. It is discussed (and implemented) in the following issue, and PyPy has also an optional implementation: http://bugs.python.org/issue1569040 http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#string- join-objects Note: Python 2 has UserString.MutableString (and Python 3 has collections.UserString). Victor -------------- next part -------------- A non-text attachment was scrubbed... Name: bench_build_str.py Type: text/x-python Size: 1566 bytes Desc: not available URL: From eric at trueblade.com Sat Oct 1 22:07:47 2011 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 01 Oct 2011 16:07:47 -0400 Subject: [Python-Dev] [Python-checkins] cpython: Implement PEP 393. In-Reply-To: <4E8714E9.7020504@v.loewis.de> References: <4E83AC0C.2010006@trueblade.com> <4E8714E9.7020504@v.loewis.de> Message-ID: <4E877313.1000909@trueblade.com> On 10/1/2011 9:26 AM, "Martin v. L?wis" wrote: > Am 29.09.2011 01:21, schrieb Eric V. Smith: >> Is there some reason str.format had such major surgery done to it? > > Yes: I couldn't figure out how to do it any other way. The formatting > code had a few basic assumptions which now break (unless you keep using > the legacy API). Primarily, the assumption is that there is a notion of > a "STRINGLIB_CHAR" which is the element of a string representation. With > PEP 393, no such type exists anymore - it depends on the individual > object what the element type for the representation is. Martin: Thanks so much for your thoughtful answer. You've obviously given this more thought than I have. From your answer, it does indeed sound like string_format.h needs to be removed from stringlib. I'll have to think more about formatter.h. On the other hand, not having this code in stringlib would certainly be liberating! Maybe I'll take this opportunity to clean it up and simplify it now that it's free of the stringlib constraints. Eric. From solipsis at pitrou.net Sat Oct 1 22:21:01 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 1 Oct 2011 22:21:01 +0200 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? References: <201110011917.56330.victor.stinner@haypocalc.com> <201110012206.11806.victor.stinner@haypocalc.com> Message-ID: <20111001222101.2aa6aaa7@pitrou.net> On Sat, 1 Oct 2011 22:06:11 +0200 Victor Stinner wrote: > > > I'm writing this email to ask you if this type solves a real issue, or if > > we can just prove the super-fast str.join(list of str). > > Hum, it looks like "What is the most efficient string concatenation method in > python?" in a frequently asked question. There is a recent thread on python- > ideas mailing list: So, since people are confused at the number of possible options, you propose to add a new option and therefore increase the confusion? I don't understand why StringIO couldn't simply be optimized a little more, if it needs to. Or, if straightforward string concatenation really needs to be fast, then str + str should be optimized (like it used to be). Regards Antoine. From larry at hastings.org Sat Oct 1 23:36:01 2011 From: larry at hastings.org (Larry Hastings) Date: Sat, 01 Oct 2011 22:36:01 +0100 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? In-Reply-To: <201110012206.11806.victor.stinner@haypocalc.com> References: <201110011917.56330.victor.stinner@haypocalc.com> <201110012206.11806.victor.stinner@haypocalc.com> Message-ID: <4E8787C1.3010106@hastings.org> On 10/01/2011 09:06 PM, Victor Stinner wrote: > Another alternative is a "string-join" object. It is discussed (and > implemented) in the following issue, and PyPy has also an optional > implementation: > > http://bugs.python.org/issue1569040 > http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#string- > join-objects > Yes, actually I was planning on trying to revive my "lazy string concatenation" patch once PEP 393 landed. As I recall it, the major roadblock to the patch's acceptance was that it changed the semantics of PyString_AS_STRING(). With the patch applied, PyString_AS_STRING() could now fail and return NULL under low-memory conditions. This meant a major change to the C API and would have required an audit of 400+ call sites inside CPython alone. I haven't studied PEP 393 yet, but Martin tells me PyUnicode_READY would be a good place to render the lazy string. Give me a week or two and I should be able to get it together, /larry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Sun Oct 2 02:33:33 2011 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 1 Oct 2011 21:33:33 -0300 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? In-Reply-To: <20111001222101.2aa6aaa7@pitrou.net> References: <201110011917.56330.victor.stinner@haypocalc.com> <201110012206.11806.victor.stinner@haypocalc.com> <20111001222101.2aa6aaa7@pitrou.net> Message-ID: On Sat, Oct 1, 2011 at 5:21 PM, Antoine Pitrou wrote: > On Sat, 1 Oct 2011 22:06:11 +0200 > Victor Stinner wrote: >> >> > I'm writing this email to ask you if this type solves a real issue, or if >> > we can just prove the super-fast str.join(list of str). >> >> Hum, it looks like "What is the most efficient string concatenation method in >> python?" in a frequently asked question. There is a recent thread on python- >> ideas mailing list: Victor, you can't say it's x times slower. It has different complexity, so it can be arbitrarily slower. > > So, since people are confused at the number of possible options, you > propose to add a new option and therefore increase the confusion? > > I don't understand why StringIO couldn't simply be optimized a little > more, if it needs to. > Or, if straightforward string concatenation really needs to be fast, > then str + str should be optimized (like it used to be). As far as I remember str + str is discouraged as a way of concatenating strings. We in pypy should make it fast if it's *really* the official way. StringIO is bytes only I think, which might be a bit of an issue if you want a unicode at the end. PyPy's Unicode/String builder are a bit hacks until we come up with something that can make ''.join faster I think. Cheers, fijal > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fijall%40gmail.com > From ncoghlan at gmail.com Sun Oct 2 04:42:54 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 1 Oct 2011 22:42:54 -0400 Subject: [Python-Dev] What it takes to change a single keyword. In-Reply-To: <4E87373D.2030503@v.loewis.de> References: <4E87373D.2030503@v.loewis.de> Message-ID: 2011/10/1 "Martin v. L?wis" : >> First of all, I am sincerely sorry if this is wrong mailing list to ask >> this question. I checked out definitions of couple other mailing list, >> and this one seemed most suitable. Here is my question: > > In principle, python-list would be more appropriate, but this really > is a border case. So welcome! > >> Let's say I want to change a single keyword, let's say import keyword, >> to be spelled as something else, like it's translation to my language. I >> guess it would be more complicated than modifiying Grammar/Grammar, but >> I can't be sure which files should get edited. > > Hmm. I also think editing Grammar/Grammar should be sufficient. Try > restricting yourself to ASCII keywords first; this just worked fine for > me. For any changes where that isn't sufficient, then http://docs.python.org/devguide/grammar.html provides a helpful list of additional places to check (and http://docs.python.org/devguide/compiler.html provides info on how it all hangs together). However, rather than *changing* the keywords, it would likely be better to allow *alternate* keywords to avoid the problem Martin mentioned with existing Python code failing to run (including the entire standard library). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Oct 2 04:48:59 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 1 Oct 2011 22:48:59 -0400 Subject: [Python-Dev] [Python-checkins] cpython: Implement PEP 393. In-Reply-To: <4E877313.1000909@trueblade.com> References: <4E83AC0C.2010006@trueblade.com> <4E8714E9.7020504@v.loewis.de> <4E877313.1000909@trueblade.com> Message-ID: On Sat, Oct 1, 2011 at 4:07 PM, Eric V. Smith wrote: > On the other hand, not having this code in stringlib would certainly be > liberating! Maybe I'll take this opportunity to clean it up and simplify > it now that it's free of the stringlib constraints. Yeah, don't sacrifice speed in str.format for a still-hypothetical-and-potentially-never-going-to-happen bytes formatting variant. If the latter does happen, the use cases would be different enough that I'm not even sure the mini-language should remain entirely the same (e.g. you'd likely want direct access to some of the struct module formatting more so than str-style formats). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Oct 2 04:54:44 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 1 Oct 2011 22:54:44 -0400 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? In-Reply-To: References: <201110011917.56330.victor.stinner@haypocalc.com> <201110012206.11806.victor.stinner@haypocalc.com> <20111001222101.2aa6aaa7@pitrou.net> Message-ID: On Sat, Oct 1, 2011 at 8:33 PM, Maciej Fijalkowski wrote: > StringIO is bytes only I think, which might be a bit of an issue if > you want a unicode at the end. I'm not sure why you would think that (aside from a 2.x holdover). StringIO handles Unicode text, BytesIO handles bytes. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Oct 2 05:13:53 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 1 Oct 2011 23:13:53 -0400 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? In-Reply-To: <201110011917.56330.victor.stinner@haypocalc.com> References: <201110011917.56330.victor.stinner@haypocalc.com> Message-ID: On Sat, Oct 1, 2011 at 1:17 PM, Victor Stinner wrote: > Most bytearray methods return a new object in most cases. I don't understand > why, it's not efficient. I don't know if we can do in-place operations for > strarray methods having the same name than bytearray methods (which are not > in-place methods). No, we can't. The whole point of having separate in-place operators is to distinguish between operations that can modify the original object, and those that leave the original object alone (even when it's an instance of a mutable type like list or bytearray). Efficiency takes a distant second place to correctness when determining API behaviour. > str has some more methods that bytes and bytearary don't have, like format. We > may do in-place operation for these methods. No we can't, since they're not mutating methods, so they shouldn't affect the state of the current object. I'm only -0 on the idea (since bytearray and io.BytesIO seem to coexist happily enough), but any such strarray object would need to behave itself with respect to which operations affected the internal state of the object. With strings defined as immutable objects, concatenating them in a loop is formally on O(N*N) operation. Those are always going to scale poorly. The 'resize if only one reference' trick was fragile, masked a real algorithmic flaw in user code, but also sped up a lot of naive software. It was definitely a case of practicality beating purity. Any change that depends on the user changing their code would be rather missing the point of the original optimisation - if the user is sufficiently aware of the problem to know they need to change their code, then explicitly joining a list of substrings or using a StringIO object instead of an ordinary string is well within their grasp. Adding a "disjoint" string representation to the existing PEP 393 suite of representations would solve the same problem in a more systematic way and, as Martin pointed out, could likely use the same machinery as is provided for backwards compatibility with code expecting the legacy string representation. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg at krypto.org Sun Oct 2 09:33:11 2011 From: greg at krypto.org (Gregory P. Smith) Date: Sun, 2 Oct 2011 00:33:11 -0700 Subject: [Python-Dev] PEP 393 merged In-Reply-To: References: <4E82D150.7050204@v.loewis.de> Message-ID: On Wed, Sep 28, 2011 at 8:41 AM, Guido van Rossum wrote: > Congrats! Python 3.3 will be better because of this. > > On Wed, Sep 28, 2011 at 12:48 AM, "Martin v. L?wis" > wrote: > > I have now merged the PEP 393 implementation into default. > > The main missing piece is the documentation; contributions are > > welcome. > +10 This is great! Thank you Martin! -Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Sun Oct 2 10:02:33 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Sun, 2 Oct 2011 11:02:33 +0300 Subject: [Python-Dev] Python Core Tools Message-ID: Hello, I've stumbled upon Dave Beazley's article [1] about trying ancient GIL removal patch at http://dabeaz.blogspot.com/2011/08/inside-look-at-gil-removal-patch-of.html and looking at the output of Python dis module thought that it would be cool if there were tools to inspect, explain and play with Python bytecode. Little visual assembler, that shows bytecode and disassembly side by side and annotates the listing with useful hints (like interpreter code optimization decisions). That will greatly help many new people understand how Python works and explain complicated stuff like GIL and stackless by copy/pasting pictures from there. PyPy has a tool named 'jitviewer' [2] that may be that I am looking for, but the demo is offline. But even without this tool I know that speakers at conferences create various useful scripts to gather interesting stats and visualize Python internals. I can name at least 'dev in a box' [3] project to get people started quickly with development, but I am sure there many others exist. Can you remember any tools that can be useful in Python core development? Maybe you use one every day? I'd like to compile a list of such tools and put in to Wiki to allow people have some fun with Python development without the knowledge of C. 1. http://dabeaz.blogspot.com/2011/08/inside-look-at-gil-removal-patch-of.html 2. http://morepypy.blogspot.com/2011/08/visualization-of-jitted-code.html 3. http://hg.python.org/devinabox/ -- anatoly t. From fijall at gmail.com Sun Oct 2 13:05:08 2011 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 2 Oct 2011 08:05:08 -0300 Subject: [Python-Dev] Python Core Tools In-Reply-To: References: Message-ID: On Sun, Oct 2, 2011 at 5:02 AM, anatoly techtonik wrote: > Hello, > > I've stumbled upon Dave Beazley's article [1] about trying ancient GIL > removal patch at > http://dabeaz.blogspot.com/2011/08/inside-look-at-gil-removal-patch-of.html > and looking at the output of Python dis module thought that it would > be cool if there were tools to inspect, explain and play with Python > bytecode. Little visual assembler, that shows bytecode and disassembly > side by side and annotates the listing with useful hints (like > interpreter code optimization decisions). That will greatly help many > new people understand how Python works and explain complicated stuff > like GIL and stackless by copy/pasting pictures from there. PyPy has a > tool named 'jitviewer' [2] that may be that I am looking for, but the > demo is offline. I put demo back online. https://bitbucket.org/pypy/pypy/src/59460302c713/lib_pypy/disassembler.py this might be of interest. It's like dis module except it creates objects instead of printing them > > But even without this tool I know that speakers at conferences create > various useful scripts to gather interesting stats and visualize > Python internals. I can name at least 'dev in a box' [3] project to > get people started quickly with development, but I am sure there many > others exist. Can you remember any tools that can be useful in Python > core development? Maybe you use one every day? I'd like to compile a > list of such tools and put in to Wiki to allow people have some fun > with Python development without the knowledge of C. > > > 1. http://dabeaz.blogspot.com/2011/08/inside-look-at-gil-removal-patch-of.html > 2. http://morepypy.blogspot.com/2011/08/visualization-of-jitted-code.html > 3. http://hg.python.org/devinabox/ > -- > anatoly t. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fijall%40gmail.com > From stefan at bytereef.org Sun Oct 2 13:04:16 2011 From: stefan at bytereef.org (Stefan Krah) Date: Sun, 2 Oct 2011 13:04:16 +0200 Subject: [Python-Dev] PEP-393: request for keeping PyUnicode_EncodeDecimal() In-Reply-To: <4E87346C.3060109@v.loewis.de> References: <20111001130603.GA16027@sleipnir.bytereef.org> <4E87205B.905@v.loewis.de> <20111001145859.GA16431@sleipnir.bytereef.org> <4E87346C.3060109@v.loewis.de> Message-ID: <20111002110416.GA18205@sleipnir.bytereef.org> "Martin v. L?wis" wrote: > > longobject.c still used PyUnicode_EncodeDecimal() until 10 months > > ago (8304bd765bcf). I missed the PyUnicode_TransformDecimalToASCII() > > commit, probably because #10557 is still open. > > > > That's why I wouldn't like to implement the function myself at least > > until the API is settled. > > I don't understand. If you implement it yourself, you don't have to > worry at all what the API is. What I'm looking for is a public function that is silently updated if python-dev decides to accept other numerical input. As I understand from your comments below, PyUnicode_EncodeDecimal() is frozen, so that function does indeed not help. I would consider it reasonable for PyUnicode_TransformDecimalAndSpaceToASCII() to be documented as: "This function might accept different numerical input in the future." The reason is that some people would like to accept additional input (see #6632), while others would like to restrict input. If there is a function that will always track whatever will be decided, extension authors don't have to worry about being up-to-date. > out = malloc(PyUnicode_GET_LENGTH(in)+1); > for (i = 0; i < PyUnicode_GET_LENGTH(in); i++) { > Py_UCS4 ch = PyUnicode_READ_CHAR(in, i); > int d = Py_UNICODE_TODIGIT(ch); > if (d != -1) { > out[i] == '0'+d; > continue; > } > if (ch < 128) > out[i] = ch; > else { > error(); > return; > } > } > out[i] = '\0'; Thanks for that. I think alternative leading and trailing whitespace would need to be handled as well: Decimal("\u180E1.233"). > > Will PyUnicode_TransformDecimalAndSpaceToASCII() be public? > > It's already included in 3.2, so it can't be removed that easily. > I wish it had been private, though - we have way too many API functions > dealing with Unicode. I can find PyUnicode_TransformDecimalToASCII() in 3.2, but not PyUnicode_TransformDecimalAndSpaceToASCII(). Stefan Krah From fijall at gmail.com Sun Oct 2 14:17:14 2011 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 2 Oct 2011 09:17:14 -0300 Subject: [Python-Dev] Python Core Tools In-Reply-To: References: Message-ID: On Sun, Oct 2, 2011 at 8:05 AM, Maciej Fijalkowski wrote: > On Sun, Oct 2, 2011 at 5:02 AM, anatoly techtonik wrote: >> Hello, >> >> I've stumbled upon Dave Beazley's article [1] about trying ancient GIL >> removal patch at >> http://dabeaz.blogspot.com/2011/08/inside-look-at-gil-removal-patch-of.html >> and looking at the output of Python dis module thought that it would >> be cool if there were tools to inspect, explain and play with Python >> bytecode. Little visual assembler, that shows bytecode and disassembly >> side by side and annotates the listing with useful hints (like >> interpreter code optimization decisions). That will greatly help many >> new people understand how Python works and explain complicated stuff >> like GIL and stackless by copy/pasting pictures from there. PyPy has a >> tool named 'jitviewer' [2] that may be that I am looking for, but the >> demo is offline. > > I put demo back online. > It's just that SimpleHTTPServer doesn't quite survive slashdot effect. Where do I fill a bug report :) Cheers, fijal From victor.stinner at haypocalc.com Sun Oct 2 15:00:01 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sun, 2 Oct 2011 15:00:01 +0200 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? In-Reply-To: <20111001222101.2aa6aaa7@pitrou.net> References: <201110011917.56330.victor.stinner@haypocalc.com> <201110012206.11806.victor.stinner@haypocalc.com> <20111001222101.2aa6aaa7@pitrou.net> Message-ID: <201110021500.01139.victor.stinner@haypocalc.com> Le samedi 1 octobre 2011 22:21:01, Antoine Pitrou a ?crit : > So, since people are confused at the number of possible options, you > propose to add a new option and therefore increase the confusion? The idea is to provide an API very close to the str type. So if your program becomes slow in some functions and these functions are manipulating strings: just try to replace str() by strarray() at the beginning of your loop, and redo your benchmark. I don't know if we really need all str methods: ljust(), endswith(), isspace(), lower(), strip(), ... or if a UnicodeBuilder supporting in-place a+=b would be enough. I suppose that it just would be more practical to have the same methods. Another useful use case is to be able to replace a substring: using strarray, you can use the standard array[a:b] = newsubstring to insert, replace or delete. Extract of strarray unit tests: abc = strarray('abc') abc[:1] = '123' # replace self.assertEqual(abc, '123bc') abc[3:3] = '45' # insert self.assertEqual(abc, '12345bc') abc[5:] = '' # delete self.assertEqual(abc, '12345') But only "replace" would be O(1). ("insert" requires less work than a replace in a classic str if the replaced string is near the end.) You cannot insert/delete using StringIO, str.join, or StringBuilder/UnicodeBuilder, but you can using array('u'). Of course, you can replace a single character: strarray[i] = 'x'. (Using array[a:b]=newstr and array.index(), you can implement your in-place .replace() function.) > I don't understand why StringIO couldn't simply be optimized a little > more, if it needs to. Honestly, I didn't know that StringIO.write() is more efficient than str+=str, and it is surprising to use the io module (which is supposed to be related to files) to manipulate strings. But we can maybe document some "trick" (is it a trick or not?) in str documementation (and in FAQ, and in stackoverflow.com, and ...). > Or, if straightforward string concatenation really needs to be fast, > then str + str should be optimized (like it used to be). We cannot have best performance and lowest memory usage at the same time with the new str implementation (PEP 393). The new implementation is even more focused on read-only (constant) strings than the previous one (Py_UNICODE array using two memory blocks). The PEP 393 uses one memory block, you cannot resize a str object anymore. The old str type, StringIO, array (and strarray) use two memory blocks, so it is possible to resize them (objects keep their identifier after the resize). I *might* be possible to implement strarray that is fast on concatenation and has small memory footprint, but we cannot use it for the str type because str is immutable in Python. -- On a second thaught, it may be easy to implement strarray if it reuses unicodeobject.c. For example, strarray can be a special case (mutable) of PyUnicodeObject (which use two memory blocks): the string would always be ready, be never compact. By the way, bytesobject.c and bytearrayobject.c is a fiasco: most functions are duplicated whereas the code is very close. A big refactor is required to remove duplicate code there. Victor From solipsis at pitrou.net Sun Oct 2 15:25:21 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 2 Oct 2011 15:25:21 +0200 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? References: <201110011917.56330.victor.stinner@haypocalc.com> <201110012206.11806.victor.stinner@haypocalc.com> <20111001222101.2aa6aaa7@pitrou.net> <201110021500.01139.victor.stinner@haypocalc.com> Message-ID: <20111002152521.63cbaf75@pitrou.net> On Sun, 2 Oct 2011 15:00:01 +0200 Victor Stinner wrote: > > > I don't understand why StringIO couldn't simply be optimized a little > > more, if it needs to. > > Honestly, I didn't know that StringIO.write() is more efficient than str+=str, > and it is surprising to use the io module (which is supposed to be related to > files) to manipulate strings. StringIO is an in-memory file-like object, like in 2.x (where it lived in the "cStringIO" module). I don't think it's a novel thing. > The PEP 393 uses one memory block, you cannot resize a str object anymore. I don't know why you're saying that. The concatenation optimization worked in 2.x where the "str" type also used only one memory block. You just have to check that the refcount is about to drop to zero. Of course, resizing only works if the two unicode objects are of the same "kind". Regards Antoine. From g.brandl at gmx.net Sun Oct 2 16:21:49 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 02 Oct 2011 16:21:49 +0200 Subject: [Python-Dev] cpython: PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown In-Reply-To: References: Message-ID: On 10/02/11 01:14, victor.stinner wrote: > http://hg.python.org/cpython/rev/9124a00df142 > changeset: 72573:9124a00df142 > parent: 72571:fa0b1e50270f > user: Victor Stinner > date: Sat Oct 01 23:48:37 2011 +0200 > summary: > PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown > > files: > Objects/unicodeobject.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > > diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c > --- a/Objects/unicodeobject.c > +++ b/Objects/unicodeobject.c > @@ -1211,7 +1211,7 @@ > case PyUnicode_4BYTE_KIND: > return _PyUnicode_FromUCS4(buffer, size); > } > - assert(0); > + PyErr_SetString(PyExc_ValueError, "invalid kind"); > return NULL; > } Is that really a ValueError? It should only be a ValueError if the user could trigger that error. Otherwise it should be a SystemError. Georg From stephen at xemacs.org Sun Oct 2 16:39:20 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 02 Oct 2011 23:39:20 +0900 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? In-Reply-To: <20111002152521.63cbaf75@pitrou.net> References: <201110011917.56330.victor.stinner@haypocalc.com> <201110012206.11806.victor.stinner@haypocalc.com> <20111001222101.2aa6aaa7@pitrou.net> <201110021500.01139.victor.stinner@haypocalc.com> <20111002152521.63cbaf75@pitrou.net> Message-ID: <871uuvh2zr.fsf@uwakimon.sk.tsukuba.ac.jp> Antoine Pitrou writes: > StringIO is an in-memory file-like object, like in 2.x (where it lived > in the "cStringIO" module). I don't think it's a novel thing. The problem is the name "StringIO". Something like "StringStream" or "StringBuffer" might be more discoverable. I personally didn't have trouble deducing that "StringIO" means "treat a string like a file", but it's not immediately obvious what the module is for (unless you already know). From solipsis at pitrou.net Sun Oct 2 16:41:16 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 02 Oct 2011 16:41:16 +0200 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? In-Reply-To: <871uuvh2zr.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201110011917.56330.victor.stinner@haypocalc.com> <201110012206.11806.victor.stinner@haypocalc.com> <20111001222101.2aa6aaa7@pitrou.net> <201110021500.01139.victor.stinner@haypocalc.com> <20111002152521.63cbaf75@pitrou.net> <871uuvh2zr.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <1317566476.3562.2.camel@localhost.localdomain> Le dimanche 02 octobre 2011 ? 23:39 +0900, Stephen J. Turnbull a ?crit : > Antoine Pitrou writes: > > > StringIO is an in-memory file-like object, like in 2.x (where it lived > > in the "cStringIO" module). I don't think it's a novel thing. > > The problem is the name "StringIO". Something like "StringStream" or > "StringBuffer" might be more discoverable. I personally didn't have > trouble deducing that "StringIO" means "treat a string like a file", > but it's not immediately obvious what the module is for (unless you > already know). I'm not sure why "StringStream" or "StringBuffer" would be more discoverable, unless you're coming from a language where these names are well-known. A "stream" is usually related to I/O, anyway; while a "buffer" is more like an implementation detail. I personally like the relative tersity of "StringIO". From g.brandl at gmx.net Sun Oct 2 16:52:13 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 02 Oct 2011 16:52:13 +0200 Subject: [Python-Dev] cpython: PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown In-Reply-To: References: Message-ID: On 10/02/11 16:21, Georg Brandl wrote: > On 10/02/11 01:14, victor.stinner wrote: >> http://hg.python.org/cpython/rev/9124a00df142 >> changeset: 72573:9124a00df142 >> parent: 72571:fa0b1e50270f >> user: Victor Stinner >> date: Sat Oct 01 23:48:37 2011 +0200 >> summary: >> PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown >> >> files: >> Objects/unicodeobject.c | 2 +- >> 1 files changed, 1 insertions(+), 1 deletions(-) >> >> >> diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c >> --- a/Objects/unicodeobject.c >> +++ b/Objects/unicodeobject.c >> @@ -1211,7 +1211,7 @@ >> case PyUnicode_4BYTE_KIND: >> return _PyUnicode_FromUCS4(buffer, size); >> } >> - assert(0); >> + PyErr_SetString(PyExc_ValueError, "invalid kind"); >> return NULL; >> } > > Is that really a ValueError? It should only be a ValueError if the user > could trigger that error. Otherwise it should be a SystemError. (And by "user", I mean "Python programmer".) Georg From benjamin at python.org Sun Oct 2 17:46:16 2011 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 2 Oct 2011 11:46:16 -0400 Subject: [Python-Dev] cpython: PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown In-Reply-To: References: Message-ID: 2011/10/2 Georg Brandl : > On 10/02/11 01:14, victor.stinner wrote: >> http://hg.python.org/cpython/rev/9124a00df142 >> changeset: ? 72573:9124a00df142 >> parent: ? ? ?72571:fa0b1e50270f >> user: ? ? ? ?Victor Stinner >> date: ? ? ? ?Sat Oct 01 23:48:37 2011 +0200 >> summary: >> ? PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown Also, could I remind you that a better commit message is probably "make PyUnicode_FromKindAndData raise a ValueError if the kind is unknown". Moreover, I wonder is the kind should be an enumeration, then people would get a warning at least. -- Regards, Benjamin From solipsis at pitrou.net Sun Oct 2 17:52:14 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 2 Oct 2011 17:52:14 +0200 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? References: <201110011917.56330.victor.stinner@haypocalc.com> <201110012206.11806.victor.stinner@haypocalc.com> <20111001222101.2aa6aaa7@pitrou.net> <201110021500.01139.victor.stinner@haypocalc.com> <20111002152521.63cbaf75@pitrou.net> <871uuvh2zr.fsf@uwakimon.sk.tsukuba.ac.jp> <1317566476.3562.2.camel@localhost.localdomain> Message-ID: <20111002175214.42fbe504@pitrou.net> On Sun, 02 Oct 2011 16:41:16 +0200 Antoine Pitrou wrote: > Le dimanche 02 octobre 2011 ? 23:39 +0900, Stephen J. Turnbull a ?crit : > > Antoine Pitrou writes: > > > > > StringIO is an in-memory file-like object, like in 2.x (where it lived > > > in the "cStringIO" module). I don't think it's a novel thing. > > > > The problem is the name "StringIO". Something like "StringStream" or > > "StringBuffer" might be more discoverable. I personally didn't have > > trouble deducing that "StringIO" means "treat a string like a file", > > but it's not immediately obvious what the module is for (unless you > > already know). > > I'm not sure why "StringStream" or "StringBuffer" would be more > discoverable, unless you're coming from a language where these names are > well-known. A "stream" is usually related to I/O, anyway; while a > "buffer" is more like an implementation detail. > I personally like the relative tersity of "StringIO". Apparently the real word is "terseness". My bad. Antoine. From alex.gaynor at gmail.com Sun Oct 2 18:34:03 2011 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Sun, 2 Oct 2011 16:34:03 +0000 (UTC) Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? References: <201110011917.56330.victor.stinner@haypocalc.com> <201110012206.11806.victor.stinner@haypocalc.com> <20111001222101.2aa6aaa7@pitrou.net> <201110021500.01139.victor.stinner@haypocalc.com> <20111002152521.63cbaf75@pitrou.net> <871uuvh2zr.fsf@uwakimon.sk.tsukuba.ac.jp> <1317566476.3562.2.camel@localhost.localdomain> <20111002175214.42fbe504@pitrou.net> Message-ID: There are a number of issues that are being conflated by this thread. 1) Should str += str be fast. In my opinion, the answer is an obvious and resounding no. Strings are immutable, thus repeated string addition is O(n**2). This is a natural and obvious conclusion. Attempts to change this are only truly possible on CPython, and thus create a worse enviroment for other Pythons, as well as a quite misleading, as they'll be extremely brittle. It's worth noting that, to my knowledge, JVMs haven't attempted hacks like this. 2) Should we have a mutable string. Personally I think this question just misses the point. No one actually wants a mutable string, the closest thing anyone asks for is faster string building, which can be solved by a far more specialized thing (see (3)) without all the API hangups of "What methods mutate?", "Should it have every str method", or "Is it a dropin replacement?". 3) And, finally the question that prompted this enter thing. Can we have a better way of incremental string building than the current list + str.join method. Personally I think unless your interest is purely in getting the most possible speed out of Python, the current idiom is probably acceptable. That said, if you want to get the most possible speed, a StringBuilder in the vein PyPy offers is the only sane way. It's able to be faster because it has very little ways to interact with it, and once you're done it reuses it's buffer to create the Python level string object, which is to say there's no need to copy it at the end. As I said, unless your interest is maximum performance, there's nothing wrong with the current idiom, and we'd do well to educate our users, rather than have more hacks. Alex From stephen at xemacs.org Sun Oct 2 18:44:50 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 03 Oct 2011 01:44:50 +0900 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? In-Reply-To: <1317566476.3562.2.camel@localhost.localdomain> References: <201110011917.56330.victor.stinner@haypocalc.com> <201110012206.11806.victor.stinner@haypocalc.com> <20111001222101.2aa6aaa7@pitrou.net> <201110021500.01139.victor.stinner@haypocalc.com> <20111002152521.63cbaf75@pitrou.net> <871uuvh2zr.fsf@uwakimon.sk.tsukuba.ac.jp> <1317566476.3562.2.camel@localhost.localdomain> Message-ID: <87zkhjfim5.fsf@uwakimon.sk.tsukuba.ac.jp> Antoine Pitrou writes: > I'm not sure why "StringStream" or "StringBuffer" would be more > discoverable, unless you're coming from a language where these names are > well-known. I think they are, but it doesn't really matter, since both are a bit lame, and I doubt either is sufficiently suggestive to be worth changing the name of the module, or even providing an alias. I wish I had a better name to offer, that's all. > I personally like the relative tersity of "StringIO". The issue is not that I *dislike* the name; I *personally* like the name fine. It's that it's definitely not doing anything to reduce the frequency of the "efficient string concatenation" FAQ. From riscutiavlad at gmail.com Sun Oct 2 18:47:47 2011 From: riscutiavlad at gmail.com (Vlad Riscutia) Date: Sun, 2 Oct 2011 09:47:47 -0700 Subject: [Python-Dev] Hg tips (was Re: [Python-checkins] cpython (merge default -> default): Merge heads.) In-Reply-To: References: Message-ID: Great tips. Can we add them to the developer guide somewhere? Thank you, Vlad On Thu, Sep 29, 2011 at 12:54 AM, Ezio Melotti wrote: > Tip 1 -- merging heads: > > A while ago ?ric suggested a nice tip to make merges easier and since I > haven't seen many people using it and now I got a chance to use it again, I > think it might be worth showing it once more: > > # so assume you just committed some changes: > $ hg ci Doc/whatsnew/3.3.rst -m 'Update and reorganize the whatsnew entry > for PEP 393.' > # you push them, but someone else pushed something in the meanwhile, so the > push fails > $ hg push > pushing to ssh://hg at hg.python.org/cpython > searching for changes > abort: push creates new remote heads on branch 'default'! > (you should pull and merge or use push -f to force) > # so you pull the other changes > $ hg pull -u > pulling from ssh://hg at hg.python.org/cpython > searching for changes > adding changesets > adding manifests > adding file changes > added 4 changesets with 5 changes to 5 files (+1 heads) > not updating, since new heads added > (run 'hg heads' to see heads, 'hg merge' to merge) > # and use "hg heads ." to see the two heads (yours and the one you pulled) > in the current branch > $ hg heads . > changeset: 72521:e6a2b54c1d16 > tag: tip > user: Victor Stinner > date: Thu Sep 29 04:02:13 2011 +0200 > summary: Fix hex_digit_to_int() prototype: expect Py_UCS4, not > Py_UNICODE > > changeset: 72517:ba6ee5cc9ed6 > user: Ezio Melotti > date: Thu Sep 29 08:34:36 2011 +0300 > summary: Update and reorganize the whatsnew entry for PEP 393. > # here comes the tip: before merging you switch to the other head (i.e. the > one pushed by Victor), > # if you don't switch, you'll be merging Victor changeset and in case of > conflicts you will have to review > # and modify his code (e.g. put a Misc/NEWS entry in the right section or > something more complicated) > $ hg up e6a2b54c1d16 > 6 files updated, 0 files merged, 0 files removed, 0 files unresolved > # after the switch you will merge the changeset you just committed, so in > case of conflicts > # reviewing and merging is much easier because you know the changes already > $ hg merge > 1 files updated, 0 files merged, 0 files removed, 0 files unresolved > (branch merge, don't forget to commit) > # here everything went fine and there were no conflicts, and in the diff I > can see my last changeset > $ hg di > diff --git a/Doc/whatsnew/3.3.rst b/Doc/whatsnew/3.3.rst > [...] > # everything looks fine, so I can commit the merge and push > $ hg ci -m 'Merge heads.' > $ hg push > pushing to ssh://hg at hg.python.org/cpython > searching for changes > remote: adding > changesets > > remote: adding manifests > remote: adding file changes > remote: added 2 changesets with 1 changes to 1 files > remote: buildbot: 2 changes sent successfully > remote: notified python-checkins at python.org of incoming changeset > ba6ee5cc9ed6 > remote: notified python-checkins at python.org of incoming changeset > e7672fe3cd35 > > This tip is not only useful while merging, but it's also useful for > python-checkins reviews, because the "merge" mail has the same diff of the > previous mail rather than having 15 unrelated changesets from the last week > because the committer didn't pull in a while. > > > Tip 2 -- extended diffs: > > If you haven't already, enable git diffs, adding to your ~/.hgrc the > following two lines: > >> [diff] >> git = True >> > (this is already in the devguide, even if 'git = on' is used there. The > mercurial website uses git = True too.) > More info: > http://hgtip.com/tips/beginner/2009-10-22-always-use-git-diffs/ > > > Tip 3 -- extensions: > > I personally like the 'color' extension, it makes the output of commands > like 'hg diff' and 'hg stat' more readable (e.g. it shows removed lines in > red and added ones in green). > If you want to give it a try, add to your ~/.hgrc the following two lines: > >> [extensions] >> color = >> > > If you find operations like pulling, updating or cloning too slow, you > might also want to look at the 'progress' extension, which displays a > progress bar during these operations: > >> [extensions] >> progress = >> > > > Tip 4 -- porting from 2.7 to 3.2: > > The devguide suggests: >> >> hg export a7df1a869e4a | hg import --no-commit - >> > but it's not always necessary to copy the changeset number manually. > If you are porting your last commit you can just use 'hg export 2.7' (or > any other branch name): > * using the one-dir-per-branch setup: > wolf at hp:~/dev/py/2.7$ hg ci -m 'Fix some bug.' > wolf at hp:~/dev/py/2.7$ cd ../3.2 > wolf at hp:~/dev/py/3.2$ hg pull -u ../2.7 > wolf at hp:~/dev/py/3.2$ hg export 2.7 | hg import --no-commit - > * using the single-dir setup: > wolf at hp:~/dev/python$ hg branch > 2.7 > wolf at hp:~/dev/python$ hg ci -m 'Fix some bug.' > wolf at hp:~/dev/python$ hg up 3.2 # here you might enjoy the progress > extension > wolf at hp:~/dev/python$ hg export 2.7 | hg import --no-commit - > And then you can check that everything is fine, and commit on 3.2 too. > Of course it works the other way around (from 3.2 to 2.7) too. > > > I hope you'll find these tips useful. > > Best Regards, > Ezio Melotti > > > On Thu, Sep 29, 2011 at 8:36 AM, ezio.melotti wrote: > >> http://hg.python.org/cpython/rev/e7672fe3cd35 >> changeset: 72522:e7672fe3cd35 >> parent: 72520:e6a2b54c1d16 >> parent: 72521:ba6ee5cc9ed6 >> user: Ezio Melotti >> date: Thu Sep 29 08:36:23 2011 +0300 >> summary: >> Merge heads. >> >> files: >> Doc/whatsnew/3.3.rst | 63 +++++++++++++++++++++---------- >> 1 files changed, 42 insertions(+), 21 deletions(-) >> >> > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/riscutiavlad%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yasar11732 at gmail.com Sun Oct 2 18:59:48 2011 From: yasar11732 at gmail.com (=?ISO-8859-9?Q?Ya=FEar_Arabac=FD?=) Date: Sun, 2 Oct 2011 19:59:48 +0300 Subject: [Python-Dev] What it takes to change a single keyword. In-Reply-To: References: <4E87373D.2030503@v.loewis.de> Message-ID: Thanks to you both, I have made some progress on introducing my own keywords to python interpreter. I think it is very kind of you to answer my question. I think I can take it from here. Thanks again :) 02 Ekim 2011 05:42 tarihinde Nick Coghlan yazd?: > 2011/10/1 "Martin v. L?wis" : > >> First of all, I am sincerely sorry if this is wrong mailing list to ask > >> this question. I checked out definitions of couple other mailing list, > >> and this one seemed most suitable. Here is my question: > > > > In principle, python-list would be more appropriate, but this really > > is a border case. So welcome! > > > >> Let's say I want to change a single keyword, let's say import keyword, > >> to be spelled as something else, like it's translation to my language. I > >> guess it would be more complicated than modifiying Grammar/Grammar, but > >> I can't be sure which files should get edited. > > > > Hmm. I also think editing Grammar/Grammar should be sufficient. Try > > restricting yourself to ASCII keywords first; this just worked fine for > > me. > > For any changes where that isn't sufficient, then > http://docs.python.org/devguide/grammar.html provides a helpful list > of additional places to check (and > http://docs.python.org/devguide/compiler.html provides info on how it > all hangs together). > > However, rather than *changing* the keywords, it would likely be > better to allow *alternate* keywords to avoid the problem Martin > mentioned with existing Python code failing to run (including the > entire standard library). > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -- http://yasar.serveblog.net/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From hodgestar+pythondev at gmail.com Sun Oct 2 19:23:19 2011 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Sun, 2 Oct 2011 19:23:19 +0200 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? In-Reply-To: <201110011917.56330.victor.stinner@haypocalc.com> References: <201110011917.56330.victor.stinner@haypocalc.com> Message-ID: On Sat, Oct 1, 2011 at 7:17 PM, Victor Stinner wrote: > I'm writing this email to ask you if this type solves a real issue, or if we > can just prove the super-fast str.join(list of str). I'm -1 on hacking += to be fast again because having the two loops below perform wildly differently is *very* surprising to me: s = '' for x in loops: s += x s = '' for x in loops: s = s + x Schiavo Simon From hodgestar+pythondev at gmail.com Sun Oct 2 19:27:46 2011 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Sun, 2 Oct 2011 19:27:46 +0200 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? In-Reply-To: References: <201110011917.56330.victor.stinner@haypocalc.com> Message-ID: On Sun, Oct 2, 2011 at 7:23 PM, Simon Cross wrote: > I'm -1 on hacking += to be fast again because having the two loops > below perform wildly differently is *very* surprising to me: > > s = '' > for x in loops: > ? ?s += x > > s = '' > for x in loops: > ? ?s = s + x Erk. Bad example. Second example should be: s = '' for x in loops: b = s s += x (I misunderstood the details but I new the reference counting hackiness would lead to surprises somewhere :). Schiavo Simon From victor.stinner at haypocalc.com Mon Oct 3 04:19:53 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Mon, 3 Oct 2011 04:19:53 +0200 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? In-Reply-To: <20111002152521.63cbaf75@pitrou.net> References: <201110011917.56330.victor.stinner@haypocalc.com> <201110021500.01139.victor.stinner@haypocalc.com> <20111002152521.63cbaf75@pitrou.net> Message-ID: <201110030419.54038.victor.stinner@haypocalc.com> Le dimanche 2 octobre 2011 15:25:21, Antoine Pitrou a ?crit : > I don't know why you're saying that. The concatenation optimization > worked in 2.x where the "str" type also used only one memory block. You > just have to check that the refcount is about to drop to zero. > Of course, resizing only works if the two unicode objects are of the > same "kind". Oh, I see. In Python 2.7, bytes+=bytes calls PyMem_Realloc() on then writes the new characters to the result. It doesn't overallocate as bytearray (which overallocate +12,5%). I restored this hack in Python 3.3 using PyUnicode_Append() in ceval.c and by optimizing PyUnicode_Append() (try to append in-place). str+=str is closer again to ''.join: str += str: 696 ms ''.join(): 547 ms I disabled temporary the optimization for wstr string in PyUnicode_Resize() because of a bug. I disabled completly resize on Windows because of another bug. Victor From hrvoje.niksic at avl.com Mon Oct 3 10:31:07 2011 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Mon, 03 Oct 2011 10:31:07 +0200 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? In-Reply-To: References: <201110011917.56330.victor.stinner@haypocalc.com> <201110012206.11806.victor.stinner@haypocalc.com> <20111001222101.2aa6aaa7@pitrou.net> <201110021500.01139.victor.stinner@haypocalc.com> <20111002152521.63cbaf75@pitrou.net> <871uuvh2zr.fsf@uwakimon.sk.tsukuba.ac.jp> <1317566476.3562.2.camel@localhost.localdomain> <20111002175214.42fbe504@pitrou.net> Message-ID: <4E8972CB.20000@avl.com> On 10/02/2011 06:34 PM, Alex Gaynor wrote: > There are a number of issues that are being conflated by this thread. > > 1) Should str += str be fast. In my opinion, the answer is an obvious and > resounding no. Strings are immutable, thus repeated string addition is > O(n**2). This is a natural and obvious conclusion. Attempts to change this > are only truly possible on CPython, and thus create a worse enviroment for > other Pythons, as well as a quite misleading, as they'll be extremely > brittle. It's worth noting that, to my knowledge, JVMs haven't attempted > hacks like this. CPython is already misleading and ahead of JVM, because the str += str optimization has been applied to Python 2 some years ago - see http://hg.python.org/cpython-fullhistory/rev/fb6ffd290cfb?revcount=480 I like Python's immutable strings and consider it a good default for strings. Nevertheless a mutable string would be useful for those situations when you know you are about to manipulate a string-like object a number of times, where immutable strings require too many allocations. I don't think Python needs a StringBuilder - constructing strings using a list of strings or StringIO is well-known and easy. Mutable strings are useful for the cases where StringBuilder doesn't suffice because you need modifications other than appends. This is analogous to file writes - in practice most of them are appends, but sometimes you also need to be able to seek and write stuff in the middle. Hrvoje From rndblnch at gmail.com Mon Oct 3 10:43:20 2011 From: rndblnch at gmail.com (renaud) Date: Mon, 3 Oct 2011 08:43:20 +0000 (UTC) Subject: [Python-Dev] Python Core Tools References: Message-ID: Maciej Fijalkowski gmail.com> writes: > https://bitbucket.org/pypy/pypy/src/59460302c713/lib_pypy/disassembler.py > > this might be of interest. It's like dis module except it creates > objects instead of printing them > I think that Issue11816 (under review) aims at extending the dis module in a similar direction: Refactor the dis module to provide better building blocks for bytecode analysis http://bugs.python.org/issue11816 renaud From L.J.Buitinck at uva.nl Mon Oct 3 12:12:47 2011 From: L.J.Buitinck at uva.nl (Lars Buitinck) Date: Mon, 3 Oct 2011 12:12:47 +0200 Subject: [Python-Dev] counterintuitive behavior (bug?) in Counter with += Message-ID: Hello, [First off, I'm not a member of this list, so please Cc: me in a reply!] I've found some counterintuitive behavior in collections.Counter while hacking on the scikit-learn project [1]. I wanted to use a bunch of Counters to do some simple term counting in a set of documents, roughly as follows: count_total = Counter() for doc in documents: count_current = Counter(analyze(doc)) count_total += count_current count_per_doc.append(count_current) Because we target Python 2.5+, I implemented a lightweight replacement with just the functionality we need, including __iadd__, but then my co-developer ran the above code on Python 2.7 and performance was horrible. After some digging, I found out that Counter [2] does not have __iadd__ and += copies the entire left-hand side in __add__! I also figured out that I should use the update method instead, which I will, but I still find that uglier than +=. I would submit a patch to implement __iadd__, but I first want to know if that's considered the right behavior, since it changes the semantics of +=: >>> from collections import Counter >>> a = Counter([1,2,3]) >>> b = a >>> a += Counter([3,4,5]) >>> a is b False would become # snip >>> a is b True TIA, Lars [1] https://github.com/scikit-learn/scikit-learn/commit/de6e93094499e4d81b8e3b15fc66b6b9252945af [2] http://hg.python.org/cpython/file/tip/Lib/collections/__init__.py#l399 -- Lars Buitinck Scientific programmer, ILPS University of Amsterdam From songofacandy at gmail.com Mon Oct 3 13:57:45 2011 From: songofacandy at gmail.com (INADA Naoki) Date: Mon, 3 Oct 2011 20:57:45 +0900 Subject: [Python-Dev] counterintuitive behavior (bug?) in Counter with += In-Reply-To: References: Message-ID: +1 Because Counter is mutable object, I think += should mutate left side object. On Mon, Oct 3, 2011 at 7:12 PM, Lars Buitinck wrote: > Hello, > > [First off, I'm not a member of this list, so please Cc: me in a reply!] > > I've found some counterintuitive behavior in collections.Counter while > hacking on the scikit-learn project [1]. I wanted to use a bunch of > Counters to do some simple term counting in a set of documents, > roughly as follows: > > ? ?count_total = Counter() > ? ?for doc in documents: > ? ? ? ?count_current = Counter(analyze(doc)) > ? ? ? ?count_total += count_current > ? ? ? ?count_per_doc.append(count_current) > > Because we target Python 2.5+, I implemented a lightweight replacement > with just the functionality we need, including __iadd__, but then my > co-developer ran the above code on Python 2.7 and performance was > horrible. After some digging, I found out that Counter [2] does not > have __iadd__ and += copies the entire left-hand side in __add__! > > I also figured out that I should use the update method instead, which > I will, but I still find that uglier than +=. I would submit a patch > to implement __iadd__, but I first want to know if that's considered > the right behavior, since it changes the semantics of +=: > > ? ?>>> from collections import Counter > ? ?>>> a = Counter([1,2,3]) > ? ?>>> b = a > ? ?>>> a += Counter([3,4,5]) > ? ?>>> a is b > ? ?False > > would become > > ? ?# snip > ? ?>>> a is b > ? ?True > > TIA, > Lars > > > [1] https://github.com/scikit-learn/scikit-learn/commit/de6e93094499e4d81b8e3b15fc66b6b9252945af > [2] http://hg.python.org/cpython/file/tip/Lib/collections/__init__.py#l399 > > > -- > Lars Buitinck > Scientific programmer, ILPS > University of Amsterdam > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- INADA Naoki? From victor.stinner at haypocalc.com Mon Oct 3 15:31:23 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Mon, 03 Oct 2011 15:31:23 +0200 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? In-Reply-To: <201110030419.54038.victor.stinner@haypocalc.com> References: <201110011917.56330.victor.stinner@haypocalc.com> <201110021500.01139.victor.stinner@haypocalc.com> <20111002152521.63cbaf75@pitrou.net> <201110030419.54038.victor.stinner@haypocalc.com> Message-ID: <4E89B92B.2070300@haypocalc.com> Le 03/10/2011 04:19, Victor Stinner a ?crit : > I restored this hack in Python 3.3 using PyUnicode_Append() in ceval.c and by > optimizing PyUnicode_Append() (try to append in-place). str+=str is closer > again to ''.join: > > str += str: 696 ms > ''.join(): 547 ms > > I disabled temporary the optimization for wstr string in PyUnicode_Resize() > because of a bug. I disabled completly resize on Windows because of another > bug. Ok, bugs fixed, all "resize" optimizations are now enabled: Python 3.3 str += str : 119 ms ''.join() : 130 ms StringIO.join : 147 ms StringBuilder : 404 ms array('u') : 979 ms Hum, str+=str is now the fastest method, even faster than ''.join() !? It's maybe time to optimize str.join ;-) Victor From martin at v.loewis.de Mon Oct 3 18:04:57 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 03 Oct 2011 18:04:57 +0200 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? In-Reply-To: <201110030419.54038.victor.stinner@haypocalc.com> References: <201110011917.56330.victor.stinner@haypocalc.com> <201110021500.01139.victor.stinner@haypocalc.com> <20111002152521.63cbaf75@pitrou.net> <201110030419.54038.victor.stinner@haypocalc.com> Message-ID: <4E89DD29.4090905@v.loewis.de> > I restored this hack in Python 3.3 using PyUnicode_Append() in ceval.c and by > optimizing PyUnicode_Append() (try to append in-place). str+=str is closer > again to ''.join: Why are you checking, in unicode_resizable, whether the string is from unicode_latin1? If it is, then it should have a refcount of at least 2, so the very first test in the function should already exclude it. Regards, Martin From martin at v.loewis.de Mon Oct 3 18:15:15 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 03 Oct 2011 18:15:15 +0200 Subject: [Python-Dev] PEP-393: request for keeping PyUnicode_EncodeDecimal() In-Reply-To: <20111002110416.GA18205@sleipnir.bytereef.org> References: <20111001130603.GA16027@sleipnir.bytereef.org> <4E87205B.905@v.loewis.de> <20111001145859.GA16431@sleipnir.bytereef.org> <4E87346C.3060109@v.loewis.de> <20111002110416.GA18205@sleipnir.bytereef.org> Message-ID: <4E89DF93.6010405@v.loewis.de> > What I'm looking for is a public function that is silently updated if > python-dev decides to accept other numerical input. As I understand > from your comments below, PyUnicode_EncodeDecimal() is frozen, so > that function does indeed not help. > > I would consider it reasonable for PyUnicode_TransformDecimalAndSpaceToASCII() > to be documented as: > > "This function might accept different numerical input in the future." > > > The reason is that some people would like to accept additional input > (see #6632), while others would like to restrict input. If there is > a function that will always track whatever will be decided, extension > authors don't have to worry about being up-to-date. I don't think it's possible to promise such a thing. Predictions are difficult and all that. If somebody knew what the function should do in the future, then it would be best to change it now to do that. Rather expect that whenever functions get deprecated but not removed that they keep the semantics they had, for use in code that relies not only on the function name, but also on the function semantics (or else there may not be a point in keeping the function at all if it suddenly changes its behavior). >>> Will PyUnicode_TransformDecimalAndSpaceToASCII() be public? >> >> It's already included in 3.2, so it can't be removed that easily. >> I wish it had been private, though - we have way too many API functions >> dealing with Unicode. > > I can find PyUnicode_TransformDecimalToASCII() in 3.2, but not > PyUnicode_TransformDecimalAndSpaceToASCII(). Ah, so there is still a chance to make it private then. I plan to go over all new APIs for 3.3, and will propose to make all those private where nobody can argue for general utility. The longer the function name, the less is the utility. PyMartin_SendThisMessageAboutDecimalParsingToStefan-ly y'rs Martin From martin at v.loewis.de Mon Oct 3 18:23:02 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Mon, 03 Oct 2011 18:23:02 +0200 Subject: [Python-Dev] cpython: PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown In-Reply-To: References: Message-ID: <4E89E166.7020307@v.loewis.de> Am 02.10.2011 17:46, schrieb Benjamin Peterson: > 2011/10/2 Georg Brandl : >> On 10/02/11 01:14, victor.stinner wrote: >>> http://hg.python.org/cpython/rev/9124a00df142 >>> changeset: 72573:9124a00df142 >>> parent: 72571:fa0b1e50270f >>> user: Victor Stinner >>> date: Sat Oct 01 23:48:37 2011 +0200 >>> summary: >>> PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown > > Also, could I remind you that a better commit message is probably > "make PyUnicode_FromKindAndData raise a ValueError if the kind is > unknown". I think this is asking too much. If we really want correct English in all commit messages, we need to employ an editor who edits all commit messages from non-native speakers. I believe Victor formulated this in the spirit of PyUnicode_FromKindAndData() "now" raises a ValueError if the kind is unknown which, to me, is equivalent to the (to me) correct formulations PyUnicode_FromKindAndData() will now raise a ValueError if the kind is unknown and PyUnicode_FromKindAndData() is now raising a ValueError if the kind is unknown Of course, I'll encourage Victor to keep on mastering the English language as much as anybody else. Regards, Martin From solipsis at pitrou.net Mon Oct 3 19:56:04 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 3 Oct 2011 19:56:04 +0200 Subject: [Python-Dev] PEP 3151 state Message-ID: <20111003195604.7bff4370@pitrou.net> Hello, I am back from holiday ;) and we haven't heard from other implementations whether there was any difficulty for them in implementing PEP 3151. Did I miss something (it's difficult to keep up with many messages on a small netbook with a screen broken by a batman-like pattern obscuring the right 20% :-)) ? Regards Antoine. From victor.stinner at haypocalc.com Mon Oct 3 20:07:36 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Mon, 3 Oct 2011 20:07:36 +0200 Subject: [Python-Dev] RFC: Add a new builtin strarray type to Python? In-Reply-To: <4E89DD29.4090905@v.loewis.de> References: <201110011917.56330.victor.stinner@haypocalc.com> <201110030419.54038.victor.stinner@haypocalc.com> <4E89DD29.4090905@v.loewis.de> Message-ID: <201110032007.36128.victor.stinner@haypocalc.com> Le lundi 3 octobre 2011 18:04:57, vous avez ?crit : > Why are you checking, in unicode_resizable, whether the string is from > unicode_latin1? If it is, then it should have a refcount of at least 2, > so the very first test in the function should already exclude it. There is also a test on unicode_empty. Singletons should not be modified, but you are right, ref count is always at least 2 (when calling unicode_resizable). Changeset 6fbc5e9141fc replaces tests by assertions. Victor From stephen at xemacs.org Mon Oct 3 20:28:39 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 04 Oct 2011 03:28:39 +0900 Subject: [Python-Dev] cpython: PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown In-Reply-To: <4E89E166.7020307@v.loewis.de> References: <4E89E166.7020307@v.loewis.de> Message-ID: <87boty0w14.fsf@uwakimon.sk.tsukuba.ac.jp> "Martin v. L?wis" writes: > > Also, could I remind you that a better commit message is probably > > "make PyUnicode_FromKindAndData raise a ValueError if the kind is > > unknown". > > I think this is asking too much. This distinction is important enough that it's worth asking non-native speakers to *learn* this one idiom ("make Python do"), and all developers to *use* it (or an equally unambiguous form, if they feel like being original). Whether that's a reasonable burden for individual non-natives is going to depend on the individual, of course. But asking is not out of line. > I believe Victor formulated this in the spirit of Sure, one can figure that out -- but that's a lot of effort to ask of readers of logs. In general it requires familiarity with the patch being documented. From solipsis at pitrou.net Mon Oct 3 20:32:08 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 3 Oct 2011 20:32:08 +0200 Subject: [Python-Dev] PEP 3151 state References: <20111003195604.7bff4370@pitrou.net> Message-ID: <20111003203208.54d668f9@pitrou.net> On Mon, 3 Oct 2011 19:56:04 +0200 Antoine Pitrou wrote: > > Hello, > > I am back from holiday ;) and we haven't heard from other > implementations whether there was any difficulty for them in > implementing PEP 3151. Alex Gaynor and Jim Baker (thank you!) just told me on IRC that there shouldn't be any problem for their respective projects (PyPy and Jython, if I'm not totally confused). Regards Antoine. From jdhardy at gmail.com Mon Oct 3 23:22:29 2011 From: jdhardy at gmail.com (Jeff Hardy) Date: Mon, 3 Oct 2011 14:22:29 -0700 Subject: [Python-Dev] PEP 3151 state In-Reply-To: <20111003203208.54d668f9@pitrou.net> References: <20111003195604.7bff4370@pitrou.net> <20111003203208.54d668f9@pitrou.net> Message-ID: On Mon, Oct 3, 2011 at 11:32 AM, Antoine Pitrou wrote: > On Mon, 3 Oct 2011 19:56:04 +0200 > Antoine Pitrou wrote: > > > > Hello, > > > > I am back from holiday ;) and we haven't heard from other > > implementations whether there was any difficulty for them in > > implementing PEP 3151. > > Alex Gaynor and Jim Baker (thank you!) just told me on IRC that there > shouldn't be any problem for their respective projects (PyPy and > Jython, if I'm not totally confused). > I don't see any issues for IronPython either. - Jeff -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Mon Oct 3 23:38:01 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 03 Oct 2011 17:38:01 -0400 Subject: [Python-Dev] cpython: PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown In-Reply-To: <4E89E166.7020307@v.loewis.de> References: <4E89E166.7020307@v.loewis.de> Message-ID: On 10/3/2011 12:23 PM, "Martin v. L?wis" wrote: > Am 02.10.2011 17:46, schrieb Benjamin Peterson: >>> On 10/02/11 01:14, victor.stinner wrote: >>>> PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown >> >> Also, could I remind you that a better commit message is probably >> "make PyUnicode_FromKindAndData raise a ValueError if the kind is >> unknown". > > I think this is asking too much. If we really want correct English > in all commit messages, we need to employ an editor who edits all > commit messages from non-native speakers. Some months ago we discussed the fact that 'x does y' is ambiguous in commit messages because a) it could describe behavior 'now' either before or after the patch and b) it has been used by developers *both* ways. On the tracker, especially in titles, it routinely describes behavior 'now' before the patch. 'Make x do y' is not much of a change. Guido approved of such clarification. Personnally, *I* know what Victor means as I believe he is consistent. But a new person might not. On the other hand, I do not want to discourage Victor from the great work he is doing. Since both forms are correct English, I have not thought of this as a native versus non-native issue. But I could imagine that the translation into X might be less ambiguous to a native speaker of X. Is it both technically possible (with hg) and socially permissible (with us) to edit another's commit message? -- Terry Jan Reedy From martin at v.loewis.de Mon Oct 3 23:57:09 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Mon, 03 Oct 2011 23:57:09 +0200 Subject: [Python-Dev] cpython: PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown In-Reply-To: References: <4E89E166.7020307@v.loewis.de> Message-ID: <4E8A2FB5.7020509@v.loewis.de> > Is it both technically possible (with hg) and socially permissible (with > us) to edit another's commit message? It's not technically possible, but it would be socially permissible to fix spelling mistakes. With hg, editing commit messages would require some sort of patch queue system, where the editor approves and manufactures commits out of the data submitted by the actual author. As any patch queue system, it would mean that commits aren't immediately available. Once they are available, their data cannot be changed due to the distributed nature of the DVCS (something that a centralized system would have no issues with). Regards, Martin From victor.stinner at haypocalc.com Tue Oct 4 00:09:28 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 4 Oct 2011 00:09:28 +0200 Subject: [Python-Dev] cpython: PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown In-Reply-To: References: Message-ID: <201110040009.28213.victor.stinner@haypocalc.com> > > - assert(0); > > + PyErr_SetString(PyExc_ValueError, "invalid kind"); > > > > return NULL; > > > > } > > Is that really a ValueError? It should only be a ValueError if the user > could trigger that error. Otherwise it should be a SystemError. You are right, ValueError is not best exception here. I used SystemError instead: see my commit 721bb2e59815. PyUnicode_FromFormat() does still use ValueError in PyUnicode_FromFormatV: PyErr_SetString(PyExc_ValueError, "incomplete format key"); PyErr_SetString(PyExc_ValueError, "width too big"); PyErr_SetString(PyExc_ValueError, "prec too big"); PyErr_SetString(PyExc_ValueError, "incomplete format"); PyErr_Format(PyExc_ValueError, "unsupported format character '%c' (0x%x) " "at index %zd", (31<=c && c<=126) ? (char)c : '?', (int)c, fmtpos - 1); PyErr_Format(PyExc_ValueError, "PyUnicode_FromFormatV() expects an ASCII-encoded format " "string, got a non-ASCII byte: 0x%02x", (unsigned char)*f); Should we also replace them by SystemError? It might break backward compatibility, but I do really hope that nobody relies on these errors ;-) Victor From v-rywel at microsoft.com Tue Oct 4 01:32:48 2011 From: v-rywel at microsoft.com (Ryan Wells (MP Tech Consulting LLC)) Date: Mon, 3 Oct 2011 23:32:48 +0000 Subject: [Python-Dev] Python compatibility issue with Windows Developer Preview Message-ID: Hello Python Developers, I am a Program Manager with the Ecosystem Engineering team at Microsoft. We are tracking a issue with Python 3.2.2 on Windows Developer Preview when using Internet Explorer. At //BUILD/ in September, Microsoft announced the availability of the Windows Developer Preview, which includes IE10. We encourage you to download the Windows Developer Preview (http://msdn.microsoft.com/en-us/windows/apps/br229516) and to begin testing. I'd like to connect directly with a developer on the project so that we can work closesly to resolve this issue. Regards, Ryan Wells Microsoft PC Ecosystem Engineering Team v-rywel at microsoft.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.curtin at gmail.com Tue Oct 4 03:20:58 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Mon, 3 Oct 2011 20:20:58 -0500 Subject: [Python-Dev] Python compatibility issue with Windows Developer Preview In-Reply-To: References: Message-ID: On Mon, Oct 3, 2011 at 18:32, Ryan Wells (MP Tech Consulting LLC) < v-rywel at microsoft.com> wrote: > Hello Python Developers,**** > > ** ** > > I am a Program Manager with the Ecosystem Engineering team at Microsoft. We > are tracking a issue with Python 3.2.2 on Windows Developer Preview when > using Internet Explorer. > Is there any public bug tracker or other information for this on your end? Sounds weird. I?d like to connect directly with a developer on the project so that we can > work closesly to resolve this issue. > There aren't many Windows devs around here, but while I'm one of them, I don't currently have the bandwidth to devote to getting a Windows 8 setup and working on this issue at the time. I think your best bet would be to post as much information as you have and we can go from there, either from myself or anyone available. If you think you've nailed down a specific issue in Python, http://bugs.python.org is our bug tracker. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Tue Oct 4 04:59:17 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 04 Oct 2011 11:59:17 +0900 Subject: [Python-Dev] cpython: PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown In-Reply-To: <4E8A2FB5.7020509@v.loewis.de> References: <4E89E166.7020307@v.loewis.de> <4E8A2FB5.7020509@v.loewis.de> Message-ID: <87aa9h1myi.fsf@uwakimon.sk.tsukuba.ac.jp> "Martin v. L?wis" writes: [Terry Reedy wrote:] > > Is it both technically possible (with hg) and socially permissible (with > > us) to edit another's commit message? > > It's not technically possible, Currently, in hg. git has a mechanism for adding notes which are automatically displayed along with the original commit message, and bzr is considering introducing such a mechanism. I'm not familiar with the hg dev process (I use hg a lot, but so far it Just Works for me :), but I would imagine they will move in that direction as well. > but it would be socially permissible to fix spelling mistakes. The notes mechanism is not useful for fixing spelling mistakes unless they make the message unintelligible, but I suppose it might expand the range of socially permissible additions. From victor.stinner at haypocalc.com Tue Oct 4 11:21:15 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 04 Oct 2011 11:21:15 +0200 Subject: [Python-Dev] [Python-checkins] cpython: fix formatting In-Reply-To: References: Message-ID: <4E8AD00B.4020208@haypocalc.com> Le 04/10/2011 01:35, benjamin.peterson a ?crit : > http://hg.python.org/cpython/rev/64495ad8aa54 > changeset: 72634:64495ad8aa54 > user: Benjamin Peterson > date: Mon Oct 03 19:35:07 2011 -0400 > summary > fix formatting > > +++ b/Objects/unicodeobject.c > @@ -1362,8 +1362,8 @@ > return -1; > _PyUnicode_CheckConsistency(*p_unicode); > return 0; > - } else > - return resize_inplace((PyUnicodeObject*)unicode, length); > + } > + return resize_inplace((PyUnicodeObject*)unicode, length); > } I chose deliberately to use "else return ...", it's more readable for me. Victor From victor.stinner at haypocalc.com Tue Oct 4 11:38:57 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 04 Oct 2011 11:38:57 +0200 Subject: [Python-Dev] [Python-checkins] cpython: fix compiler warnings In-Reply-To: References: Message-ID: <4E8AD431.7090600@haypocalc.com> Le 04/10/2011 01:34, benjamin.peterson a ?crit : > http://hg.python.org/cpython/rev/afb60b190f1c > changeset: 72633:afb60b190f1c > user: Benjamin Peterson > date: Mon Oct 03 19:34:12 2011 -0400 > summary: > fix compiler warnings > > +++ b/Objects/unicodeobject.c > @@ -369,6 +369,12 @@ > } > return 1; > } > +#else > +static int > +_PyUnicode_CheckConsistency(void *op) > +{ > + return 1; > +} > #endif Oh no, please don't do that. Calling _PyUnicode_CheckConsistency() is reserved to debug builds. In release mode, we should not check string consistency (it would slow down Python). Yes, there was a warning: Objects/unicodeobject.c:539:13: warning: statement with no effect _PyUnicode_CHECK(unicode); I added these checks recently to ensure that strings are consistent just before exiting (to help me to track down a bug). The right fix is just to replace _PyUnicode_CHECK(unicode) by assert(_PyUnicode_CHECK(unicode)). Victor From victor.stinner at haypocalc.com Tue Oct 4 11:48:42 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 04 Oct 2011 11:48:42 +0200 Subject: [Python-Dev] [Python-checkins] cpython: pyexpat uses the new Unicode API In-Reply-To: References: Message-ID: <4E8AD67A.6030900@haypocalc.com> Le 03/10/2011 11:10, Amaury Forgeot d'Arc a ?crit : >> changeset: 72548:a1be34457ccf >> user: Victor Stinner >> date: Sat Oct 01 01:05:40 2011 +0200 >> summary: >> pyexat uses the new Unicode API >> >> files: >> Modules/pyexpat.c | 12 +++++++----- >> 1 files changed, 7 insertions(+), 5 deletions(-) >> >> >> diff --git a/Modules/pyexpat.c b/Modules/pyexpat.c >> --- a/Modules/pyexpat.c >> +++ b/Modules/pyexpat.c >> @@ -1234,11 +1234,13 @@ >> static PyObject * >> xmlparse_getattro(xmlparseobject *self, PyObject *nameobj) >> { >> - const Py_UNICODE *name; >> + Py_UCS4 first_char; >> int handlernum = -1; >> >> if (!PyUnicode_Check(nameobj)) >> goto generic; >> + if (PyUnicode_READY(nameobj)) >> + return NULL; > > Why is this PyUnicode_READY necessary? > Can tp_getattro pass unfinished unicode objects? > I hope we don't have to update all extension modules? The Unicode API is supposed to only deliver ready strings. But all extensions written for Python 3.2 use the "legacy" API (PyUnicode_FromUnicode and PyUnicode_FromString(NULL, size)) and so no string is ready. But *no*, you don't have to update your extension reading strings to add a call to PyUnicode_READY. You only have to call PyUnicode_READY if you use the new API (e.g. PyUnicode_READ_CHAR), so if you modify your code. Another extract of my commit (on pyexpat): - name = PyUnicode_AS_UNICODE(nameobj); + first_char = PyUnicode_READ_CHAR(nameobj, 0); Victor From benjamin at python.org Tue Oct 4 13:56:36 2011 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 4 Oct 2011 07:56:36 -0400 Subject: [Python-Dev] [Python-checkins] cpython: fix formatting In-Reply-To: <4E8AD00B.4020208@haypocalc.com> References: <4E8AD00B.4020208@haypocalc.com> Message-ID: 2011/10/4 Victor Stinner : > Le 04/10/2011 01:35, benjamin.peterson a ?crit : >> >> http://hg.python.org/cpython/rev/64495ad8aa54 >> changeset: ? 72634:64495ad8aa54 >> user: ? ? ? ?Benjamin Peterson >> date: ? ? ? ?Mon Oct 03 19:35:07 2011 -0400 >> summary >> ? fix formatting >> >> +++ b/Objects/unicodeobject.c >> @@ -1362,8 +1362,8 @@ >> ? ? ? ? ? ? ?return -1; >> ? ? ? ? ?_PyUnicode_CheckConsistency(*p_unicode); >> ? ? ? ? ?return 0; >> - ? ?} else >> - ? ? ? ?return resize_inplace((PyUnicodeObject*)unicode, length); >> + ? ?} >> + ? ?return resize_inplace((PyUnicodeObject*)unicode, length); >> ?} > > I chose deliberately to use "else return ...", it's more readable for me. Then there should be braces around it. -- Regards, Benjamin From benjamin at python.org Tue Oct 4 13:57:57 2011 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 4 Oct 2011 07:57:57 -0400 Subject: [Python-Dev] [Python-checkins] cpython: fix compiler warnings In-Reply-To: <4E8AD431.7090600@haypocalc.com> References: <4E8AD431.7090600@haypocalc.com> Message-ID: 2011/10/4 Victor Stinner : > Le 04/10/2011 01:34, benjamin.peterson a ?crit : >> >> http://hg.python.org/cpython/rev/afb60b190f1c >> changeset: ? 72633:afb60b190f1c >> user: ? ? ? ?Benjamin Peterson >> date: ? ? ? ?Mon Oct 03 19:34:12 2011 -0400 >> summary: >> ? fix compiler warnings >> >> +++ b/Objects/unicodeobject.c >> @@ -369,6 +369,12 @@ >> ? ? ?} >> ? ? ?return 1; >> ?} >> +#else >> +static int >> +_PyUnicode_CheckConsistency(void *op) >> +{ >> + ? ?return 1; >> +} >> ?#endif > > Oh no, please don't do that. Calling _PyUnicode_CheckConsistency() is > reserved to debug builds. In release mode, we should not check string > consistency (it would slow down Python). It should be optimized out. > > Yes, there was a warning: > > Objects/unicodeobject.c:539:13: warning: statement with no effect > ? ? ? ? ? ?_PyUnicode_CHECK(unicode); > > I added these checks recently to ensure that strings are consistent just > before exiting (to help me to track down a bug). > > The right fix is just to replace _PyUnicode_CHECK(unicode) by > assert(_PyUnicode_CHECK(unicode)). But _PyUnicode_CheckConsistency is just a string of assertions. What sense does it make to check the return value? -- Regards, Benjamin From pete.alex.harris at gmail.com Tue Oct 4 16:23:18 2011 From: pete.alex.harris at gmail.com (Peter Harris) Date: Tue, 4 Oct 2011 15:23:18 +0100 Subject: [Python-Dev] Python-Dev Digest, Vol 99, Issue 7 In-Reply-To: References: Message-ID: > > Hello Python Developers, > > I am a Program Manager with the Ecosystem Engineering team at Microsoft. We are tracking a issue with Python 3.2.2 on Windows Developer Preview when > using Internet Explorer. > [...] > I'd like to connect directly with a developer on the project so that we can > work closesly to resolve this issue. You know, without any specifics given about the issue, this smells like comment spam. If it wasn't from such a reputable source, I'd almost think someone is just contacting projects at random with vague reports of "issues" relating to IE10 to pump up some interest in the new browser, whether those projects are anything to do with web browsing or not. Only kidding, they aren't that reputable ;) I Googled the phrase "I am a Program Manager with the Ecosystem Engineering team at Microsoft", and it seems this scattershot approach is not new. -------------- next part -------------- An HTML attachment was scrubbed... URL: From amauryfa at gmail.com Tue Oct 4 18:18:27 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 4 Oct 2011 18:18:27 +0200 Subject: [Python-Dev] Using PEP384 Stable ABI for the lzma extension module Message-ID: Hi, Has someone already tried to *really* use Py_LIMITED_API for some "serious" extension module? I wanted to give it a try for the _lzma module (see issue 6715) because liblzma does not compile with Microsoft compilers; an alternative could be to use mingw to (pre)build _lzma.pyd, which would link with a static liblzma.a also compiled with mingw. Mixing compilers in a Python process is one of the reasons of PEP384, so I added #define Py_LIMITED_API on top of the module, and "fixed" the issues one by one: - Py_LIMITED_API is incompatible with --with-pydebug, and compilation stops. I skipped the check to continue. - I replaced PyBytes_GET_SIZE() with Py_SIZE(), which is OK, and PyBytes_AS_STRING() with PyBytes_AsString(), which may have a slight performance impact. - I replaced Py_TYPE(self)->tp_free((PyObject *)self); with PyObject_Del(self), I hope this is the same thing (for a non-GC object) - _PyBytes_Resize() is missing; I moved it under a Py_LIMITED_API section. - For the "y*" argument spec, the Py_buffer structure is required (only for two fields: buf and len), as well as PyBuffer_Release() - PyType_FromSpec() does not call PyType_Ready(), which caused crashes in __new__. Now the module seems to work correctly and passes tests... at least on Linux in a standard environment. I will do other tests on Windows. What do you think about using the stable ABI even in shipped extensions? Have you already used it somewhere else? Cheers, -- Amaury Forgeot d'Arc From fuzzyman at voidspace.org.uk Tue Oct 4 18:27:44 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 04 Oct 2011 17:27:44 +0100 Subject: [Python-Dev] Python compatibility issue with Windows Developer Preview In-Reply-To: References: Message-ID: <4E8B3400.6030400@voidspace.org.uk> On 04/10/2011 02:20, Brian Curtin wrote: > On Mon, Oct 3, 2011 at 18:32, Ryan Wells (MP Tech Consulting LLC) > > wrote: > > Hello Python Developers, > > I am a Program Manager with the Ecosystem Engineering team at > Microsoft. We are tracking a issue with Python 3.2.2 on Windows > Developer Preview when using Internet Explorer. > > > Is there any public bug tracker or other information for this on your > end? Sounds weird. How would one use Python 3.2.2 with Internet explorer? It would be possible with the pywin32 extensions, but then the correct place for support would be the pywin32 bug tracker and mailing lists (as they're not part of core Python). Michael > > I'd like to connect directly with a developer on the project so > that we can work closesly to resolve this issue. > > > There aren't many Windows devs around here, but while I'm one of them, > I don't currently have the bandwidth to devote to getting a Windows 8 > setup and working on this issue at the time. I think your best bet > would be to post as much information as you have and we can go from > there, either from myself or anyone available. > > If you think you've nailed down a specific issue in Python, > http://bugs.python.org is our bug tracker. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Tue Oct 4 18:45:55 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 04 Oct 2011 18:45:55 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Migrate str.expandtabs to the new API In-Reply-To: References: Message-ID: <4E8B3843.4030505@v.loewis.de> > Migrate str.expandtabs to the new API This needs if (PyUnicode_READY(self) == -1) return NULL; right after the ParseTuple call. In most cases, the check will be a noop. But if it's not, omitting it will make expandtabs have no effect, since the string length will be 0 (in a debug build, you also get an assertion failure). Regards, Martin From ncoghlan at gmail.com Tue Oct 4 19:05:58 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 4 Oct 2011 13:05:58 -0400 Subject: [Python-Dev] Using PEP384 Stable ABI for the lzma extension module In-Reply-To: References: Message-ID: (My comments are based on the assumption Amaury started with http://hg.python.org/sandbox/nvawda/file/09d984063fca/Modules/_lzmamodule.c) On Tue, Oct 4, 2011 at 12:18 PM, Amaury Forgeot d'Arc wrote: > - Py_LIMITED_API is incompatible with --with-pydebug, and compilation stops. > ?I skipped the check to continue. That seems like an odd (and undesirable) restriction. If different Python versions are going to expose the same ABI, it seems strange of debug and release versions can't do the same. > - I replaced PyBytes_GET_SIZE() with Py_SIZE(), which is OK, > and PyBytes_AS_STRING() with PyBytes_AsString(), which may > have a slight performance impact. Yes, the price of using the stable ABI is that performance tricks that depend on exact memory layouts are no longer available. > - I replaced > ? ? ?Py_TYPE(self)->tp_free((PyObject *)self); > ?with PyObject_Del(self), I hope this is the same thing > ?(for a non-GC object) That looks right in this particular case, but problematic in general. The stable ABI probably needs a better solution for tp_new slots invoking tp_alloc and tp_dealloc slots invoking tp_free. In fact, a systematic review of the slot documentation is probably needed, pointing out the stable ABI alternatives to all of the recommended "cross slot" invocations (and creating them if they don't already exist). > - _PyBytes_Resize() is missing; I moved it under a Py_LIMITED_API > ?section. No, that's not valid. Bytes are officially immutable - mutating them when the reference count is only 1 is a private for a reason. The correct way to do this without relying on that implementation detail is to use a byte array instead. > - For the "y*" argument spec, the Py_buffer structure is required > ?(only for two fields: buf and len), as well as PyBuffer_Release() Yeah, PEP 3118 support will eventually appear in the stable ABI, but we need to fix it first (see issue 10181). > - PyType_FromSpec() does not call PyType_Ready(), which caused > ?crashes in __new__. That sounds like it may just be a bug. Although looking at the C API docs, PEP 384 documentation appears to be basically non-existent... > Now the module seems to work correctly and passes tests... at least on > Linux in a standard environment. ?I will do other tests on Windows. > > What do you think about using the stable ABI even in shipped extensions? It's probably not a bad idea, otherwise we may compilation without realising it. This is especially so for extension modules that *don't* need access to any of the interpreter internals. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue Oct 4 19:10:36 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 4 Oct 2011 13:10:36 -0400 Subject: [Python-Dev] Using PEP384 Stable ABI for the lzma extension module In-Reply-To: References: Message-ID: On Tue, Oct 4, 2011 at 1:05 PM, Nick Coghlan wrote: > It's probably not a bad idea, otherwise we may compilation without > realising it. s/may/may break/ Actually testing the ABI stability would be much harder - somehow building an extension module against 3.2 with the limited API then testing it against a freshly built 3.3. Perhaps we could manage something like that by building against a system installation of Python 3.2 on builders that have it available. All in all, I think PEP 384 laid the foundations, but there's still plenty of work to be done in the documentation and testing space (and perhaps a few API additions) before the majority of extensions can realistically switch to the stable ABI. A bit of "eating our own dogfood" in the extension modules we ship may be a good place to start (especially new ones that are added). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From martin at v.loewis.de Tue Oct 4 19:28:41 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 04 Oct 2011 19:28:41 +0200 Subject: [Python-Dev] Using PEP384 Stable ABI for the lzma extension module In-Reply-To: References: Message-ID: <4E8B4249.5030809@v.loewis.de> Amaury: thanks for your experiment and your report. > - I replaced PyBytes_GET_SIZE() with Py_SIZE(), which is OK, > and PyBytes_AS_STRING() with PyBytes_AsString(), which may > have a slight performance impact. That's the whole point of the stable ABI: AS_STRING assumes that there is an ob_sval field at a certain offset of the bytes object, which may not be there in a future release. PyBytes_AsString is indeed slower, but also more future-proof. > - I replaced > Py_TYPE(self)->tp_free((PyObject *)self); > with PyObject_Del(self), I hope this is the same thing > (for a non-GC object) If a subtype of self.__class__ would override tp_free, it wouldn't be the same anymore. I guess the API needs a way to read a slot from a type object. > - _PyBytes_Resize() is missing; I moved it under a Py_LIMITED_API > section. ??? Are you proposing to add _PyBytes_Resize to the Py_LIMITED_API set of functions? It's not even an API function in the first place (it starts with an underscore), so how can it be a limited API function? I think this whole notion of resizing immutable objects in the Python C API is flawed. If you can't know how large a buffer is in advance, first allocate a regular memory block, and then copy it into an object when done. > - For the "y*" argument spec, the Py_buffer structure is required > (only for two fields: buf and len), as well as PyBuffer_Release() Yes, this was a debate in the API PEP. I originally had the buffer API in the stable ABI, but was then advised to remove it, as it may not be that stable at all. I'll start a thread about extending the stable ABI soon; if people now want to reconsider, it would be possible. However, taking something out of the stable ABI is not possible, so if we decide the buffer API is stable, the structure is locked until Python 4. > - PyType_FromSpec() does not call PyType_Ready(), which caused > crashes in __new__. Oops :-) Regards, Martin From martin at v.loewis.de Tue Oct 4 19:39:52 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 04 Oct 2011 19:39:52 +0200 Subject: [Python-Dev] Using PEP384 Stable ABI for the lzma extension module In-Reply-To: References: Message-ID: <4E8B44E8.7080809@v.loewis.de> >> - Py_LIMITED_API is incompatible with --with-pydebug, and compilation stops. >> I skipped the check to continue. > > That seems like an odd (and undesirable) restriction. It's deliberate, though. > If different > Python versions are going to expose the same ABI, it seems strange of > debug and release versions can't do the same. You'll have to specify a lot of details what precisely constitutes a debug build, and what fields precisely belong to it. Nobody volunteered to specify what it should do, so I excluded it. It's also not the objective of the PEP to support loading debug-built extensions in alternative interpreter versions. I fail to see why this is undesirable, also. It's very easy to write an extension module that only uses the limited API, and still builds fine in a debug build: just don't define Py_LIMITED_API when compiling for debug mode. > The stable ABI probably needs a better solution for tp_new slots > invoking tp_alloc and tp_dealloc slots invoking tp_free. In fact, a > systematic review of the slot documentation is probably needed, > pointing out the stable ABI alternatives to all of the recommended > "cross slot" invocations (and creating them if they don't already > exist). Doing so would probably be better than my proposed approach of just provding a generic access function that reads a slot as a void* from a type object. >> What do you think about using the stable ABI even in shipped extensions? > > It's probably not a bad idea, otherwise we may compilation without > realising it. This is especially so for extension modules that *don't* > need access to any of the interpreter internals. Missing a word in the first sentence? There is the xxlimited module that is there to test that it keeps compiling under the limited API. I'll review all API additions before the next release, and will exclude a) anything that shouldn't be used by extension modules at all. There was a tradition of exposing all helper function, but I think this tradition needs to stop. Instead, adding to the API should be conservative, and only add what is positively useful to extension modules. b) anything that is not sufficiently stable from the limited API (in particular stuff that refers to new structures). The DLL .def file for Windows will make sure that nothing gets added unintentionally to the stable ABI, unfortunately, there is no easy technique for Unix achieving the same. Regards, Martin From martin at v.loewis.de Tue Oct 4 19:49:09 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 04 Oct 2011 19:49:09 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Optimize string slicing to use the new API In-Reply-To: References: Message-ID: <4E8B4715.6020907@v.loewis.de> > + result = PyUnicode_New(slicelength, PyUnicode_MAX_CHAR_VALUE(self)); This is incorrect: the maxchar of the slice might be smaller than the maxchar of the input string. So you'll need to iterate over the input string first, compute the maxchar, and then allocate the result string. Or you allocate a temporary buffer of (1<<(kind-1)) * slicelength bytes, copy the slice, allocate the target object with PyUnicode_FromKindAndData, and release the temporary buffer. Regards, Martin From brian.curtin at gmail.com Tue Oct 4 20:07:23 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Tue, 4 Oct 2011 13:07:23 -0500 Subject: [Python-Dev] Python compatibility issue with Windows Developer Preview In-Reply-To: <4E8B3400.6030400@voidspace.org.uk> References: <4E8B3400.6030400@voidspace.org.uk> Message-ID: On Tue, Oct 4, 2011 at 11:27, Michael Foord wrote: > On 04/10/2011 02:20, Brian Curtin wrote: > > On Mon, Oct 3, 2011 at 18:32, Ryan Wells (MP Tech Consulting LLC) < > v-rywel at microsoft.com> wrote: > >> Hello Python Developers, >> >> >> >> I am a Program Manager with the Ecosystem Engineering team at Microsoft. >> We are tracking a issue with Python 3.2.2 on Windows Developer Preview when >> using Internet Explorer. >> > > Is there any public bug tracker or other information for this on your > end? Sounds weird. > > > How would one use Python 3.2.2 with Internet explorer? It would be possible > with the pywin32 extensions, but then the correct place for support would be > the pywin32 bug tracker and mailing lists (as they're not part of core > Python). > I took the original message as Python is screwing up because Internet Explorer is running, which is ridiculous. Until they follow up with details, I think there's nothing to see here. -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Tue Oct 4 20:09:06 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 04 Oct 2011 20:09:06 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Optimize string slicing to use the new API In-Reply-To: <20111004195030.6ecaf999@pitrou.net> References: <4E8B4715.6020907@v.loewis.de> <20111004195030.6ecaf999@pitrou.net> Message-ID: <4E8B4BC2.8030804@v.loewis.de> Am 04.10.11 19:50, schrieb Antoine Pitrou: > On Tue, 04 Oct 2011 19:49:09 +0200 > "Martin v. L?wis" wrote: > >>> + result = PyUnicode_New(slicelength, PyUnicode_MAX_CHAR_VALUE(self)); >> >> This is incorrect: the maxchar of the slice might be smaller than the >> maxchar of the input string. > > I thought that heuristic would be good enough. I'll try to fix it. No - strings must always be in the canonical form. For example, PyUnicode_RichCompare considers string unequal if they have different kinds. As a consequence, your slice result may not compare equal to a canonical variant of itself. From nad at acm.org Tue Oct 4 20:14:12 2011 From: nad at acm.org (Ned Deily) Date: Tue, 04 Oct 2011 11:14:12 -0700 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as References: Message-ID: In article , charles-francois.natali wrote: > http://hg.python.org/cpython/rev/7697223df6df > changeset: 72670:7697223df6df > branch: 3.2 > parent: 72658:2484b2b8876e > user: Charles-Fran?ssois Natali > date: Tue Oct 04 19:17:26 2011 +0200 > summary: > Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when > run as > root (directory permissions are ignored). The same directory permission semantics apply to other (all?) BSD-derived systems, not just FreeBSD. For example, the test still fails in the same way on OS X when run via sudo. -- Ned Deily, nad at acm.org From brian.curtin at gmail.com Tue Oct 4 20:43:46 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Tue, 4 Oct 2011 13:43:46 -0500 Subject: [Python-Dev] Python compatibility issue with Windows Developer Preview In-Reply-To: References: <4E8B3400.6030400@voidspace.org.uk> Message-ID: On Tue, Oct 4, 2011 at 13:24, Ryan Wells (MP Tech Consulting LLC) wrote: > Please let me know if you have an estimated timeframe to address this issue, > and if our team can further assist in this process. No idea about an estimated time frame, but I've entered http://bugs.python.org/issue13101 into our tracker so we don't lose the issue and its details in our inboxes. If you're interested in tracking the results of any discussion or fixes, you could add yourself to the "nosy" list on that bug report (you'd have to register or use OpenID). If it is just a difference in return value like you suspected, the fix is probably pretty easy. The bigger barrier is just finding time to get a Windows 8 setup. Thanks for the report. From cf.natali at gmail.com Tue Oct 4 20:44:43 2011 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Tue, 4 Oct 2011 20:44:43 +0200 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: References: Message-ID: >> summary: >> Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when >> >> run as >> root (directory permissions are ignored). > > The same directory permission semantics apply to other (all?) > BSD-derived systems, not just FreeBSD. For example, the test still > fails in the same way on OS X when run via sudo. > Thanks, I didn't know: I only noticed this on the FreeBSD buildbots (I guess OS-X buildbots don't run as root). Note that it does behave as "expected" on Linux (note the use of quotation marks, I'm not sure whether this behavior is authorized by POSIX). I changed the test to skip when the effective UID is 0, regardless of the OS, to stay on the safe side. From v-rywel at microsoft.com Tue Oct 4 20:24:04 2011 From: v-rywel at microsoft.com (Ryan Wells (MP Tech Consulting LLC)) Date: Tue, 4 Oct 2011 18:24:04 +0000 Subject: [Python-Dev] Python compatibility issue with Windows Developer Preview In-Reply-To: References: <4E8B3400.6030400@voidspace.org.uk> Message-ID: Hello, I apologize for the confusion or if this is the wrong mailing listing, I wanted to get in contact with someone before I sent the bug information that I have. We do not have a public bug tracking system that I can direct you to. Based on preliminary testing, the following compatibility issue has been identified: Reference #: 70652 Description of the Problem: The application Python Module Doc is automatically closed when Internet Explorer 10 is closed. Steps to Reproduce: 1. Install Windows Developer Preview 2. Install Python 3.2.2 3. Launch Module Doc. Start Menu -> All Program -> Python -> Manual Docs 4. Click on the button open browser 5. It should open the site http://localhost:7464/ In Internet Explorer 10 and the contents should be displayed 6. Should be able to view list of Modules, Scripts, DLLs, and Libraries etc. 7. Close Internet Explorer Expected Result: Internet Explorer 10 should only get closed and we should be able to work with the application Module Doc. Actual Result: The application Module Doc is closed with Internet Explorer 10. Developer Notes: There is likely a difference in return values between IE8 and IE9/10 when launched from the app. Please let me know if you have an estimated timeframe to address this issue, and if our team can further assist in this process. Regards, Ryan Wells Microsoft PC Ecosystem Engineering Team v-rywel at microsoft.com From: Brian Curtin [mailto:brian.curtin at gmail.com] Sent: Tuesday, October 04, 2011 11:07 AM To: Michael Foord Cc: Ryan Wells (MP Tech Consulting LLC); Ecosystem Engineering IE; python-dev at python.org Subject: Re: [Python-Dev] Python compatibility issue with Windows Developer Preview On Tue, Oct 4, 2011 at 11:27, Michael Foord > wrote: On 04/10/2011 02:20, Brian Curtin wrote: On Mon, Oct 3, 2011 at 18:32, Ryan Wells (MP Tech Consulting LLC) > wrote: Hello Python Developers, I am a Program Manager with the Ecosystem Engineering team at Microsoft. We are tracking a issue with Python 3.2.2 on Windows Developer Preview when using Internet Explorer. Is there any public bug tracker or other information for this on your end? Sounds weird. How would one use Python 3.2.2 with Internet explorer? It would be possible with the pywin32 extensions, but then the correct place for support would be the pywin32 bug tracker and mailing lists (as they're not part of core Python). I took the original message as Python is screwing up because Internet Explorer is running, which is ridiculous. Until they follow up with details, I think there's nothing to see here. -------------- next part -------------- An HTML attachment was scrubbed... URL: From amauryfa at gmail.com Tue Oct 4 21:06:21 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 4 Oct 2011 21:06:21 +0200 Subject: [Python-Dev] Using PEP384 Stable ABI for the lzma extension module In-Reply-To: <4E8B4249.5030809@v.loewis.de> References: <4E8B4249.5030809@v.loewis.de> Message-ID: 2011/10/4 "Martin v. L?wis" : > >> - _PyBytes_Resize() is missing; I moved it under a Py_LIMITED_API >> ? section. > > ??? Are you proposing to add _PyBytes_Resize to the Py_LIMITED_API > set of functions? It's not even an API function in the first place > (it starts with an underscore), so how can it be a limited API function? It's not a proposal of any kind; it's just the workaround I used to compile and test. OTOH, it seems that many modules already use this function. Is there another method that does not need to copy data? -- Amaury Forgeot d'Arc From solipsis at pitrou.net Tue Oct 4 21:12:04 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 4 Oct 2011 21:12:04 +0200 Subject: [Python-Dev] Using PEP384 Stable ABI for the lzma extension module References: Message-ID: <20111004211204.79f0c1ff@pitrou.net> On Tue, 4 Oct 2011 13:05:58 -0400 Nick Coghlan wrote: > > > - _PyBytes_Resize() is missing; I moved it under a Py_LIMITED_API > > ?section. > > No, that's not valid. Bytes are officially immutable - mutating them > when the reference count is only 1 is a private for a reason. The > correct way to do this without relying on that implementation detail > is to use a byte array instead. Uh, no, it depends what you're doing. There's no reason not to allow people to resize a bytes object which they've just allocated and is still private to their code. That's the whole reason why _PyBytes_Resize() exists, and the use case is not exotic. Telling people to "first create a bytearray and then create a bytes object from that when you're finished" would be a shame. Regards Antoine. From martin at v.loewis.de Tue Oct 4 21:33:34 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 04 Oct 2011 21:33:34 +0200 Subject: [Python-Dev] Using PEP384 Stable ABI for the lzma extension module In-Reply-To: References: <4E8B4249.5030809@v.loewis.de> Message-ID: <4E8B5F8E.1070306@v.loewis.de> Am 04.10.11 21:06, schrieb Amaury Forgeot d'Arc: > 2011/10/4 "Martin v. L?wis": >> >>> - _PyBytes_Resize() is missing; I moved it under a Py_LIMITED_API >>> section. >> >> ??? Are you proposing to add _PyBytes_Resize to the Py_LIMITED_API >> set of functions? It's not even an API function in the first place >> (it starts with an underscore), so how can it be a limited API function? > > It's not a proposal of any kind; it's just the workaround I used to compile > and test. > OTOH, it seems that many modules already use this function. Is there > another method that does not need to copy data? Not sure what you are using it for. If you need to extend the buffer in case it is too small, there is absolutely no way this could work without copies in the general case because of how computers use address space. Even _PyBytes_Resize will copy the data. The only way to avoid copying is to run over the input twice: once to determine how large the output will have to be, and then another time to actually produce the output. Whether or not that's actually faster than copying the output depends on how much work this size computation requires. It would be nice if LZMA had "output size" information embedded in it, but it may not. Regards, Martin From ncoghlan at gmail.com Tue Oct 4 21:37:42 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 4 Oct 2011 15:37:42 -0400 Subject: [Python-Dev] Using PEP384 Stable ABI for the lzma extension module In-Reply-To: <20111004211204.79f0c1ff@pitrou.net> References: <20111004211204.79f0c1ff@pitrou.net> Message-ID: On Tue, Oct 4, 2011 at 3:12 PM, Antoine Pitrou wrote: > Uh, no, it depends what you're doing. There's no reason not to allow > people to resize a bytes object which they've just allocated and is > still private to their code. That's the whole reason why > _PyBytes_Resize() exists, and the use case is not exotic. > > Telling people to "first create a bytearray and then create a bytes > object from that when you're finished" would be a shame. If developers want to use private CPython functions, then they can't use the stable API - the whole point of having private APIs is that we don't even promise *source* compatibility for those, let alone binary compatibility. If they want the stability guarantee, then they have to eschew hacks that rely on implementation details (like the ability to resize "immutable" objects). That seems pretty reasonable to me. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Tue Oct 4 21:45:49 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 4 Oct 2011 21:45:49 +0200 Subject: [Python-Dev] Using PEP384 Stable ABI for the lzma extension module References: <4E8B4249.5030809@v.loewis.de> <4E8B5F8E.1070306@v.loewis.de> Message-ID: <20111004214549.66360def@pitrou.net> On Tue, 04 Oct 2011 21:33:34 +0200 "Martin v. L?wis" wrote: > Am 04.10.11 21:06, schrieb Amaury Forgeot d'Arc: > > 2011/10/4 "Martin v. L?wis": > >> > >>> - _PyBytes_Resize() is missing; I moved it under a Py_LIMITED_API > >>> section. > >> > >> ??? Are you proposing to add _PyBytes_Resize to the Py_LIMITED_API > >> set of functions? It's not even an API function in the first place > >> (it starts with an underscore), so how can it be a limited API function? > > > > It's not a proposal of any kind; it's just the workaround I used to compile > > and test. > > OTOH, it seems that many modules already use this function. Is there > > another method that does not need to copy data? > > Not sure what you are using it for. If you need to extend the buffer > in case it is too small, there is absolutely no way this could work > without copies in the general case because of how computers use > address space. Even _PyBytes_Resize will copy the data. That's not a given. Depending on the memory allocator, a copy can be avoided. That's why the "str += str" hack is much more efficient under Linux than Windows, AFAIK. Regards Antoine. From g.brandl at gmx.net Tue Oct 4 23:41:54 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 04 Oct 2011 23:41:54 +0200 Subject: [Python-Dev] cpython: PyUnicode_Join() checks output length in debug mode In-Reply-To: References: Message-ID: On 10/03/11 23:35, victor.stinner wrote: > http://hg.python.org/cpython/rev/bfd8b5d35f9c > changeset: 72623:bfd8b5d35f9c > user: Victor Stinner > date: Mon Oct 03 23:36:02 2011 +0200 > summary: > PyUnicode_Join() checks output length in debug mode > > PyUnicode_CopyCharacters() may copies less character than requested size, if > the input string is smaller than the argument. (This is very unlikely, but who > knows!?) > > Avoid also calling PyUnicode_CopyCharacters() if the string is empty. > > files: > Objects/unicodeobject.c | 34 +++++++++++++++++++--------- > 1 files changed, 23 insertions(+), 11 deletions(-) > > > diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c > --- a/Objects/unicodeobject.c > +++ b/Objects/unicodeobject.c > @@ -8890,20 +8890,32 @@ > > /* Catenate everything. */ > for (i = 0, res_offset = 0; i < seqlen; ++i) { > - Py_ssize_t itemlen; > + Py_ssize_t itemlen, copied; > item = items[i]; > + /* Copy item, and maybe the separator. */ > + if (i && seplen != 0) { > + copied = PyUnicode_CopyCharacters(res, res_offset, > + sep, 0, seplen); > + if (copied < 0) > + goto onError; > +#ifdef Py_DEBUG > + res_offset += copied; > +#else > + res_offset += seplen; > +#endif > + } > itemlen = PyUnicode_GET_LENGTH(item); > - /* Copy item, and maybe the separator. */ > - if (i) { > - if (PyUnicode_CopyCharacters(res, res_offset, > - sep, 0, seplen) < 0) > + if (itemlen != 0) { > + copied = PyUnicode_CopyCharacters(res, res_offset, > + item, 0, itemlen); > + if (copied < 0) > goto onError; > - res_offset += seplen; > - } > - if (PyUnicode_CopyCharacters(res, res_offset, > - item, 0, itemlen) < 0) > - goto onError; > - res_offset += itemlen; > +#ifdef Py_DEBUG > + res_offset += copied; > +#else > + res_offset += itemlen; > +#endif > + } > } > assert(res_offset == PyUnicode_GET_LENGTH(res)); I don't understand this change. Why would you not always add "copied" once you already have it? It seems to be the more correct version anyway. Georg From riscutiavlad at gmail.com Wed Oct 5 00:30:54 2011 From: riscutiavlad at gmail.com (Vlad Riscutia) Date: Tue, 4 Oct 2011 15:30:54 -0700 Subject: [Python-Dev] [Python-checkins] cpython: fix compiler warnings In-Reply-To: References: <4E8AD431.7090600@haypocalc.com> Message-ID: Why does the function even return a value? As Benjamin said, it is just a bunch of asserts with return 1 at the end. I believe another way you can get rid of "statement with no effect" is to cast return value to void, like (void)_PyUnicode_CHECK(unicode). Thank you, Vlad On Tue, Oct 4, 2011 at 4:57 AM, Benjamin Peterson wrote: > 2011/10/4 Victor Stinner : > > Le 04/10/2011 01:34, benjamin.peterson a ?crit : > >> > >> http://hg.python.org/cpython/rev/afb60b190f1c > >> changeset: 72633:afb60b190f1c > >> user: Benjamin Peterson > >> date: Mon Oct 03 19:34:12 2011 -0400 > >> summary: > >> fix compiler warnings > >> > >> +++ b/Objects/unicodeobject.c > >> @@ -369,6 +369,12 @@ > >> } > >> return 1; > >> } > >> +#else > >> +static int > >> +_PyUnicode_CheckConsistency(void *op) > >> +{ > >> + return 1; > >> +} > >> #endif > > > > Oh no, please don't do that. Calling _PyUnicode_CheckConsistency() is > > reserved to debug builds. In release mode, we should not check string > > consistency (it would slow down Python). > > It should be optimized out. > > > > > Yes, there was a warning: > > > > Objects/unicodeobject.c:539:13: warning: statement with no effect > > _PyUnicode_CHECK(unicode); > > > > I added these checks recently to ensure that strings are consistent just > > before exiting (to help me to track down a bug). > > > > The right fix is just to replace _PyUnicode_CHECK(unicode) by > > assert(_PyUnicode_CHECK(unicode)). > > But _PyUnicode_CheckConsistency is just a string of assertions. What > sense does it make to check the return value? > > > -- > Regards, > Benjamin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/riscutiavlad%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at haypocalc.com Wed Oct 5 00:31:27 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 05 Oct 2011 00:31:27 +0200 Subject: [Python-Dev] cpython: PyUnicode_Join() checks output length in debug mode In-Reply-To: References: Message-ID: <4E8B893F.5060106@haypocalc.com> Le 04/10/2011 23:41, Georg Brandl a ?crit : > I don't understand this change. Why would you not always add "copied" once you > already have it? It seems to be the more correct version anyway. If you use copied instead of seplen/itemlen, you suppose that the string has been overallocated in some cases, and that you have to resize the string (in-place or, more probably, with a copy). Victor From victor.stinner at haypocalc.com Wed Oct 5 01:39:56 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 05 Oct 2011 01:39:56 +0200 Subject: [Python-Dev] [Python-checkins] cpython: fix compiler warnings In-Reply-To: References: <4E8AD431.7090600@haypocalc.com> Message-ID: <4E8B994C.7040807@haypocalc.com> Le 05/10/2011 00:30, Vlad Riscutia a ?crit : > Why does the function even return a value? As Benjamin said, it is just > a bunch of asserts with return 1 at the end. It's just to be able to write assert(_PyUnicode_CheckConsistency(...)). assert() is just used to remove the instruction in release mode. Victor From victor.stinner at haypocalc.com Wed Oct 5 01:44:31 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 05 Oct 2011 01:44:31 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Migrate str.expandtabs to the new API In-Reply-To: <4E8B3843.4030505@v.loewis.de> References: <4E8B3843.4030505@v.loewis.de> Message-ID: <4E8B9A5F.4020104@haypocalc.com> Le 04/10/2011 18:45, "Martin v. L?wis" a ?crit : > >> Migrate str.expandtabs to the new API > > This needs > > if (PyUnicode_READY(self) == -1) > return NULL; > > right after the ParseTuple call. In most cases, the > check will be a noop. But if it's not, omitting it will > make expandtabs have no effect, since the string length > will be 0 (in a debug build, you also get an assertion > failure). This "make input string ready" code path is not well tested because all functions creating strings in unicodeobject.c ensure that the string is ready. I disabled the call to PyUnicode_READY() on result in debug mode in unicodeobject.c (define DONT_MAKE_RESULT_READY). It helped me to fix bugs in various functions, see my commit b66033a0f140. Victor From victor.stinner at haypocalc.com Wed Oct 5 01:59:35 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 05 Oct 2011 01:59:35 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Optimize string slicing to use the new API In-Reply-To: <4E8B4BC2.8030804@v.loewis.de> References: <4E8B4715.6020907@v.loewis.de> <20111004195030.6ecaf999@pitrou.net> <4E8B4BC2.8030804@v.loewis.de> Message-ID: <4E8B9DE7.60206@haypocalc.com> Le 04/10/2011 20:09, "Martin v. L?wis" a ?crit : > Am 04.10.11 19:50, schrieb Antoine Pitrou: >> On Tue, 04 Oct 2011 19:49:09 +0200 >> "Martin v. L?wis" wrote: >> >>>> + result = PyUnicode_New(slicelength, PyUnicode_MAX_CHAR_VALUE(self)); >>> >>> This is incorrect: the maxchar of the slice might be smaller than the >>> maxchar of the input string. >> >> I thought that heuristic would be good enough. I'll try to fix it. > > No - strings must always be in the canonical form. I added a check in _PyUnicode_CheckConsistency() (debug mode) to ensure that newly created strings always use the most efficient storage. > For example, PyUnicode_RichCompare considers string unequal if they > have different kinds. As a consequence, your slice > result may not compare equal to a canonical variant of itself. I see this as a micro-optimization. IMO we should *not* rely on these assumptions because we cannot expect that all developers of third party modules will be able to write perfect code, and some (lazy developers!) may prefer to use a fixed maximum character (e.g. 0xFFFF). To be able to rely on such assumption, we have to make sure that strings are in canonical forms (always check before using a string?). But it would slow down Python because you have to scan the whole string to get the maximum characters (see my change in _PyUnicode_CheckConsistency). I would prefer to drop such micro-optimization and tolerate non-canonical strings (strings not using the most efficient storage). Even if PEP 393 is fully backward compatibly (except that PyUnicode_AS_UNICODE and PyUnicode_AsUnicode may now return NULL), it's already a big change (developers may want to move to the new API to benefit of the advantages of the PEP 393), and very few developers understand correctly Unicode. It's safer to see the PEP 393 as a best-effort method. Hopefuly, most (or all?) strings created by Python itself are in canonical form. Victor From martin at v.loewis.de Wed Oct 5 18:11:15 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Oct 2011 18:11:15 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Optimize string slicing to use the new API In-Reply-To: <4E8B9DE7.60206@haypocalc.com> References: <4E8B4715.6020907@v.loewis.de> <20111004195030.6ecaf999@pitrou.net> <4E8B4BC2.8030804@v.loewis.de> <4E8B9DE7.60206@haypocalc.com> Message-ID: <4E8C81A3.6000602@v.loewis.de> > I see this as a micro-optimization. IMO we should *not* rely on these > assumptions because we cannot expect that all developers of third party > modules will be able to write perfect code, and some (lazy developers!) > may prefer to use a fixed maximum character (e.g. 0xFFFF). Hmm. I'd like to declare that it is incorrect usage of the API, only allowing maxchar to be at the next boundary (i.e. 127, 255, 65536, larger). There are always cases of incorrect usage of all API. For example, not filling out a list entirely (i.e. leaving NULL in some fields) may also cause strange results. Users will need to learn what the API is. At the first approximation, maxchar should be the true maximum character. > To be able to rely on such assumption, we have to make sure that strings > are in canonical forms (always check before using a string?). No, we don't need that: garbage in, garbage out. If people use the API incorrectly, they will get incorrect results. It's useful to have checks in debug mode, but that's the most that people should reasonably expect. Regards, Martin From martin at v.loewis.de Wed Oct 5 18:12:54 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Oct 2011 18:12:54 +0200 Subject: [Python-Dev] Using PEP384 Stable ABI for the lzma extension module In-Reply-To: <20111004214549.66360def@pitrou.net> References: <4E8B4249.5030809@v.loewis.de> <4E8B5F8E.1070306@v.loewis.de> <20111004214549.66360def@pitrou.net> Message-ID: <4E8C8206.9070302@v.loewis.de> >> Not sure what you are using it for. If you need to extend the buffer >> in case it is too small, there is absolutely no way this could work >> without copies in the general case because of how computers use >> address space. Even _PyBytes_Resize will copy the data. > > That's not a given. Depending on the memory allocator, a copy can be > avoided. That's why the "str += str" hack is much more efficient under > Linux than Windows, AFAIK. Even Linux will have to copy a block on realloc in certain cases, no? Regards, Martin From solipsis at pitrou.net Wed Oct 5 18:14:08 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 05 Oct 2011 18:14:08 +0200 Subject: [Python-Dev] Using PEP384 Stable ABI for the lzma extension module In-Reply-To: <4E8C8206.9070302@v.loewis.de> References: <4E8B4249.5030809@v.loewis.de> <4E8B5F8E.1070306@v.loewis.de> <20111004214549.66360def@pitrou.net> <4E8C8206.9070302@v.loewis.de> Message-ID: <1317831248.3713.1.camel@localhost.localdomain> Le mercredi 05 octobre 2011 ? 18:12 +0200, "Martin v. L?wis" a ?crit : > >> Not sure what you are using it for. If you need to extend the buffer > >> in case it is too small, there is absolutely no way this could work > >> without copies in the general case because of how computers use > >> address space. Even _PyBytes_Resize will copy the data. > > > > That's not a given. Depending on the memory allocator, a copy can be > > avoided. That's why the "str += str" hack is much more efficient under > > Linux than Windows, AFAIK. > > Even Linux will have to copy a block on realloc in certain cases, no? Probably so. How often is totally unknown to me :) Regards Antoine. From a.badger at gmail.com Wed Oct 5 18:38:10 2011 From: a.badger at gmail.com (Toshio Kuratomi) Date: Wed, 5 Oct 2011 09:38:10 -0700 Subject: [Python-Dev] Using PEP384 Stable ABI for the lzma extension module In-Reply-To: <1317831248.3713.1.camel@localhost.localdomain> References: <4E8B4249.5030809@v.loewis.de> <4E8B5F8E.1070306@v.loewis.de> <20111004214549.66360def@pitrou.net> <4E8C8206.9070302@v.loewis.de> <1317831248.3713.1.camel@localhost.localdomain> Message-ID: <20111005163810.GI5476@unaka.lan> On Wed, Oct 05, 2011 at 06:14:08PM +0200, Antoine Pitrou wrote: > Le mercredi 05 octobre 2011 ? 18:12 +0200, "Martin v. L?wis" a ?crit : > > >> Not sure what you are using it for. If you need to extend the buffer > > >> in case it is too small, there is absolutely no way this could work > > >> without copies in the general case because of how computers use > > >> address space. Even _PyBytes_Resize will copy the data. > > > > > > That's not a given. Depending on the memory allocator, a copy can be > > > avoided. That's why the "str += str" hack is much more efficient under > > > Linux than Windows, AFAIK. > > > > Even Linux will have to copy a block on realloc in certain cases, no? > > Probably so. How often is totally unknown to me :) > http://www.gnu.org/software/libc/manual/html_node/Changing-Block-Size.html It depends on whether there's enough free memory after the buffer you currently have allocated. I suppose that this becomes a question of what people consider "the general case" :-) -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From solipsis at pitrou.net Wed Oct 5 19:02:58 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 5 Oct 2011 19:02:58 +0200 Subject: [Python-Dev] Using PEP384 Stable ABI for the lzma extension module In-Reply-To: <20111005163810.GI5476@unaka.lan> References: <4E8B4249.5030809@v.loewis.de> <4E8B5F8E.1070306@v.loewis.de> <20111004214549.66360def@pitrou.net> <4E8C8206.9070302@v.loewis.de> <1317831248.3713.1.camel@localhost.localdomain> <20111005163810.GI5476@unaka.lan> Message-ID: <20111005190258.53e5fdab@pitrou.net> On Wed, 5 Oct 2011 09:38:10 -0700 Toshio Kuratomi wrote: > On Wed, Oct 05, 2011 at 06:14:08PM +0200, Antoine Pitrou wrote: > > Le mercredi 05 octobre 2011 ? 18:12 +0200, "Martin v. L?wis" a ?crit : > > > >> Not sure what you are using it for. If you need to extend the buffer > > > >> in case it is too small, there is absolutely no way this could work > > > >> without copies in the general case because of how computers use > > > >> address space. Even _PyBytes_Resize will copy the data. > > > > > > > > That's not a given. Depending on the memory allocator, a copy can be > > > > avoided. That's why the "str += str" hack is much more efficient under > > > > Linux than Windows, AFAIK. > > > > > > Even Linux will have to copy a block on realloc in certain cases, no? > > > > Probably so. How often is totally unknown to me :) > > > http://www.gnu.org/software/libc/manual/html_node/Changing-Block-Size.html > > It depends on whether there's enough free memory after the buffer you > currently have allocated. I suppose that this becomes a question of what > people consider "the general case" :-) But under certain circumstances (if a large block is requested), the allocator uses mmap(), no? In which case mremap() should allow to resize without copying anything. Regards Antoine. From tjreedy at udel.edu Wed Oct 5 21:25:22 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 05 Oct 2011 15:25:22 -0400 Subject: [Python-Dev] [Python-checkins] cpython: Document requierements of Unicode kinds In-Reply-To: References: Message-ID: <4E8CAF22.8040501@udel.edu> On 10/5/2011 1:43 PM, victor.stinner wrote: > http://hg.python.org/cpython/rev/055174308822 > changeset: 72699:055174308822 > user: Victor Stinner > date: Wed Oct 05 01:31:05 2011 +0200 > summary: > Document requierements of Unicode kinds > > files: > Include/unicodeobject.h | 24 ++++++++++++++++++++---- > 1 files changed, 20 insertions(+), 4 deletions(-) > > > diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h > --- a/Include/unicodeobject.h > +++ b/Include/unicodeobject.h > @@ -288,10 +288,26 @@ > unsigned int interned:2; > /* Character size: > > - PyUnicode_WCHAR_KIND (0): wchar_t* > - PyUnicode_1BYTE_KIND (1): Py_UCS1* > - PyUnicode_2BYTE_KIND (2): Py_UCS2* > - PyUnicode_4BYTE_KIND (3): Py_UCS4* > + - PyUnicode_WCHAR_KIND (0): > + > + * character type = wchar_t (16 or 32 bits, depending on the > + platform) > + > + - PyUnicode_1BYTE_KIND (1): > + > + * character type = Py_UCS1 (8 bits, unsigned) > + * if ascii is 1, at least one character must be in range > + U+80-U+FF, otherwise all characters must be in range U+00-U+7F Given that 1==True, this looks backwards. > + > + - PyUnicode_2BYTE_KIND (2): > + > + * character type = Py_UCS2 (16 bits, unsigned) > + * at least one character must be in range U+0100-U+1FFFF /U+1FFFF/U+FFFF/ ? Terry From francisco.martin at web.de Wed Oct 5 22:29:55 2011 From: francisco.martin at web.de (Francisco Martin Brugue) Date: Wed, 05 Oct 2011 22:29:55 +0200 Subject: [Python-Dev] What it takes to change a single keyword. In-Reply-To: References: <4E87373D.2030503@v.loewis.de> Message-ID: <4E8CBE43.6050601@web.de> Just Info on the links: > http://docs.python.org/devguide/compiler.html provides info on how it > all hangs together). Those: [1] Skip Montanaro?s Peephole Optimizer Paper (http://www.foretec.com/python/workshops/1998-11/proceedings/papers/montanaro/montanaro.html) [Wang97] Daniel C. Wang, Andrew W. Appel, Jeff L. Korn, and Chris S. Serra. The Zephyr Abstract Syntax Description Language. In Proceedings of the Conference on Domain-Specific Languages, pp. 213?227, 1997. are a 404 Cheers, francis From brett at python.org Thu Oct 6 00:00:55 2011 From: brett at python.org (Brett Cannon) Date: Wed, 5 Oct 2011 15:00:55 -0700 Subject: [Python-Dev] What it takes to change a single keyword. In-Reply-To: <4E8CBE43.6050601@web.de> References: <4E87373D.2030503@v.loewis.de> <4E8CBE43.6050601@web.de> Message-ID: Please file a bug about the dead links so we can fix/remove them. On Wed, Oct 5, 2011 at 13:29, Francisco Martin Brugue < francisco.martin at web.de> wrote: > Just Info on the links: > > http://docs.python.org/**devguide/compiler.htmlprovides info on how it >> all hangs together). >> > > Those: > > [1] > >**Skip Montanaro?s Peephole Optimizer Paper (http://www.foretec.com/** > python/workshops/1998-11/**proceedings/papers/montanaro/**montanaro.html > ) > > [Wang97] > >**Daniel C. Wang, Andrew W. Appel, Jeff L. Korn, and Chris S. Serra. The > Zephyr Abstract Syntax Description Language. < > http://www.cs.princeton.edu/%**7Edanwang/Papers/dsl97/dsl97.**html> > In Proceedings of the Conference on Domain-Specific Languages, pp. 213?227, > 1997. > > > are a 404 > > Cheers, > > francis > > > > > > ______________________________**_________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/**mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** > brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Oct 6 00:51:34 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 5 Oct 2011 15:51:34 -0700 Subject: [Python-Dev] Heads up: Apple llvm gcc 4.2 miscompiles PEP 393 In-Reply-To: References: <20110928132422.Horde.OvQBCtjz9kROgwPm5ZwktiA@webmail.df.eu> <4E835E1C.8090700@v.loewis.de> <74F6ADFA-874D-4BAC-B304-CE8B12D80126@masklinn.net> Message-ID: Is anyone on this thread interested in other weird Mac bugs? I had a user complaining that on their Mac, with Python 2.5.6 from macports, 2**63 was a negative number! That sounds like a compiler bug to me... http://code.google.com/p/appengine-ndb-experiment/issues/detail?id=65 (details about the versions involved are in comment 6) -- --Guido van Rossum (python.org/~guido) From nad at acm.org Thu Oct 6 01:05:39 2011 From: nad at acm.org (Ned Deily) Date: Wed, 05 Oct 2011 16:05:39 -0700 Subject: [Python-Dev] Heads up: Apple llvm gcc 4.2 miscompiles PEP 393 References: <20110928132422.Horde.OvQBCtjz9kROgwPm5ZwktiA@webmail.df.eu> <4E835E1C.8090700@v.loewis.de> <74F6ADFA-874D-4BAC-B304-CE8B12D80126@masklinn.net> Message-ID: In article , Guido van Rossum wrote: > Is anyone on this thread interested in other weird Mac bugs? I had a > user complaining that on their Mac, with Python 2.5.6 from macports, > 2**63 was a negative number! That sounds like a compiler bug to me... > > http://code.google.com/p/appengine-ndb-experiment/issues/detail?id=65 > (details about the versions involved are in comment 6) Thanks for the pointer. That looks like a duplicate of Issue11149 (and Issue12701). Another manifestation of this was reported in Issue13061 which also originated from MacPorts. I'll remind them that the configure change is likely needed for all Pythons. It's still safest to stick with good old gcc-4.2 on OS X at the moment. -- Ned Deily, nad at acm.org From guido at python.org Thu Oct 6 01:26:24 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 5 Oct 2011 16:26:24 -0700 Subject: [Python-Dev] Heads up: Apple llvm gcc 4.2 miscompiles PEP 393 In-Reply-To: References: <20110928132422.Horde.OvQBCtjz9kROgwPm5ZwktiA@webmail.df.eu> <4E835E1C.8090700@v.loewis.de> <74F6ADFA-874D-4BAC-B304-CE8B12D80126@masklinn.net> Message-ID: Thanks! More proof that debugging crosses *all* abstractions... On Wed, Oct 5, 2011 at 4:05 PM, Ned Deily wrote: > In article > , > ?Guido van Rossum wrote: >> Is anyone on this thread interested in other weird Mac bugs? I had a >> user complaining that on their Mac, with Python 2.5.6 from macports, >> 2**63 was a negative number! That sounds like a compiler bug to me... >> >> http://code.google.com/p/appengine-ndb-experiment/issues/detail?id=65 >> (details about the versions involved are in comment 6) > > Thanks for the pointer. ?That looks like a duplicate of Issue11149 (and > Issue12701). ?Another manifestation of this was reported in Issue13061 > which also originated from MacPorts. ?I'll remind them that the > configure change is likely needed for all Pythons. ?It's still safest to > stick with good old gcc-4.2 on OS X at the moment. > > -- > ?Ned Deily, > ?nad at acm.org > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) From victor.stinner at haypocalc.com Thu Oct 6 01:53:19 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 6 Oct 2011 01:53:19 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Document requierements of Unicode kinds In-Reply-To: <4E8CAF22.8040501@udel.edu> References: <4E8CAF22.8040501@udel.edu> Message-ID: <201110060153.19068.victor.stinner@haypocalc.com> Le mercredi 5 octobre 2011 21:25:22, Terry Reedy a ?crit : > > + - PyUnicode_1BYTE_KIND (1): > > + > > + * character type = Py_UCS1 (8 bits, unsigned) > > + * if ascii is 1, at least one character must be in range > > + U+80-U+FF, otherwise all characters must be in range > > U+00-U+7F > > Given that 1==True, this looks backwards. I changed the doc to: PyUnicode_1BYTE_KIND (1): * character type = Py_UCS1 (8 bits, unsigned) * if ascii is set, all characters must be in range U+0000-U+007F, otherwise at least one character must be in range U+0080-U+00FF Is it better? > > + - PyUnicode_2BYTE_KIND (2): > > + > > + * character type = Py_UCS2 (16 bits, unsigned) > > + * at least one character must be in range U+0100-U+1FFFF > > /U+1FFFF/U+FFFF/ ? Oops, correct. I fixed the doc, thanks for the review. Victor From cs at zip.com.au Thu Oct 6 01:55:07 2011 From: cs at zip.com.au (Cameron Simpson) Date: Thu, 6 Oct 2011 10:55:07 +1100 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: References: Message-ID: <20111005235507.GA32295@cskk.homeip.net> On 04Oct2011 20:44, Charles-Fran?ois Natali wrote: | >> summary: | >> Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when | >> | >> run as | >> root (directory permissions are ignored). | > | > The same directory permission semantics apply to other (all?) | > BSD-derived systems, not just FreeBSD. For example, the test still | > fails in the same way on OS X when run via sudo. | > | | Thanks, I didn't know: I only noticed this on the FreeBSD buildbots (I | guess OS-X buildbots don't run as root). Note that it does behave as | "expected" on Linux (note the use of quotation marks, I'm not sure | whether this behavior is authorized by POSIX). | I changed the test to skip when the effective UID is 0, regardless of | the OS, to stay on the safe side. I'd have expect this test to fail on _any_ UNIX system if run as root. Root's allowed to write to stuff! Any stuff! About the only permission with any effect on root is the eXecute bit for the exec call, to prevent blindly running random data files. Equally, why on earth are you running tests as root!?!?!?!?! Madness. It's as bad as compiling stuff as root etc etc. A bad idea all around, securitywise. Especially, I would think, a builtbot. "Oh, let's fetch some shiny new code and run it as the system superuser." I know this post sounds shouty, but I've just reread it a few times and still cannot bring myself to tone it down. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ If he's not one thing, he's another. - Buckaroo Banzai From solipsis at pitrou.net Thu Oct 6 02:07:32 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 6 Oct 2011 02:07:32 +0200 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as References: <20111005235507.GA32295@cskk.homeip.net> Message-ID: <20111006020732.54d4df4c@pitrou.net> On Thu, 6 Oct 2011 10:55:07 +1100 Cameron Simpson wrote: > > Equally, why on earth are you running tests as root!?!?!?!?! Madness. > It's as bad as compiling stuff as root etc etc. A bad idea all around, > securitywise. > > Especially, I would think, a builtbot. "Oh, let's fetch some shiny new > code and run it as the system superuser." Said buildbot probably runs in a VM. Regards Antoine. From victor.stinner at haypocalc.com Thu Oct 6 02:06:30 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 6 Oct 2011 02:06:30 +0200 Subject: [Python-Dev] New stringbench benchmark results Message-ID: <201110060206.30819.victor.stinner@haypocalc.com> Hi, I optimized unicodeobject.c a little bit more where I saw major performance regressions from Python 3.2 to 3.3 using stringbench. Here are new results: see attachments. Example of tests where Python 3.3 is much slower: "A".join(["Bob"]*100)): 2.11 => 0.92 ("C"+"AB"*300).rfind("CA"): 0.57 => 1.03 ("A" + ("Z"*128*1024)).replace("A", "BB", 1): 0.25 => 0.50 The rfind case is really strange: the code between Python 3.2 and 3.3 is exactly the same. Even in Python 3.2: rfind looks twice faster than find: ("AB"*300+"C").find("BC") (*1000) : 1.21 ("C"+"AB"*300).rfind("CA") (*1000) : 0.57 stringbench is ASCII only, I expect worse performance with non-ASCII characters. Python 3.3 is now faster for pure ASCII (faster than other "kinds" of Unicode string). Hopefuly, Python 3.3 is faster in some stringbench tests, sometimes 2 times faster ;-) Victor -------------- next part -------------- stringbench v2.0 3.3.0a0 (default:341c3002ffb2, Oct 6 2011, 01:52:36) [GCC 4.6.0 20110603 (Red Hat 4.6.0-10)] 2011-10-06 01:53:06.975247 bytes unicode (in ms) (in ms) % comment ========== case conversion -- dense 0.40 1.59 25.2 ("WHERE IN THE WORLD IS CARMEN SAN DEIGO?"*10).lower() (*1000) 0.42 1.55 27.3 ("where in the world is carmen san deigo?"*10).upper() (*1000) ========== case conversion -- rare 0.53 1.49 35.8 ("Where in the world is Carmen San Deigo?"*10).lower() (*1000) 0.44 1.55 28.5 ("wHERE IN THE WORLD IS cARMEN sAN dEIGO?"*10).upper() (*1000) ========== concat 20 strings of words length 4 to 15 1.25 1.44 86.8 s1+s2+s3+s4+...+s20 (*1000) ========== concat two strings 0.06 0.07 92.5 "Andrew"+"Dalke" (*1000) ========== count AACT substrings in DNA example 1.27 1.32 96.5 dna.count("AACT") (*10) ========== count newlines 0.53 0.51 103.3 ...text.with.2000.newlines.count("\n") (*10) ========== early match, single character 0.11 0.12 89.2 ("A"*1000).find("A") (*1000) 0.31 0.03 887.7 "A" in "A"*1000 (*1000) 0.11 0.12 90.7 ("A"*1000).index("A") (*1000) 0.13 0.14 94.4 ("A"*1000).partition("A") (*1000) 0.12 0.13 90.3 ("A"*1000).rfind("A") (*1000) 0.12 0.13 91.3 ("A"*1000).rindex("A") (*1000) 0.12 0.13 94.7 ("A"*1000).rpartition("A") (*1000) 0.28 0.27 102.0 ("A"*1000).rsplit("A", 1) (*1000) 0.29 0.28 103.4 ("A"*1000).split("A", 1) (*1000) ========== early match, two characters 0.11 0.13 90.5 ("AB"*1000).find("AB") (*1000) 0.31 0.04 799.5 "AB" in "AB"*1000 (*1000) 0.11 0.13 89.2 ("AB"*1000).index("AB") (*1000) 0.15 0.16 97.9 ("AB"*1000).partition("AB") (*1000) 0.12 0.14 90.0 ("AB"*1000).rfind("AB") (*1000) 0.12 0.13 92.8 ("AB"*1000).rindex("AB") (*1000) 0.14 0.15 93.3 ("AB"*1000).rpartition("AB") (*1000) 0.31 0.30 101.4 ("AB"*1000).rsplit("AB", 1) (*1000) 0.32 0.32 101.0 ("AB"*1000).split("AB", 1) (*1000) ========== endswith multiple characters 0.13 0.13 96.1 "Andrew".endswith("Andrew") (*1000) ========== endswith multiple characters - not! 0.13 0.12 108.4 "Andrew".endswith("Anders") (*1000) ========== endswith single character 0.13 0.13 94.9 "Andrew".endswith("w") (*1000) ========== formatting a string type with a dict N/A 0.52 0.0 "The %(k1)s is %(k2)s the %(k3)s."%{"k1":"x","k2":"y","k3":"z",} (*1000) ========== join empty string, with 1 character sep N/A 0.04 0.0 "A".join("") (*100) ========== join empty string, with 5 character sep N/A 0.04 0.0 "ABCDE".join("") (*100) ========== join list of 100 words, with 1 character sep 1.10 2.11 52.3 "A".join(["Bob"]*100)) (*1000) ========== join list of 100 words, with 5 character sep 1.16 2.29 50.9 "ABCDE".join(["Bob"]*100)) (*1000) ========== join list of 26 characters, with 1 character sep 0.31 0.60 51.3 "A".join(list("ABC..Z")) (*1000) ========== join list of 26 characters, with 5 character sep 0.34 0.60 56.5 "ABCDE".join(list("ABC..Z")) (*1000) ========== join string with 26 characters, with 1 character sep N/A 1.23 0.0 "A".join("ABC..Z") (*1000) ========== join string with 26 characters, with 5 character sep N/A 1.21 0.0 "ABCDE".join("ABC..Z") (*1000) ========== late match, 100 characters 8.88 8.52 104.1 s="ABC"*33; ((s+"D")*500+s+"E").find(s+"E") (*100) 3.48 3.53 98.5 s="ABC"*33; ((s+"D")*500+"E"+s).find("E"+s) (*100) 5.33 5.28 100.9 s="ABC"*33; (s+"E") in ((s+"D")*300+s+"E") (*100) 8.88 8.60 103.2 s="ABC"*33; ((s+"D")*500+s+"E").index(s+"E") (*100) 9.24 9.24 100.0 s="ABC"*33; ((s+"D")*500+s+"E").partition(s+"E") (*100) 7.86 7.61 103.4 s="ABC"*33; ("E"+s+("D"+s)*500).rfind("E"+s) (*100) 1.76 1.98 88.9 s="ABC"*33; (s+"E"+("D"+s)*500).rfind(s+"E") (*100) 7.67 7.53 101.9 s="ABC"*33; ("E"+s+("D"+s)*500).rindex("E"+s) (*100) 7.88 7.74 101.9 s="ABC"*33; ("E"+s+("D"+s)*500).rpartition("E"+s) (*100) 8.22 8.22 100.0 s="ABC"*33; ("E"+s+("D"+s)*500).rsplit("E"+s, 1) (*100) 9.12 9.09 100.3 s="ABC"*33; ((s+"D")*500+s+"E").split(s+"E", 1) (*100) ========== late match, two characters 1.19 1.16 103.1 ("AB"*300+"C").find("BC") (*1000) 1.18 1.18 99.2 ("AB"*300+"CA").find("CA") (*1000) 1.38 1.10 125.7 "BC" in ("AB"*300+"C") (*1000) 1.21 1.16 104.2 ("AB"*300+"C").index("BC") (*1000) 1.18 1.21 98.2 ("AB"*300+"C").partition("BC") (*1000) 1.06 1.03 102.3 ("C"+"AB"*300).rfind("CA") (*1000) 0.70 0.71 98.7 ("BC"+"AB"*300).rfind("BC") (*1000) 1.04 1.03 100.9 ("C"+"AB"*300).rindex("CA") (*1000) 1.03 1.05 97.6 ("C"+"AB"*300).rpartition("CA") (*1000) 1.24 1.19 103.8 ("C"+"AB"*300).rsplit("CA", 1) (*1000) 1.29 1.34 96.0 ("AB"*300+"C").split("BC", 1) (*1000) ========== no match, single character 0.67 0.68 98.1 ("A"*1000).find("B") (*1000) 0.86 0.59 145.4 "B" in "A"*1000 (*1000) 0.60 0.61 98.2 ("A"*1000).partition("B") (*1000) 0.68 0.68 98.8 ("A"*1000).rfind("B") (*1000) 0.61 0.62 99.5 ("A"*1000).rpartition("B") (*1000) 0.71 0.71 100.2 ("A"*1000).rsplit("B", 1) (*1000) 0.70 0.70 99.9 ("A"*1000).split("B", 1) (*1000) ========== no match, two characters 3.64 3.50 103.9 ("AB"*1000).find("BC") (*1000) 3.70 3.59 103.1 ("AB"*1000).find("CA") (*1000) 3.86 3.56 108.4 "BC" in "AB"*1000 (*1000) 3.57 3.69 96.9 ("AB"*1000).partition("BC") (*1000) 2.04 2.04 100.0 ("AB"*1000).rfind("BC") (*1000) 3.17 3.14 100.7 ("AB"*1000).rfind("CA") (*1000) 1.99 2.12 94.2 ("AB"*1000).rpartition("BC") (*1000) 2.31 2.33 99.2 ("AB"*1000).rsplit("BC", 1) (*1000) 3.75 3.74 100.1 ("AB"*1000).split("BC", 1) (*1000) ========== quick replace multiple character match 0.06 0.49 11.3 ("A" + ("Z"*128*1024)).replace("AZZ", "BBZZ", 1) (*10) ========== quick replace single character match 0.05 0.50 11.0 ("A" + ("Z"*128*1024)).replace("A", "BB", 1) (*10) ========== repeat 1 character 10 times 0.06 0.07 86.0 "A"*10 (*1000) ========== repeat 1 character 1000 times 0.11 0.12 91.4 "A"*1000 (*1000) ========== repeat 5 characters 10 times 0.07 0.08 85.3 "ABCDE"*10 (*1000) ========== repeat 5 characters 1000 times 0.21 0.23 91.4 "ABCDE"*1000 (*1000) ========== replace and expand multiple characters, big string 0.83 1.80 46.0 "...text.with.2000.newlines...replace("\n", "\r\n") (*10) ========== replace multiple characters, dna 1.63 2.35 69.3 dna.replace("ATC", "ATT") (*10) ========== replace single character 0.13 0.17 73.9 "This is a test".replace(" ", "\t") (*1000) ========== replace single character, big string 0.23 1.37 16.5 "...text.with.2000.lines...replace("\n", " ") (*10) ========== replace/remove multiple characters 0.20 0.27 74.7 "When shall we three meet again?".replace("ee", "") (*1000) ========== split 1 whitespace 0.08 0.11 78.6 ("Here are some words. "*2).partition(" ") (*1000) 0.07 0.08 84.3 ("Here are some words. "*2).rpartition(" ") (*1000) 0.19 0.21 91.0 ("Here are some words. "*2).rsplit(None, 1) (*1000) 0.18 0.20 93.2 ("Here are some words. "*2).split(None, 1) (*1000) ========== split 2000 newlines 1.27 1.51 83.9 "...text...".rsplit("\n") (*10) 1.21 1.32 91.4 "...text...".split("\n") (*10) 1.44 1.71 84.1 "...text...".splitlines() (*10) ========== split newlines 0.22 0.27 82.9 "this\nis\na\ntest\n".rsplit("\n") (*1000) 0.22 0.24 91.4 "this\nis\na\ntest\n".split("\n") (*1000) 0.20 0.24 82.4 "this\nis\na\ntest\n".splitlines() (*1000) ========== split on multicharacter separator (dna) 1.09 1.10 98.8 dna.rsplit("ACTAT") (*10) 1.48 1.43 103.4 dna.split("ACTAT") (*10) ========== split on multicharacter separator (small) 0.40 0.46 88.4 "this--is--a--test--of--the--emergency--broadcast--system".rsplit("--") (*1000) 0.39 0.45 87.0 "this--is--a--test--of--the--emergency--broadcast--system".split("--") (*1000) ========== split whitespace (huge) 1.24 1.44 86.1 human_text.rsplit() (*10) 1.11 1.35 82.6 human_text.split() (*10) ========== split whitespace (small) 0.34 0.38 89.5 ("Here are some words. "*2).rsplit() (*1000) 0.32 0.37 85.8 ("Here are some words. "*2).split() (*1000) ========== startswith multiple characters 0.12 0.13 94.5 "Andrew".startswith("Andrew") (*1000) ========== startswith multiple characters - not! 0.12 0.11 109.7 "Andrew".startswith("Anders") (*1000) ========== startswith single character 0.12 0.13 95.4 "Andrew".startswith("A") (*1000) ========== strip terminal newline 0.06 0.16 39.1 s="Hello!\n"; s[:-1] if s[-1]=="\n" else s (*1000) 0.05 0.06 77.6 "\nHello!".rstrip() (*1000) 0.05 0.06 77.0 "Hello!\n".rstrip() (*1000) 0.05 0.07 74.4 "\nHello!\n".strip() (*1000) 0.05 0.06 75.8 "\nHello!".strip() (*1000) 0.05 0.06 75.3 "Hello!\n".strip() (*1000) ========== strip terminal spaces and tabs 0.05 0.07 70.1 "\t \tHello".rstrip() (*1000) 0.05 0.07 76.5 "Hello\t \t".rstrip() (*1000) 0.03 0.04 78.7 "Hello\t \t".strip() (*1000) ========== tab split 0.36 0.48 75.4 GFF3_example.rsplit("\t", 8) (*1000) 0.34 0.46 73.2 GFF3_example.rsplit("\t") (*1000) 0.30 0.42 70.5 GFF3_example.split("\t", 8) (*1000) 0.32 0.43 73.4 GFF3_example.split("\t") (*1000) 152.32 166.34 91.6 TOTAL -------------- next part -------------- stringbench v2.0 3.2.2+ (3.2:125887a41a6f, Oct 5 2011, 22:29:03) [GCC 4.6.0 20110603 (Red Hat 4.6.0-10)] 2011-10-05 22:48:28.819039 bytes unicode (in ms) (in ms) % comment ========== case conversion -- dense 0.44 1.46 30.0 ("WHERE IN THE WORLD IS CARMEN SAN DEIGO?"*10).lower() (*1000) 0.46 1.38 33.3 ("where in the world is carmen san deigo?"*10).upper() (*1000) ========== case conversion -- rare 0.59 1.40 42.1 ("Where in the world is Carmen San Deigo?"*10).lower() (*1000) 0.48 1.40 34.1 ("wHERE IN THE WORLD IS cARMEN sAN dEIGO?"*10).upper() (*1000) ========== concat 20 strings of words length 4 to 15 1.26 1.50 83.7 s1+s2+s3+s4+...+s20 (*1000) ========== concat two strings 0.06 0.05 129.1 "Andrew"+"Dalke" (*1000) ========== count AACT substrings in DNA example 1.26 1.27 99.6 dna.count("AACT") (*10) ========== count newlines 0.53 0.53 99.1 ...text.with.2000.newlines.count("\n") (*10) ========== early match, single character 0.12 0.12 101.5 ("A"*1000).find("A") (*1000) 0.32 0.03 1139.8 "A" in "A"*1000 (*1000) 0.12 0.12 104.4 ("A"*1000).index("A") (*1000) 0.13 0.20 66.6 ("A"*1000).partition("A") (*1000) 0.13 0.12 103.5 ("A"*1000).rfind("A") (*1000) 0.13 0.12 102.9 ("A"*1000).rindex("A") (*1000) 0.13 0.20 63.2 ("A"*1000).rpartition("A") (*1000) 0.29 0.34 84.8 ("A"*1000).rsplit("A", 1) (*1000) 0.29 0.35 82.4 ("A"*1000).split("A", 1) (*1000) ========== early match, two characters 0.12 0.12 103.2 ("AB"*1000).find("AB") (*1000) 0.32 0.03 1050.5 "AB" in "AB"*1000 (*1000) 0.12 0.12 103.4 ("AB"*1000).index("AB") (*1000) 0.15 0.26 60.0 ("AB"*1000).partition("AB") (*1000) 0.13 0.12 104.1 ("AB"*1000).rfind("AB") (*1000) 0.13 0.13 101.9 ("AB"*1000).rindex("AB") (*1000) 0.15 0.25 60.7 ("AB"*1000).rpartition("AB") (*1000) 0.31 0.42 73.4 ("AB"*1000).rsplit("AB", 1) (*1000) 0.32 0.42 76.7 ("AB"*1000).split("AB", 1) (*1000) ========== endswith multiple characters 0.13 0.15 89.1 "Andrew".endswith("Andrew") (*1000) ========== endswith multiple characters - not! 0.13 0.11 111.1 "Andrew".endswith("Anders") (*1000) ========== endswith single character 0.12 0.13 95.1 "Andrew".endswith("w") (*1000) ========== formatting a string type with a dict N/A 0.43 0.0 "The %(k1)s is %(k2)s the %(k3)s."%{"k1":"x","k2":"y","k3":"z",} (*1000) ========== join empty string, with 1 character sep N/A 0.04 0.0 "A".join("") (*100) ========== join empty string, with 5 character sep N/A 0.04 0.0 "ABCDE".join("") (*100) ========== join list of 100 words, with 1 character sep 1.13 0.92 122.6 "A".join(["Bob"]*100)) (*1000) ========== join list of 100 words, with 5 character sep 1.17 0.96 122.0 "ABCDE".join(["Bob"]*100)) (*1000) ========== join list of 26 characters, with 1 character sep 0.32 0.27 118.2 "A".join(list("ABC..Z")) (*1000) ========== join list of 26 characters, with 5 character sep 0.32 0.32 100.9 "ABCDE".join(list("ABC..Z")) (*1000) ========== join string with 26 characters, with 1 character sep N/A 0.93 0.0 "A".join("ABC..Z") (*1000) ========== join string with 26 characters, with 5 character sep N/A 1.00 0.0 "ABCDE".join("ABC..Z") (*1000) ========== late match, 100 characters 8.73 8.80 99.2 s="ABC"*33; ((s+"D")*500+s+"E").find(s+"E") (*100) 3.44 3.49 98.5 s="ABC"*33; ((s+"D")*500+"E"+s).find("E"+s) (*100) 5.41 5.37 100.8 s="ABC"*33; (s+"E") in ((s+"D")*300+s+"E") (*100) 9.04 9.01 100.3 s="ABC"*33; ((s+"D")*500+s+"E").index(s+"E") (*100) 8.91 10.90 81.7 s="ABC"*33; ((s+"D")*500+s+"E").partition(s+"E") (*100) 7.49 3.48 215.0 s="ABC"*33; ("E"+s+("D"+s)*500).rfind("E"+s) (*100) 1.81 1.95 92.7 s="ABC"*33; (s+"E"+("D"+s)*500).rfind(s+"E") (*100) 7.71 3.90 197.4 s="ABC"*33; ("E"+s+("D"+s)*500).rindex("E"+s) (*100) 7.77 4.23 184.0 s="ABC"*33; ("E"+s+("D"+s)*500).rpartition("E"+s) (*100) 8.18 4.25 192.4 s="ABC"*33; ("E"+s+("D"+s)*500).rsplit("E"+s, 1) (*100) 9.22 9.93 92.9 s="ABC"*33; ((s+"D")*500+s+"E").split(s+"E", 1) (*100) ========== late match, two characters 1.23 1.21 101.5 ("AB"*300+"C").find("BC") (*1000) 1.19 1.16 101.9 ("AB"*300+"CA").find("CA") (*1000) 1.38 1.08 128.6 "BC" in ("AB"*300+"C") (*1000) 1.19 1.18 100.6 ("AB"*300+"C").index("BC") (*1000) 1.19 1.38 86.1 ("AB"*300+"C").partition("BC") (*1000) 1.05 0.57 184.5 ("C"+"AB"*300).rfind("CA") (*1000) 0.70 0.70 99.7 ("BC"+"AB"*300).rfind("BC") (*1000) 1.04 0.58 181.4 ("C"+"AB"*300).rindex("CA") (*1000) 1.03 0.61 168.3 ("C"+"AB"*300).rpartition("CA") (*1000) 1.24 0.77 161.1 ("C"+"AB"*300).rsplit("CA", 1) (*1000) 1.32 1.39 95.1 ("AB"*300+"C").split("BC", 1) (*1000) ========== no match, single character 0.68 0.68 100.2 ("A"*1000).find("B") (*1000) 0.89 0.59 151.2 "B" in "A"*1000 (*1000) 0.60 0.60 99.9 ("A"*1000).partition("B") (*1000) 0.69 0.68 100.8 ("A"*1000).rfind("B") (*1000) 0.61 0.60 100.9 ("A"*1000).rpartition("B") (*1000) 0.71 0.72 99.2 ("A"*1000).rsplit("B", 1) (*1000) 0.71 0.70 101.7 ("A"*1000).split("B", 1) (*1000) ========== no match, two characters 3.69 3.65 101.3 ("AB"*1000).find("BC") (*1000) 3.64 3.63 100.3 ("AB"*1000).find("CA") (*1000) 3.87 3.54 109.2 "BC" in "AB"*1000 (*1000) 3.58 4.12 87.0 ("AB"*1000).partition("BC") (*1000) 2.05 2.02 101.6 ("AB"*1000).rfind("BC") (*1000) 3.16 1.62 195.0 ("AB"*1000).rfind("CA") (*1000) 2.02 2.03 99.6 ("AB"*1000).rpartition("BC") (*1000) 2.36 2.15 109.8 ("AB"*1000).rsplit("BC", 1) (*1000) 3.66 3.67 99.9 ("AB"*1000).split("BC", 1) (*1000) ========== quick replace multiple character match 0.06 0.28 19.9 ("A" + ("Z"*128*1024)).replace("AZZ", "BBZZ", 1) (*10) ========== quick replace single character match 0.05 0.25 21.2 ("A" + ("Z"*128*1024)).replace("A", "BB", 1) (*10) ========== repeat 1 character 10 times 0.06 0.06 90.5 "A"*10 (*1000) ========== repeat 1 character 1000 times 0.10 0.19 54.6 "A"*1000 (*1000) ========== repeat 5 characters 10 times 0.07 0.08 90.7 "ABCDE"*10 (*1000) ========== repeat 5 characters 1000 times 0.20 0.51 39.9 "ABCDE"*1000 (*1000) ========== replace and expand multiple characters, big string 0.84 1.69 49.7 "...text.with.2000.newlines...replace("\n", "\r\n") (*10) ========== replace multiple characters, dna 1.66 1.90 87.3 dna.replace("ATC", "ATT") (*10) ========== replace single character 0.13 0.14 95.0 "This is a test".replace(" ", "\t") (*1000) ========== replace single character, big string 0.22 0.65 34.6 "...text.with.2000.lines...replace("\n", " ") (*10) ========== replace/remove multiple characters 0.20 0.23 90.5 "When shall we three meet again?".replace("ee", "") (*1000) ========== split 1 whitespace 0.08 0.08 91.6 ("Here are some words. "*2).partition(" ") (*1000) 0.06 0.08 83.8 ("Here are some words. "*2).rpartition(" ") (*1000) 0.21 0.24 87.0 ("Here are some words. "*2).rsplit(None, 1) (*1000) 0.20 0.21 95.1 ("Here are some words. "*2).split(None, 1) (*1000) ========== split 2000 newlines 1.29 1.90 67.9 "...text...".rsplit("\n") (*10) 1.22 1.75 70.1 "...text...".split("\n") (*10) 1.45 1.99 72.8 "...text...".splitlines() (*10) ========== split newlines 0.24 0.20 117.2 "this\nis\na\ntest\n".rsplit("\n") (*1000) 0.23 0.20 115.6 "this\nis\na\ntest\n".split("\n") (*1000) 0.21 0.20 105.6 "this\nis\na\ntest\n".splitlines() (*1000) ========== split on multicharacter separator (dna) 1.09 0.85 128.6 dna.rsplit("ACTAT") (*10) 1.47 1.60 91.5 dna.split("ACTAT") (*10) ========== split on multicharacter separator (small) 0.42 0.45 93.5 "this--is--a--test--of--the--emergency--broadcast--system".rsplit("--") (*1000) 0.42 0.38 110.1 "this--is--a--test--of--the--emergency--broadcast--system".split("--") (*1000) ========== split whitespace (huge) 1.21 1.61 75.5 human_text.rsplit() (*10) 1.13 1.67 67.3 human_text.split() (*10) ========== split whitespace (small) 0.35 0.34 100.9 ("Here are some words. "*2).rsplit() (*1000) 0.33 0.31 106.1 ("Here are some words. "*2).split() (*1000) ========== startswith multiple characters 0.13 0.14 91.7 "Andrew".startswith("Andrew") (*1000) ========== startswith multiple characters - not! 0.13 0.11 114.9 "Andrew".startswith("Anders") (*1000) ========== startswith single character 0.13 0.13 97.9 "Andrew".startswith("A") (*1000) ========== strip terminal newline 0.06 0.13 47.5 s="Hello!\n"; s[:-1] if s[-1]=="\n" else s (*1000) 0.04 0.04 107.5 "\nHello!".rstrip() (*1000) 0.05 0.04 105.7 "Hello!\n".rstrip() (*1000) 0.05 0.05 100.0 "\nHello!\n".strip() (*1000) 0.04 0.04 100.8 "\nHello!".strip() (*1000) 0.05 0.04 105.8 "Hello!\n".strip() (*1000) ========== strip terminal spaces and tabs 0.05 0.04 102.1 "\t \tHello".rstrip() (*1000) 0.05 0.05 104.7 "Hello\t \t".rstrip() (*1000) 0.03 0.03 90.3 "Hello\t \t".strip() (*1000) ========== tab split 0.35 0.37 92.7 GFF3_example.rsplit("\t", 8) (*1000) 0.33 0.36 92.9 GFF3_example.rsplit("\t") (*1000) 0.30 0.31 96.5 GFF3_example.split("\t", 8) (*1000) 0.31 0.33 94.3 GFF3_example.split("\t") (*1000) 152.31 146.96 103.6 TOTAL From tjreedy at udel.edu Thu Oct 6 03:28:58 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 05 Oct 2011 21:28:58 -0400 Subject: [Python-Dev] [Python-checkins] cpython: Document requierements of Unicode kinds In-Reply-To: <201110060153.19068.victor.stinner@haypocalc.com> References: <4E8CAF22.8040501@udel.edu> <201110060153.19068.victor.stinner@haypocalc.com> Message-ID: On 10/5/2011 7:53 PM, Victor Stinner wrote: > Le mercredi 5 octobre 2011 21:25:22, Terry Reedy a ?crit : >>> + - PyUnicode_1BYTE_KIND (1): >>> + >>> + * character type = Py_UCS1 (8 bits, unsigned) >>> + * if ascii is 1, at least one character must be in range >>> + U+80-U+FF, otherwise all characters must be in range >>> U+00-U+7F >> >> Given that 1==True, this looks backwards. > > I changed the doc to: > > PyUnicode_1BYTE_KIND (1): > > * character type = Py_UCS1 (8 bits, unsigned) > * if ascii is set, all characters must be in range > U+0000-U+007F, otherwise at least one character must be in range > U+0080-U+00FF > > Is it better? yes >>> + - PyUnicode_2BYTE_KIND (2): >>> + >>> + * character type = Py_UCS2 (16 bits, unsigned) >>> + * at least one character must be in range U+0100-U+1FFFF >> >> /U+1FFFF/U+FFFF/ ? > > Oops, correct. I fixed the doc, thanks for the review. Glad I could help with that, even though not the code details. -- Terry Jan Reedy From tjreedy at udel.edu Thu Oct 6 03:33:06 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 05 Oct 2011 21:33:06 -0400 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <20111006020732.54d4df4c@pitrou.net> References: <20111005235507.GA32295@cskk.homeip.net> <20111006020732.54d4df4c@pitrou.net> Message-ID: On 10/5/2011 8:07 PM, Antoine Pitrou wrote: > On Thu, 6 Oct 2011 10:55:07 +1100 > Cameron Simpson wrote: >> >> Equally, why on earth are you running tests as root!?!?!?!?! Madness. >> It's as bad as compiling stuff as root etc etc. A bad idea all around, >> securitywise. >> >> Especially, I would think, a builtbot. "Oh, let's fetch some shiny new >> code and run it as the system superuser." > > Said buildbot probably runs in a VM. A different perspective is that running buildbot tests as root does not reflect the experience of non-root users. It seems some tests need to be run both ways just for correctness testing. -- Terry Jan Reedy From cs at zip.com.au Thu Oct 6 04:46:46 2011 From: cs at zip.com.au (Cameron Simpson) Date: Thu, 6 Oct 2011 13:46:46 +1100 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: References: Message-ID: <20111006024646.GA17029@cskk.homeip.net> On 05Oct2011 21:33, Terry Reedy wrote: | On 10/5/2011 8:07 PM, Antoine Pitrou wrote: | >On Thu, 6 Oct 2011 10:55:07 +1100 Cameron Simpson wrote: | >>Equally, why on earth are you running tests as root!?!?!?!?! Madness. | >>It's as bad as compiling stuff as root etc etc. A bad idea all around, | >>securitywise. | >> | >>Especially, I would think, a builtbot. "Oh, let's fetch some shiny new | >>code and run it as the system superuser." | > | >Said buildbot probably runs in a VM. Which gets post validated from outside via some magic sanity test? Which gets wiped and rerun from a clean image for the next build? | A different perspective is that running buildbot tests as root does | not reflect the experience of non-root users. It seems some tests | need to be run both ways just for correctness testing. Surely VERY FEW tests need to be run as root, and they need careful consideration. The whole thing (build, full test suite) should not run as root. Am I really the only person who feels unease about this scenario? Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ Observing the first balloon ascent in Paris, [Ben] Franklin heard a scoffer ask, "What good is it?" He spoke for a generation of scientists in his retort, "What good is a newly born infant?" - John F. Kasson From stephen at xemacs.org Thu Oct 6 06:54:20 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 06 Oct 2011 13:54:20 +0900 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <20111006024646.GA17029@cskk.homeip.net> References: <20111006024646.GA17029@cskk.homeip.net> Message-ID: <87y5wyg1oj.fsf@uwakimon.sk.tsukuba.ac.jp> Cameron Simpson writes: > Am I really the only person who feels unease about this scenario? No, you are not alone. Though in practice with all the "Welcome, Cracker!" boxes out there, one more less-secure-than-it-could-be VM probably doesn't matter all that much. More important to Python is Terry's point that running as root is not the normal scenario for most users, so the buildbot client should be set up to run as root only for tests that are specifically intended to test running as root (or as a separate run from the "normal" unprivileged user test suite). From neologix at free.fr Thu Oct 6 09:29:53 2011 From: neologix at free.fr (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Thu, 6 Oct 2011 09:29:53 +0200 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <20111005235507.GA32295@cskk.homeip.net> References: <20111005235507.GA32295@cskk.homeip.net> Message-ID: > I'd have expect this test to fail on _any_ UNIX system if run as root. > Root's allowed to write to stuff! Any stuff! About the only permission > with any effect on root is the eXecute bit for the exec call, to prevent > blindly running random data files. You're right, here's another test on Linux (I must have screwed up when I tested this on my box): # mkdir /tmp/foo # chmod -w /tmp/foo # touch /tmp/foo/bar # ls /tmp/foo bar You can still set the directory immutable if you really want to deny write to root: # chattr +i /tmp/foo # touch /tmp/foo/spam touch: cannot touch `/tmp/foo/spam': Permission denied > Equally, why on earth are you running tests as root!?!?!?!?! Madness. > It's as bad as compiling stuff as root etc etc. A bad idea all around, > securitywise. Agreed, I would personally never run a buildbot as root. I just changed this because I was tired of seeing the same buildbots always red (thus masking real failures). From neologix at free.fr Thu Oct 6 10:09:34 2011 From: neologix at free.fr (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Thu, 6 Oct 2011 10:09:34 +0200 Subject: [Python-Dev] Using PEP384 Stable ABI for the lzma extension module In-Reply-To: <20111005190258.53e5fdab@pitrou.net> References: <4E8B4249.5030809@v.loewis.de> <4E8B5F8E.1070306@v.loewis.de> <20111004214549.66360def@pitrou.net> <4E8C8206.9070302@v.loewis.de> <1317831248.3713.1.camel@localhost.localdomain> <20111005163810.GI5476@unaka.lan> <20111005190258.53e5fdab@pitrou.net> Message-ID: >> > > > That's not a given. Depending on the memory allocator, a copy can be >> > > > avoided. That's why the "str += str" hack is much more efficient under >> > > > Linux than Windows, AFAIK. >> > > >> > > Even Linux will have to copy a block on realloc in certain cases, no? >> > >> > Probably so. How often is totally unknown to me :) >> > >> http://www.gnu.org/software/libc/manual/html_node/Changing-Block-Size.html >> >> It depends on whether there's enough free memory after the buffer you >> currently have allocated. ?I suppose that this becomes a question of what >> people consider "the general case" :-) > > But under certain circumstances (if a large block is requested), the > allocator uses mmap(), no? That's right, if the block requested is bigger than mmap_threshold (256K by default with glibc, forgetting the sliding window algorithm): I'm not sure of what percentage of strings/buffers are concerned in a "typical" program. > In which case mremap() should allow to resize without copying anything. Yes, there's no copying. Note however that it doesn't come for free, the kernel will still zero-fill the pages before handling them to user-space. It is still way faster than on, let's say, Solaris. cf From amauryfa at gmail.com Thu Oct 6 10:16:19 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Thu, 6 Oct 2011 10:16:19 +0200 Subject: [Python-Dev] Using PEP384 Stable ABI for the lzma extension module In-Reply-To: References: <4E8B4249.5030809@v.loewis.de> <4E8B5F8E.1070306@v.loewis.de> <20111004214549.66360def@pitrou.net> <4E8C8206.9070302@v.loewis.de> <1317831248.3713.1.camel@localhost.localdomain> <20111005163810.GI5476@unaka.lan> <20111005190258.53e5fdab@pitrou.net> Message-ID: Le 6 octobre 2011 10:09, Charles-Fran?ois Natali a ?crit : >> But under certain circumstances (if a large block is requested), the >> allocator uses mmap(), no? > > That's right, if the block requested is bigger than mmap_threshold > (256K by default with glibc, forgetting the sliding window algorithm): > I'm not sure of what percentage of strings/buffers are concerned in a > "typical" program. Most usages of _PyBytes_Resize() are in compression libraries. 256K payloads are not rare in this area. -- Amaury Forgeot d'Arc From glyph at twistedmatrix.com Thu Oct 6 10:26:24 2011 From: glyph at twistedmatrix.com (Glyph) Date: Thu, 6 Oct 2011 04:26:24 -0400 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <20111006024646.GA17029@cskk.homeip.net> References: <20111006024646.GA17029@cskk.homeip.net> Message-ID: On Oct 5, 2011, at 10:46 PM, Cameron Simpson wrote: > Surely VERY FEW tests need to be run as root, and they need careful > consideration. The whole thing (build, full test suite) should > not run as root. This is news to me - is most of Python not supported to run as root? I was under the impression that Python was supposed to run correctly as root, and therefore there should be some buildbots dedicated to running it that way. If only a few small parts of the API are supposed to work perhaps this should be advertised more clearly in the documentation? Ahem. Sorry for the snark, I couldn't resist. As terry more reasonably put it: >> running buildbot tests as root does not reflect the experience of non-root users. It seems some tests need to be run both ways just for correctness testing. (except I'd say "all", not "some") > Am I really the only person who feels unease about this scenario? More seriously: apparently you are not, but I am quite surprised by that revelation. You should be :). The idea of root as a special, magical place where real ultimate power resides is quite silly. "root" is a title, like "king". You're not just "root", you're root _of_ something. If the thing that you are root of is a dedicated virtual machine with no interesting data besides the code under test, then this is quite a lot like being a regular user in a similarly boring place. It's like having the keys to an empty safe. Similarly, if you're a normal "unprivileged" user - let's say, www-data - on a system with large amounts of sensitive data owned by that user, becoming root will rarely grant you any really interesting privileges beyond what you've already got. Most public web-based systems fall into this category, as you've got one user (the application deployment user) running almost all of your code, with privileges to read and write to the only interesting data source (the database). So if these tests were running on somebody's public-facing production system in an "unprivileged" context, I'd be far more concerned about that than about it having root on some throwaway VM. -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at haypocalc.com Thu Oct 6 12:42:26 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 6 Oct 2011 12:42:26 +0200 Subject: [Python-Dev] New stringbench benchmark results In-Reply-To: <201110060206.30819.victor.stinner@haypocalc.com> References: <201110060206.30819.victor.stinner@haypocalc.com> Message-ID: <201110061242.26526.victor.stinner@haypocalc.com> Hum, copy-paste failure, I wrote numbers in the wrong order, it's: (test: Python 3.2 => Python 3.3) "A".join(["Bob"]*100)): 0.92 => 2.11 ("C"+"AB"*300).rfind("CA"): 0.57 => 1.03 ("A" + ("Z"*128*1024)).replace("A", "BB", 1): 0.25 => 0.50 I improved str.replace(): it's now 5 times faster instead of 2 times slower for this specific benchmark :-) (or 10 times faster in Python 3.3 before/after my patch) Victor From amauryfa at gmail.com Thu Oct 6 14:57:03 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Thu, 6 Oct 2011 14:57:03 +0200 Subject: [Python-Dev] check for PyUnicode_READY look backwards Message-ID: Hi, with the new Unicode API, there are many checks like: + if (PyUnicode_READY(*filename)) + goto handle_error; Every time I read it, I get it wrong: "If filename is ready, then fail" then I have to remember that the function returns either 0 or -1. I'd prefer it was written : if (PyUnicode_READY(*filename) < 0) because "< 0" clearly indicates an error condition. That's how all calls to PyType_Ready are written, for example. Am I the only one to be distracted by this idiom? -- Amaury Forgeot d'Arc From merwok at netwok.org Thu Oct 6 15:20:38 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 06 Oct 2011 15:20:38 +0200 Subject: [Python-Dev] cpython: PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown In-Reply-To: References: <4E89E166.7020307@v.loewis.de> Message-ID: <4E8DAB26.1050507@netwok.org> Hi, Le 03/10/2011 23:38, Terry Reedy a ?crit : > Is it both technically possible (with hg) and socially permissible (with > us) to edit another's commit message? Not easily. A changeset identifier is a hash of date, user, parent changesets hashes, commit message and diff content; editing the commit message makes a new changeset. I?ve read about a company where they use a script or an extension to send changesets to a colleague for review and destroy them locally, so that when they pull the changeset edited by the reviewer, they don?t get duplicates. It sounds complicated. Regards From merwok at netwok.org Thu Oct 6 15:28:04 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 06 Oct 2011 15:28:04 +0200 Subject: [Python-Dev] cpython: PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown In-Reply-To: <87aa9h1myi.fsf@uwakimon.sk.tsukuba.ac.jp> References: <4E89E166.7020307@v.loewis.de> <4E8A2FB5.7020509@v.loewis.de> <87aa9h1myi.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4E8DACE4.8020005@netwok.org> Hi, Le 04/10/2011 04:59, Stephen J. Turnbull a ?crit : > Currently, in hg. git has a mechanism for adding notes which are > automatically displayed along with the original commit message, and > bzr is considering introducing such a mechanism. Mercurial commits can contain an ?extra? dictionary, but that feature is not yet exposed on the command line. > I'm not familiar with the hg dev process (I use hg a lot, but so far > it Just Works for me :), but I would imagine they will move in that > direction as well. I doubt it; Mercurial has a very strong view of ?history is sacred?; it has taken many, many requests for an optional rebase feature to be added, for example. Regards From merwok at netwok.org Thu Oct 6 15:42:34 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 06 Oct 2011 15:42:34 +0200 Subject: [Python-Dev] [Python-checkins] cpython: When expandtabs() would be a no-op, don't create a duplicate string In-Reply-To: References: Message-ID: <4E8DB04A.4010202@netwok.org> Hi, > http://hg.python.org/cpython/rev/447f521ac6d9 > user: Antoine Pitrou > date: Tue Oct 04 16:04:01 2011 +0200 > summary: > When expandtabs() would be a no-op, don't create a duplicate string > > diff --git a/Lib/test/test_unicode.py b/Lib/test/test_unicode.py > --- a/Lib/test/test_unicode.py > +++ b/Lib/test/test_unicode.py > @@ -1585,6 +1585,10 @@ > return > self.assertRaises(OverflowError, 't\tt\t'.expandtabs, sys.maxsize) > > + def test_expandtabs_optimization(self): > + s = 'abc' > + self.assertIs(s.expandtabs(), s) Shouldn?t that be marked CPython-specific? From solipsis at pitrou.net Thu Oct 6 15:52:59 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 6 Oct 2011 15:52:59 +0200 Subject: [Python-Dev] Rename PyUnicode_KIND_SIZE ? Message-ID: <20111006155259.4832bd75@pitrou.net> Hello, The PyUnicode_KIND_SIZE macro is defined as follows. Its name looks rather mysterious or misleading to me. Could we rename it to something else? (also, is it useful? PEP 393 has added a flurry of new macros to unicodeobject.h and it's getting hard to know which ones are genuinely useful, or well-performing) /* Compute (index * char_size) where char_size is 2 ** (kind - 1). The index is a character index, the result is a size in bytes. See also PyUnicode_CHARACTER_SIZE(). */ #define PyUnicode_KIND_SIZE(kind, index) \ ((Py_ssize_t) ((index) << ((kind) - 1))) Regards Antoine. From merwok at netwok.org Thu Oct 6 16:12:21 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 06 Oct 2011 16:12:21 +0200 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 Message-ID: <4E8DB745.2090406@netwok.org> Hi, I started to play with virtualenv recently and wondered about the status of the similar feature in 3.3 (cpythonv). The last thread mentioned two bugs; one has been fixed since. Apart from the implicit vs. explicit download of distribute, are there design issues to discuss? Can we do that with a patch on a bug report? Oh, let?s not forget naming. We can?t reuse the module name virtualenv as it would shadow the third-party module name, and I?m not fond of ?virtualize?: it brings OS-level virtualization to my mind, not isolated Python environments. Cheers From martin at v.loewis.de Thu Oct 6 16:10:20 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 06 Oct 2011 16:10:20 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Fix find_module_path(): make the string ready In-Reply-To: References: Message-ID: <4E8DB6CC.7070104@v.loewis.de> > + if (PyUnicode_READY(path_unicode)) > + return -1; > + I think we need to discuss/reconsider the return value of PyUnicode_READY. It's defined to give -1 on error currently. If that sounds good, then the check for error should be a check that it is -1. Regards, Martin From lists at cheimes.de Thu Oct 6 16:49:21 2011 From: lists at cheimes.de (Christian Heimes) Date: Thu, 06 Oct 2011 16:49:21 +0200 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: <4E8DB745.2090406@netwok.org> References: <4E8DB745.2090406@netwok.org> Message-ID: Am 06.10.2011 16:12, schrieb ?ric Araujo: > Oh, let?s not forget naming. We can?t reuse the module name virtualenv > as it would shadow the third-party module name, and I?m not fond of > ?virtualize?: it brings OS-level virtualization to my mind, not isolated > Python environments. How about clutch? A virtualenv is a clutch of Python eggs, all ready to hatch. (Pun intended). :) Christian From brian.curtin at gmail.com Thu Oct 6 17:06:17 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Thu, 6 Oct 2011 10:06:17 -0500 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: <4E8DB745.2090406@netwok.org> References: <4E8DB745.2090406@netwok.org> Message-ID: On Thu, Oct 6, 2011 at 09:12, ?ric Araujo wrote: > Oh, let?s not forget naming. ?We can?t reuse the module name virtualenv > as it would shadow the third-party module name, and I?m not fond of > ?virtualize?: it brings OS-level virtualization to my mind, not isolated > Python environments. How about we just drop the "virtual" part of the name and make it "env"? (or something non-virtual) From ronaldoussoren at mac.com Thu Oct 6 16:31:00 2011 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 06 Oct 2011 16:31:00 +0200 Subject: [Python-Dev] check for PyUnicode_READY look backwards In-Reply-To: References: Message-ID: On 6 Oct, 2011, at 14:57, Amaury Forgeot d'Arc wrote: > Hi, > > with the new Unicode API, there are many checks like: > + if (PyUnicode_READY(*filename)) > + goto handle_error; > > Every time I read it, I get it wrong: > "If filename is ready, then fail" > then I have to remember that the function returns either 0 or -1. > > I'd prefer it was written : > if (PyUnicode_READY(*filename) < 0) > because "< 0" clearly indicates an error condition. > That's how all calls to PyType_Ready are written, for example. > > Am I the only one to be distracted by this idiom? I prefer the '< 0' variant as well, for the same reason as you. Ronald > > -- > Amaury Forgeot d'Arc > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 4788 bytes Desc: not available URL: From barry at python.org Thu Oct 6 17:31:24 2011 From: barry at python.org (Barry Warsaw) Date: Thu, 6 Oct 2011 11:31:24 -0400 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: <4E8DB745.2090406@netwok.org> References: <4E8DB745.2090406@netwok.org> Message-ID: <20111006113124.56c18f6a@resist.wooz.org> On Oct 06, 2011, at 04:12 PM, ?ric Araujo wrote: >I started to play with virtualenv recently and wondered about the status >of the similar feature in 3.3 (cpythonv). The last thread mentioned two >bugs; one has been fixed since. > >Apart from the implicit vs. explicit download of distribute, are there >design issues to discuss? Can we do that with a patch on a bug report? > >Oh, let?s not forget naming. We can?t reuse the module name virtualenv >as it would shadow the third-party module name, and I?m not fond of >?virtualize?: it brings OS-level virtualization to my mind, not isolated >Python environments. Time to hit the hardware store and stock up on bikeshed paint! I agree we can't use virtualenv, and shouldn't use virtualize. I'm afraid that picking something cute might make it harder to discover. `pythonv` or `cpythonv` seem like good choices to me. Maybe the former, so we could potentially have jythonv, etc. -Barry From solipsis at pitrou.net Thu Oct 6 17:42:34 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 6 Oct 2011 17:42:34 +0200 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 References: <4E8DB745.2090406@netwok.org> Message-ID: <20111006174234.35ba4238@pitrou.net> On Thu, 6 Oct 2011 10:06:17 -0500 Brian Curtin wrote: > On Thu, Oct 6, 2011 at 09:12, ?ric Araujo wrote: > > Oh, let?s not forget naming. ?We can?t reuse the module name virtualenv > > as it would shadow the third-party module name, and I?m not fond of > > ?virtualize?: it brings OS-level virtualization to my mind, not isolated > > Python environments. > > How about we just drop the "virtual" part of the name and make it "env"? "pythonenv"? From merwok at netwok.org Thu Oct 6 17:46:27 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 06 Oct 2011 17:46:27 +0200 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: <20111006113124.56c18f6a@resist.wooz.org> References: <4E8DB745.2090406@netwok.org> <20111006113124.56c18f6a@resist.wooz.org> Message-ID: <4E8DCD53.6030306@netwok.org> Le 06/10/2011 17:31, Barry Warsaw a ?crit : > I agree we can't use virtualenv, and shouldn't use virtualize. I'm afraid > that picking something cute might make it harder to discover. `pythonv` or > `cpythonv` seem like good choices to me. Maybe the former, so we could > potentially have jythonv, etc. I?m not sure we would. The feature is two-fold: - changes to getpath.c, site.py and other usual suspects so that CPython supports being run in an isolated environment; - a new module used to create isolated environments. The first part is implemented in CPython; the second part needs a module name to replace virtualenv. python -m pythonv doesn?t seem right. python -m makeenv? python -m workon? (idea from virtualenvwrapper) python -m nest? Cheers From petri at digip.org Thu Oct 6 17:46:37 2011 From: petri at digip.org (Petri Lehtinen) Date: Thu, 6 Oct 2011 18:46:37 +0300 Subject: [Python-Dev] counterintuitive behavior (bug?) in Counter with += In-Reply-To: References: Message-ID: <20111006154637.GF23957@p16> Lars Buitinck wrote: > >>> from collections import Counter > >>> a = Counter([1,2,3]) > >>> b = a > >>> a += Counter([3,4,5]) > >>> a is b > False > > would become > > # snip > >>> a is b > True Sounds like a good idea to me. You should open an issue in the tracker at http://bugs.python.org/. From victor.stinner at haypocalc.com Thu Oct 6 17:52:05 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 06 Oct 2011 17:52:05 +0200 Subject: [Python-Dev] Rename PyUnicode_KIND_SIZE ? In-Reply-To: <20111006155259.4832bd75@pitrou.net> References: <20111006155259.4832bd75@pitrou.net> Message-ID: <4E8DCEA5.8010305@haypocalc.com> Le 06/10/2011 15:52, Antoine Pitrou a ?crit : > The PyUnicode_KIND_SIZE macro is defined as follows. Its name looks > rather mysterious or misleading to me. Could we rename it to something > else? What do you propose? > also, is it useful? index << (kind - 1) and index * PyUnicode_CHARACTER_SIZE(str) were used in unicodeobject.c. It's not easy to understand this formula, and it heavily depend on how kind is defined. I wrote a patch to use enum for kind using different values, but the gain was minor so I didn't commit it. We may move it to unicodeobject.c. index * PyUnicode_CHARACTER_SIZE(str) is enough for the public API. (PyUnicode_KIND_SIZE() is also a micro-optimization, it uses shift instead of multiply.) > PEP 393 has added a flurry of new macros to > unicodeobject.h and it's getting hard to know which ones are genuinely > useful, or well-performing) Yes, we have to review new functions and macros. Victor From brian.curtin at gmail.com Thu Oct 6 17:54:21 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Thu, 6 Oct 2011 10:54:21 -0500 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: <4E8DCD53.6030306@netwok.org> References: <4E8DB745.2090406@netwok.org> <20111006113124.56c18f6a@resist.wooz.org> <4E8DCD53.6030306@netwok.org> Message-ID: On Thu, Oct 6, 2011 at 10:46, ?ric Araujo wrote: > Le 06/10/2011 17:31, Barry Warsaw a ?crit : >> I agree we can't use virtualenv, and shouldn't use virtualize. ?I'm afraid >> that picking something cute might make it harder to discover. ?`pythonv` or >> `cpythonv` seem like good choices to me. ?Maybe the former, so we could >> potentially have jythonv, etc. > > I?m not sure we would. ?The feature is two-fold: > - changes to getpath.c, site.py and other usual suspects so that CPython > supports being run in an isolated environment; > - a new module used to create isolated environments. > > The first part is implemented in CPython; the second part needs a module > name to replace virtualenv. ?python -m pythonv doesn?t seem right. > > python -m makeenv? > python -m workon? (idea from virtualenvwrapper) > python -m nest? develop? devenv? From barry at python.org Thu Oct 6 18:02:05 2011 From: barry at python.org (Barry Warsaw) Date: Thu, 6 Oct 2011 12:02:05 -0400 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: <4E8DCD53.6030306@netwok.org> References: <4E8DB745.2090406@netwok.org> <20111006113124.56c18f6a@resist.wooz.org> <4E8DCD53.6030306@netwok.org> Message-ID: <20111006120205.061a50ed@resist.wooz.org> On Oct 06, 2011, at 05:46 PM, ?ric Araujo wrote: >Le 06/10/2011 17:31, Barry Warsaw a ?crit : >> I agree we can't use virtualenv, and shouldn't use virtualize. I'm afraid >> that picking something cute might make it harder to discover. `pythonv` or >> `cpythonv` seem like good choices to me. Maybe the former, so we could >> potentially have jythonv, etc. > >I?m not sure we would. The feature is two-fold: >- changes to getpath.c, site.py and other usual suspects so that CPython >supports being run in an isolated environment; >- a new module used to create isolated environments. While the other implementations might not be able to share any of CPython's code, it's still a worthy feature for any Python implementation I think. >The first part is implemented in CPython; the second part needs a module >name to replace virtualenv. python -m pythonv doesn?t seem right. Nope, although `python -m virtualize` seems about perfect. I don't particularly like the -m interface though. Yes, it should work, but I also think there should be a command that basically wraps whatever the -m invocation is, just for user friendliness. >python -m makeenv? >python -m workon? (idea from virtualenvwrapper) >python -m nest? Well, I have to be honest, I've *always* thought "nest" would be a good choice for a feature like this, but years ago (IIRC) PJE wanted to reserve that term for something else, which I'm not sure ever happened. There's a PyNEST project here: http://www.nest-initiative.uni-freiburg.de/index.php/PyNEST which might cause problems with a built-in `nest` module. Still, I'm a bit fond of `python -m nest` and a `pynest` wrapper. Barring that, `python -m virtualize` with an appropriate cli shortcut (`pysolate`? - say it out loud :) seems good. -Barry From solipsis at pitrou.net Thu Oct 6 18:03:51 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 6 Oct 2011 18:03:51 +0200 Subject: [Python-Dev] Rename PyUnicode_KIND_SIZE ? References: <20111006155259.4832bd75@pitrou.net> <4E8DCEA5.8010305@haypocalc.com> Message-ID: <20111006180351.283f4091@pitrou.net> On Thu, 06 Oct 2011 17:52:05 +0200 Victor Stinner wrote: > index << (kind - 1) and index * PyUnicode_CHARACTER_SIZE(str) were used > in unicodeobject.c. It's not easy to understand this formula index * PyUnicode_CHARACTER_SIZE(str) is quite easy to understand to me. I find it less cryptic than PyUnicode_KIND_SIZE(kind, index), actually, and I would advocate using the former and removing the latter. > (PyUnicode_KIND_SIZE() is also a micro-optimization, it uses shift > instead of multiply.) I don't know, but I think the compiler should be able to do that for you. Also, I don't think PyUnicode_KIND_SIZE would be used in a critical loop. You would use PyUnicode_READ when doing one-character-at-a-time stuff. Regards Antoine. From solipsis at pitrou.net Thu Oct 6 18:04:40 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 6 Oct 2011 18:04:40 +0200 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 References: <4E8DB745.2090406@netwok.org> <20111006113124.56c18f6a@resist.wooz.org> <4E8DCD53.6030306@netwok.org> <20111006120205.061a50ed@resist.wooz.org> Message-ID: <20111006180440.319c52c3@pitrou.net> On Thu, 6 Oct 2011 12:02:05 -0400 Barry Warsaw wrote: > > >The first part is implemented in CPython; the second part needs a module > >name to replace virtualenv. python -m pythonv doesn?t seem right. > > Nope, although `python -m virtualize` seems about perfect. `python -m sandbox` ? From carl at oddbird.net Thu Oct 6 18:14:28 2011 From: carl at oddbird.net (Carl Meyer) Date: Thu, 06 Oct 2011 10:14:28 -0600 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: <4E8DB745.2090406@netwok.org> References: <4E8DB745.2090406@netwok.org> Message-ID: <4E8DD3E4.70309@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi ?ric, Vinay is more up to date than I am on the current status of the implementation. I need to update the PEP draft we worked on last spring and get it posted (the WIP is at https://bitbucket.org/carljm/pythonv-pep but is out of date with the latest implementation work). On 10/06/2011 08:12 AM, ?ric Araujo wrote: > Oh, let?s not forget naming. We can?t reuse the module name virtualenv > as it would shadow the third-party module name, and I?m not fond of > ?virtualize?: it brings OS-level virtualization to my mind, not isolated > Python environments. What about "venv"? It's short, it's already commonly used colloquially to refer to virtualenv so it makes an accurate and unambiguous mental association, but AFAIK it is currently unused as a script or module name. Carl -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6N0+QACgkQ8W4rlRKtE2fCOwCg1YOWcMCZH6HOdyKepcQG3RgB T48AoIIqol+sUpOAFI+4HJH/dAdX5Xwm =DLjq -----END PGP SIGNATURE----- From vinay_sajip at yahoo.co.uk Thu Oct 6 18:46:34 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Thu, 6 Oct 2011 17:46:34 +0100 (BST) Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 References: <4E8DB745.2090406@netwok.org> Message-ID: <1317919594.45229.YahooMailNeo@web25805.mail.ukl.yahoo.com> ----- Original Message ----- > I started to play with virtualenv recently and wondered about the status > of the similar feature in 3.3 (cpythonv).? The last thread mentioned two > bugs; one has been fixed since. The pythonv branch is pretty much up to date with the default branch (3.3). I regularly merge with default and post the results of running the Python regression suite in a virtual environment - in fact I'm running such a test right now :-). The test results and a screencast are linked from the project's BitBucket page at https://bitbucket.org/vinay.sajip/pythonv/ > Apart from the implicit vs. explicit download of distribute, are there > design issues to discuss?? Can we do that with a patch on a bug report? I've made changes to get packaging to work well with virtualenv, some of which I raised as packaging issues on the tracker. In some cases, I've fixed them in the pythonv branch. The last bug is a problem with test_user_similar which has an existing issue (#9100) in a non-virtual environment. This is not a show-stopper, but I'm not really sure how the user scheme is supposed to work in venvs: perhaps Carl has a view on this. BTW there have been intermittent failures in test_packaging, too but they've generally been fixed by changes in core). In terms of design issues, it would be useful if someone (apart from me that is) could look at how the pythonv fork differs from the core and comment on any issues they find. (Much of the change can be summed up as replacing occurrences of "sys.prefix" with "sys.getattr('site_prefix', sys.prefix)".) BitBucket makes this type of comparison fairly easy to do; I'm not sure if there's a lot of value in adding a patch on the tracker for Rietveld review, until (and if) the PEP is accepted. Re. distribute: At the moment the pythonv branch downloads a private version of distribute. The only significant difference from the vanilla distribute is the use of sys.site_prefix and sys.site_exec_prefix (falling back to sys.prefix and sys.exec_prefix if we're not running distribute in a virtual env, so it's backward compatible - but I didn't ask anyone in the packaging/distribute team to port this small change across. The only reference to sys.site_prefix is this code in setuptools/command/easy_install.py: if hasattr(sys, 'site_prefix'): ??? prefixes = [sys.site_prefix] else: ??? prefixes = [sys.prefix] ??? if sys.exec_prefix != sys.prefix: ??????? prefixes.append(sys.exec_prefix) If this were ported to distribute, that would be nice :-) I think the plan is to remove the distribute-downloading functionality from the stdlib. However, I am working on a companion project, "nemo", which will have this functionality and in addition provides virtualenvwrapper-like functionality for Linux, Mac and Windows. (This is based on the stdlib API, is WIP, not released yet, though shown in the screencast I mentioned earlier). > Oh, let?s not forget naming.? We can?t reuse the module name virtualenv > as it would shadow the third-party module name, and I?m not fond of > ?virtualize?: it brings OS-level virtualization to my mind, not isolated > Python environments. I'm OK with Carl's suggestion of "venv", and prefer it to Brian's suggestion of "env". Regards, Vinay Sajip From barry at python.org Thu Oct 6 18:50:43 2011 From: barry at python.org (Barry Warsaw) Date: Thu, 6 Oct 2011 12:50:43 -0400 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: <20111006180440.319c52c3@pitrou.net> References: <4E8DB745.2090406@netwok.org> <20111006113124.56c18f6a@resist.wooz.org> <4E8DCD53.6030306@netwok.org> <20111006120205.061a50ed@resist.wooz.org> <20111006180440.319c52c3@pitrou.net> Message-ID: <20111006125043.16d1c462@rivendell> On Oct 06, 2011, at 06:04 PM, Antoine Pitrou wrote: >On Thu, 6 Oct 2011 12:02:05 -0400 >Barry Warsaw wrote: >> >> >The first part is implemented in CPython; the second part needs a module >> >name to replace virtualenv. python -m pythonv doesn?t seem right. >> >> Nope, although `python -m virtualize` seems about perfect. > >`python -m sandbox` ? That's nice too. -Barry From regebro at gmail.com Thu Oct 6 18:53:11 2011 From: regebro at gmail.com (Lennart Regebro) Date: Thu, 6 Oct 2011 18:53:11 +0200 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: <20111006125043.16d1c462@rivendell> References: <4E8DB745.2090406@netwok.org> <20111006113124.56c18f6a@resist.wooz.org> <4E8DCD53.6030306@netwok.org> <20111006120205.061a50ed@resist.wooz.org> <20111006180440.319c52c3@pitrou.net> <20111006125043.16d1c462@rivendell> Message-ID: +1 for env or sandbox or something else with "box" in it. pythonbox? envbox? boxenv? From pje at telecommunity.com Thu Oct 6 18:38:37 2011 From: pje at telecommunity.com (PJ Eby) Date: Thu, 6 Oct 2011 12:38:37 -0400 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: <20111006120205.061a50ed@resist.wooz.org> References: <4E8DB745.2090406@netwok.org> <20111006113124.56c18f6a@resist.wooz.org> <4E8DCD53.6030306@netwok.org> <20111006120205.061a50ed@resist.wooz.org> Message-ID: On Thu, Oct 6, 2011 at 12:02 PM, Barry Warsaw wrote: > Well, I have to be honest, I've *always* thought "nest" would be a good > choice > for a feature like this, but years ago (IIRC) PJE wanted to reserve that > term > for something else, which I'm not sure ever happened. > Actually, it was pretty much for this exact purpose -- i.e. it was the idea of a virtual environment. Ian just implemented it first, with some different ideas about configuration and activation. Since this is basically the replacement for that, I don't have any objection to using the term here. (In my vision, "nest" was also the name of a package management tool for creating such nests and manipulating their contents, though.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Thu Oct 6 19:23:19 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 06 Oct 2011 19:23:19 +0200 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: <20111006120205.061a50ed@resist.wooz.org> References: <4E8DB745.2090406@netwok.org> <20111006113124.56c18f6a@resist.wooz.org> <4E8DCD53.6030306@netwok.org> <20111006120205.061a50ed@resist.wooz.org> Message-ID: On 10/06/11 18:02, Barry Warsaw wrote: > On Oct 06, 2011, at 05:46 PM, ?ric Araujo wrote: > >>Le 06/10/2011 17:31, Barry Warsaw a ?crit : >>> I agree we can't use virtualenv, and shouldn't use virtualize. I'm afraid >>> that picking something cute might make it harder to discover. `pythonv` or >>> `cpythonv` seem like good choices to me. Maybe the former, so we could >>> potentially have jythonv, etc. >> >>I?m not sure we would. The feature is two-fold: >>- changes to getpath.c, site.py and other usual suspects so that CPython >>supports being run in an isolated environment; >>- a new module used to create isolated environments. > > While the other implementations might not be able to share any of CPython's > code, it's still a worthy feature for any Python implementation I think. > >>The first part is implemented in CPython; the second part needs a module >>name to replace virtualenv. python -m pythonv doesn?t seem right. > > Nope, although `python -m virtualize` seems about perfect. Hmm, with proper interpreter support I don't see what would be so "virtual" about it anymore. Georg From p.f.moore at gmail.com Thu Oct 6 19:23:37 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 6 Oct 2011 18:23:37 +0100 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: <20111006120205.061a50ed@resist.wooz.org> References: <4E8DB745.2090406@netwok.org> <20111006113124.56c18f6a@resist.wooz.org> <4E8DCD53.6030306@netwok.org> <20111006120205.061a50ed@resist.wooz.org> Message-ID: On 6 October 2011 17:02, Barry Warsaw wrote: > I don't particularly like the -m interface though. ?Yes, it should work, but I > also think there should be a command that basically wraps whatever the -m > invocation is, just for user friendliness. No problem with a wrapper, but the nice thing about the -m form is that it's portable. On Unix, shell script wrappers are pretty portable (no idea if C-shell users would agree...) On Windows, though, there are all sorts of problems. BAT files don't nest, so you end up having to use atrocities like "CALL virtualenv" within a batch file. Powershell users prefer .ps1 files. The only common form is an EXE, but nobody really likes having to use a compiled form every time. Paul. From vinay_sajip at yahoo.co.uk Thu Oct 6 19:34:37 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Thu, 6 Oct 2011 17:34:37 +0000 (UTC) Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 References: <4E8DB745.2090406@netwok.org> Message-ID: ?ric Araujo netwok.org> writes: > Oh, let?s not forget naming. We can?t reuse the module name virtualenv > as it would shadow the third-party module name, and I?m not fond of > ?virtualize?: it brings OS-level virtualization to my mind, not isolated > Python environments. Another possible name would be "isolate": python -m isolate /project/env doesn't look too bad. There's no eponymous package on PyPI, and note also that in addition to the common usage of isolate as a verb, it's also a noun with an appropriate meaning in this context. Regards, Vinay Sajip From ncoghlan at gmail.com Thu Oct 6 19:55:47 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Oct 2011 13:55:47 -0400 Subject: [Python-Dev] check for PyUnicode_READY look backwards In-Reply-To: References: Message-ID: On Thu, Oct 6, 2011 at 10:31 AM, Ronald Oussoren wrote: > On 6 Oct, 2011, at 14:57, Amaury Forgeot d'Arc wrote: >> I'd prefer it was written : >> ? ? ? if (PyUnicode_READY(*filename) < 0) >> because "< 0" clearly indicates an error condition. >> That's how all calls to PyType_Ready are written, for example. >> >> Am I the only one to be distracted by this idiom? > > I prefer the '< 0' variant as well, for the same reason as you. +1 here as well. The Unix/C "0 as success" idiom breaks my Python conditioned brain, so including the explicit "< 0" in the C code helps resolve that impedance mismatch. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Oct 6 20:12:26 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Oct 2011 14:12:26 -0400 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: <20111006125043.16d1c462@rivendell> References: <4E8DB745.2090406@netwok.org> <20111006113124.56c18f6a@resist.wooz.org> <4E8DCD53.6030306@netwok.org> <20111006120205.061a50ed@resist.wooz.org> <20111006180440.319c52c3@pitrou.net> <20111006125043.16d1c462@rivendell> Message-ID: On Thu, Oct 6, 2011 at 12:50 PM, Barry Warsaw wrote: > On Oct 06, 2011, at 06:04 PM, Antoine Pitrou wrote: > >>On Thu, 6 Oct 2011 12:02:05 -0400 >>Barry Warsaw wrote: >>> >>> >The first part is implemented in CPython; the second part needs a module >>> >name to replace virtualenv. ?python -m pythonv doesn?t seem right. >>> >>> Nope, although `python -m virtualize` seems about perfect. >> >>`python -m sandbox` ? > > That's nice too. sandbox is a bit close to Victor's pysandbox for restricted execution environments. 'nest' would probably work, although I don't recall the 'egg' nomenclature featuring heavily in the current zipimport or packaging docs, so it may be a little obscure. 'pyenv' is another possible colour for the shed, although a quick Google search suggests that may have few name clash problems. 'appenv' would be yet another colour, since that focuses on the idea of 'environment per application'. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From stephen at xemacs.org Thu Oct 6 20:24:26 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 07 Oct 2011 03:24:26 +0900 Subject: [Python-Dev] cpython: PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown In-Reply-To: <4E8DACE4.8020005@netwok.org> References: <4E89E166.7020307@v.loewis.de> <4E8A2FB5.7020509@v.loewis.de> <87aa9h1myi.fsf@uwakimon.sk.tsukuba.ac.jp> <4E8DACE4.8020005@netwok.org> Message-ID: <8762k2rnad.fsf@uwakimon.sk.tsukuba.ac.jp> ?ric Araujo writes: > Le 04/10/2011 04:59, Stephen J. Turnbull a ?crit : > > I'm not familiar with the hg dev process (I use hg a lot, but so far > > it Just Works for me :), but I would imagine they will move in that > > direction as well. "That direction" being "ability to attach notes to existing commits". It is technically possible to design a VCS in which log messages are ahistorical, and therefore editable at will. I think this would socially be a disaster (though I don't have strong evidence for that view, the fact that the three most popular systems all implement log messages as part of history is very suggestive that some important consideration is involved). The alternative is attaching notes, which are automatically displayed by the VCS's history-viewing tools. > I doubt it; Mercurial has a very strong view of ?history is sacred?; it > has taken many, many requests for an optional rebase feature to be > added, for example. While such cultists may prefer Mercurial to git because of the former's conservative position on features like rebase, the devs are apparently pragmatic about editing the DAG. On the one hand, it is now well-understood that rebase is very dangerous. Eg, *nobody* has blanket permission to push rebased branches to Linus's repos, and exceptions are rarely if ever granted. Many reasons for restricting rebase are technical, and AFAICS the Mercurial devs initially took the conservative position that it's more trouble than it's worth. They were forced to change their minds. On the other hand, the "Mercurial queues" feature is nothing more nor less than history manipulation (in fact, its history manipulations are theoretically equivalent to those of rebase), but in the form of mq it has always been considered acceptable. The reason that mq is acceptable while rebase is not is simply that *references* to the parts of history maintained by mq are not propagated by Mercurial itself (except that now there is a limited ability to do so, but still very restricted and only on explicit request). A "notes" feature should therefore be acceptable, since it doesn't involve any dangerous changing of refs; the only issue is designing the UI. From carl at oddbird.net Thu Oct 6 20:33:34 2011 From: carl at oddbird.net (Carl Meyer) Date: Thu, 06 Oct 2011 12:33:34 -0600 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: References: <4E8DB745.2090406@netwok.org> <20111006113124.56c18f6a@resist.wooz.org> <4E8DCD53.6030306@netwok.org> <20111006120205.061a50ed@resist.wooz.org> <20111006180440.319c52c3@pitrou.net> <20111006125043.16d1c462@rivendell> Message-ID: <4E8DF47E.9070109@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/06/2011 12:12 PM, Nick Coghlan wrote: > sandbox is a bit close to Victor's pysandbox for restricted execution > environments. > > 'nest' would probably work, although I don't recall the 'egg' > nomenclature featuring heavily in the current zipimport or packaging > docs, so it may be a little obscure. > > 'pyenv' is another possible colour for the shed, although a quick > Google search suggests that may have few name clash problems. > > 'appenv' would be yet another colour, since that focuses on the idea > of 'environment per application'. I still think 'venv' is preferable to any of the other options proposed thus far. It makes the virtualenv "ancestry" clearer, doesn't repeat "py" (which seems entirely unnecessary in the name of a stdlib module, though it could be prepended to a script name if we do a script), and doesn't try to introduce new semantic baggage to the concept, which is already familiar to most Python devs under the name "virtualenv". Carl -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6N9H4ACgkQ8W4rlRKtE2doVACcChCim7CNS0czZisjEmw9NblS MqkAn1FyT+A/UiKodCh1siHrQXf2/yZQ =TAUV -----END PGP SIGNATURE----- From barry at python.org Thu Oct 6 21:06:04 2011 From: barry at python.org (Barry Warsaw) Date: Thu, 6 Oct 2011 15:06:04 -0400 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: <4E8DF47E.9070109@oddbird.net> References: <4E8DB745.2090406@netwok.org> <20111006113124.56c18f6a@resist.wooz.org> <4E8DCD53.6030306@netwok.org> <20111006120205.061a50ed@resist.wooz.org> <20111006180440.319c52c3@pitrou.net> <20111006125043.16d1c462@rivendell> <4E8DF47E.9070109@oddbird.net> Message-ID: <20111006150604.5e64d912@resist.wooz.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On Oct 06, 2011, at 12:33 PM, Carl Meyer wrote: >I still think 'venv' is preferable to any of the other options proposed >thus far. It's also nicely unique for googling. Funnily enough, the top hit right now for 'venv' is apparently Lua's project of the same name for the same purpose. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAEBCAAGBQJOjfwcAAoJEBJutWOnSwa/JRwQALLgtxOAmI9dhgaxudZgsDS7 ZeyaKFTDp2gnteMwXPFv2KK0n1SCsCIzOj807xel8id2Mr5R4FeoBC/HmySSsbxv 2U29QxXHAl/a3pZ50C5a3O7ZV4DShXuPQC/Y430PkwAkrP1ydb/I7KQEgmvHe2A1 UJQ1zbdX+gadk6n9sUo7NuEtvL7lMQdDeGLXvflqRJrVLeGxYV3AHwjByBE6MggU n5tZNv5G/H4fr/pD0SoMlI7MKoWRJkdqAeo3ASySVDn8LXe9UNjJb21YW18RyZPv qcdTzD4NYLJy5FQCmG9N2mOGwUjvrt82kLkcoKcNBekY5TfELshlqgqn31n+bxPo yV4Mr2IFOpRFTW218hwPbpEGK6Mfe03AV59Qey0zSKOsiJPUiPhXu9XdozvlF+Bx 7OdvhTe2nQBf8lp/KaKj4uZnnAvA9C8mMoukn6ly0Kk0EcDw81Ls9nMYxg/3gH7b 0SfvJH+8uQBjXF24Ce/xQkm6D7cB5e9GeilruH9VZ4LfZXYpP7jMdvWhW0g2S4PD KJWChqetOajJBZmWWK0vSBuLAjVxJTZ0Y5q0k9BNOiFj8/5v3QvpCAXQFFv7LTcX RDPI8rk73qjZiyAsVHMOmjSZfpY3aJnhPVMBSn0++yCzWx+YQbPwWkt6VihC6Ve4 Od0WX6XSEEu5BxJYaM/v =/f38 -----END PGP SIGNATURE----- From francisco.martin at web.de Thu Oct 6 22:15:03 2011 From: francisco.martin at web.de (Francisco Martin Brugue) Date: Thu, 06 Oct 2011 22:15:03 +0200 Subject: [Python-Dev] What it takes to change a single keyword. In-Reply-To: References: <4E87373D.2030503@v.loewis.de> <4E8CBE43.6050601@web.de> Message-ID: <4E8E0C47.6090308@web.de> On 10/06/2011 12:00 AM, Brett Cannon wrote: > Please file a bug about the dead links so we can fix/remove them. > it's done in http://bugs.python.org/issue13117 (and I've also tried with a patch). Cheers, francis From greg.ewing at canterbury.ac.nz Thu Oct 6 22:31:00 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 07 Oct 2011 09:31:00 +1300 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: References: <4E8DB745.2090406@netwok.org> <20111006113124.56c18f6a@resist.wooz.org> <4E8DCD53.6030306@netwok.org> <20111006120205.061a50ed@resist.wooz.org> <20111006180440.319c52c3@pitrou.net> <20111006125043.16d1c462@rivendell> Message-ID: <4E8E1004.7000508@canterbury.ac.nz> Lennart Regebro wrote: > +1 for env or sandbox or something else with "box" in it. Eggbox? Eggcrate? Incubator? -- Greg From ncoghlan at gmail.com Thu Oct 6 22:45:22 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Oct 2011 16:45:22 -0400 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: <20111006150604.5e64d912@resist.wooz.org> References: <4E8DB745.2090406@netwok.org> <20111006113124.56c18f6a@resist.wooz.org> <4E8DCD53.6030306@netwok.org> <20111006120205.061a50ed@resist.wooz.org> <20111006180440.319c52c3@pitrou.net> <20111006125043.16d1c462@rivendell> <4E8DF47E.9070109@oddbird.net> <20111006150604.5e64d912@resist.wooz.org> Message-ID: On Thu, Oct 6, 2011 at 3:06 PM, Barry Warsaw wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA256 > > On Oct 06, 2011, at 12:33 PM, Carl Meyer wrote: > >>I still think 'venv' is preferable to any of the other options proposed >>thus far. > > It's also nicely unique for googling. ?Funnily enough, the top hit right now > for 'venv' is apparently Lua's project of the same name for the same purpose. Yeah, I meant to say that 'venv' also sounded like a reasonable choice to me (for the reasons Carl listed). Chers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From benjamin at python.org Thu Oct 6 22:47:22 2011 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 6 Oct 2011 20:47:22 +0000 (UTC) Subject: [Python-Dev] =?utf-8?q?check_for_PyUnicode=5FREADY_look_backwards?= References: Message-ID: Amaury Forgeot d'Arc gmail.com> writes: > I'd prefer it was written : > if (PyUnicode_READY(*filename) < 0) > because "< 0" clearly indicates an error condition. Why not just have it return 0 on error? This would be more consistent with API functions that return "false" values like NULL and would just be if (!PyUnicode_READY(s)) return NULL; in code. Regards, Benjamin From cs at zip.com.au Thu Oct 6 23:27:01 2011 From: cs at zip.com.au (Cameron Simpson) Date: Fri, 7 Oct 2011 08:27:01 +1100 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: References: Message-ID: <20111006212701.GA10627@cskk.homeip.net> On 06Oct2011 04:26, Glyph wrote: | On Oct 5, 2011, at 10:46 PM, Cameron Simpson wrote: | > Surely VERY FEW tests need to be run as root, and they need careful | > consideration. The whole thing (build, full test suite) should | > not run as root. | | This is news to me - is most of Python not supported to run as root? | I was under the impression that Python was supposed to run correctly as | root, and therefore there should be some buildbots dedicated to running | it that way. If only a few small parts of the API are supposed to work | perhaps this should be advertised more clearly in the documentation? Pretending the snark to be slightly serious: you've missed the point. The builtbots are building unreliable code, that being the point of the test suite. Doing unpredictable stuff as root is bad juju. Running the builtbots and their tests should not be run as root except for a very few special tests, and those few need careful consideration and sandboxing. | Ahem. Sorry for the snark, I couldn't resist. As terry more reasonably put it: | | >> running buildbot tests as root does not reflect the experience of | >> non-root users. It seems some tests need to be run both ways just for | >> correctness testing. | | (except I'd say "all", not "some") No. Terry is right and you are ... not. Most tests need no special privileges - they're testing language/library semantics that do not depend on the system facilities much, and when they do they should work for unprivileged users. Of course they _should_ work as root (barring the few tests like the issue cited, where things are expected to fail but don't because root is unconstrained by the permission system). HOWEVER, the whole suite should not be _tested_ as root because the code being testing is by definition untrusted. | > Am I really the only person who feels unease about this scenario? | | More seriously: apparently you are not, but I am quite surprised by | that revelation. You should be :). The idea of root as a special, | magical place where real ultimate power resides is quite silly. "root" | is a title, like "king". You're not just "root", you're root _of_ | something. If the thing that you are root of is a dedicated virtual | machine with no interesting data besides the code under test, then this | is quite a lot like being a regular user in a similarly boring place. | It's like having the keys to an empty safe. Sadly, _no_. Root _is_ special, within the host and with scope to misbehave beyond the host. 1: The permission system does _not_ behave the same for root as for other users. 2: Root _can_ corrupt things anywhere in the system (within the VM, of course, but the builtbot is a subset of it). A normal unprivileged user will not have write permission to thing like: the OS image the compilers the system commands other user data areas all of which offer avenues to corrupt the built/test scenario. And if it is not a special purpose VM, the corrupt things for other uses and users of the system. 3: Root can also do other fun things like modify the network interfaces, including changing/adding IP addresss and MAC addresses. Which means that unless the VM (_if_ it is a VM) is running on a totally unroutable special purpose virtual network, it is possible to use the VM to pretend to be other machines on the same net and so forth. The prudent way to run the buildbots, especially if they cycle (refetch newer codebase, rebuilt, retest) instead of (scrub VM, reinstall, install built system, etc) is: - a user to fetch source and dispatch builds - possibly a distinct user to run the builds - definitely a distinct user to run the test suite And none of those be root. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ Sorry, but at DoD minimum speed it is impossible to speak. There is just too much wind noise. At that speed I am spending all my concentration allowance on riding, and cannot afford anymore thought for words. However, when I finish a ride and the bike is in the garage cooling down, the single word that comes to mind is: BEER. - Jack Tavares, tavares at balrog, DoD#0570 From ncoghlan at gmail.com Thu Oct 6 23:40:20 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Oct 2011 17:40:20 -0400 Subject: [Python-Dev] check for PyUnicode_READY look backwards In-Reply-To: References: Message-ID: On Thu, Oct 6, 2011 at 4:47 PM, Benjamin Peterson wrote: > Amaury Forgeot d'Arc gmail.com> writes: > >> I'd prefer it was written : >> ? ? ? ?if (PyUnicode_READY(*filename) < 0) >> because "< 0" clearly indicates an error condition. > > Why not just have it return 0 on error? This would be more consistent with API > functions that return "false" values like NULL and would just be > > if (!PyUnicode_READY(s)) return NULL; > > in code. Alas, that isn't the convention in C - courtesy of Unix, the convention is that for integer return codes, "0" means success. Yes, this is annoying, but violating it means you're not writing idiomatic C any more, you're trying to write Python-in-C. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Thu Oct 6 23:37:48 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 6 Oct 2011 23:37:48 +0200 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as References: <20111006212701.GA10627@cskk.homeip.net> Message-ID: <20111006233748.3f265eb7@pitrou.net> On Fri, 7 Oct 2011 08:27:01 +1100 Cameron Simpson wrote: > > 2: Root _can_ corrupt things anywhere in the system (within the VM, of > course, but the builtbot is a subset of it). A normal unprivileged user > will not have write permission to thing like: > the OS image > the compilers > the system commands > other user data areas > all of which offer avenues to corrupt the built/test scenario. > And if it is not a special purpose VM, the corrupt things for other > uses and users of the system. Why do you think it is not a special purpose VM? Also, if you think there's a security problem, why don't you take it in private with the buildbot owner instead of making such a fuss on a public mailing-list? > The prudent way to run the buildbots, especially if they cycle > (refetch newer codebase, rebuilt, retest) instead of (scrub VM, > reinstall, install built system, etc) is: > > - a user to fetch source and dispatch builds > - possibly a distinct user to run the builds > - definitely a distinct user to run the test suite Your contribution is definitely welcome. Thanks Antoine. From solipsis at pitrou.net Thu Oct 6 23:40:36 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 6 Oct 2011 23:40:36 +0200 Subject: [Python-Dev] check for PyUnicode_READY look backwards References: Message-ID: <20111006234036.2e744473@pitrou.net> On Thu, 6 Oct 2011 17:40:20 -0400 Nick Coghlan wrote: > On Thu, Oct 6, 2011 at 4:47 PM, Benjamin Peterson wrote: > > Amaury Forgeot d'Arc gmail.com> writes: > > > >> I'd prefer it was written : > >> ? ? ? ?if (PyUnicode_READY(*filename) < 0) > >> because "< 0" clearly indicates an error condition. > > > > Why not just have it return 0 on error? This would be more consistent with API > > functions that return "false" values like NULL and would just be > > > > if (!PyUnicode_READY(s)) return NULL; > > > > in code. > > Alas, that isn't the convention in C - courtesy of Unix, the > convention is that for integer return codes, "0" means success. C is quite inconsistent, and so is our own C API. if (PyUnicode_READY(s)) { ...} definitely looks like the code block will be executed if the unicode string is ready, though. Regards Antoine. From amauryfa at gmail.com Fri Oct 7 00:07:58 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Fri, 7 Oct 2011 00:07:58 +0200 Subject: [Python-Dev] check for PyUnicode_READY look backwards In-Reply-To: References: Message-ID: 2011/10/6 Benjamin Peterson : > Why not just have it return 0 on error? This would be more consistent with API > functions that return "false" values like NULL and would just be > > if (!PyUnicode_READY(s)) return NULL; Most functions of the Python C API seems to follow one of two ways to indicate an error: - functions that return PyObject* will return NULL - functions that return an int will return -1 -- Amaury Forgeot d'Arc From martin at v.loewis.de Fri Oct 7 00:20:00 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 07 Oct 2011 00:20:00 +0200 Subject: [Python-Dev] check for PyUnicode_READY look backwards In-Reply-To: References: Message-ID: <4E8E2990.9060806@v.loewis.de> Am 06.10.11 14:57, schrieb Amaury Forgeot d'Arc: > Hi, > > with the new Unicode API, there are many checks like: > + if (PyUnicode_READY(*filename)) > + goto handle_error; I think you are misinterpreting what you are seeing. There are not *many* such checks. Of the PyUnicode_READY checks, 106 take the form if (PyUnicode_READY(foo) == -1) return NULL; 30 tests take the form that you mention. I believe all of those have been added by Victor, who just didn't follow the convention. So, Victor: please correct them. Regards, Martin From martin at v.loewis.de Fri Oct 7 00:21:59 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 07 Oct 2011 00:21:59 +0200 Subject: [Python-Dev] Rename PyUnicode_KIND_SIZE ? In-Reply-To: <20111006155259.4832bd75@pitrou.net> References: <20111006155259.4832bd75@pitrou.net> Message-ID: <4E8E2A07.5060008@v.loewis.de> > (also, is it useful? PEP 393 has added a flurry of new macros to > unicodeobject.h and it's getting hard to know which ones are genuinely > useful, or well-performing) Please understand that not all of them have been added by PEP 393 genuinely. Some have only be added by individual committers, and we have to review and revoke those that are not useful. (Of course, some of those that did get added by the PEP may also not be useful). Regards, Martin From steve at pearwood.info Fri Oct 7 03:19:20 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 07 Oct 2011 12:19:20 +1100 Subject: [Python-Dev] New stringbench benchmark results In-Reply-To: <201110061242.26526.victor.stinner@haypocalc.com> References: <201110060206.30819.victor.stinner@haypocalc.com> <201110061242.26526.victor.stinner@haypocalc.com> Message-ID: <4E8E5398.6020601@pearwood.info> Victor Stinner wrote: > Hum, copy-paste failure, I wrote numbers in the wrong order, it's: > > (test: Python 3.2 => Python 3.3) > "A".join(["Bob"]*100)): 0.92 => 2.11 > ("C"+"AB"*300).rfind("CA"): 0.57 => 1.03 > ("A" + ("Z"*128*1024)).replace("A", "BB", 1): 0.25 => 0.50 > > I improved str.replace(): it's now 5 times faster instead of 2 times slower > for this specific benchmark :-) (or 10 times faster in Python 3.3 before/after > my patch) Talking about str.replace, I was surprised to see this behaviour in 3.2: >>> s = 'spam' >>> t = s.replace('a', 'a') >>> s is t False Given that strings are immutable, would it not be an obvious optimization for replace to return the source string unchanged if the old and new substrings are equal, and avoid making a potentially expensive copy? I note that if count is zero, the source string is returned unchanged: >>> t = s.replace('a', 'b', 0) >>> t is s True -- Steven From andrew at bemusement.org Fri Oct 7 03:46:48 2011 From: andrew at bemusement.org (Andrew Bennetts) Date: Fri, 7 Oct 2011 12:46:48 +1100 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <20111006212701.GA10627@cskk.homeip.net> References: <20111006212701.GA10627@cskk.homeip.net> Message-ID: <20111007014648.GE25682@flay.puzzling.org> On Fri, Oct 07, 2011 at 08:27:01AM +1100, Cameron Simpson wrote: [?] > | >> running buildbot tests as root does not reflect the experience of > | >> non-root users. It seems some tests need to be run both ways just for > | >> correctness testing. > | > | (except I'd say "all", not "some") > > No. Terry is right and you are ... not. Most tests need no special > privileges - they're testing language/library semantics that do not > depend on the system facilities much, and when they do they should work > for unprivileged users. You could also say that most tests that work on Linux work on FreeBSD too, so when they work on Linux they should work for FreeBSD too? so why bother running tests on FreeBSD at all? The reason is because the assumptions behind that ?should? are wrong frequently enough to make it worth running tests in both environments. Like Glyph, I think that ?running as root? is sufficiently different environment to ?running as typical user? (in terms of how POSIX-like systems behave w.r.t. to things like permissions checks) to make it worthwhile to regularly run the whole test suite as root. > HOWEVER, the whole suite should not be _tested_ as root because the code > being testing is by definition untrusted. No, that just means you shouldn't trust *root*. Which is where a VM is a very useful tool. You can have the ?as root? environment for your tests without the need to have anything important trust it. [?] > Root _is_ special, within the host and with scope to misbehave beyond > the host. > > 1: The permission system does _not_ behave the same for root as for > other users. Those are arguments *for* running tests as root! > 2: Root _can_ corrupt things anywhere in the system (within the VM, of > course, but the builtbot is a subset of it). A normal unprivileged user This appears to be a key error in your logic. There's no fundamental reason why ?tests run as root inside a VM? must necessarily imply ?buildbot process is run inside that same VM and is therefore vulnerable to code in that test run.? It may be more convenient to deploy it that way, but I'm sure it's possible to have a buildslave configured to e.g. start a pristine VM (from a snapshot with all the necessary build dependencies installed) and via SSH copy the the source into it, build it, run it, and report the results. The VM could be fully isolated from the real network and filesystem etc if you like. Given that it is certainly possible to run tests as root about as securely as running them without root, do you still feel it is not worth running the tests as root? > The prudent way to run the buildbots, especially if they cycle (refetch > newer codebase, rebuilt, retest) instead of (scrub VM, reinstall, > install built system, etc) is: > > - a user to fetch source and dispatch builds > - possibly a distinct user to run the builds > - definitely a distinct user to run the test suite If we're talking prudence, then surely s/user/VM/ is even better :) -Andrew. From cs at zip.com.au Fri Oct 7 04:11:39 2011 From: cs at zip.com.au (Cameron Simpson) Date: Fri, 7 Oct 2011 13:11:39 +1100 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <20111007014648.GE25682@flay.puzzling.org> References: <20111007014648.GE25682@flay.puzzling.org> Message-ID: <20111007021139.GA19224@cskk.homeip.net> On 07Oct2011 12:46, Andrew Bennetts wrote: | On Fri, Oct 07, 2011 at 08:27:01AM +1100, Cameron Simpson wrote: | [?] | > | >> running buildbot tests as root does not reflect the experience of | > | >> non-root users. It seems some tests need to be run both ways just for | > | >> correctness testing. | > | | > | (except I'd say "all", not "some") | > | > No. Terry is right and you are ... not. Most tests need no special | > privileges - they're testing language/library semantics that do not | > depend on the system facilities much, and when they do they should work | > for unprivileged users. | | You could also say that most tests that work on Linux work on FreeBSD | too, so when they work on Linux they should work for FreeBSD too? so why | bother running tests on FreeBSD at all? The reason is because the | assumptions behind that ?should? are wrong frequently enough to make it | worth running tests in both environments. For thoroughness, yes. | Like Glyph, I think that ?running as root? is sufficiently different | environment to ?running as typical user? (in terms of how POSIX-like | systems behave w.r.t. to things like permissions checks) to make it | worthwhile to regularly run the whole test suite as root. Hmm. Glyph seemed to be arguing both ways - that everything should be tested as root, and also that root is not special. I have unease over the former and disagreement over the latter. | > HOWEVER, the whole suite should not be _tested_ as root because the code | > being testing is by definition untrusted. | | No, that just means you shouldn't trust *root*. Which is where a VM is | a very useful tool. You can have the ?as root? environment for your | tests without the need to have anything important trust it. | | > Root _is_ special, within the host and with scope to misbehave beyond | > the host. | > 1: The permission system does _not_ behave the same for root as for | > other users. | | Those are arguments *for* running tests as root! I think they're arguments for running _specific_ tests as root. File I/O based tests primarily I would suppose, though my first instinct would be to constrain even these to permission related tests. | > 2: Root _can_ corrupt things anywhere in the system (within the VM, of | > course, but the builtbot is a subset of it). A normal unprivileged user | | This appears to be a key error in your logic. There's no fundamental | reason why ?tests run as root inside a VM? must necessarily imply | ?buildbot process is run inside that same VM and is therefore vulnerable | to code in that test run.? Indeed, I had no considered that the tests might be run in a special VM distinct from the build environment. In that setup most of my concerns are moot. | It may be more convenient to deploy it that way, but I'm sure it's | possible to have a buildslave configured to e.g. start a pristine VM | (from a snapshot with all the necessary build dependencies installed) and | via SSH copy the the source into it, build it, run it, and report the | results. The VM could be fully isolated from the real network and | filesystem etc if you like. | | Given that it is certainly possible to run tests as root about as | securely as running them without root, do you still feel it is not worth | running the tests as root? Not in the style you describe above. To clarify: I agree with your suggestion. | > The prudent way to run the buildbots, especially if they cycle (refetch | > newer codebase, rebuilt, retest) instead of (scrub VM, reinstall, | > install built system, etc) is: | > | > - a user to fetch source and dispatch builds | > - possibly a distinct user to run the builds | > - definitely a distinct user to run the test suite | | If we're talking prudence, then surely s/user/VM/ is even better :) Yes. And a distinct VM instance for the root and non-root tests, too. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ all coders are created equal; that they are endowed with certain unalienable rights, of these are beer, net connectivity, and the pursuit of bugfixes... - Gregory R Block From steve at pearwood.info Fri Oct 7 04:42:26 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 07 Oct 2011 13:42:26 +1100 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <20111006212701.GA10627@cskk.homeip.net> References: <20111006212701.GA10627@cskk.homeip.net> Message-ID: <4E8E6712.8050004@pearwood.info> Cameron Simpson wrote: > On 06Oct2011 04:26, Glyph wrote: > | On Oct 5, 2011, at 10:46 PM, Cameron Simpson wrote: > | > Surely VERY FEW tests need to be run as root, and they need careful > | > consideration. The whole thing (build, full test suite) should > | > not run as root. > | > | This is news to me - is most of Python not supported to run as root? > | I was under the impression that Python was supposed to run correctly as > | root, and therefore there should be some buildbots dedicated to running > | it that way. If only a few small parts of the API are supposed to work > | perhaps this should be advertised more clearly in the documentation? > > Pretending the snark to be slightly serious: you've missed the point. > The builtbots are building unreliable code, that being the point of the > test suite. Doing unpredictable stuff as root is bad juju. Sorry Cameron, it seems to me that you have missed the point, not Glyph. If the builtbots were predictable, there would be no point in running them because we would already know the answer. But since they *might* fail, they need to be run: that is an argument in favour of tests run explicitly as root, not against it. Since Python running as root is supported[1], then it must be tested running as root. What's the alternative? Wait for some user to report that Python nuked their production system when run as root? Better that we find out first hand. It would be embarrassing enough if (say) list.append crashed our test system. How much worse would if be if the first time we found out about it was after it did so to Google's production servers? To put it another way: Doing unpredictable stuff as root on a production machine is bad juju. Doing unpredictable stuff as root in order to find out what it will do *before* putting it into production is absolutely vital. > Running the builtbots and their tests should not be run as root except > for a very few special tests, and those few need careful consideration > and sandboxing. Are you suggested that they aren't currently sandboxed? > | Ahem. Sorry for the snark, I couldn't resist. As terry more reasonably put it: > | > | >> running buildbot tests as root does not reflect the experience of > | >> non-root users. It seems some tests need to be run both ways just for > | >> correctness testing. > | > | (except I'd say "all", not "some") > > No. Terry is right and you are ... not. Most tests need no special > privileges - they're testing language/library semantics that do not > depend on the system facilities much, and when they do they should work > for unprivileged users. > > Of course they _should_ work as root (barring the few tests like the > issue cited, where things are expected to fail but don't because root is > unconstrained by the permission system). > > HOWEVER, the whole suite should not be _tested_ as root because the code > being testing is by definition untrusted. So what you are saying is that the most critical situation, with the greatest consequences if there is a failure, should *not* be tested, but taken on trust that it will just work as expected. This makes no sense to me. I would say that testing as root is more important, not less, because the consequences of unexpected failure is so much worse. It seems to me that you are putting the security of the build-bot ahead of people's real-life production systems, that you expect them to run *untested code* in production as root. To me, this seems wrong. What exactly is your fear? That Python run as root will be able to escape the jail it is running in and do bad things to the host machine? That's a legitimate concern, but that's an argument to be taken up with the sys admins who set up the jail. It's not an argument against testing as root. It's an argument against testing as root *badly*. [1] If not, that will come as a mighty big surprise to Red Hat, among others, who use Python for system tools run as root. -- Steven From cs at zip.com.au Fri Oct 7 05:01:12 2011 From: cs at zip.com.au (Cameron Simpson) Date: Fri, 7 Oct 2011 14:01:12 +1100 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <4E8E6712.8050004@pearwood.info> References: <4E8E6712.8050004@pearwood.info> Message-ID: <20111007030112.GA25985@cskk.homeip.net> On 07Oct2011 13:42, Steven D'Aprano wrote: | Cameron Simpson wrote: | >On 06Oct2011 04:26, Glyph wrote: | >| On Oct 5, 2011, at 10:46 PM, Cameron Simpson wrote: | >| > Surely VERY FEW tests need to be run as root, and they need careful | >| > consideration. The whole thing (build, full test suite) should | >| > not run as root. | >| | This is news to me - is most of Python not supported to run as | >root? | >| I was under the impression that Python was supposed to run correctly as | >| root, and therefore there should be some buildbots dedicated to running | >| it that way. If only a few small parts of the API are supposed to work | >| perhaps this should be advertised more clearly in the documentation? | > | >Pretending the snark to be slightly serious: you've missed the point. | >The builtbots are building unreliable code, that being the point of the | >test suite. Doing unpredictable stuff as root is bad juju. | | Sorry Cameron, it seems to me that you have missed the point, not | Glyph. We're probably both aiming badly. See my reply to Andrew Bennetts; I'm less concerned if his described scenario is typical. [...snip...] | Doing unpredictable stuff as root on a production machine is bad | juju. Doing unpredictable stuff as root in order to find out what it | will do *before* putting it into production is absolutely vital. Yes yes yes. | >Running the builtbots and their tests should not be run as root except | >for a very few special tests, and those few need careful consideration | >and sandboxing. | | Are you suggested that they aren't currently sandboxed? No, but it was my instinctive fear. Please see my reply to Andrew Bennetts. I find nothing to disagree with in your reply. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ The word is not the thing. The map is not the territory. The symbol is not the thing symbolized. - S.I. Hayakawa From greg.ewing at canterbury.ac.nz Fri Oct 7 06:26:05 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 07 Oct 2011 17:26:05 +1300 Subject: [Python-Dev] check for PyUnicode_READY look backwards In-Reply-To: References: Message-ID: <4E8E7F5D.4070605@canterbury.ac.nz> Benjamin Peterson wrote: > Why not just have it return 0 on error? This would be more consistent with API > functions that return "false" values like NULL But that would make it confusingly different from all the other functions that return ints. The NULL convention is only used when the function returns a pointer. -- Greg From greg.ewing at canterbury.ac.nz Fri Oct 7 06:53:26 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 07 Oct 2011 17:53:26 +1300 Subject: [Python-Dev] New stringbench benchmark results In-Reply-To: <4E8E5398.6020601@pearwood.info> References: <201110060206.30819.victor.stinner@haypocalc.com> <201110061242.26526.victor.stinner@haypocalc.com> <4E8E5398.6020601@pearwood.info> Message-ID: <4E8E85C6.4090403@canterbury.ac.nz> Steven D'Aprano wrote: > Given that strings are immutable, would it not be an obvious > optimization for replace to return the source string unchanged if the > old and new substrings are equal, Only if this situation occurs frequently enough to outweigh the overhead of comparing the target and replacement strings. This check could be performed very cheaply when both strings are interned, so it might be worth doing in that case. -- Greg From glyph at twistedmatrix.com Fri Oct 7 07:48:07 2011 From: glyph at twistedmatrix.com (Glyph) Date: Fri, 7 Oct 2011 01:48:07 -0400 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <20111007021139.GA19224@cskk.homeip.net> References: <20111007014648.GE25682@flay.puzzling.org> <20111007021139.GA19224@cskk.homeip.net> Message-ID: <1E3F8BD0-73B0-48AE-8315-878BE7490FAA@twistedmatrix.com> On Oct 6, 2011, at 10:11 PM, Cameron Simpson wrote: > Hmm. Glyph seemed to be arguing both ways - that everything should be > tested as root, and also that root is not special. I have unease over the > former and disagreement over the latter. Your reply to Stephen suggests that we are actually in agreement, but just to be clear: I completely understand that root is special in that the environment allows for several behaviors which are not true for a normal user. Which is precisely why it must be tested by a (properly sandboxed) buildbot :). It's just not special in the sense that having root on a throwaway VM would allow you to do non-throwaway things. The one thing one must always be careful of, of course, is having your bandwidth chewed up for some nefarious purpose (spam, phishing) but that sort of thing should be caught with other monitoring tools. Plus, there are lots of other impediments to getting Python's buildbots to do something nasty. Only people with a commit bit should be able to actually push changes that buildbot will see. So avoiding root is more about avoiding mistakes than avoiding attacks. (After all, if this process isn't completely secure, then neither is the Python that's shipped in various OSes: in which case, game over _everywhere_.) Finally, and unfortunately, there are so many privilege escalation exploits in so many different daemons and applications that it's foolish to treat root as too terribly special: unless you're a real hardening expert and you spend a lot of effort keeping up to the second on security patches, the ability to execute completely arbitrary untrusted code as a unprivileged local user on your system can likely be converted with little effort into the ability to execute arbitrary untrusted code as root. Although, ironically, buildbots are often minimally configured and don't run any other services, so maybe these environments are one of the few places where it actually does make a difference :-). (Which is precisely why all daemons everywhere should be written in Python. Buffer overflows are dumb, it's 2011 already, come on. Use Twisted.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ezio.melotti at gmail.com Fri Oct 7 09:46:12 2011 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Fri, 07 Oct 2011 10:46:12 +0300 Subject: [Python-Dev] [Python-checkins] r88904 - tracker/instances/python-dev/html/issue.item.js In-Reply-To: <3SFMGm1PxJzPLm@mail.python.org> References: <3SFMGm1PxJzPLm@mail.python.org> Message-ID: <4E8EAE44.8040908@gmail.com> Hi, On 07/10/2011 10.02, ezio.melotti wrote: > Author: ezio.melotti > Date: Fri Oct 7 09:02:07 2011 > New Revision: 88904 > > Log: > #422: add keyword shortcuts to navigate through the messages and to reply. > I added keyboard shortcut to navigate through the messages in the tracker (yes, keyboard, not keyword, that's a typo :). There are two groups of shortcuts available: * mnemonics: f: first message; p: previous message; n: next message; l: last message; r: reply (jumps on the comment field and focuses it); * vim-style: h: left (first message); k: up (previous message); j: down (next message); l: right (last message); i: insert-mode (jumps on the comment field and focuses it); esc: normal-mode (unfocus the field and re-enables the commands); The two groups don't conflict with each other, so all the keys always work. The shortcuts don't require key combinations like ctrl+f/alt+f -- 'f' is enough. The shortcuts are available only in the issue page, and not in the main page with the list of issues. The shortcuts use javascript, so they won't work if js is disabled. The issue is tracked here: http://psf.upfronthosting.co.za/roundup/meta/issue422 If you have any problem/feedback reply either there or here. A few notes about the change: * The 'end' key doesn't jump to the last message anymore. The normal browser behavior (i.e. go to the end of the page) is now restored. Use 'l' (last) to jump to the last message. * The patch conflicts with the browser 'find-as-you-type' if the first letter is a shortcut. If you are using the find-as-you-type, use ctrl+f instead. * f/l *always* jump to the first/last message, regardless of the position of the page. p/n use an index that does *not* change when you scroll, and do nothing on the first/last message respectively. If you are at the second-last message, scroll to the top, and hit 'n', you will still jump to the last message. If you are at the last message, scroll to the top, and hit 'n' the page won't scroll (you can use 'l' instead). * I added the shortcuts to the left sidebar but I plan to move them to the devguide eventually. * It might be useful to add a shortcut to submit. 's' would be a good candidate, but it might be hit accidentally (an "are you sure?" popup might solve this). ctrl+enter/ctrl+s might be better, but they might conflict with the browser commands. * While replying (i.e. while writing in the comment textarea), the shortcuts are disabled. You can hit ESC to unfocus the textarea and then use them. You can then press 'r' again to continue editing. Best Regards, Ezio Melotti From victor.stinner at haypocalc.com Fri Oct 7 09:56:51 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 07 Oct 2011 09:56:51 +0200 Subject: [Python-Dev] check for PyUnicode_READY look backwards In-Reply-To: <4E8E2990.9060806@v.loewis.de> References: <4E8E2990.9060806@v.loewis.de> Message-ID: <4E8EB0C3.80100@haypocalc.com> Le 07/10/2011 00:20, "Martin v. L?wis" a ?crit : > Am 06.10.11 14:57, schrieb Amaury Forgeot d'Arc: >> Hi, >> >> with the new Unicode API, there are many checks like: >> + if (PyUnicode_READY(*filename)) >> + goto handle_error; > > I think you are misinterpreting what you are seeing. > There are not *many* such checks. Of the PyUnicode_READY > checks, 106 take the form > > if (PyUnicode_READY(foo) == -1) > return NULL; > > 30 tests take the form that you mention. > > I believe all of those have been added by Victor, who > just didn't follow the convention. Yes, I wrote if (PyUnicode_READY(foo)), but I agree that it is confusing when you read the code, especially because we have also a PyUnicode_IS_READY(foo) macro! if (!PyUnicode_READY(foo)) is not better, also because of PyUnicode_IS_READY(foo). I prefer PyUnicode_IS_READY(foo) < 0 over PyUnicode_IS_READY(foo) == -1. Victor From stefan at bytereef.org Fri Oct 7 10:07:55 2011 From: stefan at bytereef.org (Stefan Krah) Date: Fri, 7 Oct 2011 10:07:55 +0200 Subject: [Python-Dev] check for PyUnicode_READY look backwards In-Reply-To: <4E8EB0C3.80100@haypocalc.com> References: <4E8E2990.9060806@v.loewis.de> <4E8EB0C3.80100@haypocalc.com> Message-ID: <20111007080755.GA1918@sleipnir.bytereef.org> Victor Stinner wrote: > Yes, I wrote if (PyUnicode_READY(foo)), but I agree that it is confusing > when you read the code, especially because we have also a > PyUnicode_IS_READY(foo) macro! > > if (!PyUnicode_READY(foo)) is not better, also because of > PyUnicode_IS_READY(foo). > > I prefer PyUnicode_IS_READY(foo) < 0 over PyUnicode_IS_READY(foo) == -1. Do you mean PyUnicode_READY(foo) < 0? I also prefer that idiom. Stefan Krah From victor.stinner at haypocalc.com Fri Oct 7 10:49:35 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 07 Oct 2011 10:49:35 +0200 Subject: [Python-Dev] check for PyUnicode_READY look backwards In-Reply-To: <20111007080755.GA1918@sleipnir.bytereef.org> References: <4E8E2990.9060806@v.loewis.de> <4E8EB0C3.80100@haypocalc.com> <20111007080755.GA1918@sleipnir.bytereef.org> Message-ID: <4E8EBD1F.3060100@haypocalc.com> Le 07/10/2011 10:07, Stefan Krah a ?crit : > Victor Stinner wrote: >> Yes, I wrote if (PyUnicode_READY(foo)), but I agree that it is confusing >> when you read the code, especially because we have also a >> PyUnicode_IS_READY(foo) macro! >> >> if (!PyUnicode_READY(foo)) is not better, also because of >> PyUnicode_IS_READY(foo). >> >> I prefer PyUnicode_IS_READY(foo)< 0 over PyUnicode_IS_READY(foo) == -1. > > Do you mean PyUnicode_READY(foo)< 0? I also prefer that idiom. Oops, yes I mean PyUnicode_READY(foo)< 0. Victor From L.J.Buitinck at uva.nl Fri Oct 7 10:53:47 2011 From: L.J.Buitinck at uva.nl (Lars Buitinck) Date: Fri, 7 Oct 2011 10:53:47 +0200 Subject: [Python-Dev] counterintuitive behavior (bug?) in Counter with += In-Reply-To: <20111006154637.GF23957@p16> References: <20111006154637.GF23957@p16> Message-ID: 2011/10/6 Petri Lehtinen : > Lars Buitinck wrote: >> ? ? >>> from collections import Counter >> ? ? >>> a = Counter([1,2,3]) >> ? ? >>> b = a >> ? ? >>> a += Counter([3,4,5]) >> ? ? >>> a is b >> ? ? False > > Sounds like a good idea to me. You should open an issue in the tracker > at http://bugs.python.org/. Done that: http://bugs.python.org/issue13121 -- Lars Buitinck Scientific programmer, ILPS University of Amsterdam From stephen at xemacs.org Fri Oct 7 11:10:58 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 07 Oct 2011 18:10:58 +0900 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <20111007014648.GE25682@flay.puzzling.org> References: <20111006212701.GA10627@cskk.homeip.net> <20111007014648.GE25682@flay.puzzling.org> Message-ID: <87wrchqi8t.fsf@uwakimon.sk.tsukuba.ac.jp> Andrew Bennetts writes: > No, that just means you shouldn't trust *root*. Which is where a > VM is a very useful tool. You can have the ?as root? environment > for your tests without the need to have anything important trust it. Cameron acknowledges that he missed that. So maybe he was right for the wrong reason; he's still right. But in the current context, it is not an argument for not worrying, because there is no evidence at all that the OP set up his buildbot in a secure sandbox. As I read his followups, he simply "didn't bother" to set up an unprivileged user and run the 'bot as that user. > > 2: Root _can_ corrupt things anywhere in the system (within the > > VM, of course, but the builtbot is a subset of it). A normal > > unprivileged user > > This appears to be a key error in your logic. There's no > fundamental reason why ?tests run as root inside a VM? must > necessarily imply ?buildbot process is run inside that same VM and > is therefore vulnerable to code in that test run.? Cameron's logic is correct. When security is in question, one must assume that *everything* is vulnerable until proven otherwise. "Is secure" requires a universal quantifier; "is insecure" only an existential one. The principle here is "ran as root" without further explanation is a litmus test for "not bothering about security", even today. It's worth asking for explanation, or at least a comment that "all the buildbot contributors I've talked to have put a lot of effort into security configuration". > It may be more convenient to deploy it that way, You bet, and hundreds of thousands of viruses exploiting IE thank Microsoft for its devotion to convenience. While much can be done to make secure configuration more convenient than it is, nevertheless the state of the art is that convenience is the enemy of security. > but I'm sure it's possible to have a buildslave configured to > e.g. start a pristine VM (from a snapshot with all the necessary > build dependencies installed) and via SSH copy the the source into > it, build it, run it, and report the results. The VM could be > fully isolated from the real network and filesystem etc if you > like. Sure it's possible. In security, the question is never "can it be done?"; it's "was it done?" We have *no* evidence justifying an *assumption* that it was done in the current case, or for other buildbots, for that matter. In fact, as I read the OP's followups, that was *not* the case; the assumption is falsified. Nevertheless, several people who I would have thought would know better are *all* arguing from the assumption that the OP configured his test system with security (rather than convenience) in mind, and are castigating Cameron for *not* making that same assumption. To my mind, every post is increasing justification for his unease. :-( And that's why this thread belongs on this list, rather than on Bruce Schneier's blog. It's very easy these days to set up a basic personal VM, and folk of goodwill will do so to help the project with buildbots to provide platform coverage in testing new code. But this contribution involves certain risks (however low probability, some Very Bad Things *could* happen). Contributors should get help in evaluating the potential threats and corresponding risks, and in proper configuration. Not assurances that nothing will go wrong "because you probably run the 'bot in a VM." From glyph at twistedmatrix.com Fri Oct 7 12:18:55 2011 From: glyph at twistedmatrix.com (Glyph) Date: Fri, 7 Oct 2011 06:18:55 -0400 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <87wrchqi8t.fsf@uwakimon.sk.tsukuba.ac.jp> References: <20111006212701.GA10627@cskk.homeip.net> <20111007014648.GE25682@flay.puzzling.org> <87wrchqi8t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <477C9A0C-5058-422D-A0D2-48060A7C947D@twistedmatrix.com> On Oct 7, 2011, at 5:10 AM, Stephen J. Turnbull wrote: > The principle here is "ran as root" without further explanation is a > litmus test for "not bothering about security", even today. It's > worth asking for explanation, or at least a comment that "all the > buildbot contributors I've talked to have put a lot of effort into > security configuration". This is a valid point. I think that Cameron and I may have had significantly different assumptions about the environment being discussed here. I may have brought some assumptions about the build farm here that don't actually apply to the way Python does it. To sum up what I believe is now the consensus from this thread: Anyone setting up a buildslave should take care to invoke the build in an environment where an out-of-control buildbot, potentially executing arbitrarily horrible and/or malicious code, should not damage anything. Builders should always be isolated from valuable resources, although the specific mechanism of isolation may differ. A virtual machine is a good default, but may not be sufficient; other tools for cutting of the builder from the outside world would be chroot jails, solaris zones, etc. Code runs differently as privileged vs. unprivileged users. Therefore builders should be set up in both configurations, running the full test suite, to ensure that all code runs as expected in both configurations. Some tests, as the start of this thread indicates, must have some special logic to make sure they do or do not run, or run differently, in privileged vs. unprivileged configurations, but generally speaking most things should work in both places. Access to root my provide access to slightly surprising resources, even within a VM (such as the ability to send spoofed IP packets, change the MAC address of even virtual ethernet cards, etc), and administrators should be aware that this is the case when configuring the host environment for a run-as-root builder. You don't want to end up with a compromised test VM that can snoop on your network. Have I left anything out? :-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at zip.com.au Fri Oct 7 12:40:18 2011 From: cs at zip.com.au (Cameron Simpson) Date: Fri, 7 Oct 2011 21:40:18 +1100 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <477C9A0C-5058-422D-A0D2-48060A7C947D@twistedmatrix.com> References: <477C9A0C-5058-422D-A0D2-48060A7C947D@twistedmatrix.com> Message-ID: <20111007104018.GA18106@cskk.homeip.net> On 07Oct2011 06:18, Glyph wrote: | On Oct 7, 2011, at 5:10 AM, Stephen J. Turnbull wrote: | | > The principle here is "ran as root" without further explanation is a | > litmus test for "not bothering about security", even today. It's | > worth asking for explanation, or at least a comment that "all the | > buildbot contributors I've talked to have put a lot of effort into | > security configuration". | | This is a valid point. I think that Cameron and I may have | had significantly different assumptions about the environment being | discussed here. I may have brought some assumptions about the build | farm here that don't actually apply to the way Python does it. Likewise. I state now that I have no actual knowledge of the practices in the build farm(s). | To sum up what I believe is now the consensus from this thread: | | Anyone setting up a buildslave should take care to invoke the build in | an environment where an out-of-control buildbot, potentially executing | arbitrarily horrible and/or malicious code, should not damage anything. | Builders should always be isolated from valuable resources, although | the specific mechanism of isolation may differ. A virtual machine is a | good default, but may not be sufficient; other tools for cutting of the | builder from the outside world would be chroot jails, solaris zones, etc. | | Code runs differently as privileged vs. unprivileged users. Therefore | builders should be set up in both configurations, running the full test | suite, to ensure that all code runs as expected in both configurations. | Some tests, as the start of this thread indicates, must have some | special logic to make sure they do or do not run, or run differently, | in privileged vs. unprivileged configurations, but generally speaking | most things should work in both places. | | Access to root my provide access to slightly surprising resources, | even within a VM (such as the ability to send spoofed IP packets, | change the MAC address of even virtual ethernet cards, etc), and | administrators should be aware that this is the case when configuring | the host environment for a run-as-root builder. You don't want to end | up with a compromised test VM that can snoop on your network. | | Have I left anything out? :-) I think that the build and the tests should be different security scopes/zones/levels: different users or different VMs. Andrew's suggestion of a VM-for-tests sounds especially good. And that I think the as-root tests suite shouldn't run unless the not-root test suite passes. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ It is not true that life is one damn thing after another -- it's one damn thing over and over. - Edna St. Vincent Millay From glyph at twistedmatrix.com Fri Oct 7 12:50:14 2011 From: glyph at twistedmatrix.com (Glyph) Date: Fri, 7 Oct 2011 06:50:14 -0400 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <20111007104018.GA18106@cskk.homeip.net> References: <477C9A0C-5058-422D-A0D2-48060A7C947D@twistedmatrix.com> <20111007104018.GA18106@cskk.homeip.net> Message-ID: <1F9D0431-B24C-410E-9282-0E498EF024BF@twistedmatrix.com> On Oct 7, 2011, at 6:40 AM, Cameron Simpson wrote: > I think that the build and the tests should be different security > scopes/zones/levels: different users or different VMs. Andrew's > suggestion of a VM-for-tests sounds especially good. To me, "build" and "test" are largely the same function, since a build whose tests haven't been run is just a bag of bits :). But in the sense that root should never be required to do a build, I don't see a reason to bother supporting that configuration: it makes sense to always do the build as a regular user. > And that I think the as-root tests suite shouldn't run unless the > not-root test suite passes. Why's that? The as-root VM needs to be equally secure either way, and it's a useful data point to see that the as-root tests *didn't* break, if they didn't; this way a developer can tell at a glance that the failure is either a test that needs to be marked as 'root only' or a change that causes permissions to be required that it shouldn't have. (In general I object to suggestions of the form "don't run the tests unless X", unless X is a totally necessary pre-requisite like "the compile finished".) -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at zip.com.au Fri Oct 7 13:10:36 2011 From: cs at zip.com.au (Cameron Simpson) Date: Fri, 7 Oct 2011 22:10:36 +1100 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <1F9D0431-B24C-410E-9282-0E498EF024BF@twistedmatrix.com> References: <1F9D0431-B24C-410E-9282-0E498EF024BF@twistedmatrix.com> Message-ID: <20111007111036.GA22483@cskk.homeip.net> On 07Oct2011 06:50, Glyph wrote: | On Oct 7, 2011, at 6:40 AM, Cameron Simpson wrote: | > I think that the build and the tests should be different security | > scopes/zones/levels: different users or different VMs. Andrew's | > suggestion of a VM-for-tests sounds especially good. | | To me, "build" and "test" are largely the same function, since a build | whose tests haven't been run is just a bag of bits :). But in the sense | that root should never be required to do a build, I don't see a reason | to bother supporting that configuration: it makes sense to always do | the build as a regular user. I don't mean buid as root and test as regular user, I mean build as regular user and test as different user. This can be used to prevent the test user from having write permission to the built code. My thinking is that the "build" is a well defined set of "safe" operations: copy the source (safe, just a data copy), compile the source (safe, presuming bug-free compiler). Of course I'm glossing over any autoconfiguration shell scripts and makefiles full of source-code-supplied shell commands - nasty nasty. Basicly I was taking the view that a "build" should be a safe "source code to machine code" translation process. By contrast, the tests _run_ the somewhat-unknown test suite. Not an inherently "safe" procedure. Think of the build being like a PDF viewer rendering a document to the display. And the tests as being the user reading a list of instructions off that display and doing stuff. It ought to be safe to render the PDF; the user's actions are "unsafe". | > And that I think the as-root tests suite shouldn't run unless the | > not-root test suite passes. | | Why's that? The as-root VM needs to be equally secure either way, | and it's a useful data point to see that the as-root tests *didn't* | break, if they didn't; this way a developer can tell at a glance that | the failure is either a test that needs to be marked as 'root only' | or a change that causes permissions to be required that it shouldn't have. | | (In general I object to suggestions of the form "don't run the tests | unless X", unless X is a totally necessary pre-requisite like "the | compile finished".) Suppose a test is dangerously broken through ineptitude or even malice. Extreme example: a test makes a bunch of test files and cleans up with "rm -r /". (Non-malicious scenario: "rm -r ${testdatatree}/", with $testdatatree accidentally undefined, eg through a typo.) Such a test will fail when unprivileged. (Of course a mlicious test might say "do not set off the bomb unless I am root":-) The point here is security, not test coverage: if a procedure is known to be broken as a regular user, is it not highly unsafe to then run it as root? Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ 1st Law Economists: For every economist there exists an equal and opposite economist. 2nd Law Economists: They're both always wrong! From vinay_sajip at yahoo.co.uk Fri Oct 7 13:10:53 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Fri, 7 Oct 2011 11:10:53 +0000 (UTC) Subject: [Python-Dev] socket module build failure Message-ID: I work on Ubuntu Jaunty for my cpython development work - an old version, I know, but still quite serviceable and has worked well for me over many months. With the latest default cpython repository, however, I can't run the regression suite because the socket module now fails to build: gcc -pthread -fPIC -g -O0 -Wall -Wstrict-prototypes -IInclude -I. -I./Include -I/usr/local/include -I/home/vinay/projects/python/default -c /home/vinay/projects/python/default/Modules/socketmodule.c -o build/temp.linux-i686-3.3-pydebug/home/vinay/projects/python/default /Modules/socketmodule.o .../Modules/socketmodule.c: In function ?makesockaddr?: .../Modules/socketmodule.c:1224: error: ?AF_CAN? undeclared (first use in this function) .../Modules/socketmodule.c:1224: error: (Each undeclared identifier is reported only once .../Modules/socketmodule.c:1224: error: for each function it appears in.) .../Modules/socketmodule.c: In function ?getsockaddrarg?: .../Modules/socketmodule.c:1610: error: ?AF_CAN? undeclared (first use in this function) .../Modules/socketmodule.c: In function ?getsockaddrlen?: .../Modules/socketmodule.c:1750: error: ?AF_CAN? undeclared (first use in this function) On this system, AF_CAN *is* defined, but in linux/socket.h, not in sys/socket.h. >From what I can see, sys/socket.h includes bits/socket.h which includes asm/socket.h, but apparently linux/socket.h isn't included. Is this a bug which doesn't show up on more recent Linux versions, or is Jaunty no longer supported for Python development, or could something be wrong with my configuration? BTW nothing has changed on the machine other than updates to Jenkins and the cpython repo. Any advice would be appreciated! Regards, Vinay Sajip From glyph at twistedmatrix.com Fri Oct 7 13:19:38 2011 From: glyph at twistedmatrix.com (Glyph) Date: Fri, 7 Oct 2011 07:19:38 -0400 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <20111007111036.GA22483@cskk.homeip.net> References: <1F9D0431-B24C-410E-9282-0E498EF024BF@twistedmatrix.com> <20111007111036.GA22483@cskk.homeip.net> Message-ID: <9BAA539B-8128-4D08-8D84-9AE065D4176E@twistedmatrix.com> On Oct 7, 2011, at 7:10 AM, Cameron Simpson wrote: > The point here is security, not test coverage: if a procedure is known > to be broken as a regular user, is it not highly unsafe to then run it > as root? No. As I mentioned previously, any environment where the tests are run should be isolated from any resources that are even safety-relevant, let alone safety-critical, whether they're running as a regular user _or_ root. In theory, one might automatically restore the run-as-root buildslave VM from a snapshot before every single test run. In practice this is probably too elaborate to bother with and an admin can just hit the 'restore' button in the fairly unlikely case that something does happen to break the buildslave. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosslagerwall at gmail.com Fri Oct 7 13:31:46 2011 From: rosslagerwall at gmail.com (Ross Lagerwall) Date: Fri, 07 Oct 2011 13:31:46 +0200 Subject: [Python-Dev] socket module build failure In-Reply-To: References: Message-ID: <1317987106.1989.1.camel@hobo> > Is this a bug which doesn't show up on more recent Linux versions Probably. AF_CAN was introduced in e767318baccd. Cheers Ross From neologix at free.fr Fri Oct 7 13:36:50 2011 From: neologix at free.fr (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Fri, 7 Oct 2011 13:36:50 +0200 Subject: [Python-Dev] socket module build failure In-Reply-To: References: Message-ID: Hello, 2011/10/7 Vinay Sajip : > I work on Ubuntu Jaunty for my cpython development work - an old version, I > know, but still quite serviceable and has worked well for me over many months. > With the latest default cpython repository, however, I can't run the regression > suite because the socket module now fails to build: > It's due to the recent inclusion of PF_CAN support: http://hg.python.org/cpython/rev/e767318baccd It looks like your header files are different from what's found in other distributions. Please reopen issue #10141, we'll try to go from there. Cheers, cf From raymond.hettinger at gmail.com Fri Oct 7 13:39:32 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Fri, 7 Oct 2011 07:39:32 -0400 Subject: [Python-Dev] counterintuitive behavior (bug?) in Counter with += In-Reply-To: References: Message-ID: <87720B70-D7E6-460A-B269-A9D5254B47A5@gmail.com> On Oct 3, 2011, at 6:12 AM, Lars Buitinck wrote: > After some digging, I found out that Counter [2] does not > have __iadd__ and += copies the entire left-hand side in __add__! This seems like a reasonable change for Py3.3. > I also figured out that I should use the update method instead, which > I will, but I still find that uglier than +=. I would submit a patch > to implement __iadd__, but I first want to know if that's considered > the right behavior, since it changes the semantics of +=: Yes, update() is the fastest way. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at haypocalc.com Fri Oct 7 13:44:13 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 07 Oct 2011 13:44:13 +0200 Subject: [Python-Dev] New stringbench benchmark results In-Reply-To: <4E8E5398.6020601@pearwood.info> References: <201110060206.30819.victor.stinner@haypocalc.com> <201110061242.26526.victor.stinner@haypocalc.com> <4E8E5398.6020601@pearwood.info> Message-ID: <4E8EE60D.1010207@haypocalc.com> Le 07/10/2011 03:19, Steven D'Aprano a ?crit : > Given that strings are immutable, would it not be an obvious > optimization for replace to return the source string unchanged if the > old and new substrings are equal, and avoid making a potentially > expensive copy? I just implemented this optimization in 9c1b76936b79, but only if old and new substrings are the same object (old is new). *Compare* substrings (content) would slow down .replace() in most cases. Victor From martin at v.loewis.de Fri Oct 7 15:21:36 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 07 Oct 2011 15:21:36 +0200 Subject: [Python-Dev] check for PyUnicode_READY look backwards In-Reply-To: <4E8EB0C3.80100@haypocalc.com> References: <4E8E2990.9060806@v.loewis.de> <4E8EB0C3.80100@haypocalc.com> Message-ID: <4E8EFCE0.7060005@v.loewis.de> > if (!PyUnicode_READY(foo)) is not better, also because of > PyUnicode_IS_READY(foo). > > I prefer PyUnicode_IS_READY(foo) < 0 over PyUnicode_IS_READY(foo) == -1. > Ok, so feel free to replace all == -1 tests with < 0 tests as well. I'll point out that the test for -1 is also widespread in Python, e.g. when checking return values from PyObject_SetAttrString, BaseException_init, PyThread_create_key, PyObject_DelAttrString, etc. Regards, Martin From techtonik at gmail.com Fri Oct 7 15:57:31 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 7 Oct 2011 16:57:31 +0300 Subject: [Python-Dev] SimpleHTTPServer slashdot (Was: Python Core Tools) Message-ID: On Sun, Oct 2, 2011 at 3:17 PM, Maciej Fijalkowski wrote: > On Sun, Oct 2, 2011 at 8:05 AM, Maciej Fijalkowski wrote: >> On Sun, Oct 2, 2011 at 5:02 AM, anatoly techtonik wrote: >>> Hello, >>> >>> I've stumbled upon Dave Beazley's article [1] about trying ancient GIL >>> removal patch at >>> http://dabeaz.blogspot.com/2011/08/inside-look-at-gil-removal-patch-of.html >>> and looking at the output of Python dis module thought that it would >>> be cool if there were tools to inspect, explain and play with Python >>> bytecode. Little visual assembler, that shows bytecode and disassembly >>> side by side and annotates the listing with useful hints (like >>> interpreter code optimization decisions). That will greatly help many >>> new people understand how Python works and explain complicated stuff >>> like GIL and stackless by copy/pasting pictures from there. PyPy has a >>> tool named 'jitviewer' [2] that may be that I am looking for, but the >>> demo is offline. >> >> I put demo back online. >> > > It's just that SimpleHTTPServer doesn't quite survive slashdot effect. > Where do I fill a bug report :) http://bugs.python.org Is the demo address is still http://wyvern.cs.uni-duesseldorf.de:5000/ ? Still can't connect. =| -- anatoly t. From brian.curtin at gmail.com Fri Oct 7 16:04:08 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Fri, 7 Oct 2011 09:04:08 -0500 Subject: [Python-Dev] SimpleHTTPServer slashdot (Was: Python Core Tools) In-Reply-To: References: Message-ID: On Fri, Oct 7, 2011 at 08:57, anatoly techtonik wrote: >> It's just that SimpleHTTPServer doesn't quite survive slashdot effect. >> Where do I fill a bug report :) > > http://bugs.python.org http://www.theonion.com/ From ncoghlan at gmail.com Fri Oct 7 16:06:16 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Oct 2011 10:06:16 -0400 Subject: [Python-Dev] check for PyUnicode_READY look backwards In-Reply-To: <4E8EFCE0.7060005@v.loewis.de> References: <4E8E2990.9060806@v.loewis.de> <4E8EB0C3.80100@haypocalc.com> <4E8EFCE0.7060005@v.loewis.de> Message-ID: On Fri, Oct 7, 2011 at 9:21 AM, "Martin v. L?wis" wrote: > ?> if (!PyUnicode_READY(foo)) is not better, also because of >> >> PyUnicode_IS_READY(foo). >> >> I prefer PyUnicode_IS_READY(foo) < 0 over PyUnicode_IS_READY(foo) == -1. >> > > Ok, so feel free to replace all == -1 tests with < 0 tests as well. > > I'll point out that the test for -1 is also widespread in Python, > e.g. when checking return values from PyObject_SetAttrString, > BaseException_init, PyThread_create_key, PyObject_DelAttrString, etc. FWIW, I don't mind whether it's "< 0" or "== -1", so long as there's a comparison there to kick my brain out of Python boolean logic mode. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Fri Oct 7 16:07:26 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 7 Oct 2011 16:07:26 +0200 Subject: [Python-Dev] SimpleHTTPServer slashdot (Was: Python Core Tools) References: Message-ID: <20111007160726.1aba072e@pitrou.net> On Fri, 7 Oct 2011 09:04:08 -0500 Brian Curtin wrote: > On Fri, Oct 7, 2011 at 08:57, anatoly techtonik wrote: > >> It's just that SimpleHTTPServer doesn't quite survive slashdot effect. > >> Where do I fill a bug report :) > > > > http://bugs.python.org > > http://www.theonion.com/ Does theonion.com really run a SimpleHTTPServer, or am I missing something? Regards Antoine. From stephen at xemacs.org Fri Oct 7 16:49:55 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 07 Oct 2011 23:49:55 +0900 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <477C9A0C-5058-422D-A0D2-48060A7C947D@twistedmatrix.com> References: <20111006212701.GA10627@cskk.homeip.net> <20111007014648.GE25682@flay.puzzling.org> <87wrchqi8t.fsf@uwakimon.sk.tsukuba.ac.jp> <477C9A0C-5058-422D-A0D2-48060A7C947D@twistedmatrix.com> Message-ID: <87vcs0rh4c.fsf@uwakimon.sk.tsukuba.ac.jp> Glyph writes: > Have I left anything out? :-) Probably. That's the nature of the problem. But you caught enough that if all our buildbots are set up that way, the Bad Guys' scripts will probably conclude there's nothing to see here, and move along. From victor.stinner at haypocalc.com Fri Oct 7 17:36:05 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 07 Oct 2011 17:36:05 +0200 Subject: [Python-Dev] New stringbench benchmark results In-Reply-To: <201110061242.26526.victor.stinner@haypocalc.com> References: <201110060206.30819.victor.stinner@haypocalc.com> <201110061242.26526.victor.stinner@haypocalc.com> Message-ID: <4E8F1C65.9070607@haypocalc.com> Le 06/10/2011 12:42, Victor Stinner a ?crit : > "A".join(["Bob"]*100)): 0.92 => 2.11 I just optimized PyUnicode_Join() for such dummy benchmark. It's now 1.2x slower instead of 2.3x slower on this dummy benchmark. With longer *ASCII* strings, Python 3.3 is now 2x (narrow 3.2) or 4x (wide 3.2) faster than Python 3.2. For example with this micro-benchmark: ./python -m timeit 'x=["x"*500]*5000; y="\n"; z=y.join' 'z(x)' Victor From status at bugs.python.org Fri Oct 7 18:07:28 2011 From: status at bugs.python.org (Python tracker) Date: Fri, 7 Oct 2011 18:07:28 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20111007160728.C17E61D1DB@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2011-09-30 - 2011-10-07) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3052 ( +6) closed 21853 (+40) total 24905 (+46) Open issues with patches: 1302 Issues opened (33) ================== #4147: xml.dom.minidom toprettyxml: omit whitespace for text-only ele http://bugs.python.org/issue4147 reopened by ezio.melotti #10141: SocketCan support http://bugs.python.org/issue10141 reopened by neologix #11250: 2to3 truncates files at formfeed character http://bugs.python.org/issue11250 reopened by barry #12210: test_smtplib: intermittent failures on FreeBSD http://bugs.python.org/issue12210 reopened by skrah #12804: "make test" fails on systems without internet access http://bugs.python.org/issue12804 reopened by pitrou #13078: Python Crashes When Saving Or Opening http://bugs.python.org/issue13078 opened by Ash.Sparks #13081: Crash in Windows with unknown cause http://bugs.python.org/issue13081 opened by rlibiez #13083: _sre: getstring() releases the buffer before using it http://bugs.python.org/issue13083 opened by haypo #13086: Update howto/cporting.rst so it talks about 3.x instead of 3.0 http://bugs.python.org/issue13086 opened by larry #13088: Add Py_hexdigits constant: use one unique constant to format a http://bugs.python.org/issue13088 opened by haypo #13089: parsetok.c: memory leak http://bugs.python.org/issue13089 opened by skrah #13090: posix_read: memory leak http://bugs.python.org/issue13090 opened by skrah #13091: ctypes: memory leak http://bugs.python.org/issue13091 opened by skrah #13092: pep-393: memory leaks #2 http://bugs.python.org/issue13092 opened by skrah #13093: Redundant code in PyUnicode_EncodeDecimal() http://bugs.python.org/issue13093 opened by skrah #13094: Need Programming FAQ entry for the behavior of closures http://bugs.python.org/issue13094 opened by Tom????.Dvo????k #13096: ctypes: segfault with large POINTER type names http://bugs.python.org/issue13096 opened by meador.inge #13097: ctypes: segfault with large number of callback arguments http://bugs.python.org/issue13097 opened by meador.inge #13100: sre_compile._optimize_unicode() needs a cleanup http://bugs.python.org/issue13100 opened by haypo #13101: Module Doc viewer closes when browser window closes on Windows http://bugs.python.org/issue13101 opened by brian.curtin #13102: xml.dom.minidom does not support default namespaces http://bugs.python.org/issue13102 opened by crass #13103: copy of an asyncore dispatcher causes infinite recursion http://bugs.python.org/issue13103 opened by xdegaye #13105: Please elaborate on how 2.x and 3.x are different heads http://bugs.python.org/issue13105 opened by larry #13107: Text width in optparse.py can become negative http://bugs.python.org/issue13107 opened by adambyrtek #13111: Error 2203 when installing Python/Perl? http://bugs.python.org/issue13111 opened by MA.S #13114: check -r fails with non-ASCII unicode long_description http://bugs.python.org/issue13114 opened by Cykooz #13115: tp_as_{number,sequence,mapping} can't be set using PyType_From http://bugs.python.org/issue13115 opened by awilkins #13116: setup.cfg in [sb]dists should be static http://bugs.python.org/issue13116 opened by eric.araujo #13119: Newline for print() is \n on Windows, and not \r\n as expected http://bugs.python.org/issue13119 opened by M..Z. #13120: Default nosigint option to pdb.Pdb() prevents use in non-main http://bugs.python.org/issue13120 opened by bpb #13121: collections.Counter's += copies the entire object http://bugs.python.org/issue13121 opened by larsmans #13122: Out of date links in the sidebar of the documentation index of http://bugs.python.org/issue13122 opened by smarnach #13123: bdist_wininst uninstaller does not remove pycache directories http://bugs.python.org/issue13123 opened by pmoore Most recent 15 issues with no replies (15) ========================================== #13123: bdist_wininst uninstaller does not remove pycache directories http://bugs.python.org/issue13123 #13122: Out of date links in the sidebar of the documentation index of http://bugs.python.org/issue13122 #13120: Default nosigint option to pdb.Pdb() prevents use in non-main http://bugs.python.org/issue13120 #13116: setup.cfg in [sb]dists should be static http://bugs.python.org/issue13116 #13115: tp_as_{number,sequence,mapping} can't be set using PyType_From http://bugs.python.org/issue13115 #13111: Error 2203 when installing Python/Perl? http://bugs.python.org/issue13111 #13107: Text width in optparse.py can become negative http://bugs.python.org/issue13107 #13100: sre_compile._optimize_unicode() needs a cleanup http://bugs.python.org/issue13100 #13097: ctypes: segfault with large number of callback arguments http://bugs.python.org/issue13097 #13093: Redundant code in PyUnicode_EncodeDecimal() http://bugs.python.org/issue13093 #13092: pep-393: memory leaks #2 http://bugs.python.org/issue13092 #13090: posix_read: memory leak http://bugs.python.org/issue13090 #13089: parsetok.c: memory leak http://bugs.python.org/issue13089 #13088: Add Py_hexdigits constant: use one unique constant to format a http://bugs.python.org/issue13088 #13083: _sre: getstring() releases the buffer before using it http://bugs.python.org/issue13083 Most recent 15 issues waiting for review (15) ============================================= #13121: collections.Counter's += copies the entire object http://bugs.python.org/issue13121 #13114: check -r fails with non-ASCII unicode long_description http://bugs.python.org/issue13114 #13103: copy of an asyncore dispatcher causes infinite recursion http://bugs.python.org/issue13103 #13093: Redundant code in PyUnicode_EncodeDecimal() http://bugs.python.org/issue13093 #13092: pep-393: memory leaks #2 http://bugs.python.org/issue13092 #13088: Add Py_hexdigits constant: use one unique constant to format a http://bugs.python.org/issue13088 #13077: Unclear behavior of daemon threads on main thread exit http://bugs.python.org/issue13077 #13075: PEP-0001 contains dead links http://bugs.python.org/issue13075 #13063: test_concurrent_futures failures on Windows: IOError('[Errno 2 http://bugs.python.org/issue13063 #13062: Introspection generator and function closure state http://bugs.python.org/issue13062 #13057: Thread not working for python 2.7.1 built with HP Compiler on http://bugs.python.org/issue13057 #13055: Distutils tries to handle null versions but fails http://bugs.python.org/issue13055 #13053: Add Capsule migration documentation to "cporting" http://bugs.python.org/issue13053 #13051: Infinite recursion in curses.textpad.Textbox http://bugs.python.org/issue13051 #13045: socket.getsockopt may require custom buffer contents http://bugs.python.org/issue13045 Top 10 most discussed issues (10) ================================= #6715: xz compressor support http://bugs.python.org/issue6715 23 msgs #12753: \N{...} neglects formal aliases and named sequences from Unico http://bugs.python.org/issue12753 17 msgs #10141: SocketCan support http://bugs.python.org/issue10141 12 msgs #12804: "make test" fails on systems without internet access http://bugs.python.org/issue12804 9 msgs #13071: IDLE accepts, then crashes, on invalid key bindings. http://bugs.python.org/issue13071 9 msgs #4147: xml.dom.minidom toprettyxml: omit whitespace for text-only ele http://bugs.python.org/issue4147 8 msgs #12880: ctypes: clearly document how structure bit fields are allocate http://bugs.python.org/issue12880 8 msgs #13081: Crash in Windows with unknown cause http://bugs.python.org/issue13081 7 msgs #13053: Add Capsule migration documentation to "cporting" http://bugs.python.org/issue13053 6 msgs #13103: copy of an asyncore dispatcher causes infinite recursion http://bugs.python.org/issue13103 6 msgs Issues closed (41) ================== #3163: module struct support for ssize_t and size_t http://bugs.python.org/issue3163 closed by pitrou #7367: pkgutil.walk_packages fails on write-only directory in sys.pat http://bugs.python.org/issue7367 closed by ned.deily #7425: Improve the robustness of "pydoc -k" in the face of broken mod http://bugs.python.org/issue7425 closed by ned.deily #7689: Pickling of classes with a metaclass and copy_reg http://bugs.python.org/issue7689 closed by pitrou #8037: multiprocessing.Queue's put() not atomic thread wise http://bugs.python.org/issue8037 closed by neologix #10348: multiprocessing: use SysV semaphores on FreeBSD http://bugs.python.org/issue10348 closed by jnoller #11841: Bug in the verson comparison http://bugs.python.org/issue11841 closed by eric.araujo #11914: pydoc modules/help('modules') crash in dirs with unreadable su http://bugs.python.org/issue11914 closed by ned.deily #11956: 3.3 : test_import.py causes 'make test' to fail http://bugs.python.org/issue11956 closed by neologix #12167: test_packaging reference leak http://bugs.python.org/issue12167 closed by eric.araujo #12222: All pysetup commands should respect exit codes http://bugs.python.org/issue12222 closed by python-dev #12696: pydoc error page due to lacking permissions on ./* http://bugs.python.org/issue12696 closed by ned.deily #12823: Broken link in "SSL wrapper for socket objects" document http://bugs.python.org/issue12823 closed by pitrou #12881: ctypes: segfault with large structure field names http://bugs.python.org/issue12881 closed by meador.inge #12911: Expose a private accumulator C API http://bugs.python.org/issue12911 closed by pitrou #12943: tokenize: add python -m tokenize support back http://bugs.python.org/issue12943 closed by meador.inge #13001: test_socket.testRecvmsgTrunc failure on FreeBSD 7.2 buildbot http://bugs.python.org/issue13001 closed by neologix #13034: Python does not read Alternative Subject Names from some SSL c http://bugs.python.org/issue13034 closed by pitrou #13040: call to tkinter.messagebox.showinfo hangs the script on timer http://bugs.python.org/issue13040 closed by ned.deily #13054: sys.maxunicode value after PEP-393 http://bugs.python.org/issue13054 closed by ezio.melotti #13070: segmentation fault in pure-python multi-threaded server http://bugs.python.org/issue13070 closed by neologix #13073: message_body argument of HTTPConnection.endheaders is undocume http://bugs.python.org/issue13073 closed by orsenthil #13076: Bad links to 'time' in datetime documentation http://bugs.python.org/issue13076 closed by ezio.melotti #13079: Wrong datetime format in PEP3101 http://bugs.python.org/issue13079 closed by eric.smith #13080: test_email fails in refleak mode http://bugs.python.org/issue13080 closed by skrah #13082: Can't open new window in python http://bugs.python.org/issue13082 closed by ned.deily #13084: test_signal failure http://bugs.python.org/issue13084 closed by neologix #13085: pep-393: memory leaks http://bugs.python.org/issue13085 closed by loewis #13087: C BufferedReader seek() is inconsistent with UnsupportedOperat http://bugs.python.org/issue13087 closed by pitrou #13095: Support for splitting lists/tuples into chunks http://bugs.python.org/issue13095 closed by rhettinger #13098: the struct module should support storage for size_t / Py_ssize http://bugs.python.org/issue13098 closed by pitrou #13099: Sqlite3 & turkish locale http://bugs.python.org/issue13099 closed by pitrou #13104: urllib.request.thishost() returns a garbage value http://bugs.python.org/issue13104 closed by orsenthil #13106: Incorrect pool.py distributed with Python 2.7 windows 32bit http://bugs.python.org/issue13106 closed by Aaron.Staley #13108: test_urllib: buildbot failure http://bugs.python.org/issue13108 closed by skrah #13109: telnetlib insensitive to connection loss http://bugs.python.org/issue13109 closed by eric.smith #13110: test_socket.py failures on ARM http://bugs.python.org/issue13110 closed by barry #13112: backreferences in comprehensions http://bugs.python.org/issue13112 closed by mark.dickinson #13113: Wrong error message on class instance, when giving too little http://bugs.python.org/issue13113 closed by benjamin.peterson #13117: Broken links in the ???compiler??? page, section ???references http://bugs.python.org/issue13117 closed by ned.deily #13118: Py_BuildValue format f incorrect description. http://bugs.python.org/issue13118 closed by felixantoinefortin From merwok at netwok.org Fri Oct 7 18:56:57 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Fri, 07 Oct 2011 18:56:57 +0200 Subject: [Python-Dev] Status of the built-in virtualenv functionality in 3.3 In-Reply-To: References: <4E8DB745.2090406@netwok.org> <20111006113124.56c18f6a@resist.wooz.org> <4E8DCD53.6030306@netwok.org> <20111006120205.061a50ed@resist.wooz.org> <20111006180440.319c52c3@pitrou.net> <20111006125043.16d1c462@rivendell> <4E8DF47E.9070109@oddbird.net> <20111006150604.5e64d912@resist.wooz.org> Message-ID: <4E8F2F59.6070807@netwok.org> Hi, I too prefer venv (module) and pyvenv (script). Regards From merwok at netwok.org Fri Oct 7 19:00:25 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Fri, 07 Oct 2011 19:00:25 +0200 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #7367: Add test case to test_pkgutil for walking path with In-Reply-To: References: Message-ID: <4E8F3029.2000608@netwok.org> Hi Ned, > Issue #7367: Add test case to test_pkgutil for walking path with > an unreadable directory. Kudos for fixing this bug, the pydoc one and cleaning the duplicate reports! > diff --git a/Lib/test/test_pkgutil.py b/Lib/test/test_pkgutil.py > --- a/Lib/test/test_pkgutil.py > +++ b/Lib/test/test_pkgutil.py > @@ -78,6 +78,17 @@ > + def test_unreadable_dir_on_syspath(self): > + # issue7367 - walk_packages failed if unreadable dir on sys.path > + package_name = "unreadable_package" > + d = os.path.join(self.dirname, package_name) > + # this does not appear to create an unreadable dir on Windows > + # but the test should not fail anyway > + os.mkdir(d, 0) > + for t in pkgutil.walk_packages(path=[self.dirname]): > + self.fail("unexpected package found") > + os.rmdir(d) This should use a try/finally block (or self.addCleanup, my preference) to make sure rmdir is always called. Regards From ericsnowcurrently at gmail.com Fri Oct 7 20:21:38 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 7 Oct 2011 12:21:38 -0600 Subject: [Python-Dev] More Buildbot Information in Devguide (Was: Re: cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as) Message-ID: On Fri, Oct 7, 2011 at 4:18 AM, Glyph wrote: > On Oct 7, 2011, at 5:10 AM, Stephen J. Turnbull wrote: > > The principle here is "ran as root" without further explanation is a > litmus test for "not bothering about security", even today. ?It's > worth asking for explanation, or at least a comment that "all the > buildbot contributors I've talked to have put a lot of effort into > security configuration". > > This is a valid point. ?I think that Cameron and I may have had > significantly different assumptions about the environment being discussed > here. ?I may have brought some assumptions about the build farm here that > don't actually apply to the way Python does it. > To sum up what I believe is now the consensus from this thread: > > Anyone setting up a buildslave should take care to invoke the build in an > environment where an out-of-control buildbot, potentially executing > arbitrarily horrible and/or malicious code, should not damage anything. > ?Builders should always be isolated from valuable resources, although the > specific mechanism of isolation may differ. ?A virtual machine is a good > default, but may not be sufficient; other tools for cutting of the builder > from the outside world would be chroot jails, solaris zones, etc. > Code runs differently as privileged vs. unprivileged users. ?Therefore > builders should be set up in both configurations, running the full test > suite, to ensure that all code runs as expected in both configurations. > ?Some tests, as the start of this thread indicates, must have some special > logic to make sure they do or do not run, or run differently, in privileged > vs. unprivileged configurations, but generally speaking most things should > work in both places. > Access to root my provide access to slightly surprising resources, even > within a VM (such as the ability to send spoofed IP packets, change the MAC > address of even virtual ethernet cards, etc), and administrators should be > aware that this is the case when configuring the host environment for a > run-as-root builder. ?You don't want to end up with a compromised test VM > that can snoop on your network. > > Have I left anything out? :-) > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/ericsnowcurrently%40gmail.com > > I've created an issue with a patch for a dedicated page in the devguide on running a build slave[1]. I've included the information from this thread on that page. I realize that the thread still has some juice in it, so the info I copied from this thread is likely incomplete and/or too much detail, but I wanted to get the devguide page rolling. -eric [1] http://bugs.python.org/issue13124 From p.f.moore at gmail.com Fri Oct 7 20:49:19 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 7 Oct 2011 19:49:19 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 Message-ID: I see that the Packaging documentation is now more complete (at least at docs.python.org) - I don't know if it's deemed fully complete yet, but I scanned the documentation and "Installing Python Projects" looks pretty much converted (and very good!!), but "Distributing Python Projects" still has quite a lot of distutils-related text in, and I need to read more deeply to understand if that's because it remains unchanged, or if it is still to be updated. But one thing struck me - the "Installing Python Projects" document talks about source distributions, but not much about binary distributions. On Windows, binary distributions are significantly more important than on Unix, because not all users have easy access to a compiler, and more importantly, C library dependencies can be difficult to build, hard to set up, and generally a pain to deal with. The traditional solution was always bdist_wininst installers, and with the advent of setuptools binary eggs started to become common. I've noticed that since pip became more widely used, with its focus on source builds, binary eggs seemed to fade away somewhat. I don't know what format packaging favours. The problem when Python 3.3 comes out is that bdist_wininst/bdist_msi installers do not interact well with pysetup. And if native virtual environment support becomes part of Python 3.3, they won't work well there either (they don't deal well with today's virtualenv, for that matter). So there will be a need for a pysetup-friendly binary format. I assume that the egg format will fill this role - or is that not the case? What is the current thinking on binary distribution formats for Python 3.3? The main reason I am asking is that I would like to write an article (or maybe a series of articles) for Python Insider, introducing the new packaging facilities from the point of view of an end user with straightforward needs (whether a package user just looking to manage a set of installed packages, or a module author who just wants to publish his code in a form that satisfies as many people as possible). What I'd hope to do is, as well as showing people all the nice things they can expect to see in Python 3.3, to also start package authors thinking about what they need to do to support their users under the new system. If we get the message out early, and make people aware of the benefits of the new end user tools, then I'm hoping more authors will see the advantage of switching to the new format rather than just sticking with bdist_xxx because "it's always worked". I suspect I should (re-)join the distutils SIG and take this discussion there. But honestly, I'm not sure I have the time - the traffic was always fairly high, and the number of relevant posts for a casual observer was quite low. So even if that's the right place to go, some pointers to some "high spots" to get me up to speed on the current state of affairs would help. Thanks, Paul. From martin at v.loewis.de Fri Oct 7 21:02:00 2011 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 07 Oct 2011 21:02:00 +0200 Subject: [Python-Dev] PyUnicode_KIND changed Message-ID: <4E8F4CA8.7060409@v.loewis.de> After discussion with several people, I changed PyUnicode_KIND to have values of 1,2,4, respectively, thus reflecting the element size of the string numerically. As a consequence, the PyUnicode_CHARACTER_SIZE and PyUnicode_KIND_SIZE macros are now gone. Regards, Martin From guido at python.org Fri Oct 7 21:13:01 2011 From: guido at python.org (Guido van Rossum) Date: Fri, 7 Oct 2011 12:13:01 -0700 Subject: [Python-Dev] PyUnicode_KIND changed In-Reply-To: <4E8F4CA8.7060409@v.loewis.de> References: <4E8F4CA8.7060409@v.loewis.de> Message-ID: On Fri, Oct 7, 2011 at 12:02 PM, "Martin v. L?wis" wrote: > After discussion with several people, I changed > PyUnicode_KIND to have values of 1,2,4, respectively, > thus reflecting the element size of the string numerically. Hah! I suggested this when first reviewing the PEP. :-) -- --Guido van Rossum (python.org/~guido) From fijall at gmail.com Fri Oct 7 21:31:23 2011 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 7 Oct 2011 21:31:23 +0200 Subject: [Python-Dev] SimpleHTTPServer slashdot (Was: Python Core Tools) In-Reply-To: References: Message-ID: On Fri, Oct 7, 2011 at 3:57 PM, anatoly techtonik wrote: > On Sun, Oct 2, 2011 at 3:17 PM, Maciej Fijalkowski wrote: >> On Sun, Oct 2, 2011 at 8:05 AM, Maciej Fijalkowski wrote: >>> On Sun, Oct 2, 2011 at 5:02 AM, anatoly techtonik wrote: >>>> Hello, >>>> >>>> I've stumbled upon Dave Beazley's article [1] about trying ancient GIL >>>> removal patch at >>>> http://dabeaz.blogspot.com/2011/08/inside-look-at-gil-removal-patch-of.html >>>> and looking at the output of Python dis module thought that it would >>>> be cool if there were tools to inspect, explain and play with Python >>>> bytecode. Little visual assembler, that shows bytecode and disassembly >>>> side by side and annotates the listing with useful hints (like >>>> interpreter code optimization decisions). That will greatly help many >>>> new people understand how Python works and explain complicated stuff >>>> like GIL and stackless by copy/pasting pictures from there. PyPy has a >>>> tool named 'jitviewer' [2] that may be that I am looking for, but the >>>> demo is offline. >>> >>> I put demo back online. >>> >> >> It's just that SimpleHTTPServer doesn't quite survive slashdot effect. >> Where do I fill a bug report :) > > http://bugs.python.org don't have a reproducible workload, it has something to do with stalling on sendall though (Something!) > > Is the demo address is still http://wyvern.cs.uni-duesseldorf.de:5000/ > ? Still can't connect. =| Restarted > > -- > anatoly t. > From victor.stinner at haypocalc.com Fri Oct 7 21:41:30 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 7 Oct 2011 21:41:30 +0200 Subject: [Python-Dev] PyUnicode_KIND changed In-Reply-To: <4E8F4CA8.7060409@v.loewis.de> References: <4E8F4CA8.7060409@v.loewis.de> Message-ID: <201110072141.30474.victor.stinner@haypocalc.com> Le vendredi 7 octobre 2011 21:02:00, Martin v. L?wis a ?crit : > After discussion with several people, I changed > PyUnicode_KIND to have values of 1,2,4, respectively, > thus reflecting the element size of the string numerically. You may rename it to "character size" (char_size) ;-) Victor From victor.stinner at haypocalc.com Fri Oct 7 21:46:15 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 7 Oct 2011 21:46:15 +0200 Subject: [Python-Dev] New stringbench benchmark results In-Reply-To: <201110060206.30819.victor.stinner@haypocalc.com> References: <201110060206.30819.victor.stinner@haypocalc.com> Message-ID: <201110072146.15926.victor.stinner@haypocalc.com> Le jeudi 6 octobre 2011 02:06:30, Victor Stinner a ?crit : > The rfind case is really strange: the code between Python 3.2 and 3.3 is > exactly the same. Even in Python 3.2: rfind looks twice faster than find: > > ("AB"*300+"C").find("BC") (*1000) : 1.21 > ("C"+"AB"*300).rfind("CA") (*1000) : 0.57 It looks to be a gcc bug: using attached patch (written by Antoine), str.find() is a little bit faster. With the patch, the function does the same memory access, but it generates a different machine code. I don't know exactly the difference yet, but it may be related to the CMOVNE instruction (which looks to be slower than a classical conditional jump, JNE). Victor -------------- next part -------------- A non-text attachment was scrubbed... Name: fastsearch_gcc_bug.patch Type: text/x-patch Size: 1431 bytes Desc: not available URL: From fijall at gmail.com Fri Oct 7 22:50:26 2011 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 7 Oct 2011 22:50:26 +0200 Subject: [Python-Dev] Disabling cyclic GC in timeit module Message-ID: Hi Can we disable by default disabling the cyclic gc in timeit module? Often posts on pypy-dev or on pypy bugs contain usage of timeit module which might change the performance significantly. A good example is json benchmarks - you would rather not disable cyclic GC when running a web app, so encoding/decoding json in benchmark with the cyclic GC disabled does not make sense. What do you think? Cheers, fijal From ncoghlan at gmail.com Fri Oct 7 23:47:54 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Oct 2011 17:47:54 -0400 Subject: [Python-Dev] Disabling cyclic GC in timeit module In-Reply-To: References: Message-ID: On Fri, Oct 7, 2011 at 4:50 PM, Maciej Fijalkowski wrote: > Hi > > Can we disable by default disabling the cyclic gc in timeit module? > Often posts on pypy-dev or on pypy bugs contain usage of timeit module > which might change the performance significantly. A good example is > json benchmarks - you would rather not disable cyclic GC when running > a web app, so encoding/decoding json in benchmark with the cyclic GC > disabled does not make sense. > > What do you think? No, it's disabled by default for a reason (to avoid irrelevant noise in microbenchmarks), and other cases don't trump those original use cases. A command line switch to leave it enabled would probably be reasonable, though. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fijall at gmail.com Sat Oct 8 00:13:40 2011 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 8 Oct 2011 00:13:40 +0200 Subject: [Python-Dev] Disabling cyclic GC in timeit module In-Reply-To: References: Message-ID: On Fri, Oct 7, 2011 at 11:47 PM, Nick Coghlan wrote: > On Fri, Oct 7, 2011 at 4:50 PM, Maciej Fijalkowski wrote: >> Hi >> >> Can we disable by default disabling the cyclic gc in timeit module? >> Often posts on pypy-dev or on pypy bugs contain usage of timeit module >> which might change the performance significantly. A good example is >> json benchmarks - you would rather not disable cyclic GC when running >> a web app, so encoding/decoding json in benchmark with the cyclic GC >> disabled does not make sense. >> >> What do you think? > > No, it's disabled by default for a reason (to avoid irrelevant noise > in microbenchmarks), and other cases don't trump those original use > cases. People don't use it only for microbenchmarks though. Also, you can't call noise a thing that adds something every now and then I think. Er. How is disabling the GC for microbenchmarks any good by the way? Cheers, fijal From tjreedy at udel.edu Sat Oct 8 01:13:10 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 07 Oct 2011 19:13:10 -0400 Subject: [Python-Dev] check for PyUnicode_READY look backwards In-Reply-To: References: <4E8E2990.9060806@v.loewis.de> <4E8EB0C3.80100@haypocalc.com> <4E8EFCE0.7060005@v.loewis.de> Message-ID: On 10/7/2011 10:06 AM, Nick Coghlan wrote: > On Fri, Oct 7, 2011 at 9:21 AM, "Martin v. L?wis" wrote: >> > if (!PyUnicode_READY(foo)) is not better, also because of >>> >>> PyUnicode_IS_READY(foo). >>> >>> I prefer PyUnicode_IS_READY(foo)< 0 over PyUnicode_IS_READY(foo) == -1. >>> >> >> Ok, so feel free to replace all == -1 tests with< 0 tests as well. >> >> I'll point out that the test for -1 is also widespread in Python, >> e.g. when checking return values from PyObject_SetAttrString, >> BaseException_init, PyThread_create_key, PyObject_DelAttrString, etc. > > FWIW, I don't mind whether it's "< 0" or "== -1", so long as there's a > comparison there to kick my brain out of Python boolean logic mode. Is there any speed difference (on common x86/64 processors and compilers)? I would expect that '< 0' should be optimized to just check the sign bit and 'if n < 0' to 'load n; jump-non-negative'. -- Terry Jan Reedy From tjreedy at udel.edu Sat Oct 8 01:19:44 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 07 Oct 2011 19:19:44 -0400 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <477C9A0C-5058-422D-A0D2-48060A7C947D@twistedmatrix.com> References: <20111006212701.GA10627@cskk.homeip.net> <20111007014648.GE25682@flay.puzzling.org> <87wrchqi8t.fsf@uwakimon.sk.tsukuba.ac.jp> <477C9A0C-5058-422D-A0D2-48060A7C947D@twistedmatrix.com> Message-ID: On 10/7/2011 6:18 AM, Glyph wrote: > To sum up what I believe is now the consensus from this thread: > > 1. Anyone setting up a buildslave should take care to invoke the build > in an environment where an out-of-control buildbot, potentially > executing arbitrarily horrible and/or malicious code, should not > damage anything. Builders should always be isolated from valuable > resources, although the specific mechanism of isolation may differ. > A virtual machine is a good default, but may not be sufficient; > other tools for cutting of the builder from the outside world would > be chroot jails, solaris zones, etc. > 2. Code runs differently as privileged vs. unprivileged users. My particular concern with testing as an unprivileged user comes from experience with too many (commercial, post-XP) Windows programs that only run correctly as admin (without an obvious good reason). > Therefore builders should be set up in both configurations, running > the full test suite, to ensure that all code runs as expected in > both configurations. Some tests, as the start of this thread > indicates, must have some special logic to make sure they do or do > not run, or run differently, in privileged vs. unprivileged > configurations, but generally speaking most things should work in > both places. > 3. Access to root my provide access to slightly surprising resources, > even within a VM (such as the ability to send spoofed IP packets, > change the MAC address of even virtual ethernet cards, etc), and > administrators should be aware that this is the case when > configuring the host environment for a run-as-root builder. You > don't want to end up with a compromised test VM that can snoop on > your network. -- Terry Jan Reedy From solipsis at pitrou.net Sat Oct 8 01:47:28 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 8 Oct 2011 01:47:28 +0200 Subject: [Python-Dev] Disabling cyclic GC in timeit module References: Message-ID: <20111008014728.7f0916ea@pitrou.net> On Sat, 8 Oct 2011 00:13:40 +0200 Maciej Fijalkowski wrote: > On Fri, Oct 7, 2011 at 11:47 PM, Nick Coghlan wrote: > > On Fri, Oct 7, 2011 at 4:50 PM, Maciej Fijalkowski wrote: > >> Hi > >> > >> Can we disable by default disabling the cyclic gc in timeit module? > >> Often posts on pypy-dev or on pypy bugs contain usage of timeit module > >> which might change the performance significantly. A good example is > >> json benchmarks - you would rather not disable cyclic GC when running > >> a web app, so encoding/decoding json in benchmark with the cyclic GC > >> disabled does not make sense. > >> > >> What do you think? > > > > No, it's disabled by default for a reason (to avoid irrelevant noise > > in microbenchmarks), and other cases don't trump those original use > > cases. > > People don't use it only for microbenchmarks though. Also, you can't > call noise a thing that adds something every now and then I think. > > Er. How is disabling the GC for microbenchmarks any good by the way? In CPython, looking for reference cycles is a parasitic task that interferes with what you are trying to measure. It is not critical in any way, and you can schedule it much less often if it takes too much CPU, without any really adverse consequences. timeit takes the safe way and disables it completely. In PyPy, it doesn't seem gc.disable() should do anything, since you'd lose all automatic memory management if the GC was disabled. Regards Antoine. From fuzzyman at voidspace.org.uk Sat Oct 8 02:13:45 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sat, 08 Oct 2011 01:13:45 +0100 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: References: <20111006212701.GA10627@cskk.homeip.net> <20111007014648.GE25682@flay.puzzling.org> <87wrchqi8t.fsf@uwakimon.sk.tsukuba.ac.jp> <477C9A0C-5058-422D-A0D2-48060A7C947D@twistedmatrix.com> Message-ID: <4E8F95B9.8030508@voidspace.org.uk> On 08/10/2011 00:19, Terry Reedy wrote: > On 10/7/2011 6:18 AM, Glyph wrote: > >> To sum up what I believe is now the consensus from this thread: >> >> 1. Anyone setting up a buildslave should take care to invoke the build >> in an environment where an out-of-control buildbot, potentially >> executing arbitrarily horrible and/or malicious code, should not >> damage anything. Builders should always be isolated from valuable >> resources, although the specific mechanism of isolation may differ. >> A virtual machine is a good default, but may not be sufficient; >> other tools for cutting of the builder from the outside world would >> be chroot jails, solaris zones, etc. >> 2. Code runs differently as privileged vs. unprivileged users. > > My particular concern with testing as an unprivileged user comes from > experience with too many (commercial, post-XP) Windows programs that > only run correctly as admin (without an obvious good reason). It would seem that for this use case it is more important that all tests pass when run as a *non-admin* user. Michael > >> Therefore builders should be set up in both configurations, running >> the full test suite, to ensure that all code runs as expected in >> both configurations. Some tests, as the start of this thread >> indicates, must have some special logic to make sure they do or do >> not run, or run differently, in privileged vs. unprivileged >> configurations, but generally speaking most things should work in >> both places. >> 3. Access to root my provide access to slightly surprising resources, >> even within a VM (such as the ability to send spoofed IP packets, >> change the MAC address of even virtual ethernet cards, etc), and >> administrators should be aware that this is the case when >> configuring the host environment for a run-as-root builder. You >> don't want to end up with a compromised test VM that can snoop on >> your network. > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From fijall at gmail.com Sat Oct 8 02:14:38 2011 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 8 Oct 2011 02:14:38 +0200 Subject: [Python-Dev] Disabling cyclic GC in timeit module In-Reply-To: <20111008014728.7f0916ea@pitrou.net> References: <20111008014728.7f0916ea@pitrou.net> Message-ID: On Sat, Oct 8, 2011 at 1:47 AM, Antoine Pitrou wrote: > On Sat, 8 Oct 2011 00:13:40 +0200 > Maciej Fijalkowski wrote: >> On Fri, Oct 7, 2011 at 11:47 PM, Nick Coghlan wrote: >> > On Fri, Oct 7, 2011 at 4:50 PM, Maciej Fijalkowski wrote: >> >> Hi >> >> >> >> Can we disable by default disabling the cyclic gc in timeit module? >> >> Often posts on pypy-dev or on pypy bugs contain usage of timeit module >> >> which might change the performance significantly. A good example is >> >> json benchmarks - you would rather not disable cyclic GC when running >> >> a web app, so encoding/decoding json in benchmark with the cyclic GC >> >> disabled does not make sense. >> >> >> >> What do you think? >> > >> > No, it's disabled by default for a reason (to avoid irrelevant noise >> > in microbenchmarks), and other cases don't trump those original use >> > cases. >> >> People don't use it only for microbenchmarks though. Also, you can't >> call noise a thing that adds something every now and then I think. >> >> Er. How is disabling the GC for microbenchmarks any good by the way? > > In CPython, looking for reference cycles is a parasitic task that > interferes with what you are trying to measure. It is not critical in > any way, and you can schedule it much less often if it takes too much > CPU, without any really adverse consequences. timeit takes the safe way > and disables it completely. > > In PyPy, it doesn't seem gc.disable() should do anything, since you'd > lose all automatic memory management if the GC was disabled. > it disables finalizers but this is besides the point. the point is that people use timeit module to compute absolute time it takes for CPython to do things, among other things comparing it to PyPy. While I do agree that in microbenchmarks you don't loose much by just disabling it, it does affect larger applications. So answering the question like "how much time will take json encoding in my application" should take cyclic GC time into account. Cheers, fijal From solipsis at pitrou.net Sat Oct 8 02:18:20 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 08 Oct 2011 02:18:20 +0200 Subject: [Python-Dev] Disabling cyclic GC in timeit module In-Reply-To: References: <20111008014728.7f0916ea@pitrou.net> Message-ID: <1318033100.3697.10.camel@localhost.localdomain> > > > > In CPython, looking for reference cycles is a parasitic task that > > interferes with what you are trying to measure. It is not critical in > > any way, and you can schedule it much less often if it takes too much > > CPU, without any really adverse consequences. timeit takes the safe way > > and disables it completely. > > > > In PyPy, it doesn't seem gc.disable() should do anything, since you'd > > lose all automatic memory management if the GC was disabled. > > > > it disables finalizers but this is besides the point. the point is > that people use timeit module to compute absolute time it takes for > CPython to do things, among other things comparing it to PyPy. While I > do agree that in microbenchmarks you don't loose much by just > disabling it, it does affect larger applications. So answering the > question like "how much time will take json encoding in my > application" should take cyclic GC time into account. If you are only measuring json encoding of a few select pieces of data then it's a microbenchmark. If you are measuring the whole application (or a significant part of it) then I'm not sure timeit is the right tool for that. Regards Antoine. From fijall at gmail.com Sat Oct 8 02:49:23 2011 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 8 Oct 2011 02:49:23 +0200 Subject: [Python-Dev] Disabling cyclic GC in timeit module In-Reply-To: <1318033100.3697.10.camel@localhost.localdomain> References: <20111008014728.7f0916ea@pitrou.net> <1318033100.3697.10.camel@localhost.localdomain> Message-ID: On Sat, Oct 8, 2011 at 2:18 AM, Antoine Pitrou wrote: > >> > >> > In CPython, looking for reference cycles is a parasitic task that >> > interferes with what you are trying to measure. It is not critical in >> > any way, and you can schedule it much less often if it takes too much >> > CPU, without any really adverse consequences. timeit takes the safe way >> > and disables it completely. >> > >> > In PyPy, it doesn't seem gc.disable() should do anything, since you'd >> > lose all automatic memory management if the GC was disabled. >> > >> >> it disables finalizers but this is besides the point. the point is >> that people use timeit module to compute absolute time it takes for >> CPython to do things, among other things comparing it to PyPy. While I >> do agree that in microbenchmarks you don't loose much by just >> disabling it, it does affect larger applications. So answering the >> question like "how much time will take json encoding in my >> application" should take cyclic GC time into account. > > If you are only measuring json encoding of a few select pieces of data > then it's a microbenchmark. > If you are measuring the whole application (or a significant part of it) > then I'm not sure timeit is the right tool for that. > > Regards > > Antoine. > When you're measuring how much time it takes to encode json, this is a microbenchmark and yet the time that timeit gives you is misleading, because it'll take different amount of time in your application. I guess my proposition would be to not disable gc by default and disable it when requested, but well, I guess I'll give up given the strong push against it. Cheers, fijal From steve at pearwood.info Sat Oct 8 02:51:04 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 08 Oct 2011 11:51:04 +1100 Subject: [Python-Dev] Disabling cyclic GC in timeit module In-Reply-To: <1318033100.3697.10.camel@localhost.localdomain> References: <20111008014728.7f0916ea@pitrou.net> <1318033100.3697.10.camel@localhost.localdomain> Message-ID: <4E8F9E78.8070302@pearwood.info> Antoine Pitrou wrote: >>> In CPython, looking for reference cycles is a parasitic task that >>> interferes with what you are trying to measure. It is not critical in >>> any way, and you can schedule it much less often if it takes too much >>> CPU, without any really adverse consequences. timeit takes the safe way >>> and disables it completely. >>> >>> In PyPy, it doesn't seem gc.disable() should do anything, since you'd >>> lose all automatic memory management if the GC was disabled. >>> >> it disables finalizers but this is besides the point. the point is >> that people use timeit module to compute absolute time it takes for >> CPython to do things, among other things comparing it to PyPy. While I >> do agree that in microbenchmarks you don't loose much by just >> disabling it, it does affect larger applications. So answering the >> question like "how much time will take json encoding in my >> application" should take cyclic GC time into account. > > If you are only measuring json encoding of a few select pieces of data > then it's a microbenchmark. > If you are measuring the whole application (or a significant part of it) > then I'm not sure timeit is the right tool for that. Perhaps timeit should grow a macro-benchmark tool too? I find myself often using timeit to time macro-benchmarks simply because it's more convenient at the interactive interpreter than the alternatives. Something like this idea perhaps? http://preshing.com/20110924/timing-your-code-using-pythons-with-statement -- Steven From cs at zip.com.au Sat Oct 8 03:40:41 2011 From: cs at zip.com.au (Cameron Simpson) Date: Sat, 8 Oct 2011 12:40:41 +1100 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <4E8F95B9.8030508@voidspace.org.uk> References: <4E8F95B9.8030508@voidspace.org.uk> Message-ID: <20111008014041.GA28402@cskk.homeip.net> On 08Oct2011 01:13, Michael Foord wrote: | On 08/10/2011 00:19, Terry Reedy wrote: | >On 10/7/2011 6:18 AM, Glyph wrote: | > | >>To sum up what I believe is now the consensus from this thread: | >> | >> 1. Anyone setting up a buildslave should take care to invoke the build | >> in an environment where an out-of-control buildbot, potentially | >> executing arbitrarily horrible and/or malicious code, should not | >> damage anything. Builders should always be isolated from valuable | >> resources, although the specific mechanism of isolation may differ. | >> A virtual machine is a good default, but may not be sufficient; | >> other tools for cutting of the builder from the outside world would | >> be chroot jails, solaris zones, etc. | >> 2. Code runs differently as privileged vs. unprivileged users. | > | >My particular concern with testing as an unprivileged user comes | >from experience with too many (commercial, post-XP) Windows | >programs that only run correctly as admin (without an obvious good | >reason). | | It would seem that for this use case it is more important that all | tests pass when run as a *non-admin* user. I'm pretty sure that's what Terry meant; if these apps were tested non-admin they wouldn't need to run as "admin (without an obvious good reason". Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ It is easier to optimize correct code than to correct optimized code. - Bill Harlan From eliben at gmail.com Sat Oct 8 05:47:19 2011 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 8 Oct 2011 05:47:19 +0200 Subject: [Python-Dev] Disabling cyclic GC in timeit module In-Reply-To: <4E8F9E78.8070302@pearwood.info> References: <20111008014728.7f0916ea@pitrou.net> <1318033100.3697.10.camel@localhost.localdomain> <4E8F9E78.8070302@pearwood.info> Message-ID: > Perhaps timeit should grow a macro-benchmark tool too? I find myself often > using timeit to time macro-benchmarks simply because it's more convenient at > the interactive interpreter than the alternatives. > > Something like this idea perhaps? > > http://preshing.com/20110924/timing-your-code-using-pythons-with-statement I have essentially the same snippet (with the addition of being able to provide names for timers, thus allowing to have several executing in the code and knowing which is which) lying in my toolbox for a long time now, and I find it very useful. There's also an alternative approach, having a decorator that marks a function for benchmarking. David Beazley has one good example of this here: http://www.dabeaz.com/python3io/timethis.py Eli From andrew at bemusement.org Sat Oct 8 14:27:53 2011 From: andrew at bemusement.org (Andrew Bennetts) Date: Sat, 8 Oct 2011 23:27:53 +1100 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <87wrchqi8t.fsf@uwakimon.sk.tsukuba.ac.jp> References: <20111006212701.GA10627@cskk.homeip.net> <20111007014648.GE25682@flay.puzzling.org> <87wrchqi8t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20111008122753.GA2550@aihal.home.puzzling.org> Stephen J. Turnbull wrote: > Andrew Bennetts writes: > > > No, that just means you shouldn't trust *root*. Which is where a > > VM is a very useful tool. You can have the ?as root? environment > > for your tests without the need to have anything important trust it. > > Cameron acknowledges that he missed that. So maybe he was right for > the wrong reason; he's still right. But in the current context, it is > not an argument for not worrying, because there is no evidence at all > that the OP set up his buildbot in a secure sandbox. As I read his > followups, he simply "didn't bother" to set up an unprivileged user > and run the 'bot as that user. I made no claim about how the bot was deployed. The point I was disputing was more general than how one specific bot is deployed. To quote the mail I was replying to again: ?HOWEVER, the whole suite should not be _tested_ as root because the code being testing is by definition untrusted.? This sentiment was expressed strongly and repeatedly in several mails. It was this overly broad assertion I was addressing, and happily my argument was apparently convincing. I'm fine with ?It's not worth running the tests as root because the overhead of making a secure setup for it with a VM etc is too hard with our very limited volunteer resources.? I'm not fine with ?We mustn't run them as root because it's impossible to do it safely.? That's all I'm saying. [?] > that was *not* the case; the assumption is falsified. Nevertheless, > several people who I would have thought would know better are *all* > arguing from the assumption that the OP configured his test system > with security (rather than convenience) in mind, and are castigating > Cameron for *not* making that same assumption. To my mind, every post > is increasing justification for his unease. :-( I certainly hope I wasn't so severe as to be castigating! If I was Cameron has been kind enough to not show any offense. > And that's why this thread belongs on this list, rather than on Bruce > Schneier's blog. It's very easy these days to set up a basic personal > VM, and folk of goodwill will do so to help the project with buildbots > to provide platform coverage in testing new code. But this > contribution involves certain risks (however low probability, some > Very Bad Things *could* happen). Contributors should get help in > evaluating the potential threats and corresponding risks, and in > proper configuration. Not assurances that nothing will go wrong > "because you probably run the 'bot in a VM." For the record, in case it isn't obvious, I think a buildslave that runs the tests as root that doesn't take precautions like using a VM dedicated to just running the tests (and not running the buildslave) is a bad idea. Although given that there's a very limited supply of volunteer labour involved in configuring and administering buildslaves I'm not surprised to hear this has happened. :( I don't object at all to folks like Cameron asking questions to ensure that these systems are secure enough. I think that's a good thing! I don't even object to treating someone saying ?run as root? as a red flag requiring further explanation. What I was objecting to was an apparent willingness to make an unnecessary compromise on software quality. I care about the security of contributors' buildslaves. I also care about the reliability of Python. -Andrew. From martin at v.loewis.de Sat Oct 8 16:54:06 2011 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 08 Oct 2011 16:54:06 +0200 Subject: [Python-Dev] Identifier API Message-ID: <4E90640E.2040301@v.loewis.de> In benchmarking PEP 393, I noticed that many UTF-8 decode calls originate from C code with static strings, in particular PyObject_CallMethod. Many of such calls already have been optimized to cache a string object, however, PyObject_CallMethod remains unoptimized since it requires a char*. I find the ad-hoc approach of declaring and initializing variables inadequate, in particular since it is difficult to clean up all those string objects at interpreter shutdown. I propose to add an explicit API to deal with such identifiers. With this API, tmp = PyObject_CallMethod(result, "update", "O", other); would be replaced with PyObject *tmp; Py_identifier(update); ... tmp = PyObject_CallMethodId(result, &PyId_update, "O", other); Py_identifier expands to a struct typedef struct Py_Identifier { struct Py_Identifier *next; const char* string; PyObject *object; } Py_Identifier; string will be initialized by the compiler, next and object on first use. The new API for that will be PyObject* PyUnicode_FromId(Py_Identifier*); PyObject* PyObject_CallMethodId(PyObject*, Py_Identifier*, char*, ...); PyObject* PyObject_GetAttrId(PyObject*, Py_Identifier*); int PyObject_SetAttrId(PyObject*, Py_Identifier*, PyObject*); int PyObject_HasAttrId(PyObject*, Py_Identifier*); I have micro-benchmarked this; for import time d={} i=d.items() t=time.time() for _ in range(10**6): i | d print(time.time()-t) I get a speed-up of 30% (notice that "i | d" invokes the above PyObject_CallMethod call). Regards, Martin From martin at v.loewis.de Sat Oct 8 17:03:34 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 08 Oct 2011 17:03:34 +0200 Subject: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as In-Reply-To: <20111006212701.GA10627@cskk.homeip.net> References: <20111006212701.GA10627@cskk.homeip.net> Message-ID: <4E906646.7040408@v.loewis.de> > Pretending the snark to be slightly serious: you've missed the point. > The builtbots are building unreliable code, that being the point of the > test suite. Doing unpredictable stuff as root is bad juju. > > Running the builtbots and their tests should not be run as root except > for a very few special tests, and those few need careful consideration > and sandboxing. No no no no no. Running as a non-"privileged" user does not gain much. The code may be un*reliable*, but it is not un*trusted*. If the code disturbs the system, it can do so nearly as much as an unprivileged user as the superuser. The critical part of the file system is the build area, and the build slave has full access to that either way. > HOWEVER, the whole suite should not be _tested_ as root because the code > being testing is by definition untrusted. No, you got that definition wrong. "unreliable" is correct; we don't have any untrusted code in Python. We trust all committers, as do we trust the integrity of the repository server. Regards, Martin From victor.stinner at haypocalc.com Sat Oct 8 17:14:55 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sat, 08 Oct 2011 17:14:55 +0200 Subject: [Python-Dev] [Python-ideas] PEP 3101 (Advanced string formatting) base 36 integer presentation type In-Reply-To: <20111008150336.3839a98c@pitrou.net> References: <4E8FC024.9000009@gmail.com> <20111008150336.3839a98c@pitrou.net> Message-ID: <4E9068EF.3050800@haypocalc.com> Le 08/10/2011 15:03, Antoine Pitrou a ?crit : > On Fri, 07 Oct 2011 21:14:44 -0600 > Jeffrey wrote: >> I would like to suggest adding an integer presentation type for base 36 >> to PEP 3101. I can't imagine that it would be a whole lot more >> difficult than the existing types. Python's built-in long integers >> provide a nice way to prototype and demonstrate cryptographic >> operations, especially with asymmetric cryptography. (Alice and Bob >> stuff.) Built-in functions provide modular reduction, modular >> exponentiation, and lots of nice number theory stuff that supports a >> variety of protocols and algorithms. A frequent need is to represent a >> message by a number. Base 36 provides a way to represent all 26 letters >> in a semi-standard way, and simple string transformations can >> efficiently make zeros into spaces or vice versa. > > Why base 36 rather than, say, base 64 or even base 80? Base 85 is the most efficient base to format IPv6 addresses! http://tools.ietf.org/html/rfc1924 And Python doesn't provide builtin function for this base! Victor From solipsis at pitrou.net Sat Oct 8 17:27:16 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 8 Oct 2011 17:27:16 +0200 Subject: [Python-Dev] [Python-ideas] PEP 3101 (Advanced string formatting) base 36 integer presentation type References: <4E8FC024.9000009@gmail.com> <20111008150336.3839a98c@pitrou.net> <4E9068EF.3050800@haypocalc.com> Message-ID: <20111008172716.43de2ace@pitrou.net> On Sat, 08 Oct 2011 17:14:55 +0200 Victor Stinner wrote: > Le 08/10/2011 15:03, Antoine Pitrou a ?crit : > > On Fri, 07 Oct 2011 21:14:44 -0600 > > Jeffrey wrote: > >> I would like to suggest adding an integer presentation type for base 36 > >> to PEP 3101. I can't imagine that it would be a whole lot more > >> difficult than the existing types. Python's built-in long integers > >> provide a nice way to prototype and demonstrate cryptographic > >> operations, especially with asymmetric cryptography. (Alice and Bob > >> stuff.) Built-in functions provide modular reduction, modular > >> exponentiation, and lots of nice number theory stuff that supports a > >> variety of protocols and algorithms. A frequent need is to represent a > >> message by a number. Base 36 provides a way to represent all 26 letters > >> in a semi-standard way, and simple string transformations can > >> efficiently make zeros into spaces or vice versa. > > > > Why base 36 rather than, say, base 64 or even base 80? > > Base 85 is the most efficient base to format IPv6 addresses! > > http://tools.ietf.org/html/rfc1924 Indeed, I meant base 85. There's a specification that doesn't rely on long integer division: http://en.wikipedia.org/wiki/Ascii85 Regards Antoine. From solipsis at pitrou.net Sat Oct 8 17:43:33 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 8 Oct 2011 17:43:33 +0200 Subject: [Python-Dev] Identifier API References: <4E90640E.2040301@v.loewis.de> Message-ID: <20111008174333.5d6723b5@pitrou.net> On Sat, 08 Oct 2011 16:54:06 +0200 "Martin v. L?wis" wrote: > > I find the ad-hoc approach of declaring and initializing variables > inadequate, in particular since it is difficult to clean up all > those string objects at interpreter shutdown. > > I propose to add an explicit API to deal with such identifiers. That sounds like a good idea. > With this API, > > tmp = PyObject_CallMethod(result, "update", "O", other); > > would be replaced with > > PyObject *tmp; > Py_identifier(update); > ... > tmp = PyObject_CallMethodId(result, &PyId_update, "O", other); Surely there is something missing to initialize the "const char *" in the structure? Or is "Py_identifier()" actually a macro? > string will be initialized by the compiler, next and object on > first use. The new API for that will be > > PyObject* PyUnicode_FromId(Py_Identifier*); > PyObject* PyObject_CallMethodId(PyObject*, Py_Identifier*, char*, ...); > PyObject* PyObject_GetAttrId(PyObject*, Py_Identifier*); > int PyObject_SetAttrId(PyObject*, Py_Identifier*, PyObject*); > int PyObject_HasAttrId(PyObject*, Py_Identifier*); Do we want to make these APIs public? Regards Antoine. From g.brandl at gmx.net Sat Oct 8 18:15:23 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 08 Oct 2011 18:15:23 +0200 Subject: [Python-Dev] Identifier API In-Reply-To: <20111008174333.5d6723b5@pitrou.net> References: <4E90640E.2040301@v.loewis.de> <20111008174333.5d6723b5@pitrou.net> Message-ID: Am 08.10.2011 17:43, schrieb Antoine Pitrou: > On Sat, 08 Oct 2011 16:54:06 +0200 > "Martin v. L?wis" wrote: >> >> I find the ad-hoc approach of declaring and initializing variables >> inadequate, in particular since it is difficult to clean up all >> those string objects at interpreter shutdown. >> >> I propose to add an explicit API to deal with such identifiers. > > That sounds like a good idea. > >> With this API, >> >> tmp = PyObject_CallMethod(result, "update", "O", other); >> >> would be replaced with >> >> PyObject *tmp; >> Py_identifier(update); >> ... >> tmp = PyObject_CallMethodId(result, &PyId_update, "O", other); > > Surely there is something missing to initialize the "const char *" in > the structure? Or is "Py_identifier()" actually a macro? Yes (note the parenthesized usage). >> string will be initialized by the compiler, next and object on >> first use. The new API for that will be >> >> PyObject* PyUnicode_FromId(Py_Identifier*); >> PyObject* PyObject_CallMethodId(PyObject*, Py_Identifier*, char*, ...); >> PyObject* PyObject_GetAttrId(PyObject*, Py_Identifier*); >> int PyObject_SetAttrId(PyObject*, Py_Identifier*, PyObject*); >> int PyObject_HasAttrId(PyObject*, Py_Identifier*); > > Do we want to make these APIs public? Probably not at first. I'd suggest making them private for Python 3.3, and if the approach proves satisfying, we can think about adding them to the public API in Python 3.4. Georg From g.rodola at gmail.com Sat Oct 8 18:57:47 2011 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Sat, 8 Oct 2011 18:57:47 +0200 Subject: [Python-Dev] Bring new features to older python versions Message-ID: Hello everybody, at work we're using different versions of python, from 2.4 to 2.7. Because of the differences between the various versions in terms of features we have a "util.pycompat" module which basically is a copy & paste of different features which were added to stdlib in every new major version throughout the years. What we do is basically this. Instead of: from collections import namedtuple, OrderedDict import fractions ...we do: from util.pycompat.collections import namedtuple, OrderedDict from util.pycompat import fractions # py 2.6 from util.pycompat.builtins import all, any # py 2.5 # etc... This let us use different stdlib features which appeared in latest Python versions (including 3.2) throughout all our code base. Now, what I have in mind is to release this as a public module so that everyone who cannot upgrade to a recent python version can benefit of newer features. By taking a quick look at the various "what's new" documents this is a brief list of what this module would include: functools (2.5) any, all builtins (2.5) collections.defaultdict (2.5) property setters/deleters (2.6) abc (2.6) fractions (2.6) collections.OrderedDict (2.7) collections.Counter (2.7) unittest2 (2.7) functools.lru_cache (3.2) functools.total_ordering (3.2) itertools.accumulate (3.2) reprlib (3.2) contextlib.ContextDecorator (3.2) I have a couple of doubts about this though. The first one is about licensing. What I would be doing is basically copy & paste pieces of the python stdlib modules (including tests) and, where needed, adjust them so that they work with older python versions. Would this represent problem? My second doubt is about morality. Although this might be useful to those people who are forced to use older python versions, on the other hand it might represent an incentive for not upgrading (and there will be python 3.X features as well). Or maybe it won't, I don't know, point is I feel kind of guilty. =) I'd like to hear your opinions, especially about the second point. Best regards, --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ From solipsis at pitrou.net Sat Oct 8 20:26:13 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 8 Oct 2011 20:26:13 +0200 Subject: [Python-Dev] Bring new features to older python versions References: Message-ID: <20111008202613.1a8565b8@pitrou.net> Ciao Giampaolo, > I have a couple of doubts about this though. > The first one is about licensing. > What I would be doing is basically copy & paste pieces of the python > stdlib modules (including tests) and, where needed, adjust them so > that they work with older python versions. > Would this represent problem? I don't think so. Python is distributed under a free non-copyleft license, which means you are basically free to do what you want as long as you don't try to change that license, or misrepresent authorship. (you can also mix that code with code under another license, such as the BSD, the GPL or even a proprietary license) > My second doubt is about morality. > Although this might be useful to those people who are forced to use > older python versions, on the other hand it might represent an > incentive for not upgrading (and there will be python 3.X features as > well). > Or maybe it won't, I don't know, point is I feel kind of guilty. =) I don't know. Certainly we would prefer people to upgrade. Also, the kind of support you will be able to provide as a single maintainer of that package may not be as good as what we collectively provide for the official Python distribution. There's also some stuff there that is coded in C, or that will rely on some functionality of the core interpreter that is not easily emulated on previous versions. But I suppose you'll find that out by yourself. But I wouldn't describe that as "immoral"; rather, suboptimal. cheers Antoine. From martin at v.loewis.de Sat Oct 8 20:35:05 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 08 Oct 2011 20:35:05 +0200 Subject: [Python-Dev] Bring new features to older python versions In-Reply-To: References: Message-ID: <4E9097D9.7010900@v.loewis.de> > The first one is about licensing. > What I would be doing is basically copy& paste pieces of the python > stdlib modules (including tests) and, where needed, adjust them so > that they work with older python versions. > Would this represent problem? You have a "nonexclusive, royalty-free, world-wide license to ..." "prepare derivative works, distribute, and otherwise use Python alone or in any derivative version," so: no, this is no problem ... "provided, however, that PSF's License Agreement and PSF's notice of copyright, i.e., "Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 Python Software Foundation; All Rights Reserved" are retained in Python alone or in any derivative version prepared by Licensee." > My second doubt is about morality. > Although this might be useful to those people who are forced to use > older python versions, on the other hand it might represent an > incentive for not upgrading (and there will be python 3.X features as > well). Don't worry about that. I'm not sure how many people would be interested in your approach in the first place - if I have to support old versions of Python, I personally just don't use newer features, and don't even have the desire to do so. If I want to use newer features, I decide to drop support for older versions. That I get both with a hack as such a module is just something that I *personally* would never consider (there are other reasons for me to consider hacks like this, such as when supporting multiple versions is just not feasible, but I wouldn't use a hack for convenience reasons). People that do feel the same way as you have probably started their own emulation layers already, so by publishing your emulation layer, it's not getting worse. Regards, Martin From fijall at gmail.com Sat Oct 8 21:38:26 2011 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 8 Oct 2011 21:38:26 +0200 Subject: [Python-Dev] Bring new features to older python versions In-Reply-To: <4E9097D9.7010900@v.loewis.de> References: <4E9097D9.7010900@v.loewis.de> Message-ID: On Sat, Oct 8, 2011 at 8:35 PM, "Martin v. L?wis" wrote: >> The first one is about licensing. >> What I would be doing is basically copy& ?paste pieces of the python >> stdlib modules (including tests) and, where needed, adjust them so >> that they work with older python versions. >> Would this represent problem? > > You have a "nonexclusive, royalty-free, world-wide license to ..." > "prepare derivative works, distribute, and otherwise use Python alone or in > any derivative version," so: no, this is no problem ... > > "provided, however, that PSF's License Agreement and PSF's notice of > copyright, i.e., "Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, > 2008, 2009, 2010, 2011 Python Software Foundation; All Rights Reserved" are > retained in Python alone or in any derivative version prepared by Licensee." > >> My second doubt is about morality. >> Although this might be useful to those people who are forced to use >> older python versions, on the other hand it might represent an >> incentive for not upgrading (and there will be python 3.X features as >> well). > > Don't worry about that. I'm not sure how many people would be interested > in your approach in the first place - if I have to support old versions > of Python, I personally just don't use newer features, and don't even > have the desire to do so. If I want to use newer features, I decide to > drop support for older versions. That I get both with a hack as such > a module is just something that I *personally* would never consider > (there are other reasons for me to consider hacks like this, such as when > supporting multiple versions is just not feasible, but I wouldn't > use a hack for convenience reasons). > > People that do feel the same way as you have probably started their > own emulation layers already, so by publishing your emulation layer, > it's not getting worse. > > Regards, > Martin Most programs I know have it's own imperfect version of such thing, so I would definitely use it. Not everyone can drop support for older versions of python at will. Cheers, fijal From a.badger at gmail.com Sat Oct 8 21:47:01 2011 From: a.badger at gmail.com (Toshio Kuratomi) Date: Sat, 8 Oct 2011 12:47:01 -0700 Subject: [Python-Dev] Bring new features to older python versions In-Reply-To: References: Message-ID: <20111008194701.GD17566@unaka.lan> I have some similar code in kitchen: http://packages.python.org/kitchen/api-overview.html It wasn't as ambitious as your initial goals sound (I was only working on pulling out what was necessary for what people requested rather than an all-inclusive set of changes). You're welcome to join me and work on this aspect of kitchen if you'd like or you can go your own way and I'll probably start pointing people at your library (Like I do with hashlib, bunch, iterutils, ordereddict, etc). I have a need to support a small amount of code as far back as python-2.3 I don't suppose you're interested in that as well? ;-) On Sat, Oct 08, 2011 at 06:57:47PM +0200, Giampaolo Rodol? wrote: > functools (2.5) > any, all builtins (2.5) > collections.defaultdict (2.5) > property setters/deleters (2.6) > abc (2.6) > fractions (2.6) > collections.OrderedDict (2.7) > collections.Counter (2.7) > unittest2 (2.7) > functools.lru_cache (3.2) > functools.total_ordering (3.2) > itertools.accumulate (3.2) > reprlib (3.2) > contextlib.ContextDecorator (3.2) > You can also add subprocess to this list. There's various methods and functions that were added to subprocess since it's first appearance in python-2.4 (Check the library docs page for notes about this [1] _) hashlib (which has a pypi backport already) is another one. hmac is a third which you probably won't notice if you're just perusing docs. It's an issue because if someone tries to use the stdlib's hmac together with the pypi hashlib, hmac will fail unless it's from a recent enough python. .. [1]_:: http://docs.python.org/library/subprocess.html Speaking as someone who works on a Linux distribution, one thing that I'd appreciate is if you could take care to make it so the copied code doesn't get used if the stdlib already provides the necessary code. If you do this, it makes it easier for people who have to audit the code to do their jobs. Instead of having to check every consumer of the compat library to make sure they use something like this:: try: import functools except ImportError: from pycompat import functools import sys if sys.version_info >= (2, 5): import hmac else: from pycompat import hmac You can depend on roughly the same logic having been performed in the library itself which greatly eases their burden. You can look at the kitchen pycompat code for some examples of doing this [2]_. .. [2]_ http://bzr.fedorahosted.org/bzr/kitchen/devel/files/head:/kitchen/ -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From dirkjan at ochtman.nl Sat Oct 8 22:45:08 2011 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Sat, 8 Oct 2011 22:45:08 +0200 Subject: [Python-Dev] Bring new features to older python versions In-Reply-To: <20111008194701.GD17566@unaka.lan> References: <20111008194701.GD17566@unaka.lan> Message-ID: On Sat, Oct 8, 2011 at 21:47, Toshio Kuratomi wrote: > I have some similar code in kitchen: > http://packages.python.org/kitchen/api-overview.html It also sounds similar to six: http://pypi.python.org/pypi/six Avoid all the duplicate efforts would certainly make sense. Cheers, Dirkjan From benjamin at python.org Sat Oct 8 22:54:53 2011 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 8 Oct 2011 16:54:53 -0400 Subject: [Python-Dev] Bring new features to older python versions In-Reply-To: References: <20111008194701.GD17566@unaka.lan> Message-ID: 2011/10/8 Dirkjan Ochtman : > On Sat, Oct 8, 2011 at 21:47, Toshio Kuratomi wrote: >> I have some similar code in kitchen: >> http://packages.python.org/kitchen/api-overview.html > > It also sounds similar to six: > > http://pypi.python.org/pypi/six Though six tries to be a bit minimalist and doesn't strive to include the "Kitchen sink" as it were. :) -- Regards, Benjamin From tjreedy at udel.edu Sun Oct 9 06:09:58 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 09 Oct 2011 00:09:58 -0400 Subject: [Python-Dev] Bring new features to older python versions In-Reply-To: References: Message-ID: On 10/8/2011 12:57 PM, Giampaolo Rodol? wrote: > I have a couple of doubts about this though. > The first one is about licensing. Other have answered -- follow the license in giving credit, etc. > My second doubt is about morality. > Although this might be useful to those people who are forced to use > older python versions, on the other hand it might represent an > incentive for not upgrading (and there will be python 3.X features as > well). > Or maybe it won't, I don't know, point is I feel kind of guilty. =) > > I'd like to hear your opinions, especially about the second point. While I would personally prefer that everyone else upgrade, I never expected that and I consider it a feature that one does not have to. The only thing that has really bothered me is people spreading FUD about Python 3, and that has mostly ended. Just be accurate in promoting your package. It will backport some newer stdlib features, but it will not remove old cruft, fix bugs, add other new features, or include doc improvements. -- Terry Jan Reedy From merwok at netwok.org Sun Oct 9 08:24:09 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Sun, 09 Oct 2011 08:24:09 +0200 Subject: [Python-Dev] Bring new features to older python versions In-Reply-To: References: Message-ID: <4E913E09.80209@netwok.org> Hi, > abc (2.6) I?m not sure this module is very useful without built-in support in isinstance and issubclass. > collections.OrderedDict (2.7) > unittest2 (2.7) Why not depend on the backports available on PyPI instead of re-backporting these in your project? > My second doubt is about morality. > Although this might be useful to those people who are forced to use > older python versions, on the other hand it might represent an > incentive for not upgrading (and there will be python 3.X features as > well). It?s more about marketing than morality IMO :) As other people have said, many projects already have manual backports, so converging efforts on six (for a minimal compat layer) or your lib (for a fat layer) is just rationalization of existing practices. New versions of Python can fend for themselves IMO, they?re not threatened that much by one lib with backports. The issues I foresee with your lib are more technical: First, it looks like a big bag of backported modules, classes and functions without defined criterion for inclusion (?cool new stuff??). Second, will you keep on lumping new things until Python 3.4? 3.8? Won?t that become unmanageable (boring/huge/hard)? Cheers From merwok at netwok.org Sun Oct 9 09:15:21 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Sun, 09 Oct 2011 09:15:21 +0200 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: Message-ID: <4E914A09.50209@netwok.org> Hi Paul, Thanks for raising this during the development phase. > I see that the Packaging documentation is now more complete (at least > at docs.python.org) - I don't know if it's deemed fully complete yet, > but I scanned the documentation and "Installing Python Projects" looks > pretty much converted (and very good!!), but "Distributing Python > Projects" still has quite a lot of distutils-related text in, and I > need to read more deeply to understand if that's because it remains > unchanged, or if it is still to be updated. The basic structure is in place (distributing/installing/library reference), but the docs are far from up-to-date. I have nearly finished a first patch that fixes a ton of markup issues and updates distutils idioms (setup.py snippets e.g.) to packaging ones (setup.cfg snippets, using pysetup, etc.), it?s already a thousand changed lines. Then I will work on another patch to move things around, consolidate, expand and rephrase. See http://bugs.python.org/issue12779 if you want to follow along and review patches. > But one thing struck me - the "Installing Python Projects" document > talks about source distributions, but not much about binary > distributions. This is inherited from distutils docs, not a deliberate choice. We just haven?t thought much, if at all, about binary distributions. > On Windows, binary distributions are significantly more important than > on Unix, because not all users have easy access to a compiler, and > more importantly, C library dependencies can be difficult to build, > hard to set up, and generally a pain to deal with. Are there that many distributions with extension modules? sdists should work well even on Windows for pure Python projects. > I don't know what format packaging favours. As a direct distutils descendant, can create bdist_wininst and bdist_msi. For installing, I was not aware of the problem you reported (?does not interact well with pysetup?); can you give more info? I?m guessing it boils down to the fact that Windows binary installers are meant to be clicked by users, not managed with command-line tools. IIRC the current behavior in pysetup is to favor source distributions, but bdists should probably be favored for non-pure distributions on Windows. > So there will be a need for a pysetup-friendly binary format. > I assume that the egg format will fill this role - or is that not the > case? What is the current thinking on binary distribution formats for > Python 3.3? First, we don?t want to include wholesale support for setuptools eggs in packaging. We have options to support egg/egg-info metadata in the PEP 376 implementation (packaging.database), because we need that to provide a useful tool for users and help them switch, but eggs are another matter. After all, if setuptools and then pkg_resources were turned down for inclusion in Python 2.5, it?s not now that we have packaging that we?ll change our mind and just bless eggs. What we can do however is to see what bdist_egg does and define a new bdist command inspired by it, but without zipping, pkg_resource calls, etc. > The main reason I am asking is that I would like to write an article > (or maybe a series of articles) for Python Insider, introducing the > new packaging facilities from the point of view of an end user with > straightforward needs (whether a package user just looking to manage a > set of installed packages, or a module author who just wants to > publish his code in a form that satisfies as many people as possible). That?s excellent. I too thought about writing something about packaging for that blog, but an outside end-user viewpoint like yours would best match the readership. I can write a shorter piece just for packaging tool developers (i.e. how to use packaging as a library), or you can write that one too and act as a tester for our doc and API. > What I'd hope to do is, as well as showing people all the nice things > they can expect to see in Python 3.3, to also start package authors > thinking about what they need to do to support their users under the > new system. Yes! We need feedback to provide a much better tool than distutils, before the API is locked by backward compatibility rules. I actually wanted to talk about that, so let me take the opportunity. What if we released packaging in Python 3.3 (and distutils2 1.0 on PyPI) as a not-quite-final release? (Something like Python 3.0, which was not considered a real version and not supported as much as the other ones.) The goal would be to expose it to a large range of users to get bug reports and feature requests, but without locking us forever into one API or design, which was the death of distutils a year ago. The idea is not to scare people with warnings that we?ll break APIs on a whim, but that we keep the option to change parts of packaging and release a 2.0 with Python 3.4, with documented changes from 3.3. Opinions? Regards From victor.stinner at haypocalc.com Sun Oct 9 11:50:35 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sun, 09 Oct 2011 11:50:35 +0200 Subject: [Python-Dev] [Python-ideas] PEP 3101 (Advanced string formatting) base 36 integer presentation type In-Reply-To: <4E9068EF.3050800@haypocalc.com> References: <4E8FC024.9000009@gmail.com> <20111008150336.3839a98c@pitrou.net> <4E9068EF.3050800@haypocalc.com> Message-ID: <4E916E6B.3080506@haypocalc.com> Le 08/10/2011 17:14, Victor Stinner a ?crit : > Le 08/10/2011 15:03, Antoine Pitrou a ?crit : >> On Fri, 07 Oct 2011 21:14:44 -0600 >> Jeffrey wrote: >>> I would like to suggest adding an integer presentation type for base 36 >>> to PEP 3101. I can't imagine that it would be a whole lot more >>> difficult than the existing types. Python's built-in long integers >>> provide a nice way to prototype and demonstrate cryptographic >>> operations, especially with asymmetric cryptography. (Alice and Bob >>> stuff.) Built-in functions provide modular reduction, modular >>> exponentiation, and lots of nice number theory stuff that supports a >>> variety of protocols and algorithms. A frequent need is to represent a >>> message by a number. Base 36 provides a way to represent all 26 letters >>> in a semi-standard way, and simple string transformations can >>> efficiently make zeros into spaces or vice versa. >> >> Why base 36 rather than, say, base 64 or even base 80? > > Base 85 is the most efficient base to format IPv6 addresses! > > http://tools.ietf.org/html/rfc1924 > > And Python doesn't provide builtin function for this base! > > Victor Oops, I answered to the wrong mailing list. Victor From p.f.moore at gmail.com Sun Oct 9 13:54:32 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 9 Oct 2011 12:54:32 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: <4E914A09.50209@netwok.org> References: <4E914A09.50209@netwok.org> Message-ID: On 9 October 2011 08:15, ?ric Araujo wrote: >> But one thing struck me - the "Installing Python Projects" document >> talks about source distributions, but not much about binary >> distributions. > This is inherited from distutils docs, not a deliberate choice. ?We just > haven?t thought much, if at all, about binary distributions. > >> On Windows, binary distributions are significantly more important than >> on Unix, because not all users have easy access to a compiler, and >> more importantly, C library dependencies can be difficult to build, >> hard to set up, and generally a pain to deal with. > Are there that many distributions with extension modules? ?sdists should > work well even on Windows for pure Python projects. To be honest, I'm not that sure. I tend to find that many of the ones I want to use have binary dependencies, but maybe I'm atypical :-) Looking at my installations, I see: - database drivers (cx_Oracle, in my case) - lxml - pywin32 - pyQT - pyzmq (that's just for playing a bit with IPython, so doesn't really count...) - I've also used in the past PIL mod_python (mod_wsgi more recently) and wxPython, but don't these days because they either aren't available for Python 3, or no binaries are available and building them is a pain. I've hit others in the past, but mainly just in idle hacking, so I don't depend on them as such (and can't really remember which). >> I don't know what format packaging favours. > As a direct distutils descendant, can create bdist_wininst and > bdist_msi. ?For installing, I was not aware of the problem you reported > (?does not interact well with pysetup?); can you give more info? ?I?m > guessing it boils down to the fact that Windows binary installers are > meant to be clicked by users, not managed with command-line tools. Precisely that (and nothing really more). The pysetup features for uninstalling packages aren't going to work with bdist_wininst/bdist_msi (that's an assumption, I haven't tried them but I can't see how they would, and it'd certainly be a lot of marginally-useful effort to do even if it were possible). The virtual environment stuff also wouldn't work that well with the installers, because they wouldn't have any way of finding which environments existed to ask where to install to. The same problem exists with virtualenv. (Again this is speculation backed by a small amount of playing with virtualenv, so I may be wrong here). > IIRC the current behavior in pysetup is to favor source distributions, > but bdists should probably be favored for non-pure distributions on Windows. > >> So there will be a need for a pysetup-friendly binary format. >> I assume that the egg format will fill this role - or is that not the >> case? What is the current thinking on binary distribution formats for >> Python 3.3? > First, we don?t want to include wholesale support for setuptools eggs in > packaging. ?We have options to support egg/egg-info metadata in the PEP > 376 implementation (packaging.database), because we need that to provide > a useful tool for users and help them switch, but eggs are another > matter. ?After all, if setuptools and then pkg_resources were turned > down for inclusion in Python 2.5, it?s not now that we have packaging > that we?ll change our mind and just bless eggs. ?What we can do however > is to see what bdist_egg does and define a new bdist command inspired by > it, but without zipping, pkg_resource calls, etc. It may be that the bdist_dumb format would be OK. I haven't checked it out (to be honest, I don't think it's ever been used much). I could have a play with that and see if it did the job (or could be made to). Like you say, eggs have a lot of extra infrastructure that wouldn't be needed here. >> The main reason I am asking is that I would like to write an article >> (or maybe a series of articles) for Python Insider, introducing the >> new packaging facilities from the point of view of an end user with >> straightforward needs (whether a package user just looking to manage a >> set of installed packages, or a module author who just wants to >> publish his code in a form that satisfies as many people as possible). > That?s excellent. ?I too thought about writing something about packaging > for that blog, but an outside end-user viewpoint like yours would best > match the readership. ?I can write a shorter piece just for packaging > tool developers (i.e. how to use packaging as a library), or you can > write that one too and act as a tester for our doc and API. Let's see how things go. My goal is to evangelise packaging so that people writing packages I use will create binary builds to save me the effort (:-)) - but anything else I can do, if I have the time, I'm happy to chip in with. >> What I'd hope to do is, as well as showing people all the nice things >> they can expect to see in Python 3.3, to also start package authors >> thinking about what they need to do to support their users under the >> new system. > Yes! ?We need feedback to provide a much better tool than distutils, > before the API is locked by backward compatibility rules. Always the chicken and egg problem :-) > I actually wanted to talk about that, so let me take the opportunity. > What if we released packaging in Python 3.3 (and distutils2 1.0 on PyPI) > as a not-quite-final release? ?(Something like Python 3.0, which was not > considered a real version and not supported as much as the other ones.) > ?The goal would be to expose it to a large range of users to get bug > reports and feature requests, but without locking us forever into one > API or design, which was the death of distutils a year ago. ?The idea is > not to scare people with warnings that we?ll break APIs on a whim, but > that we keep the option to change parts of packaging and release a 2.0 > with Python 3.4, with documented changes from 3.3. ?Opinions? My immediate thought is that it would actually put people off using packaging for 3.3, they'd wait until "it is stable". What is the status of distutils2? Is that (still?) intended to be effectively a backport of packaging to earlier Python versions? If so, then I'd suggest getting a distutils2 release available, and promoted, as the "early adopter" version of packaging. Maybe even with an option to install it as "packaging" so that people can use it in 3.2 and earlier and expect to need no changes when 3.3 is released. That would have the usual "nobody bothers testing stuff that hasn't been released yet" problems, but it might at least get some take-up. But maybe that's what distutils2 already is, and it just needs more promotion? A python-announce article "Python 3.3 new packaging features - early adopter release" publicising it, would be what I'm thinking of... Paul. From fuzzyman at voidspace.org.uk Sun Oct 9 17:35:25 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sun, 09 Oct 2011 16:35:25 +0100 Subject: [Python-Dev] Bring new features to older python versions In-Reply-To: References: <4E9097D9.7010900@v.loewis.de> Message-ID: <4E91BF3D.8050001@voidspace.org.uk> On 08/10/2011 20:38, Maciej Fijalkowski wrote: > On Sat, Oct 8, 2011 at 8:35 PM, "Martin v. L?wis" wrote: >>> The first one is about licensing. >>> What I would be doing is basically copy& paste pieces of the python >>> stdlib modules (including tests) and, where needed, adjust them so >>> that they work with older python versions. >>> Would this represent problem? >> You have a "nonexclusive, royalty-free, world-wide license to ..." >> "prepare derivative works, distribute, and otherwise use Python alone or in >> any derivative version," so: no, this is no problem ... >> >> "provided, however, that PSF's License Agreement and PSF's notice of >> copyright, i.e., "Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, >> 2008, 2009, 2010, 2011 Python Software Foundation; All Rights Reserved" are >> retained in Python alone or in any derivative version prepared by Licensee." >> >>> My second doubt is about morality. >>> Although this might be useful to those people who are forced to use >>> older python versions, on the other hand it might represent an >>> incentive for not upgrading (and there will be python 3.X features as >>> well). >> Don't worry about that. I'm not sure how many people would be interested >> in your approach in the first place - if I have to support old versions >> of Python, I personally just don't use newer features, and don't even >> have the desire to do so. If I want to use newer features, I decide to >> drop support for older versions. That I get both with a hack as such >> a module is just something that I *personally* would never consider >> (there are other reasons for me to consider hacks like this, such as when >> supporting multiple versions is just not feasible, but I wouldn't >> use a hack for convenience reasons). >> >> People that do feel the same way as you have probably started their >> own emulation layers already, so by publishing your emulation layer, >> it's not getting worse. >> >> Regards, >> Martin > Most programs I know have it's own imperfect version of such thing, so > I would definitely use it. Not everyone can drop support for older > versions of python at will. Ditto. unittest2 and the mock test suite both have a subset of this in for some of the newer Python standard library features they use (plus putting back into Python 3 some of the things that disappeared like callable and apply). All the best, Michael Foord > Cheers, > fijal > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From pje at telecommunity.com Sun Oct 9 21:31:23 2011 From: pje at telecommunity.com (PJ Eby) Date: Sun, 9 Oct 2011 15:31:23 -0400 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: <4E914A09.50209@netwok.org> References: <4E914A09.50209@netwok.org> Message-ID: On Sun, Oct 9, 2011 at 3:15 AM, ?ric Araujo wrote: > After all, if setuptools and then pkg_resources were turned > down for inclusion in Python 2.5, it?s not now that we have packaging that we?ll change our mind and just bless eggs. Actually, that's not what happened. I withdrew the approved-by-Guido, announced-at-PyCon, and already-in-progress implementation, both because of the lack of package management features, and because of support concerns raised by Fredrik Lundh. (At that time, the EggFormats doc didn't exist, and there were not as many people familiar with the design or code as there are now.) For the full statement, see: http://mail.python.org/pipermail/python-dev/2006-April/064145.html (The withdrawal is after a lot of background on the history of setuptools and what it was designed for.) In any case, it definitely wasn't the case that eggs or setuptools were rejected for 2.5; they were withdrawn for reasons that didn't have anything to do with the format itself. (And, ironically enough, AFAIK the new packaging module uses code that's actually based on the bits of setuptools Fredrik was worried about supporting... but at least there now are more people providing that support.) What we can do however > is to see what bdist_egg does and define a new bdist command inspired by > it, but without zipping, pkg_resource calls, etc. > Why? If you just want a dumb bdist format, there's already bdist_dumb. Conversely, if you want a smarter format, why reinvent wheels? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ziade.tarek at gmail.com Sun Oct 9 21:47:08 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 9 Oct 2011 21:47:08 +0200 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> Message-ID: On Sun, Oct 9, 2011 at 9:31 PM, PJ Eby wrote: ... >> What we can do however >> is to see what bdist_egg does and define a new bdist command inspired by >> it, but without zipping, pkg_resource calls, etc. > > Why? ?If you just want a dumb bdist format, there's already bdist_dumb. > ?Conversely, if you want a smarter format, why reinvent wheels? Just to make sure we're on the same page here. PEP 376 provide the installation format for the 'future' -- http://www.python.org/dev/peps/pep-0376/ Introducing back another *installation* format would be against the goal we've initially had with PEP 376 : have a single installation format all tools out there would support, for the sake of standardization of interoperability. (and for consumers in other communities) So, while 'eggs' are interesting as plugins for a given application (that was the initial use case right ?), please do not consider them as an installation format for Python. Now for a binary archive, that would get installed ala PEP 376, why not ? I'd just be curious to have someone list the advantage of having a project released that way besides the "importable as-is" feature. Cheers Tarek -- Tarek Ziad? | http://ziade.org From p.f.moore at gmail.com Sun Oct 9 22:14:01 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 9 Oct 2011 21:14:01 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> Message-ID: On 9 October 2011 20:47, Tarek Ziad? wrote: > On Sun, Oct 9, 2011 at 9:31 PM, PJ Eby wrote: > ... >>> What we can do however >>> is to see what bdist_egg does and define a new bdist command inspired by >>> it, but without zipping, pkg_resource calls, etc. >> >> Why? ?If you just want a dumb bdist format, there's already bdist_dumb. >> ?Conversely, if you want a smarter format, why reinvent wheels? > > Just to make sure we're on the same page here. > > PEP 376 provide the installation format for the 'future' -- > http://www.python.org/dev/peps/pep-0376/ [...] > Now for a binary archive, that would get installed ala PEP 376, why > not ? I'd just be curious to have someone list the advantage of having > a project released that way besides the "importable as-is" feature. Agreed. I'm not looking at a new binary installed format - PEP 376 covers this fine. What I am looking at is how/if users without a compiler can get a file that contains all the bits they need to install a distribution. My expectation would be that the user would type pysetup install some_binary_format_file.zip and have that file unpacked and all the "bits" put in the appropriate place. Basically just like installing from a source archive - pysetup install project-1.0.tar.gz - but skipping the compile steps because the compiler output files are present. That may need some extra intelligence in pysetup if it doesn't have this feature already (I sort of assumed it would, but that's likely because of my interest in binary formats) but if not it shouldn't be hard to add - just unzip the bits into the right place, or something similar. As regards the format, bdist_dumb is about the right level - but having just checked it has some problems (which if I recall, have been known for some time, and are why bdist_dumb doesn't get used). Specifically, bdist_dumb puts the location of site-packages ON THE BUILD SYSTEM into the archive, making it useless for direct unzipping on a target system which has Python installed somewhere else. See the following for an example: PS D:\Data\python-sample> unzip -l .\dist\PackageName-1.0.win32.zip Archive: ./dist/PackageName-1.0.win32.zip Length EAs ACLs Date Time Name -------- --- ---- ---- ---- ---- 6656 0 0 09/10/11 20:56 Apps/Python32/Lib/site-packages/hello.pyd 208 0 0 09/10/11 20:56 Apps/Python32/Lib/site-packages/PackageName-1.0-py3.2.egg-info -------- ----- ----- ------- 6864 0 0 2 files It should be simple enough to fix this in bdist_dumb, although a new name might be needed if backward compatibility of the old broken format matters... If pysetup doesn't have support for binary archives at all, then I'm happy to take a look at what might be involved in adding this. But I don't know the code at all, and I have little time, so I'm rather hoping I won't need to... Paul. PS The problem for me is that if pysetup only handles source builds, it's STILL annoyingly incomplete for my requirements (and possibly many Windows users') So I feel this is a hole that needs to be filled before 3.3 is released, or pysetup won't be suitable as "the way to do all packaging in Python". From fuzzyman at voidspace.org.uk Mon Oct 10 01:34:17 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 10 Oct 2011 00:34:17 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> Message-ID: <51511CC6-EE62-49C2-BC76-F572B0EC03C1@voidspace.org.uk> On 9 Oct 2011, at 21:14, Paul Moore wrote: > On 9 October 2011 20:47, Tarek Ziad? wrote: >> On Sun, Oct 9, 2011 at 9:31 PM, PJ Eby wrote: >> ... >>>> What we can do however >>>> is to see what bdist_egg does and define a new bdist command inspired by >>>> it, but without zipping, pkg_resource calls, etc. >>> >>> Why? If you just want a dumb bdist format, there's already bdist_dumb. >>> Conversely, if you want a smarter format, why reinvent wheels? >> >> Just to make sure we're on the same page here. >> >> PEP 376 provide the installation format for the 'future' -- >> http://www.python.org/dev/peps/pep-0376/ > [...] >> Now for a binary archive, that would get installed ala PEP 376, why >> not ? I'd just be curious to have someone list the advantage of having >> a project released that way besides the "importable as-is" feature. > > Agreed. I'm not looking at a new binary installed format - PEP 376 > covers this fine. What I am looking at is how/if users without a > compiler can get a file that contains all the bits they need to > install a distribution. Just to agree with Paul, a typical Windows Python user will not be able to install a non-binary version of a distribution that includes C code. Even on the Mac it is common to distribute binaries. Michael > > My expectation would be that the user would type pysetup install > some_binary_format_file.zip and have that file unpacked and all the > "bits" put in the appropriate place. Basically just like installing > from a source archive - pysetup install project-1.0.tar.gz - but > skipping the compile steps because the compiler output files are > present. That may need some extra intelligence in pysetup if it > doesn't have this feature already (I sort of assumed it would, but > that's likely because of my interest in binary formats) but if not it > shouldn't be hard to add - just unzip the bits into the right place, > or something similar. > > As regards the format, bdist_dumb is about the right level - but > having just checked it has some problems (which if I recall, have been > known for some time, and are why bdist_dumb doesn't get used). > Specifically, bdist_dumb puts the location of site-packages ON THE > BUILD SYSTEM into the archive, making it useless for direct unzipping > on a target system which has Python installed somewhere else. > > See the following for an example: > > PS D:\Data\python-sample> unzip -l .\dist\PackageName-1.0.win32.zip > Archive: ./dist/PackageName-1.0.win32.zip > Length EAs ACLs Date Time Name > -------- --- ---- ---- ---- ---- > 6656 0 0 09/10/11 20:56 > Apps/Python32/Lib/site-packages/hello.pyd > 208 0 0 09/10/11 20:56 > Apps/Python32/Lib/site-packages/PackageName-1.0-py3.2.egg-info > -------- ----- ----- ------- > 6864 0 0 2 files > > It should be simple enough to fix this in bdist_dumb, although a new > name might be needed if backward compatibility of the old broken > format matters... > > If pysetup doesn't have support for binary archives at all, then I'm > happy to take a look at what might be involved in adding this. But I > don't know the code at all, and I have little time, so I'm rather > hoping I won't need to... > > Paul. > > PS The problem for me is that if pysetup only handles source builds, > it's STILL annoyingly incomplete for my requirements (and possibly > many Windows users') So I feel this is a hole that needs to be filled > before 3.3 is released, or pysetup won't be suitable as "the way to do > all packaging in Python". > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From pje at telecommunity.com Mon Oct 10 05:11:41 2011 From: pje at telecommunity.com (PJ Eby) Date: Sun, 9 Oct 2011 23:11:41 -0400 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> Message-ID: On Sun, Oct 9, 2011 at 4:14 PM, Paul Moore wrote: > As regards the format, bdist_dumb is about the right level - but > having just checked it has some problems (which if I recall, have been > known for some time, and are why bdist_dumb doesn't get used). > Specifically, bdist_dumb puts the location of site-packages ON THE > BUILD SYSTEM into the archive, making it useless for direct unzipping > on a target system which has Python installed somewhere else. > I don't know about the case for packaging/distutils2, but I know that in original distutils, you can work around this by making bdist_dumb call the install commands with different arguments. That is, it's a relatively shallow flaw in bdist_dumb. bdist_wininst, for example, is basically a zipped bdist_dumb with altered install arguments and an .exe header tacked on the front. (Along with a little extra data crammed in between the two.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Mon Oct 10 11:49:56 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 10 Oct 2011 10:49:56 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> Message-ID: 2011/10/10 PJ Eby : > On Sun, Oct 9, 2011 at 4:14 PM, Paul Moore wrote: >> >> As regards the format, bdist_dumb is about the right level - but >> having just checked it has some problems (which if I recall, have been >> known for some time, and are why bdist_dumb doesn't get used). >> Specifically, bdist_dumb puts the location of site-packages ON THE >> BUILD SYSTEM into the archive, making it useless for direct unzipping >> on a target system which has Python installed somewhere else. > > I don't know about the case for packaging/distutils2, but I know that in > original distutils, you can work around this by making bdist_dumb call the > install commands with different arguments. ?That is, it's a relatively > shallow flaw in bdist_dumb. Agreed. > bdist_wininst, for example, is basically a zipped bdist_dumb with altered > install arguments and an .exe header tacked on the front. ?(Along with a > little extra data crammed in between the two.) I'd propose that the install arguments used in bdist_wininst be transferred to bdist_dumb (or a new command bdist_binary created based on the same), because the bdist_wininst zip format has the following advantages: 1. Proven format, so it should deal with any edge cases like header files reasonably. And the code already exists. 2. Easily recognisable directory names (PLATLIB, PURELIB, etc) making detection easy without needing extra metadata. 3. At a pinch, a bdist_wininst installer could be treated as a dumb distribution without change (assuming the stdlib zip handling correctly ignores prepended data like the exe header). Then pysetup could be enhanced to recognise and install the binary format in pretty much the same way as it does source formats (add another install_method to _run_install_from_dir that copies the files to the target locations along with appropriate checking and/or metadata handling). There might be a small amount of extra work to do, to check binary version compatibility, but that shouldn't be hard. If this is useful, I could look at creating a patch. (Once I get my build environment fixed so I can get 3.3 up and running - it looks like Python 3.3 can't be built with Visual C++ Express these days, the IDE can't convert the solution files because Express Edition doesn't support 64-bit. I'll have to fish out the full version and install that...) Paul. From ncoghlan at gmail.com Mon Oct 10 13:47:26 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 10 Oct 2011 21:47:26 +1000 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> Message-ID: On Mon, Oct 10, 2011 at 7:49 PM, Paul Moore wrote: > I'd propose that the install arguments used in bdist_wininst be > transferred to bdist_dumb (or a new command bdist_binary created based > on the same) bdist_zip, bdist_archive, bdist_simple would all work (bdist_binary is redundant, given what the 'b' stands for). The 'bdist_dumb' name has always irritated me, since the connotations more strongly favour 'stupid' than they do 'simple' (of course, a legitimate argument can be made that the default behaviour of bdist_dumb *is* pretty stupid). > If this is useful, I could look at creating a patch. (Once I get my > build environment fixed so I can get 3.3 up and running - it looks > like Python 3.3 can't be built with Visual C++ Express these days, the > IDE can't convert the solution files because Express Edition doesn't > support 64-bit. I'll have to fish out the full version and install > that...) IIRC, even the Express edition should still work once the 64 bit Platform SDK is installed. Regardless, the intent is that it should be possible to build Python with only the free tools from MS. If they broke the Express editions such that extra tools are needed, suggested updates to the devguide [1] would be appreciated. [1] http://docs.python.org/devguide/setup.html#windows Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From p.f.moore at gmail.com Mon Oct 10 13:58:07 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 10 Oct 2011 12:58:07 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> Message-ID: On 10 October 2011 12:47, Nick Coghlan wrote: > IIRC, even the Express edition should still work once the 64 bit > Platform SDK is installed. Regardless, the intent is that it should be > possible to build Python with only the free tools from MS. If they > broke the Express editions such that extra tools are needed, suggested > updates to the devguide [1] would be appreciated. > > [1] http://docs.python.org/devguide/setup.html#windows It's the "once the 64 bit platform SDK is installed" bit that's the pain. To just build a test setup for a 32-bit PC it's a shame that you need to install a load of 64-bit tools. Hmm, looking at the devguide it says you should use VS 2008. If that's still the case, ignore me - I was using VS 2010 as that's what's (easily) downloadable. I've now installed VS Pro 2010. We'll see how that goes. I'd rather avoid downgrading to VS2008 (or having both at once) just for personal builds. But will if I have to. Nothing to see here (other than me not reading the docs), move right along... :-) Paul. From mail at timgolden.me.uk Mon Oct 10 14:22:31 2011 From: mail at timgolden.me.uk (Tim Golden) Date: Mon, 10 Oct 2011 13:22:31 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> Message-ID: <4E92E387.1040705@timgolden.me.uk> On 10/10/2011 12:58, Paul Moore wrote: > I've now installed VS Pro 2010. We'll see how that goes. I'd rather > avoid downgrading to VS2008 (or having both at once) just for personal > builds. But will if I have to. Fairly sure VS2010 won't work, Paul. At least it didn't when I was in the same situation a few months ago. I've had close to zero time to spend on Python lately so I could be wrong but I remember that I had to "downgrade" to VS2008 for that reason. TJG From vinay_sajip at yahoo.co.uk Mon Oct 10 15:47:36 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 10 Oct 2011 13:47:36 +0000 (UTC) Subject: [Python-Dev] Bring new features to older python versions References: Message-ID: Giampaolo Rodol? gmail.com> writes: > This let us use different stdlib features which appeared in latest > Python versions (including 3.2) throughout all our code base. > Now, what I have in mind is to release this as a public module so that > everyone who cannot upgrade to a recent python version can benefit of > newer features. There's also the logutils project, which aims to bring logging features added in recent Pythons to older Pythons - especially dictionary-based configuration and queue handlers to facilitate working with multiprocessing, ZeroMQ etc. http://code.google.com/p/logutils/ Regards, Vinay Sajip From ncoghlan at gmail.com Mon Oct 10 16:15:57 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 11 Oct 2011 00:15:57 +1000 Subject: [Python-Dev] Bring new features to older python versions In-Reply-To: References: Message-ID: On Mon, Oct 10, 2011 at 11:47 PM, Vinay Sajip wrote: > Giampaolo Rodol? gmail.com> writes: > >> This let us use different stdlib features which appeared in latest >> Python versions (including 3.2) throughout all our code base. >> Now, what I have in mind is to release this as a public module so that >> everyone who cannot upgrade to a recent python version can benefit of >> newer features. > > There's also the logutils project, which aims to bring logging features added in > recent Pythons to older Pythons - especially dictionary-based configuration and > queue handlers to facilitate working with multiprocessing, ZeroMQ etc. > > http://code.google.com/p/logutils/ Should we create an informational PEP or other resource to point people towards some of these forwards compatibility options? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From vinay_sajip at yahoo.co.uk Mon Oct 10 18:12:52 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 10 Oct 2011 16:12:52 +0000 (UTC) Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 References: <4E914A09.50209@netwok.org> Message-ID: Paul Moore gmail.com> writes: > I'd propose that the install arguments used in bdist_wininst be > transferred to bdist_dumb (or a new command bdist_binary created based > on the same), because the bdist_wininst zip format has the following > advantages: > > 1. Proven format, so it should deal with any edge cases like header > files reasonably. And the code already exists. > 2. Easily recognisable directory names (PLATLIB, PURELIB, etc) making > detection easy without needing extra metadata. > 3. At a pinch, a bdist_wininst installer could be treated as a dumb > distribution without change (assuming the stdlib zip handling > correctly ignores prepended data like the exe header). > > Then pysetup could be enhanced to recognise and install the binary > format in pretty much the same way as it does source formats (add > another install_method to _run_install_from_dir that copies the files > to the target locations along with appropriate checking and/or > metadata handling). A simple change to packaging will allow an archive containing a setup.cfg-based directory to be installed in the same way as a source directory. AFAICT this gives a more useful result than bdist_wininst (as you typically want to install in more places than PURELIB and PLATLIB, and the setup.cfg scheme allows for writing files to locations such as Powershell script directories for a user). > There might be a small amount of extra work to do, to check binary > version compatibility, but that shouldn't be hard. > > If this is useful, I could look at creating a patch. (Once I get my > build environment fixed so I can get 3.3 up and running - it looks > like Python 3.3 can't be built with Visual C++ Express these days, the > IDE can't convert the solution files because Express Edition doesn't > support 64-bit. I'll have to fish out the full version and install > that...) There's one thing that you touched on in an earlier post, which hasn't been further discussed: support for virtual environments. The executable installer format covers two things: packaging of version specific/compiled code, and the simplicity of point-and-click installation. This latter convenience is worth having, but the current installer stub (wininst-x.y.exe) does not know anything about virtual environments. If we care about virtual environment support (and I think we should), wininst.exe could be enhanced to provide a "Browse..." button to allow a user to select a virtual environment to install into, in addition to the detection of installed Pythons from the registry. If this is coupled with the ability to invoke a setup.cfg-based installation when the appended archive is for a setup.cfg-based directory tree, won't this pretty much tick all the boxes? Regards, Vinay Sajip From vinay_sajip at yahoo.co.uk Mon Oct 10 18:20:45 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 10 Oct 2011 16:20:45 +0000 (UTC) Subject: [Python-Dev] Bring new features to older python versions References: Message-ID: Nick Coghlan gmail.com> writes: > Should we create an informational PEP or other resource to point > people towards some of these forwards compatibility options? Or perhaps a page on www.python.org which is referenced by e.g. a footnote in PEP 387 (Backwards Compatibility Policy)? Regards, Vinay Sajip From p.f.moore at gmail.com Mon Oct 10 20:29:26 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 10 Oct 2011 19:29:26 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> Message-ID: On 10 October 2011 17:12, Vinay Sajip wrote: > Paul Moore gmail.com> writes: > >> I'd propose that the install arguments used in bdist_wininst be >> transferred to bdist_dumb (or a new command bdist_binary created based >> on the same), because the bdist_wininst zip format has the following >> advantages: >> >> 1. Proven format, so it should deal with any edge cases like header >> files reasonably. And the code already exists. >> 2. Easily recognisable directory names (PLATLIB, PURELIB, etc) making >> detection easy without needing extra metadata. >> 3. At a pinch, a bdist_wininst installer could be treated as a dumb >> distribution without change (assuming the stdlib zip handling >> correctly ignores prepended data like the exe header). >> >> Then pysetup could be enhanced to recognise and install the binary >> format in pretty much the same way as it does source formats (add >> another install_method to _run_install_from_dir that copies the files >> to the target locations along with appropriate checking and/or >> metadata handling). > > A simple change to packaging will allow an archive containing a setup.cfg-based > directory to be installed in the same way as a source directory. AFAICT this > gives a more useful result than bdist_wininst (as you typically want to install > in more places than PURELIB and PLATLIB, and the setup.cfg scheme allows for > writing files to locations such as Powershell script directories for a user). I'm not sure what you mean by a "setup.cfg-based directory". Could you clarify, and maybe explain how you'd expect to create such an archive? We may be talking at cross-purposes here. >> There might be a small amount of extra work to do, to check binary >> version compatibility, but that shouldn't be hard. >> >> If this is useful, I could look at creating a patch. (Once I get my >> build environment fixed so I can get 3.3 up and running - it looks >> like Python 3.3 can't be built with Visual C++ Express these days, the >> IDE can't convert the solution files because Express Edition doesn't >> support 64-bit. I'll have to fish out the full version and install >> that...) > > There's one thing that you touched on in an earlier post, which hasn't been > further discussed: support for virtual environments. The executable installer > format covers two things: packaging of version specific/compiled code, and the > simplicity of point-and-click installation. This latter convenience is worth > having, but the current installer stub (wininst-x.y.exe) does not know anything > about virtual environments. If we care about virtual environment support (and I > think we should), wininst.exe could be enhanced to provide a "Browse..." button > to allow a user to select a virtual environment to install into, in addition to > the detection of installed Pythons from the registry. If this is coupled with > the ability to invoke a setup.cfg-based installation when the appended archive > is for a setup.cfg-based directory tree, won't this pretty much tick all the > boxes? Agreed - but I'm looking at a pysetup install approach to work for source and binary packages, essentially replacing the use of bdist_wininst and bdist_msi with sdist/bdist_simple archives. That's a change of heart for me, as I used to argue for wininst/msi over setuptools and similar - but pysetup includes all the listing and uninstalling features I wanted, so the "one unified approach" has won me over in preference to the platform integration. Ideally bdist_wininst and bdist_msi would also integrate with pysetup and with virtual environments, but I imagine that could be pretty hard to make work cleanly, as Windows doesn't really support multiple installations of a software package... (Plus, I've no real idea about how bdist_wininst works, so while you may be right, I wouldn't know how to do anything with your suggestion :-)) Paul. From g.rodola at gmail.com Mon Oct 10 22:21:58 2011 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Mon, 10 Oct 2011 22:21:58 +0200 Subject: [Python-Dev] Bring new features to older python versions In-Reply-To: References: Message-ID: Thanks everybody for your feedback. I created a gcode project here: http://code.google.com/p/pycompat/ 2011/10/8 Antoine Pitrou : > There's also some stuff there that is coded in C, or that will rely on > some functionality of the core interpreter that is not easily > emulated on previous versions. But I suppose you'll find that out by > yourself. Yep, I'm still not sure what to do about this. I guess I'll just ignore that stuff in all those cases where rewriting it in python is too much effort. Toshio Kuratomi wrote: > I have a need to support a small amount of code as far back as python-2.3 > I don't suppose you're interested in that as well? ;-) I'm still not sure; 2.3 version is way too old (it doesn't even have decorators). It might make sense only in case the lib gets widely used, which I doubt. Personally, at some point I deliberately dropped support for 2.3 from all of my code/lib, mainly because I couldn't use decorators. so I don't have a real need to do this. 2011/10/9 ?ric Araujo : > The issues I foresee with your lib are more technical: First, it looks > like a big bag of backported modules, classes and functions without > defined criterion for inclusion (?cool new stuff??). I'd say the criterion for inclusion is putting in everything which can be (re)written in python 2.4, such as any, all, collections.defaultdict and property setters/deleters (2.6). Pretty much all the stuff written in C would be left out, maybe with the exception of functools module which is important (for me at least), in which case I might try to rewrite it in pure Python. I'm sharing your same doubts though. Maybe this isn't worth the effort in the first place. I'll try to write some more code and see whether this is a good candidate for a "public module". If not I'll just get back to use it as an internal "private" module. 2011/10/9 ?ric Araujo : > keep on lumping new things until Python 3.4? ?3.8? ?Won?t that become > unmanageable (boring/huge/hard)? I don't think it makes sense to go over than 3.2 version. Folks which are forced to use python 2.4 are already avoing to use 2.6 and 2.7 features, let alone 3.X only features. Plus, python 3.2 was already the latest 3.X version which still had something in common with 2.7. --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ From vinay_sajip at yahoo.co.uk Mon Oct 10 22:38:32 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 10 Oct 2011 21:38:32 +0100 (BST) Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 References: <4E914A09.50209@netwok.org> Message-ID: <1318279112.10373.YahooMailNeo@web25808.mail.ukl.yahoo.com> > I'm not sure what you mean by a "setup.cfg-based directory". Could > you > clarify, and maybe explain how you'd expect to create such an archive? > We may be talking at cross-purposes here. Here's how I see it: at present, you can install a project by specifying pysetup3 install path-to-directory where the named directory contains a setup.cfg (replacing setup.py) and a bunch of things to install. Exactly what to install where is specified in the setup.cfg: it covers not only python packages and modules but also arbitrary binary files. The setup.cfg format is extensible enough to allow specifying where files are installed not only declaratively (as static paths in setup.cfg) but also according to criteria computed at installation time (e.g., write some PowerShell scripts to the installing user's PowerShell profile location). Of course, since you can install arbitrary data (and record what was installed where, to allow uninstallation to work), you can certainly install DLLs too (modulo the user having write permissions for the installation location, but that's true for data files, too). In theory, therefore, a binary distribution could be installed from a directory containing a setup.cfg, some DLLs, Python source files, and other text and binary data files. Moreover, it would be just as easy to zip that whole directory up (using any zipping tools), and pass it around as a .zip file; at installation time, the packaging code would unpack the directory in a temporary location and install from there. The zip archive can, of course, be appended to an executable which does the relevant unpacking and calls to packaging code to actually do the installation. The scheme is conceptually the same as the wininst-x.y.exe does - only the details differ. This gives a one (double-)click installer. > Agreed - but I'm looking at a pysetup install approach to work for > source and binary packages, essentially replacing the use of > bdist_wininst and bdist_msi with sdist/bdist_simple archives. That's a > change of heart for me, as I used to argue for wininst/msi over > setuptools and similar - but pysetup includes all the listing and > uninstalling features I wanted, so the "one unified approach" has won > me over in preference to the platform integration. Right, but AFAICT pysetup3 will work now with a binary distribution, other than it does not contain mechanisms for checking Python version and platform compatibilities. Being a command line script, it will even support virtual environments without too much trouble - I've been working on this in the pythonv branch with some success. What's missing from things is a .exe installer; even though you might be happy without one, not having it may be seen by some as a retrograde step. > Ideally bdist_wininst and bdist_msi would also integrate with pysetup > and with virtual environments, but I imagine that could be pretty hard > to make work cleanly, as Windows doesn't really support multiple > installations of a software package... I don't think Windows itself cares in general, it's more about the specifics of what's being installed. Obviously some things like COM components would need to be managed centrally, but I would imagine that if you had two projects with separate versions of e.g. C-based extensions, you could install the relevant DLLs in separate virtual environments and not necessarily have problems with them coexisting. > (Plus, I've no real idea about how bdist_wininst works, so while you > may be right, I wouldn't know how to do anything with your suggestion > :-)) Though I can't claim to have looked at the codebase in detail, the overall scheme would appear to be this: bdist_wininst creates an executable from wininst-x.y.exe (part of Python, in a distutils directory), appends some metadata (used in the UI of wininst-x.y.exe - things like the package name, icon etc.) and appends to that an archive containing all the stuff to install. When the executable is run, the UI is presented incorporating relevant metadata, user input solicited and the archive contents installed according to that input. However, the installation locations are determined from the registry information on installed Pythons only, with no nod to the possibility of users having installed multiple virtual environments from those installed Pythons. Regards, Vinay Sajip From p.f.moore at gmail.com Mon Oct 10 23:20:25 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 10 Oct 2011 22:20:25 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: <1318279112.10373.YahooMailNeo@web25808.mail.ukl.yahoo.com> References: <4E914A09.50209@netwok.org> <1318279112.10373.YahooMailNeo@web25808.mail.ukl.yahoo.com> Message-ID: On 10 October 2011 21:38, Vinay Sajip wrote: >> I'm not sure what you mean by a "setup.cfg-based directory". Could > >> you >> clarify, and maybe explain how you'd expect to create such an archive? >> We may be talking at cross-purposes here. > > Here's how I see it: at present, you can install a project by specifying > > pysetup3 install path-to-directory > > where the named directory contains a setup.cfg (replacing setup.py) and a bunch of things to install. Exactly what to install where is specified in the setup.cfg: > it covers not only python packages and modules but also arbitrary binary files. The setup.cfg format is extensible enough to allow specifying where files are > installed not only declaratively (as static paths in setup.cfg) but also according to criteria computed at installation time (e.g., write some PowerShell scripts to > the installing user's PowerShell profile location). > > Of course, since you can install arbitrary data (and record what was installed where, to allow uninstallation to work), you can certainly install DLLs too (modulo > the user having write permissions for the installation location, but that's true for data files, too). > > In theory, therefore, a binary distribution could be installed from a directory containing a setup.cfg, some DLLs, Python source files, and other text and binary > data files. Moreover, it would be just as easy to zip that whole directory up (using any zipping tools), and pass it around as a .zip file; at installation time, the > packaging code would unpack the directory in a temporary location and install from there. > > The zip archive can, of course, be appended to an executable which does the relevant unpacking and calls to packaging code to actually do the installation. > The scheme is conceptually the same as the wininst-x.y.exe does - only the details differ. This gives a one (double-)click installer. > Ah, I see what you are saying now. I hadn't realised that the setup.cfg format was that flexible. I'll look into it a bit more - you're right that it would be better to reuse the existing technology than to extend it if that's not needed. >> Agreed - but I'm looking at a pysetup install approach to work for >> source and binary packages, essentially replacing the use of >> bdist_wininst and bdist_msi with sdist/bdist_simple archives. That's a >> change of heart for me, as I used to argue for wininst/msi over >> setuptools and similar - but pysetup includes all the listing and >> uninstalling features I wanted, so the "one unified approach" has won >> me over in preference to the platform integration. > > Right, but AFAICT pysetup3 will work now with a binary distribution, other than it does not contain mechanisms for checking Python version and platform compatibilities. > Being a command line script, it will even support virtual environments without too much trouble - I've been working on this in the pythonv branch with some success. > What's missing from things is a .exe installer; even though you might be happy without one, not having it may be seen by some as a retrograde step. > Now I understand what you mean. I agree that an exe installer should be available. And given that it can be used like a zipfile as well if it follows the exe stub plus zipfile approach of bdist_wininst) then that sounds ideal. >> Ideally bdist_wininst and bdist_msi would also integrate with pysetup >> and with virtual environments, but I imagine that could be pretty hard >> to make work cleanly, as Windows doesn't really support multiple >> installations of a software package... > > I don't think Windows itself cares in general, it's more about the specifics of what's being installed. Obviously some things like COM components would need to be > managed centrally, but I would imagine that if you had two projects with separate versions of e.g. C-based extensions, you could install the relevant DLLs in separate > virtual environments and not necessarily have problems with them coexisting. Agreed, it's more common Windows conventions than windows itself. Plus having to invent distinct names for each entry to go into add/remove programs, which could get to be a pain with multiple venvs. >> (Plus, I've no real idea about how bdist_wininst works, so while you >> may be right, I wouldn't know how to do anything with your suggestion >> :-)) > > Though I can't claim to have looked at the codebase in detail, the overall scheme would appear to be this: bdist_wininst creates an executable from wininst-x.y.exe > (part of Python, in a distutils directory), appends some metadata (used in the UI of wininst-x.y.exe - things like the package name, icon etc.) and appends to that > an archive containing all the stuff to install. When the executable is run, the UI is presented incorporating relevant metadata, user input solicited and the archive > contents installed according to that input. However, the installation locations are determined from the registry information on installed Pythons only, with no nod to > the possibility of users having installed multiple virtual environments from those installed Pythons. Sorry, to be clear, yes I'm aware that's the general scheme. But I don't know much about writing installers in general, or the code of wininst-x.y.exe in particular, which is why I added the proviso. And I don't have the free time to work on a C-based installer to replace wininst-x.y.exe, which is why my focus is on pysetup. To summarise, then: 1. By using setup.cfg technology, it would be easy enough to zip up a binary build in a way that pysetup could unpack and install. 1a. A packaging command to build such an archive would be worth providing. 2. A GUI installer would still be valuable for many people 2a. Having the GUI work by doing a pysetup install passing the installer exe (which would have a zipfile as noted in 1 above appended) could make sense to avoid duplicating work. 2b. The GUI could do the extra needed to integrate with the OS, which pysetup wouldn't do 2c. There's a question over a GUI install followed by a pysetup uninstall, which wouldn't remove the add/remove entry... 3. Ideally, the GUI should co-operate with venvs, by offering some form of browse facility. The command line does this automatically. I'll do some research into setup.cfg capabilities and do some proof of concept work to see how all this would work. Does the above make sense? Paul. From ncoghlan at gmail.com Tue Oct 11 03:29:41 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 10 Oct 2011 21:29:41 -0400 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> Message-ID: On Mon, Oct 10, 2011 at 2:29 PM, Paul Moore wrote: > Ideally bdist_wininst and bdist_msi would also integrate with pysetup > and with virtual environments, but I imagine that could be pretty hard > to make work cleanly, as Windows doesn't really support multiple > installations of a software package... That's OK, the package managers get bypassed by pysetup on POSIX systems as well - that's kind of the point of language level virtual environments (they're an intermediate step between system installs and chroot installs, which in turn are an interim step on the road to full virtualised machines). There are hard to build packages on POSIX (e.g. PIL) that would also benefit from a good, cross-platform approach to binary installation. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From tjreedy at udel.edu Tue Oct 11 06:22:12 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 11 Oct 2011 00:22:12 -0400 Subject: [Python-Dev] Bring new features to older python versions In-Reply-To: References: Message-ID: On 10/10/2011 4:21 PM, Giampaolo Rodol? wrote: > Thanks everybody for your feedback. > I created a gcode project here: > http://code.google.com/p/pycompat/ This project will be easier if the test suite for a particular function/class/module is up to par. If you find any gaping holes, you might file an issue on the tracker. -- Terry Jan Reedy From vinay_sajip at yahoo.co.uk Tue Oct 11 09:59:45 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 11 Oct 2011 07:59:45 +0000 (UTC) Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 References: <4E914A09.50209@netwok.org> <1318279112.10373.YahooMailNeo@web25808.mail.ukl.yahoo.com> Message-ID: Paul Moore gmail.com> writes: > To summarise, then: > > 1. By using setup.cfg technology, it would be easy enough to zip up a > binary build in a way that pysetup could unpack and install. > 1a. A packaging command to build such an archive would be worth providing. > 2. A GUI installer would still be valuable for many people > 2a. Having the GUI work by doing a pysetup install passing the > installer exe (which would have a zipfile as noted in 1 above > appended) could make sense to avoid duplicating work. > 2b. The GUI could do the extra needed to integrate with the OS, > which pysetup wouldn't do > 2c. There's a question over a GUI install followed by a pysetup > uninstall, which wouldn't remove the add/remove entry... > 3. Ideally, the GUI should co-operate with venvs, by offering some > form of browse facility. The command line does this automatically. > > I'll do some research into setup.cfg capabilities and do some proof of > concept work to see how all this would work. > > Does the above make sense? To me it does, and it would be useful to have some validation from the packaging folks. I looked at the dialog resources for wininst-x.y.exe and noticed that there is a "Find other ..." button which is hidden, and its handler (in PC\bdist_wininst\install.c) is commented out. However, the code called by the handler - GetOtherPythonVersion - is still there. Does anyone here know why the button has been made unavailable? Regards, Vinay Sajip From vinay_sajip at yahoo.co.uk Tue Oct 11 10:14:05 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 11 Oct 2011 08:14:05 +0000 (UTC) Subject: [Python-Dev] Test failures on Windows 7 Message-ID: I just cloned and built CPython default on Windows 7 32-bit (in a VM). The build was successful, but I get crashes when running the regression tests: test test_capi failed -- Traceback (most recent call last): File "C:\Users\Vinay\Projects\cpython\lib\test\test_capi.py", line 51, in test _no_FatalError_infinite_loop b'Fatal Python error:' AssertionError: b"Fatal Python error: PyThreadState_Get: no current thread\n\r\n This application has requested the Runtime to terminate it in an unusual way.\nP lease contact the application's support team for more information." != b'Fatal P ython error: PyThreadState_Get: no current thread' test test_faulthandler failed -- Traceback (most recent call last): File "C:\Users\Vinay\Projects\cpython\lib\test\test_faulthandler.py", line 175 , in test_fatal_error 'xyz') File "C:\Users\Vinay\Projects\cpython\lib\test\test_faulthandler.py", line 105 , in check_fatal_error self.assertRegex(output, regex) AssertionError: Regex didn't match: '^Fatal Python error: xyz\n\nCurrent\\ threa d\\ XXX:\n File "", line 2 in $' not found in 'Fatal Python err or: xyz\n\nCurrent thread XXX:\n File "", line 2 in \n\nThis ap plication has requested the Runtime to terminate it in an unusual way.\nPlease c ontact the application\'s support team for more information.' It's been a few weeks since I built and tested on Windows 7, so I'm not sure what to make of these. I notice that at least some of the Windows 7 buildbots are green, so can someone advise whether there is any special configuring I need to do? I've just built from the solution file (using Visual Studio 2008 SP1). Regards, Vinay Sajip From stefan at bytereef.org Tue Oct 11 10:23:46 2011 From: stefan at bytereef.org (Stefan Krah) Date: Tue, 11 Oct 2011 10:23:46 +0200 Subject: [Python-Dev] Test failures on Windows 7 In-Reply-To: References: Message-ID: <20111011082346.GA16523@sleipnir.bytereef.org> Vinay Sajip wrote: > test test_capi failed -- Traceback (most recent call last): > test test_faulthandler failed -- Traceback (most recent call last): The tests call abort(), and the handling on Windows is slightly peculiar. See: http://bugs.python.org/issue9116 http://bugs.python.org/issue11732 Stefan Krah From amauryfa at gmail.com Tue Oct 11 10:41:56 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 11 Oct 2011 10:41:56 +0200 Subject: [Python-Dev] Test failures on Windows 7 In-Reply-To: References: Message-ID: Hi, 2011/10/11 Vinay Sajip : > AssertionError: b"Fatal Python error: PyThreadState_Get: no current thread\n\r\n > This application has requested the Runtime to terminate it in an unusual way.\nP > lease contact the application's support team for more information." != b'Fatal P > ython error: PyThreadState_Get: no current thread' Can these additional lines "This application has requested the Runtime to terminate..." be the equivalent of the infamous popups we had sometimes? I know that buildbots modified a specific registry key, I don't remember which one though :-( -- Amaury Forgeot d'Arc From vinay_sajip at yahoo.co.uk Tue Oct 11 12:08:10 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 11 Oct 2011 10:08:10 +0000 (UTC) Subject: [Python-Dev] Where are the build files for recent wininst-x.y.exe programs in packaging? Message-ID: The packaging/command folder contains wininst executables with the following suffixes: -6.0.exe, -7.1.exe, -8.0.exe, -9.0.exe, -9.0.amd64.exe, -10.0.exe, and -10.0-amd64.exe. However, the build files in PC\bdist_wininst only seem to cover building up to -8.0.exe; there are no build files for -9.0 and -10.0 versions. Where can these be found? Thanks, Vinay Sajip From vinay_sajip at yahoo.co.uk Tue Oct 11 12:13:52 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 11 Oct 2011 10:13:52 +0000 (UTC) Subject: [Python-Dev] =?utf-8?q?Where_are_the_build_files_for_recent_winin?= =?utf-8?q?st-x=2Ey=2Eexe=09programs_in_packaging=3F?= References: Message-ID: Vinay Sajip yahoo.co.uk> writes: Never mind, found the answer from closed issue 9818 - bdist_wininst.vcproj. From hrvoje.niksic at avl.com Tue Oct 11 14:36:56 2011 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Tue, 11 Oct 2011 14:36:56 +0200 Subject: [Python-Dev] Identifier API In-Reply-To: <4E90640E.2040301@v.loewis.de> References: <4E90640E.2040301@v.loewis.de> Message-ID: <4E943868.6070204@avl.com> On 10/08/2011 04:54 PM, "Martin v. L?wis" wrote: > tmp = PyObject_CallMethod(result, "update", "O", other); > > would be replaced with > > PyObject *tmp; > Py_identifier(update); > ... > tmp = PyObject_CallMethodId(result,&PyId_update, "O", other); An alternative I am fond of is to to avoid introducing a new type, and simply initialize a PyObject * and register its address. For example: PyObject *tmp; static PyObject *s_update; // pick a naming convention PY_IDENTIFIER_INIT(update); tmp = PyObject_CallMethodObj(result, s_update, "O", other); (but also PyObject_GetAttr(o, s_update), etc.) PY_IDENTIFIER_INIT(update) might expand to something like: if (!s_update) { s_update = PyUnicode_InternFromString("update"); _Py_IdentifierAdd(&s_update); } _PyIdentifierAdd adds the address of the variable to a global set of C variables that need to be decreffed and zeroed-out at interpreted shutdown. The benefits of this approach is: * you don't need special "identifier" versions of functions such as PyObject_CallMethod. In my example I invented a PyObject_CallMethodObj, but adding that might be useful anyway. * a lot of Python/C code implements similar caching, often leaking strings. Hrvoje From amauryfa at gmail.com Tue Oct 11 14:45:46 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 11 Oct 2011 14:45:46 +0200 Subject: [Python-Dev] Identifier API In-Reply-To: <4E943868.6070204@avl.com> References: <4E90640E.2040301@v.loewis.de> <4E943868.6070204@avl.com> Message-ID: 2011/10/11 Hrvoje Niksic > An alternative I am fond of is to to avoid introducing a new type, and > simply initialize a PyObject * and register its address. For example: > > PyObject *tmp; > static PyObject *s_update; // pick a naming convention > > PY_IDENTIFIER_INIT(update); > tmp = PyObject_CallMethodObj(result, s_update, "O", other); > > (but also PyObject_GetAttr(o, s_update), etc.) > > PY_IDENTIFIER_INIT(update) might expand to something like: > > if (!s_update) { > s_update = PyUnicode_InternFromString("**update"); > _Py_IdentifierAdd(&s_update); > } > It should also check for errors; in this case the initialization is a bit more verbose: if (PY_IDENTIFIER_INIT(update) < 0) ; -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at voidspace.org.uk Tue Oct 11 14:49:43 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 11 Oct 2011 13:49:43 +0100 Subject: [Python-Dev] Bring new features to older python versions In-Reply-To: References: Message-ID: <4E943B67.3070901@voidspace.org.uk> On 10/10/2011 21:21, Giampaolo Rodol? wrote: > Thanks everybody for your feedback. > I created a gcode project here: > http://code.google.com/p/pycompat/ > > 2011/10/8 Antoine Pitrou: >> There's also some stuff there that is coded in C, or that will rely on >> some functionality of the core interpreter that is not easily >> emulated on previous versions. But I suppose you'll find that out by >> yourself. > Yep, I'm still not sure what to do about this. > I guess I'll just ignore that stuff in all those cases where rewriting > it in python is too much effort. > > Toshio Kuratomi wrote: >> I have a need to support a small amount of code as far back as python-2.3 >> I don't suppose you're interested in that as well? ;-) > I'm still not sure; 2.3 version is way too old (it doesn't even have > decorators). > It might make sense only in case the lib gets widely used, which I doubt. > Personally, at some point I deliberately dropped support for 2.3 from > all of my code/lib, mainly because I couldn't use decorators. so I > don't have a real need to do this. Yes, rewriting code from Python 2.7 to support Python 2.3 (pre-decorators) is a real nuisance. In my projects I'm currently supporting Python 2.4+. I'll probably drop support for Python 2.4 soon which will allow for the use of the with statement. > > 2011/10/9 ?ric Araujo: >> The issues I foresee with your lib are more technical: First, it looks >> like a big bag of backported modules, classes and functions without >> defined criterion for inclusion (?cool new stuff??). > I'd say the criterion for inclusion is putting in everything which can > be (re)written in python 2.4, such as any, all, > collections.defaultdict and property setters/deleters (2.6). > Pretty much all the stuff written in C would be left out, maybe with > the exception of functools module which is important (for me at > least), in which case I might try to rewrite it in pure Python. > I'm sharing your same doubts though. > Maybe this isn't worth the effort in the first place. > I'll try to write some more code and see whether this is a good > candidate for a "public module". > If not I'll just get back to use it as an internal "private" module. > > 2011/10/9 ?ric Araujo: >> keep on lumping new things until Python 3.4? 3.8? Won?t that become >> unmanageable (boring/huge/hard)? > I don't think it makes sense to go over than 3.2 version. > Folks which are forced to use python 2.4 are already avoing to use 2.6 > and 2.7 features, let alone 3.X only features. > Plus, python 3.2 was already the latest 3.X version which still had > something in common with 2.7. However, if you can include Python 3.2+ features then projects that also support Python 3 can still use new features without having to worry about compatibility (it solves the same problem). All the best, Michael Foord > > > --- Giampaolo > http://code.google.com/p/pyftpdlib/ > http://code.google.com/p/psutil/ > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From barry at python.org Tue Oct 11 15:19:43 2011 From: barry at python.org (Barry Warsaw) Date: Tue, 11 Oct 2011 09:19:43 -0400 Subject: [Python-Dev] Identifier API In-Reply-To: <4E943868.6070204@avl.com> References: <4E90640E.2040301@v.loewis.de> <4E943868.6070204@avl.com> Message-ID: <20111011091943.4160b217@resist.wooz.org> On Oct 11, 2011, at 02:36 PM, Hrvoje Niksic wrote: >On 10/08/2011 04:54 PM, "Martin v. L?wis" wrote: >> tmp = PyObject_CallMethod(result, "update", "O", other); >> >> would be replaced with >> >> PyObject *tmp; >> Py_identifier(update); >> ... >> tmp = PyObject_CallMethodId(result,&PyId_update, "O", other); > >An alternative I am fond of is to to avoid introducing a new type, and simply >initialize a PyObject * and register its address. For example: > > PyObject *tmp; > static PyObject *s_update; // pick a naming convention > > PY_IDENTIFIER_INIT(update); > tmp = PyObject_CallMethodObj(result, s_update, "O", other); > > (but also PyObject_GetAttr(o, s_update), etc.) I like this better too because of the all-caps macro name. Something about seeing "Py_identifier" look like a function call and having it add the magical PyId_update local bugs me. It just looks wrong, whereas the all-caps is more of a cultural clue that something else is going on. -Barry From hrvoje.niksic at avl.com Tue Oct 11 16:24:01 2011 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Tue, 11 Oct 2011 16:24:01 +0200 Subject: [Python-Dev] Identifier API In-Reply-To: References: <4E90640E.2040301@v.loewis.de> <4E943868.6070204@avl.com> Message-ID: <4E945181.801@avl.com> On 10/11/2011 02:45 PM, Amaury Forgeot d'Arc wrote: > It should also check for errors; in this case the initialization is a > bit more verbose: > if (PY_IDENTIFIER_INIT(update) < 0) > ; Error checking is somewhat more controversial because behavior in case of error differs between situations and coding patterns. I think it should be up to the calling code to check for s_update remaining NULL. In my example, I would expect PyObject_CallMethodObj and similar to raise InternalError when passed a NULL pointer. Since their return values are already checked, this should be enought to cover the unlikely case of identifier creation failing. Hrvoje From solipsis at pitrou.net Tue Oct 11 16:24:30 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 11 Oct 2011 16:24:30 +0200 Subject: [Python-Dev] Identifier API References: <4E90640E.2040301@v.loewis.de> <4E943868.6070204@avl.com> <20111011091943.4160b217@resist.wooz.org> Message-ID: <20111011162430.72ecaa6f@pitrou.net> On Tue, 11 Oct 2011 09:19:43 -0400 Barry Warsaw wrote: > On Oct 11, 2011, at 02:36 PM, Hrvoje Niksic wrote: > > >On 10/08/2011 04:54 PM, "Martin v. L?wis" wrote: > >> tmp = PyObject_CallMethod(result, "update", "O", other); > >> > >> would be replaced with > >> > >> PyObject *tmp; > >> Py_identifier(update); > >> ... > >> tmp = PyObject_CallMethodId(result,&PyId_update, "O", other); > > > >An alternative I am fond of is to to avoid introducing a new type, and simply > >initialize a PyObject * and register its address. For example: > > > > PyObject *tmp; > > static PyObject *s_update; // pick a naming convention > > > > PY_IDENTIFIER_INIT(update); > > tmp = PyObject_CallMethodObj(result, s_update, "O", other); > > > > (but also PyObject_GetAttr(o, s_update), etc.) > > I like this better too because of the all-caps macro name. Something about > seeing "Py_identifier" look like a function call and having it add the magical > PyId_update local bugs me. It just looks wrong, whereas the all-caps is more > of a cultural clue that something else is going on. +1 for something more recognizable. I think "const string" is more accurate than "identifier" as well. Regards Antoine. From benjamin at python.org Tue Oct 11 16:29:49 2011 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 11 Oct 2011 10:29:49 -0400 Subject: [Python-Dev] Identifier API In-Reply-To: <20111011162430.72ecaa6f@pitrou.net> References: <4E90640E.2040301@v.loewis.de> <4E943868.6070204@avl.com> <20111011091943.4160b217@resist.wooz.org> <20111011162430.72ecaa6f@pitrou.net> Message-ID: 2011/10/11 Antoine Pitrou : > +1 for something more recognizable. > I think "const string" is more accurate than "identifier" as well. It should only really be used for identifiers, though, because the result is interned. -- Regards, Benjamin From stefan_ml at behnel.de Tue Oct 11 17:06:17 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 11 Oct 2011 17:06:17 +0200 Subject: [Python-Dev] Identifier API In-Reply-To: <4E90640E.2040301@v.loewis.de> References: <4E90640E.2040301@v.loewis.de> Message-ID: "Martin v. L?wis", 08.10.2011 16:54: > In benchmarking PEP 393, I noticed that many UTF-8 decode > calls originate from C code with static strings, in particular > PyObject_CallMethod. Many of such calls already have been optimized > to cache a string object, however, PyObject_CallMethod remains > unoptimized since it requires a char*. Yes, I noticed that in Cython, too. We often use PyObject_CallMethod() as a fallback for optimistically optimised method calls when the expected fast path does not hit, and it always bugged me that this needs to generate a Python string on each call in order to look up the method. > I propose to add an explicit API to deal with such identifiers. > With this API, > > tmp = PyObject_CallMethod(result, "update", "O", other); > > would be replaced with > > PyObject *tmp; > Py_identifier(update); > ... > tmp = PyObject_CallMethodId(result, &PyId_update, "O", other); > > Py_identifier expands to a struct > > typedef struct Py_Identifier { > struct Py_Identifier *next; > const char* string; > PyObject *object; > } Py_Identifier; > > string will be initialized by the compiler, next and object on > first use. As I understand it, the macro expands to both the ID variable declaration and the init-at-first-call code, right? This is problematic when more than one identifier is used, as some C compilers strictly require declarations to be written *before* any other code. I'm not sure how often users will need more than one identifier in a function, but if it's not too hard to come up with a way that avoids this problem all together, it would be better to do so right from the start. Also note that existing code needs to be changed in order to take advantage of this. It might be possible to optimise PyObject_CallMethod() internally by making the lookup either reuse a number of cached Python strings, or by supporting a lookup of char* values in a dict *somehow*. However, this appears to be substantially more involved than just moving a smaller burden on the users. Stefan From a.badger at gmail.com Tue Oct 11 17:39:34 2011 From: a.badger at gmail.com (Toshio Kuratomi) Date: Tue, 11 Oct 2011 08:39:34 -0700 Subject: [Python-Dev] Bring new features to older python versions In-Reply-To: References: Message-ID: <20111011153934.GA16100@unaka.lan> On Tue, Oct 11, 2011 at 12:22:12AM -0400, Terry Reedy wrote: > On 10/10/2011 4:21 PM, Giampaolo Rodol? wrote: > >Thanks everybody for your feedback. > >I created a gcode project here: > >http://code.google.com/p/pycompat/ > > This project will be easier if the test suite for a particular > function/class/module is up to par. If you find any gaping holes, you > might file an issue on the tracker. > About testsuites... one issue that you'll run into is that while some stdlib modules are written with backporting to older versions in mind, their testsuites are not. For instance, subprocess from python-2.7 runs fine on python-2.3+. The testsuite for subprocess in python-2.7 makes use of the with statement, though, so it has to be ported. -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From fuzzyman at voidspace.org.uk Tue Oct 11 17:41:33 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 11 Oct 2011 16:41:33 +0100 Subject: [Python-Dev] Bring new features to older python versions In-Reply-To: <20111011153934.GA16100@unaka.lan> References: <20111011153934.GA16100@unaka.lan> Message-ID: <4E9463AD.5000807@voidspace.org.uk> On 11/10/2011 16:39, Toshio Kuratomi wrote: > On Tue, Oct 11, 2011 at 12:22:12AM -0400, Terry Reedy wrote: >> On 10/10/2011 4:21 PM, Giampaolo Rodol? wrote: >>> Thanks everybody for your feedback. >>> I created a gcode project here: >>> http://code.google.com/p/pycompat/ >> This project will be easier if the test suite for a particular >> function/class/module is up to par. If you find any gaping holes, you >> might file an issue on the tracker. >> > About testsuites... one issue that you'll run into is that while some stdlib > modules are written with backporting to older versions in mind, their > testsuites are not. For instance, subprocess from python-2.7 runs fine on > python-2.3+. The testsuite for subprocess in python-2.7 makes use of the > with statement, though, so it has to be ported. Some of the tests will use newer features of unittest as well. These can of course be run with unittest2 which has been backported to Python 2.4 (although use of the with statements in the tests themselves will have to be changed still). All the best, Michael Foord > > -Toshio > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.badger at gmail.com Tue Oct 11 18:14:12 2011 From: a.badger at gmail.com (Toshio Kuratomi) Date: Tue, 11 Oct 2011 09:14:12 -0700 Subject: [Python-Dev] Bring new features to older python versions In-Reply-To: <4E943B67.3070901@voidspace.org.uk> References: <4E943B67.3070901@voidspace.org.uk> Message-ID: <20111011161412.GB16100@unaka.lan> On Tue, Oct 11, 2011 at 01:49:43PM +0100, Michael Foord wrote: > On 10/10/2011 21:21, Giampaolo Rodol? wrote: > > > >Toshio Kuratomi wrote: > >>I have a need to support a small amount of code as far back as python-2.3 > >>I don't suppose you're interested in that as well? ;-) > >I'm still not sure; 2.3 version is way too old (it doesn't even have > >decorators). > >It might make sense only in case the lib gets widely used, which I doubt. > >Personally, at some point I deliberately dropped support for 2.3 from > >all of my code/lib, mainly because I couldn't use decorators. so I > >don't have a real need to do this. > > Yes, rewriting code from Python 2.7 to support Python 2.3 > (pre-decorators) is a real nuisance. In my projects I'm currently > supporting Python 2.4+. I'll probably drop support for Python 2.4 > soon which will allow for the use of the with statement. > So actually, decorators aren't a big deal when thinking about porting a limited set of code to python-2.3. decorators are simply syntactic sugar after all, so it's only a one-line change:: @cache() def function(arg): # do_expensive_something return result becomes:: def function(arg): # do_expensiv_something return result function = cache(function) This may not be the preferred manner to write decorators but it's fairly straightforward and easy to remember compared to, say, porting away from the with statement. That said, this was in the nature of hopeful, finger crossing, not really expecting that I'd get someone else to commit to this as a limitation than a, "this is not worthwhile unless you go back to python-2.3". I only have to bear this burden until February 29 and believe me, I'm anxiously awaiting that day :-) -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From pje at telecommunity.com Tue Oct 11 18:43:56 2011 From: pje at telecommunity.com (PJ Eby) Date: Tue, 11 Oct 2011 12:43:56 -0400 Subject: [Python-Dev] Bring new features to older python versions In-Reply-To: <20111011161412.GB16100@unaka.lan> References: <4E943B67.3070901@voidspace.org.uk> <20111011161412.GB16100@unaka.lan> Message-ID: On Tue, Oct 11, 2011 at 12:14 PM, Toshio Kuratomi wrote: > This may not be the preferred manner to write decorators but it's fairly > straightforward and easy to remember compared to, say, porting away from > the > with statement. > You can emulate 'with' using decorators, actually, if you don't mind a nested function. Some code from my Contextual library (minus the tests): *def* *call_with*(ctxmgr): *"""Emulate the PEP 343 "with" statement for Python versions <2.5 The following examples do the same thing at runtime:: Python 2.5+ Python 2.4 ------------ ------------- with x as y: @call_with(x) print y def do_it(y): print y ``call_with(foo)`` returns a decorator that immediately invokes the function it decorates, passing in the same value that would be bound by the ``as`` clause of the ``with`` statement. Thus, by decorating a nested function, you can get most of the benefits of "with", at a cost of being slightly slower and perhaps a bit more obscure than the 2.5 syntax. Note: because of the way decorators work, the return value (if any) of the ``do_it()`` function above will be bound to the name ``do_it``. So, this example prints "42":: @call_with(x) def do_it(y): return 42 print do_it This is rather ugly, so you may prefer to do it this way instead, which more explicitly calls the function and gets back a value:: def do_it(y): return 42 print with_(x, do_it) """* *return* with_.__get__(ctxmgr, type(ctxmgr)) *def* *with_*(ctx, func): *"""Perform PEP 343 "with" logic for Python versions <2.5 The following examples do the same thing at runtime:: Python 2.5+ Python 2.3/2.4 ------------ -------------- with x as y: z = with_(x,f) z = f(y) This function is used to implement the ``call_with()`` decorator, but can also be used directly. It's faster and more compact in the case where the function ``f`` already exists. """* inp = ctx.__enter__() *try*: retval = func(inp) *except*: *if* *not* ctx.__exit__(*sys.exc_info()): *raise* *else*: ctx.__exit__(None, None, None) *return* retval This version doesn't handle the multi-context syntax of newer pythons, but could probably be extended readily enough. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Tue Oct 11 22:25:37 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 11 Oct 2011 22:25:37 +0200 Subject: [Python-Dev] PEP 393 close to pronouncement In-Reply-To: <201109290227.48340.victor.stinner@haypocalc.com> References: <4E834EE7.4050706@egenix.com> <201109290227.48340.victor.stinner@haypocalc.com> Message-ID: <4E94A641.5000109@egenix.com> Victor Stinner wrote: >> Given that I've been working on and maintaining the Python Unicode >> implementation actively or by providing assistance for almost >> 12 years now, I've also thought about whether it's still worth >> the effort. > > Thanks for your huge work on Unicode, Marc-Andre! Thanks. I enjoyed working it on it, but priorities are different now, and new projects are waiting :-) >> My interests have shifted somewhat into other directions and >> I feel that helping Python reach world domination in other ways >> makes me happier than fighting over Unicode standards, implementations, >> special cases that aren't special enough, and all those other >> nitty-gritty details that cause long discussions :-) > > Someone said that we still need to define what a character is! By the way, what > is a code point? I'll leave that as exercise for the interested reader to find out :-) (Hint: Google should find enough hits where I've explained those things on various mailing lists and in talks I gave.) >> So I feel that the PEP 393 change is a good time to draw a line >> and leave Unicode maintenance to Ezio, Victor, Martin, and >> all the others that have helped over the years. I know it's >> in good hands. > > I don't understand why you would like to stop contribution to Unicode, but I only have limited time available for these things and am nowadays more interested in getting others to recognize just how great Python is, than actually sitting down and writing patches for it. Unicode was my baby for quite a few years, but I now have two kids which need more love and attention :-) > well, as you want. We will try to continue your work. Thanks. Cheers, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 11 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From barry at python.org Wed Oct 12 00:22:43 2011 From: barry at python.org (Barry Warsaw) Date: Tue, 11 Oct 2011 18:22:43 -0400 Subject: [Python-Dev] PEP 3151 accepted Message-ID: <20111011182243.6f4a62b7@resist.wooz.org> As the BDFOP for PEP 3151, I hereby accept it for inclusion into Python 3.3. Congratulations to Antoine for producing a great PEP that has broad acceptance in the Python development community, with buy-in from all the major implementations of Python. Antoine's branch is ready to go and it should now be merged into the default branch. PEP 3151 will bring some much needed sanity to this part of the standard exception hierarchy, and I for one look forward to being able to write code directly using it, one day finally eliminating most of my `import errno`s! Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From solipsis at pitrou.net Wed Oct 12 01:14:26 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 12 Oct 2011 01:14:26 +0200 Subject: [Python-Dev] PEP 3151 accepted References: <20111011182243.6f4a62b7@resist.wooz.org> Message-ID: <20111012011426.32f6045c@pitrou.net> On Tue, 11 Oct 2011 18:22:43 -0400 Barry Warsaw wrote: > As the BDFOP for PEP 3151, I hereby accept it for inclusion into Python 3.3. > > Congratulations to Antoine for producing a great PEP that has broad acceptance > in the Python development community, with buy-in from all the major > implementations of Python. Antoine's branch is ready to go and it should now > be merged into the default branch. > > PEP 3151 will bring some much needed sanity to this part of the standard > exception hierarchy, and I for one look forward to being able to write code > directly using it, one day finally eliminating most of my `import errno`s! Thanks Barry! I expect to merge the PEP 3151 into default soon (it's basically ready). cheers Antoine. From solipsis at pitrou.net Wed Oct 12 01:57:25 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 12 Oct 2011 01:57:25 +0200 Subject: [Python-Dev] cpython: Backed out changeset 952d91a7d376 References: Message-ID: <20111012015725.446e1491@pitrou.net> On Wed, 12 Oct 2011 00:53:52 +0200 victor.stinner wrote: > http://hg.python.org/cpython/rev/2abd48a47f3b > changeset: 72878:2abd48a47f3b > user: Victor Stinner > date: Wed Oct 12 00:54:35 2011 +0200 > summary: > Backed out changeset 952d91a7d376 > > If maxchar == PyUnicode_MAX_CHAR_VALUE(unicode), we do an useless copy. Ah, that was the purpose of this assert. Fair enough :) Regards Antoine. From g.rodola at gmail.com Wed Oct 12 11:59:39 2011 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Wed, 12 Oct 2011 11:59:39 +0200 Subject: [Python-Dev] PEP 3151 accepted In-Reply-To: <20111012011426.32f6045c@pitrou.net> References: <20111011182243.6f4a62b7@resist.wooz.org> <20111012011426.32f6045c@pitrou.net> Message-ID: 2011/10/12 Antoine Pitrou : > On Tue, 11 Oct 2011 18:22:43 -0400 > Barry Warsaw wrote: >> As the BDFOP for PEP 3151, I hereby accept it for inclusion into Python 3.3. >> >> Congratulations to Antoine for producing a great PEP that has broad acceptance >> in the Python development community, with buy-in from all the major >> implementations of Python. ?Antoine's branch is ready to go and it should now >> be merged into the default branch. >> >> PEP 3151 will bring some much needed sanity to this part of the standard >> exception hierarchy, and I for one look forward to being able to write code >> directly using it, one day finally eliminating most of my `import errno`s! > > Thanks Barry! > I expect to merge the PEP 3151 into default soon (it's basically ready). > > cheers > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/g.rodola%40gmail.com > Thank you for having worked on this, it was a pretty huge amount of work. We'll probabily have to wait a long time before seeing libs/apps freely depending on this change without caring about backward compatibility constraints, but with this Python is a better language now. --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ From solipsis at pitrou.net Wed Oct 12 16:17:55 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 12 Oct 2011 16:17:55 +0200 Subject: [Python-Dev] Documentation strategy for PEP 3151 Message-ID: <20111012161755.1d96e744@pitrou.net> Hello, I'd like some advice on what the best path is in cases such as: A :exc:`socket.error` is raised for errors from the call to :func:`inet_ntop`. Should I replace "socket.error" with "OSError" (knowing that the former is now an alias of the latter), or leave "socket.error" so that people have less surprises when running their code with a previous Python version? Regards Antoine. From benjamin at python.org Wed Oct 12 16:24:59 2011 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 12 Oct 2011 10:24:59 -0400 Subject: [Python-Dev] Documentation strategy for PEP 3151 In-Reply-To: <20111012161755.1d96e744@pitrou.net> References: <20111012161755.1d96e744@pitrou.net> Message-ID: 2011/10/12 Antoine Pitrou : > > Hello, > > I'd like some advice on what the best path is in cases such as: > > ? A :exc:`socket.error` is raised for errors from the call > ? to :func:`inet_ntop`. > > Should I replace "socket.error" with "OSError" (knowing that the > former is now an alias of the latter), or leave "socket.error" so that > people have less surprises when running their code with a previous > Python version? I think you should say OSError but leave a historical note with a versionchanged on it. -- Regards, Benjamin From brian.curtin at gmail.com Wed Oct 12 16:30:25 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Wed, 12 Oct 2011 09:30:25 -0500 Subject: [Python-Dev] Documentation strategy for PEP 3151 In-Reply-To: <20111012161755.1d96e744@pitrou.net> References: <20111012161755.1d96e744@pitrou.net> Message-ID: On Wed, Oct 12, 2011 at 09:17, Antoine Pitrou wrote: > > Hello, > > I'd like some advice on what the best path is in cases such as: > > ? A :exc:`socket.error` is raised for errors from the call > ? to :func:`inet_ntop`. > > Should I replace "socket.error" with "OSError" (knowing that the > former is now an alias of the latter), or leave "socket.error" so that > people have less surprises when running their code with a previous > Python version? I would expect the 3.3 documentation shows the best way to write 3.3 code, so I'd prefer to see OSError there. A good "What's New" entry as well as explanation/example of how the hierarchy has changed in library/exceptions.rst should cover anyone questioning the departure from socket.error. From barry at python.org Wed Oct 12 16:58:29 2011 From: barry at python.org (Barry Warsaw) Date: Wed, 12 Oct 2011 10:58:29 -0400 Subject: [Python-Dev] Documentation strategy for PEP 3151 In-Reply-To: References: <20111012161755.1d96e744@pitrou.net> Message-ID: <20111012105829.530f7592@limelight.wooz.org> On Oct 12, 2011, at 10:24 AM, Benjamin Peterson wrote: >2011/10/12 Antoine Pitrou : >> I'd like some advice on what the best path is in cases such as: >> >> ? A :exc:`socket.error` is raised for errors from the call >> ? to :func:`inet_ntop`. >> >> Should I replace "socket.error" with "OSError" (knowing that the >> former is now an alias of the latter), or leave "socket.error" so that >> people have less surprises when running their code with a previous >> Python version? > >I think you should say OSError but leave a historical note with a >versionchanged on it. +1 -Barry From merwok at netwok.org Wed Oct 12 17:08:19 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Wed, 12 Oct 2011 17:08:19 +0200 Subject: [Python-Dev] PEP 3151 accepted In-Reply-To: <20111011182243.6f4a62b7@resist.wooz.org> References: <20111011182243.6f4a62b7@resist.wooz.org> Message-ID: <4E95AD63.7020601@netwok.org> Le 12/10/2011 00:22, Barry Warsaw a ?crit : > As the BDFOP for PEP 3151, I hereby accept it for inclusion into Python 3.3. Congratulations Antoine, and thanks! Cheers From merwok at netwok.org Wed Oct 12 18:09:19 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Wed, 12 Oct 2011 18:09:19 +0200 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> <1318279112.10373.YahooMailNeo@web25808.mail.ukl.yahoo.com> Message-ID: <4E95BBAF.3020402@netwok.org> Le 11/10/2011 09:59, Vinay Sajip a ?crit : > To me it does, and it would be useful to have some validation from > the packaging folks. I?m trying to catch up, but the wi-fi here is horrible and there are so many messages! I?ll compose a reply for tomorrow. Regards From victor.stinner at haypocalc.com Wed Oct 12 21:19:09 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 12 Oct 2011 21:19:09 +0200 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #13156: revert changeset f6feed6ec3f9, which was only relevant for native In-Reply-To: References: Message-ID: <201110122119.09398.victor.stinner@haypocalc.com> Le mercredi 12 octobre 2011 21:07:33, charles-francois.natali a ?crit : > changeset: 72897:ee4fe16d9b48 > branch: 2.7 > parent: 69635:f6feed6ec3f9 > user: Charles-Fran?ois Natali > date: Wed Oct 12 21:07:54 2011 +0200 > summary: > Issue #13156: revert changeset f6feed6ec3f9, which was only relevant for > native TLS implementations, and fails with the ad-hoc TLS implementation > when a thread doesn't have an auto thread state (e.g. a thread created > outside of Python calling into a subinterpreter). > > --- a/Misc/NEWS > +++ b/Misc/NEWS > @@ -61,10 +61,6 @@ > Library > ------- > > -- Issue #10517: After fork(), reinitialize the TLS used by the > PyGILState_* - APIs, to avoid a crash with the pthread implementation in > RHEL 5. Patch - by Charles-Fran?ois Natali. You should restore this NEWS entry and add a new one to say that the patch has been reverted. Victor From tjreedy at udel.edu Wed Oct 12 22:58:15 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 12 Oct 2011 16:58:15 -0400 Subject: [Python-Dev] Documentation strategy for PEP 3151 In-Reply-To: <20111012105829.530f7592@limelight.wooz.org> References: <20111012161755.1d96e744@pitrou.net> <20111012105829.530f7592@limelight.wooz.org> Message-ID: On 10/12/2011 10:58 AM, Barry Warsaw wrote: > On Oct 12, 2011, at 10:24 AM, Benjamin Peterson wrote: > >> 2011/10/12 Antoine Pitrou: >>> I'd like some advice on what the best path is in cases such as: >>> >>> A :exc:`socket.error` is raised for errors from the call >>> to :func:`inet_ntop`. >>> >>> Should I replace "socket.error" with "OSError" (knowing that the >>> former is now an alias of the latter), or leave "socket.error" so that >>> people have less surprises when running their code with a previous >>> Python version? >> >> I think you should say OSError but leave a historical note with a >> versionchanged on it. > > +1 Given that tracebacks for uncaught socket errors will end with OSError, the doc should say that is what is raised. The edits and notes I have seen so far today look fine. I also liked the What's New example. The new version looks *much* better, and not just because of the deleted import, but because the changes allow a much clearer structure that is more pleasant to read. So my thanks also for carrying out this project. -- Terry Jan Reedy From victor.stinner at haypocalc.com Thu Oct 13 00:44:33 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 13 Oct 2011 00:44:33 +0200 Subject: [Python-Dev] Identifier API In-Reply-To: <4E90640E.2040301@v.loewis.de> References: <4E90640E.2040301@v.loewis.de> Message-ID: <201110130044.33601.victor.stinner@haypocalc.com> Le samedi 8 octobre 2011 16:54:06, Martin v. L?wis a ?crit : > In benchmarking PEP 393, I noticed that many UTF-8 decode > calls originate from C code with static strings, in particular > PyObject_CallMethod. Many of such calls already have been optimized > to cache a string object, however, PyObject_CallMethod remains > unoptimized since it requires a char*. Because all identifiers are ASCII (in the C code base), another idea is to use a structure similar to PyASCIIObject but with an additional pointer to the constant char* string: typedef struct { PyASCIIObject _base; const char *ascii; } PyConstASCIIObject; Characters don't have to be copied, just the pointer, but you still have to allocate a structure. Because the size of the structure is also constant, we can have an efficient free list. Pseudo-code to create such object: PyObject* create_const_ascii(const char *str) { PyConstASCIIObject *obj; /* ensure maybe that str is ASCII only? */ obj = get_from_freelist(); # reset the object (e.g. hash) if (!obj) { obj = allocate_new_const_ascii(); if (!obj) return NULL; } obj->ascii = str; return obj; } Except PyUnicode_DATA, such structure should be fully compatible with other PEP 383 structures. We would need a new format for Py_BuildValue, e.g. 'a' for ASCII string. Later we can add new functions like _PyDict_GetASCII(). Victor From solipsis at pitrou.net Thu Oct 13 01:27:32 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 13 Oct 2011 01:27:32 +0200 Subject: [Python-Dev] cpython: Optimize findchar() for PyUnicode_1BYTE_KIND: use memchr and memrchr References: Message-ID: <20111013012732.0d9724c9@pitrou.net> On Thu, 13 Oct 2011 01:17:29 +0200 victor.stinner wrote: > http://hg.python.org/cpython/rev/e5bd48b43a58 > changeset: 72903:e5bd48b43a58 > user: Victor Stinner > date: Thu Oct 13 00:18:12 2011 +0200 > summary: > Optimize findchar() for PyUnicode_1BYTE_KIND: use memchr and memrchr Can't we simply reuse the stringlib here? From benjamin at python.org Thu Oct 13 02:10:22 2011 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 12 Oct 2011 20:10:22 -0400 Subject: [Python-Dev] [Python-checkins] peps: Mark PEP accepted. In-Reply-To: References: Message-ID: Isn't now final? 2011/10/12 antoine.pitrou : > http://hg.python.org/peps/rev/f50f0e14c774 > changeset: ? 3962:f50f0e14c774 > parent: ? ? ?3959:2e1f0462a847 > user: ? ? ? ?Antoine Pitrou > date: ? ? ? ?Thu Oct 13 02:01:21 2011 +0200 > summary: > ?Mark PEP accepted. > > files: > ?pep-3151.txt | ?10 +++++----- > ?1 files changed, 5 insertions(+), 5 deletions(-) > > > diff --git a/pep-3151.txt b/pep-3151.txt > --- a/pep-3151.txt > +++ b/pep-3151.txt > @@ -3,13 +3,12 @@ > ?Version: $Revision$ > ?Last-Modified: $Date$ > ?Author: Antoine Pitrou > -Status: Draft > +Status: Accepted > ?Type: Standards Track > ?Content-Type: text/x-rst > ?Created: 2010-07-21 > ?Python-Version: 3.3 > ?Post-History: > -Resolution: TBD > > > ?Abstract > @@ -507,9 +506,10 @@ > ?Implementation > ?============== > > -A reference implementation is available in > -http://hg.python.org/features/pep-3151/ in branch ``pep-3151``. ?It is > -also tracked on the bug tracker at http://bugs.python.org/issue12555. > +The reference implementation has been integrated into Python 3.3. > +It was formerly developed in http://hg.python.org/features/pep-3151/ in > +branch ``pep-3151``, and also tracked on the bug tracker at > +http://bugs.python.org/issue12555. > ?It has been successfully tested on a variety of systems: Linux, Windows, > ?OpenIndiana and FreeBSD buildbots. > > > -- > Repository URL: http://hg.python.org/peps > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > > -- Regards, Benjamin From victor.stinner at haypocalc.com Thu Oct 13 03:07:20 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 13 Oct 2011 03:07:20 +0200 Subject: [Python-Dev] cpython: Optimize findchar() for PyUnicode_1BYTE_KIND: use memchr and memrchr In-Reply-To: <20111013012732.0d9724c9@pitrou.net> References: <20111013012732.0d9724c9@pitrou.net> Message-ID: <201110130307.20522.victor.stinner@haypocalc.com> Le jeudi 13 octobre 2011 01:27:32, Antoine Pitrou a ?crit : > On Thu, 13 Oct 2011 01:17:29 +0200 > > victor.stinner wrote: > > http://hg.python.org/cpython/rev/e5bd48b43a58 > > changeset: 72903:e5bd48b43a58 > > user: Victor Stinner > > date: Thu Oct 13 00:18:12 2011 +0200 > > > > summary: > > Optimize findchar() for PyUnicode_1BYTE_KIND: use memchr and memrchr > > Can't we simply reuse the stringlib here? Hum, maybe, but not easily: functions have different prototypes and manipulate different types. Victor From victor.stinner at haypocalc.com Thu Oct 13 03:34:00 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 13 Oct 2011 03:34:00 +0200 Subject: [Python-Dev] Identifier API In-Reply-To: <201110130044.33601.victor.stinner@haypocalc.com> References: <4E90640E.2040301@v.loewis.de> <201110130044.33601.victor.stinner@haypocalc.com> Message-ID: <201110130334.00711.victor.stinner@haypocalc.com> Le jeudi 13 octobre 2011 00:44:33, Victor Stinner a ?crit : > Le samedi 8 octobre 2011 16:54:06, Martin v. L?wis a ?crit : > > In benchmarking PEP 393, I noticed that many UTF-8 decode > > calls originate from C code with static strings, in particular > > PyObject_CallMethod. Many of such calls already have been optimized > > to cache a string object, however, PyObject_CallMethod remains > > unoptimized since it requires a char*. > > Because all identifiers are ASCII (in the C code base), another idea is to > use a structure similar to PyASCIIObject but with an additional pointer to > the constant char* string: Oh, I realized that Martin did already commit its PyIdentifier API, it's maybe too late :-) > We would need a new format for Py_BuildValue, e.g. 'a' for ASCII string. > Later we can add new functions like _PyDict_GetASCII(). The main difference between my new "const ASCII" string idea and PyIdentifier is the lifetime of objects. Using "const ASCII" strings, the strings are destroyed quickly (to not waste memory), whereas PyIdentifiers are intern strings and so they are only destroyed at Python exit. I don't know if "const ASCII" strings solve a real issue. I implemented my idea. I will do some benchmarks to check if it's useful or not :-) Victor From martin at v.loewis.de Thu Oct 13 14:00:38 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 13 Oct 2011 14:00:38 +0200 Subject: [Python-Dev] Identifier API In-Reply-To: <20111011091943.4160b217@resist.wooz.org> References: <4E90640E.2040301@v.loewis.de> <4E943868.6070204@avl.com> <20111011091943.4160b217@resist.wooz.org> Message-ID: <4E96D2E6.60207@v.loewis.de> > I like this better too because of the all-caps macro name. Something about > seeing "Py_identifier" look like a function call and having it add the magical > PyId_update local bugs me. It just looks wrong, whereas the all-caps is more > of a cultural clue that something else is going on. If people think the macro should be all upper-case, I can go through and replace them (but only once). Let me know what the exact spelling should be. Originally, I meant to make the variable name equal the string (e.g. then having a variable named __init__ point to the "__init__" string). However, I quickly gave up on that idea, since the strings conflict too often with other identifiers in C. In particular, you couldn't use that approach for calling the "fileno", "read" or "write" methods. So I think it needs a prefix. If you don't like PyId_, let me know what the prefix should be instead. If there is no fixed prefix (i.e. if you have to specify variable name and string value separately), and if there is no error checking, there is IMO too little gain to make this usable. I'm particularly worried about the error checking: the tricky part in C is to keep track of all the code paths. This API hides this by putting the initialization into the callee (PyObject_GetAttrId and friends), hence the (unlikely) failure to initialize the string is reported in the same code path as the (very plausible) error that the actual attribute access failed. Regards, Martin From martin at v.loewis.de Thu Oct 13 13:51:58 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 13 Oct 2011 13:51:58 +0200 Subject: [Python-Dev] Identifier API In-Reply-To: <4E943868.6070204@avl.com> References: <4E90640E.2040301@v.loewis.de> <4E943868.6070204@avl.com> Message-ID: <4E96D0DE.9010304@v.loewis.de> > An alternative I am fond of is to to avoid introducing a new type, and > simply initialize a PyObject * and register its address. -1 on that, because of the lack of error checking. Regards, Martin From solipsis at pitrou.net Thu Oct 13 15:23:23 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 13 Oct 2011 15:23:23 +0200 Subject: [Python-Dev] Identifier API References: <4E90640E.2040301@v.loewis.de> <4E943868.6070204@avl.com> <20111011091943.4160b217@resist.wooz.org> <4E96D2E6.60207@v.loewis.de> Message-ID: <20111013152323.622b6bc6@pitrou.net> On Thu, 13 Oct 2011 14:00:38 +0200 "Martin v. L?wis" wrote: > > I like this better too because of the all-caps macro name. Something about > > seeing "Py_identifier" look like a function call and having it add the magical > > PyId_update local bugs me. It just looks wrong, whereas the all-caps is more > > of a cultural clue that something else is going on. > > If people think the macro should be all upper-case, I can go through and > replace them (but only once). Let me know what the exact spelling > should be. Py_CONST_STRING or Py_IDENTIFIER would be fine with me. Given that everything else uses "Id" in their name, Py_IDENTIFIER is probably better? > Originally, I meant to make the variable name equal the string (e.g. > then having a variable named __init__ point to the "__init__" string). > However, I quickly gave up on that idea, since the strings conflict > too often with other identifiers in C. In particular, you couldn't > use that approach for calling the "fileno", "read" or "write" methods. > > So I think it needs a prefix. If you don't like PyId_, let me know > what the prefix should be instead. I agree with that. Regards Antoine. From barry at python.org Thu Oct 13 15:42:24 2011 From: barry at python.org (Barry Warsaw) Date: Thu, 13 Oct 2011 09:42:24 -0400 Subject: [Python-Dev] Identifier API In-Reply-To: <20111013152323.622b6bc6@pitrou.net> References: <4E90640E.2040301@v.loewis.de> <4E943868.6070204@avl.com> <20111011091943.4160b217@resist.wooz.org> <4E96D2E6.60207@v.loewis.de> <20111013152323.622b6bc6@pitrou.net> Message-ID: <20111013094224.46e4e7c8@limelight.wooz.org> On Oct 13, 2011, at 03:23 PM, Antoine Pitrou wrote: >Py_CONST_STRING or Py_IDENTIFIER would be fine with me. Given that >everything else uses "Id" in their name, Py_IDENTIFIER is probably better? I agree that either is fine, with a slight preference for Py_IDENTIFIER for the same reasons. >> Originally, I meant to make the variable name equal the string (e.g. >> then having a variable named __init__ point to the "__init__" string). >> However, I quickly gave up on that idea, since the strings conflict >> too often with other identifiers in C. In particular, you couldn't >> use that approach for calling the "fileno", "read" or "write" methods. >> >> So I think it needs a prefix. If you don't like PyId_, let me know >> what the prefix should be instead. > >I agree with that. I'm fine with that too, as long as it's all well-documented in the C API guide. -Barry From merwok at netwok.org Thu Oct 13 18:18:56 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 13 Oct 2011 18:18:56 +0200 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> Message-ID: <4E970F70.70105@netwok.org> Le 09/10/2011 13:54, Paul Moore a ?crit : > On 9 October 2011 08:15, ?ric Araujo wrote: >> Are there that many distributions with extension modules? sdists should >> work well even on Windows for pure Python projects. > [...] > Looking at my installations, I see: > - database drivers (cx_Oracle, in my case) > - lxml > - pywin32 > - pyQT > - pyzmq (that's just for playing a bit with IPython, so doesn't really count...) > - I've also used in the past PIL mod_python (mod_wsgi more recently) > and wxPython, These are good examples. Even if the number is not high, they are widely used, so support for binary distributions really seems needed. (When pip switches to distutils2 as underlying lib, they?ll get bdist support for free! IOW, proper bdist support would be another argument to make the world switch to the new standards.) > The pysetup features for uninstalling packages aren't going to work with > bdist_wininst/bdist_msi (that's an assumption, I haven't tried them > but I can't see how they would, and it'd certainly be a lot of > marginally-useful effort to do even if it were possible). > > The virtual environment stuff also wouldn't work that well with the > installers, because they wouldn't have any way of finding which > environments existed to ask where to install to. The same problem > exists with virtualenv. (Again this is speculation backed by a small > amount of playing with virtualenv, so I may be wrong here). wininst and msi bdists can continue to be used as previously, for people who want clicky installation to their system Python. With built-in package management and virtual environments in 3.3+, we can just recommend that people publish bdist_simple instead of wininst or eggs. > It may be that the bdist_dumb format would be OK. I haven't checked it > out (to be honest, I don't think it's ever been used much). It may or may not be. bdist_dumb just makes a tarball or zipfile of Python modules and built extension modules and is supposed to be unpacked under sys.prefix or sys.exec_prefix. However, that won?t play nice with install options (--home, --user, --install-lib or redefined --prefix) or sysconfig categories (i.e. I may want config files under /usr/local/etc, scripts in /usr/local/bin, etc.). I think we could revamp bdist_dumb so that it?s really ?sdist with compiled files? and then we let pysetup install things to the right places. >> Yes! We need feedback to provide a much better tool than distutils, >> before the API is locked by backward compatibility rules. > Always the chicken and egg problem :-) I?d rather say it?s ?code in the stdlib has one foot in the grave? + ?stdlib code without active maintainer is effectively frozen? (hi asyncore changes in 2.6!). >> I actually wanted to talk about that, so let me take the opportunity. >> What if we released packaging in Python 3.3 (and distutils2 1.0 on PyPI) >> as a not-quite-final release? [...] > My immediate thought is that it would actually put people off using > packaging for 3.3, they'd wait until "it is stable". OK. Too bad. I?ll probably post that question again in its own message to get more feedback. > What is the status of distutils2? Is that (still?) intended to be > effectively a backport of packaging to earlier Python versions? Yes. It works with 2.4-3.3. I maintain it synchronized with packaging in 3.3. There are a small number of test failures which needs fixing before I release distutils2 1.0a4 on PyPI. > I'd suggest getting a distutils2 release available, and promoted, > as the "early adopter" version of packaging. We may do that, but I fear we?re going to lack time for that. As part of the stdlib, the packaging module API will be frozen in June, for the first 3.3 beta. We still have a lot to do: defining __all__ in all public modules, changing some important internals (Tarek wants to kill the subcommands system for example), fixing a number of bugs which may imply incompatible API changes, etc. The pace of development has slowed much these last months, so I?m not sure we?ll reach 1.0 status months before June. > Maybe even with an option to install it as "packaging" so that people > can use it in 3.2 and earlier and expect to need no changes when 3.3 > is released. Not gonna happen! d2 changed name on purpose when entering the stdlib, so that in the future code can choose to use packaging (in the stdlib, for example 1.0) or distutils2 (external backport, possibly of 2.0). Code wanting to use ?packaging if available otherwise distutils2? will use the same import try/except as what?s done with unittest/unittest2, json/simplejson and similar. > A python-announce article "Python 3.3 new packaging features - early > adopter release" publicising it, would be what I'm thinking of... Here?s a plan: 1) make the docs usable (I?m on it) 2) fix the three test failures we currently have in d2 3) test on Windows and Mac 4) release 1.0a4 (I?ll do it) 5) announce and request feedback 6) work frantically before Python 3.3b1 to improve stuff and limit the public API so as not to lock ourselves Regards From merwok at netwok.org Thu Oct 13 18:23:57 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 13 Oct 2011 18:23:57 +0200 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> Message-ID: <4E97109D.2060306@netwok.org> Hi Philip, > [...] In any case, it definitely wasn't the case that eggs or setuptools were > rejected for 2.5; they were withdrawn for reasons that didn't have anything > to do with the format itself. Thanks for clarifying. I nonetheless remember strong opposition to pulling the code unmodified, from MvL IIRC. > (And, ironically enough, AFAIK the new packaging module uses code that's > actually based on the bits of setuptools Fredrik was worried about > supporting... setuptools presence in packaging is - packaging.database can read egg/PKG-INFO (zipped and unzipped) and egg-info files - packaging.install can detect that a project uses a setup.py with setuptools, run that setup.py, and convert egg-info to dist-info > but at least there now are more people providing that support.) Truth be told, I?m not sure it is so. The student who worked on packaging.database has not remained a member of our group; his mentor is also less active. But that?s not the hardest code in packaging. Regarding installation, we do have people with distribute knowledge and experience, so that?s good. >> What we can do however is to see what bdist_egg does and define a new bdist >> command inspired by it, but without zipping, pkg_resource calls, etc. > Why? If you just want a dumb bdist format, there's already bdist_dumb. We?re not sure bdist_dumb is what we?re after?see my other messages. > Conversely, if you want a smarter format, why reinvent wheels? Recent packaging PEPs and distutils2 are all about reinventing wheels! Or rather standardizing best practices for wheels. Some ideas are taken near-identical from setuptools, other see great changes. In this case, we have to define our requirements, and if bdist_egg can work (as a distribution format, not an installation format!), then we may just take it. If it does not, we?ll have to make a new wheel. Regards From merwok at netwok.org Thu Oct 13 18:25:30 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 13 Oct 2011 18:25:30 +0200 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> Message-ID: <4E9710FA.3010003@netwok.org> Le 09/10/2011 22:14, Paul Moore a ?crit : > On 9 October 2011 20:47, Tarek Ziad? wrote: >> PEP 376 provide the installation format for the 'future' -- >> http://www.python.org/dev/peps/pep-0376/ > [...] >> Now for a binary archive, that would get installed ala PEP 376, why >> not ? I'd just be curious to have someone list the advantage of having >> a project released that way besides the "importable as-is" feature. I think I don?t understand you here. IMO, bdists are just intermediary formats that are supposed to be consumed by installation tools. Users are not expected to import from bdists. > My expectation would be that the user would type pysetup install > some_binary_format_file.zip and have that file unpacked and all the > "bits" put in the appropriate place. Basically just like installing > from a source archive - pysetup install project-1.0.tar.gz - but > skipping the compile steps because the compiler output files are > present. Yep. > That may need some extra intelligence in pysetup if it doesn't have > this feature already [...] just unzip the bits into the right place, > or something similar. Yes. The bdist can be just like an sdist, but it contains compiled files instead of C source files (maybe setuptools bdist_egg is just that), then pysetup uses the setup.cfg file to find files and install them at the right places. Alternatively, the bdist format could put build files into a few top-level directories, using sysconfig names: config, appdata, doc, purelib, platlib, etc. pysetup would then move files to the right target directory. > As regards the format, bdist_dumb is about the right level - but > having just checked it has some problems (which if I recall, have been > known for some time, and are why bdist_dumb doesn't get used). Or maybe because it?s just useless: Windows users want to click many times, Mac OS users want to drag-and-drop things, free OS users are fine with sdists. > Specifically, bdist_dumb puts the location of site-packages ON THE > BUILD SYSTEM into the archive, making it useless for direct unzipping > on a target system which has Python installed somewhere else. pysetup would not just unzip it though, so maybe this limitation is not really one. > a new name might be needed if backward compatibility of the old > broken format matters... The point of forking distutils under a new name is that we can break compat. > I don't know the code at all, and I have little time Your time is best used with giving user expectations and feedback. I hope to get a Windows VM soon-ish, so I should not even pester people for testing my patches :) > PS The problem for me is that if pysetup only handles source builds, > it's STILL annoyingly incomplete for my requirements (and possibly > many Windows users') Agreed. packaging does not want to exclude Windows users. [Nick] > bdist_zip, bdist_archive, bdist_simple would all work (bdist_binary is > redundant, given what the 'b' stands for). Isn?t calling zipfiles ?archives? an abuse? I like bdist_simple. On a related topic, I?m not sure the bdist comand is useful. Its only role is to translate ?bdist --formats zip,targz,wininst? to calls to the other bdist_* commands. > The 'bdist_dumb' name has always irritated me, since the connotations > more strongly favour 'stupid' than they do 'simple' I?ve also recently learned that people with mental illness can be hurt by derogatory uses of ?dumb?, so if we change the name to reflect the change in behavior, it?d have the nice side-effect of being nicer. [Vinay] > A simple change to packaging will allow an archive containing a setup.cfg-based > directory to be installed in the same way as a source directory. Isn?t that already supported, as long as the tarball or zipfile contains source files? In any case, it was intended to be, and there?s still support code around. > the current installer stub (wininst-x.y.exe) does not know anything > about virtual environments. If we care about virtual environment support (and I > think we should), wininst.exe could be enhanced to provide a "Browse..." button > to allow a user to select a virtual environment to install into, Personally, I?ll focus on sdist and bdist_simple support. When pysetup is run from a virtualenv, projects will be installed into the venv. You are free to work on patches for bdist_wininsts, but I?m not sure it will be needed, if we make the pysetup user experience smooth enough. That said, even if Paul was convinced to forsake clicky installers, maybe some Windows users will absolutely refuse to use the command line. [Paul] > To summarise, then: > > 1. By using setup.cfg technology, it would be easy enough to zip up a > binary build in a way that pysetup could unpack and install. Correct. I?m still pondering whether I find the idea of registering built files in setup.cfg as elegant or hacky :) We also have the other ideas I wrote to choose from. > 1a. A packaging command to build such an archive would be worth providing. Definitely. Maybe we could also decide on one of wininst or msi? > 2. A GUI installer would still be valuable for many people > 2a. Having the GUI work by doing a pysetup install passing the > installer exe (which would have a zipfile as noted in 1 above > appended) could make sense to avoid duplicating work. Yes. > 2b. The GUI could do the extra needed to integrate with the OS, > which pysetup wouldn't do Nice property. > 2c. There's a question over a GUI install followed by a pysetup > uninstall, which wouldn't remove the add/remove entry... I think we could require that a project installed with a clicky wininst_bdist has to be removed via the Add/Remove GUI. (There is support for that in PEP 376: the INSTALLER file.) > 3. Ideally, the GUI should co-operate with venvs, by offering some > form of browse facility. The command line does this automatically. Will Windows users want a GUI to create venvs too? [Vinay] > I looked at the dialog resources for wininst-x.y.exe and noticed that there is a > "Find other ..." button which is hidden, and its handler (in > PC\bdist_wininst\install.c) is commented out. However, the code called by the > handler - GetOtherPythonVersion - is still there. Does anyone here know why the > button has been made unavailable? hg blame tells that this has been commented out since the initial commit to Subversion. Maybe the previous CVS history will yield more info. Regards From merwok at netwok.org Thu Oct 13 18:35:06 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 13 Oct 2011 18:35:06 +0200 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> Message-ID: <4E97133A.8060604@netwok.org> Hi Nick, Le 11/10/2011 03:29, Nick Coghlan a ?crit : > On Mon, Oct 10, 2011 at 2:29 PM, Paul Moore wrote: >> Ideally bdist_wininst and bdist_msi would also integrate with pysetup >> and with virtual environments, but I imagine that could be pretty hard >> to make work cleanly, as Windows doesn't really support multiple >> installations of a software package... > That's OK, the package managers get bypassed by pysetup on POSIX > systems as well - that's kind of the point of language level virtual > environments I?m not sure I follow you. wininst and msi installers are supposed to work with the Windows programs manager, it is not bypassed at all IIUC. That?s the difficulty: How to make the Add/Remove program aware of many Pythons and venvs? > There are hard to build packages on POSIX (e.g. PIL) that would also > benefit from a good, cross-platform approach to binary installation. We haven?t talked about cross-platform binary installers. The current gist of the discussion seems to point to a world where people continue to release an sdist for capable OSes and also publish one bdist_simple with .pyd for one Windows version and one bdist_simple with .so for one Mac OS X version (but possible many arches). (And given that it?s possible to have one setup.cfg and one setup.py coexisting, maybe eggs will be offered for a while too.) Regards From merwok at netwok.org Thu Oct 13 18:44:18 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 13 Oct 2011 18:44:18 +0200 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: <4E9710FA.3010003@netwok.org> References: <4E914A09.50209@netwok.org> <4E9710FA.3010003@netwok.org> Message-ID: <4E971562.1060401@netwok.org> Le 13/10/2011 18:25, ?ric Araujo a ?crit : >> 2c. There's a question over a GUI install followed by a pysetup >> uninstall, which wouldn't remove the add/remove entry... > I think we could require that a project installed with a clicky > wininst_bdist has to be removed via the Add/Remove GUI. (There is > support for that in PEP 376: the INSTALLER file.) In case this wasn?t very clear: PEP 376 defines that the name of the installer be recorded in a file in the dist-info dir, and when another installer tries to remove the distribution it will be blocked. So if INSTALLER contains ?wininst?, then ?pysetup remove? won?t work, so we won?t get imperfect uninstalls. Cheers From p.f.moore at gmail.com Thu Oct 13 19:47:36 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 13 Oct 2011 18:47:36 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: <4E97133A.8060604@netwok.org> References: <4E914A09.50209@netwok.org> <4E97133A.8060604@netwok.org> Message-ID: On 13 October 2011 17:35, ?ric Araujo wrote: > Le 11/10/2011 03:29, Nick Coghlan a ?crit : >> On Mon, Oct 10, 2011 at 2:29 PM, Paul Moore wrote: >>> Ideally bdist_wininst and bdist_msi would also integrate with pysetup >>> and with virtual environments, but I imagine that could be pretty hard >>> to make work cleanly, as Windows doesn't really support multiple >>> installations of a software package... >> That's OK, the package managers get bypassed by pysetup on POSIX >> systems as well - that's kind of the point of language level virtual >> environments > I?m not sure I follow you. ?wininst and msi installers are supposed to > work with the Windows programs manager, it is not bypassed at all IIUC. > ?That?s the difficulty: How to make the Add/Remove program aware of many > Pythons and venvs? There are 2 separate things here: 1. Native installers, bdist_msi and bdist_wininst. These (currently) integrate with the Add/Remove feature and have a standard platform look & feel. The downside is that they don't integrate to the same level with certain Python features like venvs and pysetup, or with non-system Python installations. 2. pysetup, which is Python's "native" package manager and as such can be assumed to integrate well with any other Python features. The Unix equivalent of (1) would be RPM installers for Python packages. It's not so much about the GUI-ness as the native integration (which on Windows does mean GUI, but less so on other platforms). It's the classic trade-off between integration with the platform vs integration with the language environment. You can't usually have both unless the language is single-platform. >> There are hard to build packages on POSIX (e.g. PIL) that would also >> benefit from a good, cross-platform approach to binary installation. > We haven?t talked about cross-platform binary installers. ?The current > gist of the discussion seems to point to a world where people continue > to release an sdist for capable OSes and also publish one bdist_simple > with .pyd for one Windows version and one bdist_simple with .so for one > Mac OS X version (but possible many arches). ?(And given that it?s > possible to have one setup.cfg and one setup.py coexisting, maybe eggs > will be offered for a while too.) Nick is talking about a cross-platform *approach* - not a single installer that runs on multiple platforms, but rather a common set of instructions ("run pysetup run bdist_simple; pysetup upload") which will generate a binary package that can be installed in a common way (pysetup install) on all platforms. The actual bdist_simple file will be version, architecture and platform dependent, just like binary eggs and bdist_winints installers today, but it can be used by people without access to a compiler, or to development packages for supporting libraries, or anything other than Python. This is a good thing for most users (even the RPM for say lxml includes binary so files, and doesn't build from source on the target system), and nigh-on essential for most Windows users (where setting up a compiler is a lot more complex than yum install gcc). It may even be useful enough to persuade Windows users to move away from GUI installers (after all, that's what has happened with setuptools and binary eggs). Paul. From martin at v.loewis.de Thu Oct 13 19:30:32 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 13 Oct 2011 19:30:32 +0200 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: <4E970F70.70105@netwok.org> References: <4E914A09.50209@netwok.org> <4E970F70.70105@netwok.org> Message-ID: <4E972038.3050000@v.loewis.de> > wininst and msi bdists can continue to be used as previously, for people > who want clicky installation to their system Python. With built-in > package management and virtual environments in 3.3+, we can just > recommend that people publish bdist_simple instead of wininst or eggs. Pardon me for jumping in - but I fail to see why those missing features can't be provided by bdist_wininst and bdist_msi in a straight-forward manner. Giving people even more choice is bad, IMO, as it will confuse users. There should be one obvious way. In particular wrt. virtual environments: I see no need to actually *install* files multiple times. It's rather sufficient that the distributions to be installed are *available* in the virtual env after installation, and unavailable after being removed. Actually copying them into the virtual environment might not be necessary or useful. So I envision a setup where the MSI file puts the binaries into a place on disk where pysetup (or whatever tool) finds them, and links them whereever they need to go (using whatever linking mechanism might work). For MSI in particular, there could be some interaction with pysetup, e.g. to register all virtualenvs that have linked the installation, and warn the user that the file is still in use in certain locations. Likewise, automated download might pick an MSI file, and tell it not to place itself into the actual Python installation, but instead into a location where pysetup will find it. Regards, Martin From solipsis at pitrou.net Thu Oct 13 20:05:08 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 13 Oct 2011 20:05:08 +0200 Subject: [Python-Dev] cpython: Use identifier API for PyObject_GetAttrString. References: <4E972852.5060906@v.loewis.de> Message-ID: <20111013200508.7ee9d45a@pitrou.net> On Thu, 13 Oct 2011 20:05:06 +0200 "Martin v. L?wis" wrote: > > - In Modules/_json.c, line 1126, _Py_identifier(strict) is > > declared but not used, and there are 5 other possible replacements. > > Antoine reverted this in 8ed6a627a834. I think I started doing them, > then noticed that this is an initializer, so it's likely not called > that often. That's what I thought too. There didn't seem to be much point in optimizing that code. Regards Antoine. From martin at v.loewis.de Thu Oct 13 20:08:32 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 13 Oct 2011 20:08:32 +0200 Subject: [Python-Dev] Identifier API In-Reply-To: <20111013094224.46e4e7c8@limelight.wooz.org> References: <4E90640E.2040301@v.loewis.de> <4E943868.6070204@avl.com> <20111011091943.4160b217@resist.wooz.org> <4E96D2E6.60207@v.loewis.de> <20111013152323.622b6bc6@pitrou.net> <20111013094224.46e4e7c8@limelight.wooz.org> Message-ID: <4E972920.8030300@v.loewis.de> >> Py_CONST_STRING or Py_IDENTIFIER would be fine with me. Given that >> everything else uses "Id" in their name, Py_IDENTIFIER is probably better? > > I agree that either is fine, with a slight preference for Py_IDENTIFIER for > the same reasons. Ok, so it's Py_IDENTIFIER. >>> So I think it needs a prefix. If you don't like PyId_, let me know >>> what the prefix should be instead. >> >> I agree with that. > > I'm fine with that too, as long as it's all well-documented in the C API > guide. Hmm. People voted that this should be an internal API, so I'm not sure it should be documented at all outside of the header file, or if, in what document. Currently, this very point is documented in the header file. Regards, Martin From p.f.moore at gmail.com Thu Oct 13 20:30:01 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 13 Oct 2011 19:30:01 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: <4E9710FA.3010003@netwok.org> References: <4E914A09.50209@netwok.org> <4E9710FA.3010003@netwok.org> Message-ID: On 13 October 2011 17:25, ?ric Araujo wrote: >> 1. By using setup.cfg technology, it would be easy enough to zip up a >> binary build in a way that pysetup could unpack and install. > Correct. ?I?m still pondering whether I find the idea of registering > built files in setup.cfg as elegant or hacky :) ?We also have the other > ideas I wrote to choose from. To be honest, I think I prefer the idea of taking the bdist_wininst code which creates a zipped distribution archive with "special" root directories like PLATLIB, and use it to create a bdist_simple (basically, by removing some of the code to prepend the EXE stub). Then teach pysetup install to install that file format.(likely by just plugging in a new function into install_methods). The benefit is that bdist_wininst installers can be consumed unaltered by this install method. [1] Vinay's suggestion of registering the built files in setup.cfg sounds attractive, as the code is already there, but it seems like it'd just move the complexity from the install code to the process of building the bdist_simple archive. >> ? ?1a. A packaging command to build such an archive would be worth providing. > Definitely. ?Maybe we could also decide on one of wininst or msi? I was thinking of a new bdist_simple archive format, which is platform-agnostic. bdist_wininst is a compatible superset restricted to the Windows environment, so promoting bdist_wininst over bdist_msi for people who prefer GUIs and platform integration would make sense. >> 2. A GUI installer would still be valuable for many people >> ? ?2a. Having the GUI work by doing a pysetup install passing the >> installer exe (which would have a zipfile as noted in 1 above >> appended) could make sense to avoid duplicating work. > Yes. Or having bdist_wininst usable by pysetup install directly (because it's a bdist_simple compatible format)... >> ? ?2b. The GUI could do the extra needed to integrate with the OS, >> which pysetup wouldn't do > Nice property. And already present with bdist_wininst. >> ? ?2c. There's a question over a GUI install followed by a pysetup >> uninstall, which wouldn't remove the add/remove entry... > I think we could require that a project installed with a clicky > wininst_bdist has to be removed via the Add/Remove GUI. ?(There is > support for that in PEP 376: the INSTALLER file.) Agreed - and if pysetup install works with bdist_wininst files, the user can choose whether to use a pysetup or a GUI install (and consequently which approach to management/uninstall they prefer). >> 3. Ideally, the GUI should co-operate with venvs, by offering some >> form of browse facility. The command line does this automatically. > Will Windows users want a GUI to create venvs too? Quite possibly some will. Personally, I don't. And given that virtualenv has managed OK without a GUI interface, I'd say let's assume it's YAGNI for now. [1] Actually, based on the above, I think the pysetup install method that consumes bdist_wininst files as if they were just bdist_simple archives (i.e., assuming the bdist_simple format takes that layout) would be useful even if bdist_simple is never implemented. The bdist_simple version makes the same facilities available for non-Windows users, if they want it. I think I'll look at coding that option, and see where it takes me. Paul. From p.f.moore at gmail.com Thu Oct 13 20:36:11 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 13 Oct 2011 19:36:11 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: <4E972038.3050000@v.loewis.de> References: <4E914A09.50209@netwok.org> <4E970F70.70105@netwok.org> <4E972038.3050000@v.loewis.de> Message-ID: On 13 October 2011 18:30, "Martin v. L?wis" wrote: >> wininst and msi bdists can continue to be used as previously, for people >> who want clicky installation to their system Python. With built-in >> package management and virtual environments in 3.3+, we can just >> recommend that people publish bdist_simple instead of wininst or eggs. > > Pardon me for jumping in - but I fail to see why those missing features > can't be provided by bdist_wininst and bdist_msi in a straight-forward > manner. Giving people even more choice is bad, IMO, as it will confuse > users. There should be one obvious way. I don't particularly disagree - although I would point out that the two formats bdist_wininst and bdist_msi already offer more than one obvious way... My contention is that there *are* two distinct use cases - platform integrated installation (on Windows that implies a GUI to most people), and Python's native installation process (pysetup). This isn't new, before packaging the "python-native" form was setuptools/eggs, for better or worse. Ideally, both forms should have full capabilities, making the decision a style/preference choice rather than a functionality choice. But this particular choice is always with us, and people are familiar with it. (Native vs cross-platform GUIs, cygwin vs mingw, etc, etc). So we need two obvious ways, one for each case. (It would be nice if one way could cover both cases, of course - having pysetup consume bdist_wininst files is my attempt to achieve that). I don't really understand the benefits of bdist_msi over bdist_wininst, and I certainly don't understand the MSI technology well enough to comment on what it's capable of, so I'm going to stick to bdist_wininst in the following. My apologies for anything I miss as a result. But it does strike me that the existence of both MSI and wininst is where the confusing duplication exists, rather than having GUI and command line alternatives. The GUI and platform integration aspects of the bdist_wininst format are all part of the executable "bit". I haven't looked at that code at all, but I am certain it can be modified to provide whatever user experience is desired. The only real problem here is how many people have the knowledge and/or inclination to work on that code. When it comes to installing the actual package, I don't know how the bdist_wininst code does it - the data is there in zip format, and I suspect that the code simply unzips the data in the expected directories. But the zipped up data in bdist_wininst could be consumed by the packaging module, just by writing a new install method. This would reuse all of the various packaging support routines and infrastructure. The bdist_wininst executable code *could* be modified to invoke that packaging method - whether that's worthwhile isn't clear to me (I don't know how extensive the changes would be to get the benefit reusing the same implementation). As MSI format is a specialised format, I don't believe this option is open for bdist_msi. > In particular wrt. virtual environments: I see no need to actually > *install* files multiple times. It's rather sufficient that the > distributions to be installed are *available* in the virtual env after > installation, and unavailable after being removed. Actually copying > them into the virtual environment might not be necessary or useful. > > So I envision a setup where the MSI file puts the binaries into a place > on disk where pysetup (or whatever tool) finds them, and links them > whereever they need to go (using whatever linking mechanism might work). > For MSI in particular, there could be some interaction with pysetup, > e.g. to register all virtualenvs that have linked the installation, > and warn the user that the file is still in use in certain locations. > Likewise, automated download might pick an MSI file, and tell it not > to place itself into the actual Python installation, but instead into > a location where pysetup will find it. I can't really comment on this. I agree in principle with what you're saying, but I know little about the MSI format so I can't say much more. It feels to me like you're suggesting that the MSI file encapsulate the file layout logic that already has to exist in pysetup, though, which sounds like duplication of effort. Can MSI call out to pysetup to actually install the files and save this duplication? Paul. From barry at python.org Thu Oct 13 20:38:15 2011 From: barry at python.org (Barry Warsaw) Date: Thu, 13 Oct 2011 14:38:15 -0400 Subject: [Python-Dev] Identifier API In-Reply-To: <4E972920.8030300@v.loewis.de> References: <4E90640E.2040301@v.loewis.de> <4E943868.6070204@avl.com> <20111011091943.4160b217@resist.wooz.org> <4E96D2E6.60207@v.loewis.de> <20111013152323.622b6bc6@pitrou.net> <20111013094224.46e4e7c8@limelight.wooz.org> <4E972920.8030300@v.loewis.de> Message-ID: <20111013143815.023d1a8e@limelight.wooz.org> On Oct 13, 2011, at 08:08 PM, Martin v. L?wis wrote: >>> Py_CONST_STRING or Py_IDENTIFIER would be fine with me. Given that >>> everything else uses "Id" in their name, Py_IDENTIFIER is probably better? >> >> I agree that either is fine, with a slight preference for Py_IDENTIFIER for >> the same reasons. > >Ok, so it's Py_IDENTIFIER. Given below, shouldn't that be _Py_IDENTIFIER? >>>> So I think it needs a prefix. If you don't like PyId_, let me know >>>> what the prefix should be instead. >>> >>> I agree with that. >> >> I'm fine with that too, as long as it's all well-documented in the C API >> guide. > >Hmm. People voted that this should be an internal API, so I'm not sure it should be documented at all outside of the header file, or if, in >what document. > >Currently, this very point is documented in the header file. That's fine, if the macro is prefixed with an underscore. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From mail at timgolden.me.uk Thu Oct 13 21:28:40 2011 From: mail at timgolden.me.uk (Tim Golden) Date: Thu, 13 Oct 2011 20:28:40 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> <4E970F70.70105@netwok.org> <4E972038.3050000@v.loewis.de> Message-ID: <4E973BE8.3030105@timgolden.me.uk> On 13/10/2011 19:36, Paul Moore wrote: > I don't really understand the benefits of bdist_msi over > bdist_wininst Just commenting on this particular issue: in essence, the .MSI format is the Microsoft standard, something which is especially important for corporate rollouts. We're not particularly bureaucratic, but I recently had to bundle a small number of common extensions as .msi packages so they could be deployed easily onto our baseline machines. I'm not saying that Python *must* have .msi support for this reason: if it didn't already, you could argue that it could be provided by corporates who needed this, or by 3rd party service providers, if only by providing light .msi wrappers round standard installers. I'm completely overloaded at the moment, so I'm only following this thread at a distance but I did want to chime in in agreement with the points Paul's already made: Windows users expect executable binary installers; it's much harder to compile libraries on Windows even if you have a compiler; the integration with the OS package manager (Add/Remove Programs) is a benefit although not a sine qua non. TJG From p.f.moore at gmail.com Thu Oct 13 21:35:10 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 13 Oct 2011 20:35:10 +0100 Subject: [Python-Dev] PEP 376 - contents of RECORD file Message-ID: Looking at a RECORD file installed by pysetup (on 3.3 trunk, on Windows) all of the filenames seem to be absolute, even though the package is pure-Python and so everything is under site-packages. Looking at PEP 376, it looks like the paths should be relative to site-packages. Two questions: 1. Am I reading this right? Is it a bug in pysetup? 2. Does it matter? Are relative paths needed, or is it just nice to have? Oh, and a third question - where is the best place to ask these questions? Now that packaging is in core, is python-dev OK? Or should I be asking on the distutils SIG or the packaging developers list? Thanks, Paul. From p.f.moore at gmail.com Thu Oct 13 21:42:13 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 13 Oct 2011 20:42:13 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: <4E973BE8.3030105@timgolden.me.uk> References: <4E914A09.50209@netwok.org> <4E970F70.70105@netwok.org> <4E972038.3050000@v.loewis.de> <4E973BE8.3030105@timgolden.me.uk> Message-ID: On 13 October 2011 20:28, Tim Golden wrote: > On 13/10/2011 19:36, Paul Moore wrote: >> >> I don't really understand the benefits of bdist_msi over >> bdist_wininst > > Just commenting on this particular issue: in essence, the .MSI > format is the Microsoft standard, something which is especially > important for corporate rollouts. We're not particularly bureaucratic, > but I recently had to bundle a small number of common extensions as > .msi packages so they could be deployed easily onto our baseline > machines. > > I'm not saying that Python *must* have .msi support for this reason: > if it didn't already, you could argue that it could be provided by > corporates who needed this, or by 3rd party service providers, if > only by providing light .msi wrappers round standard installers. Thanks for the clarification. I can see why this would be important. But maintaining 3 different interfaces to do essentially the same thing (collect some data from the user, then based on that data put the same set of files in the same places) seems a waste of effort, and a recipe for discrepancies in capabilities. Maybe the wininst and MSI installers should ultimately become simple UIs around a zipfile and an invocation of the packaging APIs? Not that I'm offering to do that work, I'm afraid... Paul. From jeremy.kloth at gmail.com Thu Oct 13 23:35:01 2011 From: jeremy.kloth at gmail.com (Jeremy Kloth) Date: Thu, 13 Oct 2011 15:35:01 -0600 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: Message-ID: <201110131535.01164.jeremy.kloth@gmail.com> On Tuesday, October 11, 2011 01:59:45 AM Vinay Sajip wrote: > I looked at the dialog resources for wininst-x.y.exe and noticed that there > is a "Find other ..." button which is hidden, and its handler (in > PC\bdist_wininst\install.c) is commented out. However, the code called by > the handler - GetOtherPythonVersion - is still there. Does anyone here > know why the button has been made unavailable? This "feature" has never been active. It has been commented out since before Distutils was imported into Python proper. -- Jeremy Kloth From jeremy.kloth at gmail.com Fri Oct 14 00:02:27 2011 From: jeremy.kloth at gmail.com (Jeremy Kloth) Date: Thu, 13 Oct 2011 16:02:27 -0600 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E973BE8.3030105@timgolden.me.uk> Message-ID: <201110131602.27792.jeremy.kloth@gmail.com> On Thursday, October 13, 2011 01:42:13 PM Paul Moore wrote: > Maybe the wininst and MSI installers should ultimately become simple > UIs around a zipfile and an invocation of the packaging APIs? Not that > I'm offering to do that work, I'm afraid... The bdist_wininst/_msi installers cannot use any of the packaging.* code *runtime* as packaging (or distutils2) isn't necessarily installed on the target machine. I would think that having it as a prerequisite to actually running the installers is a bad thing. Including the required support files within the installers may be doable but could add too much complexity and possibly lead to stale code issues (for the support files). That said, I have been working on a drop-in replacement for the current bdist_wininst executable stub with the following features: - install to 32- or 64-bit Python installations from a single installer; currently one installer for each architechure is required - install to any Python from version 2.4 to the latest; currently one installer is needed for each major version - updated look and feel (Wizard97) with the new (as of Python 2.5!) logo; for some screen shots see: http://www.flickr.com/photos/67460826 at N04/sets/72157627653603530/ - unicode metadata support (name, summary, description) - runs on Win95 through Win7 (that is, all support platforms for the supported Python versions for packaging) - per-user installs (as in, setup.py install --user); currently only system-wide or per-user based on permissions and how Python itself was installed Planned Features: - multi-version extension module support; one installer that can install the precompiled extensions to different Python versions - prefix installs (as in, setup.py install --prefix) for virtual environments or other non-standard locations. Current thinking is to *not* track these installations in the Add/Remove programs. -- Jeremy Kloth From greg.ewing at canterbury.ac.nz Fri Oct 14 00:06:21 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2011 11:06:21 +1300 Subject: [Python-Dev] Identifier API In-Reply-To: <4E96D2E6.60207@v.loewis.de> References: <4E90640E.2040301@v.loewis.de> <4E943868.6070204@avl.com> <20111011091943.4160b217@resist.wooz.org> <4E96D2E6.60207@v.loewis.de> Message-ID: <4E9760DD.90206@canterbury.ac.nz> Martin v. L?wis wrote: > So I think it needs a prefix. If you don't like PyId_, let me know > what the prefix should be instead. Instead of an explicit prefix, how about a macro, such as Py_ID(__string__)? -- Greg From victor.stinner at haypocalc.com Fri Oct 14 00:30:51 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 14 Oct 2011 00:30:51 +0200 Subject: [Python-Dev] Identifier API In-Reply-To: <201110130334.00711.victor.stinner@haypocalc.com> References: <4E90640E.2040301@v.loewis.de> <201110130044.33601.victor.stinner@haypocalc.com> <201110130334.00711.victor.stinner@haypocalc.com> Message-ID: <201110140030.51518.victor.stinner@haypocalc.com> Le jeudi 13 octobre 2011 03:34:00, Victor Stinner a ?crit : > > We would need a new format for Py_BuildValue, e.g. 'a' for ASCII string. > > Later we can add new functions like _PyDict_GetASCII(). > > The main difference between my new "const ASCII" string idea and > PyIdentifier is the lifetime of objects. Using "const ASCII" strings, the > strings are destroyed quickly (to not waste memory), whereas PyIdentifiers > are intern strings and so they are only destroyed at Python exit. > > I don't know if "const ASCII" strings solve a real issue. I implemented my > idea. I will do some benchmarks to check if it's useful or not :-) Ok, I did some tests: it is slower with my PyConstASCIIObject. I don't understand why, but it means that the idea is not interesting because the code is not faster. It is also difficult to ensure that the string is "constant" (test the scope of the string). At least, I found a nice GCC function: __builtin_constant_p(str) can be used to ensure that the string is constant (e.g. "abc" vs char*). Victor From jeremy.kloth at gmail.com Fri Oct 14 03:01:52 2011 From: jeremy.kloth at gmail.com (Jeremy Kloth) Date: Thu, 13 Oct 2011 19:01:52 -0600 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: <201110131602.27792.jeremy.kloth@gmail.com> References: <201110131602.27792.jeremy.kloth@gmail.com> Message-ID: <201110131901.52510.jeremy.kloth@gmail.com> On Thursday, October 13, 2011 04:02:27 PM Jeremy Kloth wrote: > That said, I have been working on a drop-in replacement for the current > bdist_wininst executable stub with the following features: > - install to 32- or 64-bit Python installations from a single installer; > currently one installer for each architechure is required > - install to any Python from version 2.4 to the latest; > currently one installer is needed for each major version > - updated look and feel (Wizard97) with the new (as of Python 2.5!) logo; > for some screen shots see: > http://www.flickr.com/photos/67460826 at N04/sets/72157627653603530/ > - unicode metadata support (name, summary, description) > - runs on Win95 through Win7 (that is, all support platforms for the > supported Python versions for packaging) > - per-user installs (as in, setup.py install --user); > currently only system-wide or per-user based on permissions and how > Python itself was installed I missed a few additional features: - UAC (Vista, Win7) is handled at the install phase depending on the selected Python target (system or user installed Python or user site-packages); currently just at the running of the installer - pre-/post- install and remove script support with the scripts no longer needing to be installed with the distribution - MSVCRT agnostic; built completely with the Windows API meaning only one stub EXE required; currently there is one stub per MSVCRT version -- Jeremy Kloth From g.brandl at gmx.net Fri Oct 14 07:44:41 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 14 Oct 2011 07:44:41 +0200 Subject: [Python-Dev] Identifier API In-Reply-To: <201110140030.51518.victor.stinner@haypocalc.com> References: <4E90640E.2040301@v.loewis.de> <201110130044.33601.victor.stinner@haypocalc.com> <201110130334.00711.victor.stinner@haypocalc.com> <201110140030.51518.victor.stinner@haypocalc.com> Message-ID: Am 14.10.2011 00:30, schrieb Victor Stinner: > Le jeudi 13 octobre 2011 03:34:00, Victor Stinner a ?crit : >> > We would need a new format for Py_BuildValue, e.g. 'a' for ASCII string. >> > Later we can add new functions like _PyDict_GetASCII(). >> >> The main difference between my new "const ASCII" string idea and >> PyIdentifier is the lifetime of objects. Using "const ASCII" strings, the >> strings are destroyed quickly (to not waste memory), whereas PyIdentifiers ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> are intern strings and so they are only destroyed at Python exit. >> >> I don't know if "const ASCII" strings solve a real issue. I implemented my >> idea. I will do some benchmarks to check if it's useful or not :-) > > Ok, I did some tests: it is slower with my PyConstASCIIObject. I don't > understand why, but it means that the idea is not interesting because the code > is not faster. I think you've already given the answer above... Georg From victor.stinner at haypocalc.com Fri Oct 14 10:46:44 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 14 Oct 2011 10:46:44 +0200 Subject: [Python-Dev] Identifier API In-Reply-To: References: <4E90640E.2040301@v.loewis.de> <201110130044.33601.victor.stinner@haypocalc.com> <201110130334.00711.victor.stinner@haypocalc.com> <201110140030.51518.victor.stinner@haypocalc.com> Message-ID: <4E97F6F4.7080701@haypocalc.com> Le 14/10/2011 07:44, Georg Brandl a ?crit : > Am 14.10.2011 00:30, schrieb Victor Stinner: >> Le jeudi 13 octobre 2011 03:34:00, Victor Stinner a ?crit : >>>> We would need a new format for Py_BuildValue, e.g. 'a' for ASCII string. >>>> Later we can add new functions like _PyDict_GetASCII(). >>> >>> The main difference between my new "const ASCII" string idea and >>> PyIdentifier is the lifetime of objects. Using "const ASCII" strings, the >>> strings are destroyed quickly (to not waste memory), whereas PyIdentifiers > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > >>> are intern strings and so they are only destroyed at Python exit. >>> >>> I don't know if "const ASCII" strings solve a real issue. I implemented my >>> idea. I will do some benchmarks to check if it's useful or not :-) >> >> Ok, I did some tests: it is slower with my PyConstASCIIObject. I don't >> understand why, but it means that the idea is not interesting because the code >> is not faster. > > I think you've already given the answer above... I tried with and without interned strings. It doesn't change anything. Victor From martin at v.loewis.de Fri Oct 14 16:07:54 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 14 Oct 2011 16:07:54 +0200 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> <4E970F70.70105@netwok.org> <4E972038.3050000@v.loewis.de> Message-ID: <4E98423A.90408@v.loewis.de> > I can't really comment on this. I agree in principle with what you're > saying, but I know little about the MSI format so I can't say much > more. It feels to me like you're suggesting that the MSI file > encapsulate the file layout logic that already has to exist in > pysetup, though, which sounds like duplication of effort. Can MSI call > out to pysetup to actually install the files and save this > duplication? I'm not sure what exactly it is that pysetup does, so I can't say whether there is any duplication. It's possibly to have post-install actions in MSI, so if the files get put into (some) place at first, it would be possible to copy/link them to whatever layout is needed. What I'd like to avoid is that people need to create too many different packages on Windows, for different use cases. It would be better if the author/packager could create one Windows distribution, and have that work in all use cases. As for MSI: it's primary advantages over bdist_wininst is the higher flexibility of integration into systems maintenance infrastructures. UI-less installation, installation through Active Directory, nested installation (as part of some bundle of installers) are all supported by MSI out of the box. IMO, the primary reason to keep bdist_wininst (besides popularity) is that you need to run the packaging on Windows to create an MSI file, whereas bdist_wininst can be created cross-platform (as long as there are no binary extension modules). In addition, bdist_wininst is better wrt. to repeated installations. I'd prefer a setup though where the same package can work in multiple installations without requiring physical copies. Regards, Martin From martin at v.loewis.de Fri Oct 14 16:08:40 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Fri, 14 Oct 2011 16:08:40 +0200 Subject: [Python-Dev] Identifier API In-Reply-To: <20111013143815.023d1a8e@limelight.wooz.org> References: <4E90640E.2040301@v.loewis.de> <4E943868.6070204@avl.com> <20111011091943.4160b217@resist.wooz.org> <4E96D2E6.60207@v.loewis.de> <20111013152323.622b6bc6@pitrou.net> <20111013094224.46e4e7c8@limelight.wooz.org> <4E972920.8030300@v.loewis.de> <20111013143815.023d1a8e@limelight.wooz.org> Message-ID: <4E984268.5020109@v.loewis.de> Am 13.10.11 20:38, schrieb Barry Warsaw: > On Oct 13, 2011, at 08:08 PM, Martin v. L?wis wrote: > >>>> Py_CONST_STRING or Py_IDENTIFIER would be fine with me. Given that >>>> everything else uses "Id" in their name, Py_IDENTIFIER is probably better? >>> >>> I agree that either is fine, with a slight preference for Py_IDENTIFIER for >>> the same reasons. >> >> Ok, so it's Py_IDENTIFIER. > > Given below, shouldn't that be _Py_IDENTIFIER? It actually is _Py_IDENTIFIER (and was _Py_identifier). Regards, Martin From martin at v.loewis.de Fri Oct 14 16:13:43 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 14 Oct 2011 16:13:43 +0200 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> <4E970F70.70105@netwok.org> <4E972038.3050000@v.loewis.de> <4E973BE8.3030105@timgolden.me.uk> Message-ID: <4E984397.4060906@v.loewis.de> > Thanks for the clarification. I can see why this would be important. > But maintaining 3 different interfaces to do essentially the same > thing (collect some data from the user, then based on that data put > the same set of files in the same places) seems a waste of effort, and > a recipe for discrepancies in capabilities. > > Maybe the wininst and MSI installers should ultimately become simple > UIs around a zipfile and an invocation of the packaging APIs? Not that > I'm offering to do that work, I'm afraid... I think you are mixing issues: even if they were simple wrappers, they would still be 3 different interfaces (presented to the user). I.e. the user doesn't really care whether there is a zip file inside (as in bdist_wininst) or a cab file (as in bdist_msi), and whether or not the packaging APIs are invoked during installation. So if you want to get rid of interfaces, you really have to drop one of the formats. Making maintenance of the interfaces simpler and more homogenous by having them call into installed Python code is certainly worthwhile. The challenge here is that the installers also work on older Python versions (unless there are extension modules in there), so they could only use the packaging API when installing into 3.3 or newer. Regards, Martin From barry at python.org Fri Oct 14 16:28:12 2011 From: barry at python.org (Barry Warsaw) Date: Fri, 14 Oct 2011 10:28:12 -0400 Subject: [Python-Dev] Identifier API In-Reply-To: <4E984268.5020109@v.loewis.de> References: <4E90640E.2040301@v.loewis.de> <4E943868.6070204@avl.com> <20111011091943.4160b217@resist.wooz.org> <4E96D2E6.60207@v.loewis.de> <20111013152323.622b6bc6@pitrou.net> <20111013094224.46e4e7c8@limelight.wooz.org> <4E972920.8030300@v.loewis.de> <20111013143815.023d1a8e@limelight.wooz.org> <4E984268.5020109@v.loewis.de> Message-ID: <20111014102812.39ec0b57@resist.wooz.org> On Oct 14, 2011, at 04:08 PM, Martin v. L?wis wrote: >It actually is _Py_IDENTIFIER (and was _Py_identifier). Yep, I saw your commit to make the change. Thanks! -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From p.f.moore at gmail.com Fri Oct 14 16:40:16 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 14 Oct 2011 15:40:16 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: <4E98423A.90408@v.loewis.de> References: <4E914A09.50209@netwok.org> <4E970F70.70105@netwok.org> <4E972038.3050000@v.loewis.de> <4E98423A.90408@v.loewis.de> Message-ID: On 14 October 2011 15:07, "Martin v. L?wis" wrote: >> I can't really comment on this. I agree in principle with what you're >> saying, but I know little about the MSI format so I can't say much >> more. It feels to me like you're suggesting that the MSI file >> encapsulate the file layout logic that already has to exist in >> pysetup, though, which sounds like duplication of effort. Can MSI call >> out to pysetup to actually install the files and save this >> duplication? > > I'm not sure what exactly it is that pysetup does, so I can't say > whether there is any duplication. It's possibly to have post-install > actions in MSI, so if the files get put into (some) place at first, > it would be possible to copy/link them to whatever layout is needed. OK, that might be a useful approach. > What I'd like to avoid is that people need to create too many different > packages on Windows, for different use cases. It would be better if > the author/packager could create one Windows distribution, and have > that work in all use cases. Absolutely. That is crucial if we're to avoid the sort of fragmentation we've seen in the past (with bdist_wininst, bdist_msi and binary eggs). I don't know if we'll ever get down to one format, but that would be the ideal. > As for MSI: it's primary advantages over bdist_wininst is the > higher flexibility of integration into systems maintenance > infrastructures. UI-less installation, installation through > Active Directory, nested installation (as part of some bundle > of installers) are all supported by MSI out of the box. IMO, > the primary reason to keep bdist_wininst (besides popularity) > is that you need to run the packaging on Windows to create an > MSI file, whereas bdist_wininst can be created cross-platform > (as long as there are no binary extension modules). In addition, > bdist_wininst is better wrt. to repeated installations. I'd > prefer a setup though where the same package can work in > multiple installations without requiring physical copies. One other aspect is that MSI format is essentially opaque (correct me if I'm wrong here). With bdist_msi, if I want to get the compiled binaries out for some reason (maybe to install them in a virtual environment or some type of other custom build) I just unzip the file - the exe header gets ignored. With bdist_msi, I have no idea if there's any way of doing that. Also, there are fewer people with expertise in MSI format. I suspect that even a Unix developer could have a go at modifying the C code in bdist_msi, it's not too MS-specific. I don't know if that's possible for bdist_msi. Speaking personally, the msilib documentation is pretty unreadable, as I don't know anything about the MSI format. Whenever I've tried reading the MS documentation in the past, I've found it pretty impenetrable (a link to a simple tutorial, and some examples of use, in the msilib documentation might help). Paul. From merwok at netwok.org Fri Oct 14 17:07:12 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Fri, 14 Oct 2011 17:07:12 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Add a comment explaining this heuristic. In-Reply-To: References: Message-ID: <4E985020.203@netwok.org> Hi Antoine, > changeset: 701b2e0e6f3f > user: Antoine Pitrou > date: Thu Oct 13 18:07:37 2011 +0200 > summary: > Add a comment explaining this heuristic. > > diff --git a/Objects/stringlib/fastsearch.h b/Objects/stringlib/fastsearch.h > --- a/Objects/stringlib/fastsearch.h > +++ b/Objects/stringlib/fastsearch.h > @@ -115,6 +115,9 @@ > unsigned char needle; > needle = p[0] & 0xff; > #if STRINGLIB_SIZEOF_CHAR > 1 > + /* If looking for a multiple of 256, we'd have two > + many false positives looking for the '\0' byte in UCS2 > + and UCS4 representations. */ I guess this should read ?too many?, not ?two?. Cheers From martin at v.loewis.de Fri Oct 14 17:11:18 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 14 Oct 2011 17:11:18 +0200 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> <4E970F70.70105@netwok.org> <4E972038.3050000@v.loewis.de> <4E98423A.90408@v.loewis.de> Message-ID: <4E985116.2060209@v.loewis.de> > One other aspect is that MSI format is essentially opaque (correct me > if I'm wrong here). You are wrong: msiexec /a unpacks an MSI extracts the files from the MSI (documented as "administrative installation", meaning that the result of it can again be installed, as it will also produce a stripped MSI file with just the installation procedure). > With bdist_msi, if I want to get the compiled > binaries out for some reason (maybe to install them in a virtual > environment or some type of other custom build) I just unzip the file > - the exe header gets ignored. With bdist_msi, I have no idea if > there's any way of doing that. It's little known, but was always well supported. See also http://www.python.org/download/releases/2.4/msi/ > Also, there are fewer people with expertise in MSI format. That's certainly true. > I suspect > that even a Unix developer could have a go at modifying the C code in > bdist_msi, it's not too MS-specific. s/bdist_msi/bdist_wininst/ > I don't know if that's possible for bdist_msi. No need to modify C code - it's all pure Python :-) However, I agree that's beside the point: you do need to understand MSI fairly well for modifying bdist_msi. I'm skeptical with your assertion that a Unix developer could contribute to bdist_wininst though without a Windows installation - you have to test this stuff or else it will break. > Speaking personally, the msilib documentation is pretty > unreadable, as I don't know anything about the MSI format. Whenever > I've tried reading the MS documentation in the past, I've found it > pretty impenetrable (a link to a simple tutorial, and some examples of > use, in the msilib documentation might help). If somebody would volunteer to write a tutorial, I could provide input. I'm clearly unqualified to write such a document, both for language barrier reasons, and because I continue to fail guessing what precisely it is that people don't understand. Regards, Martin From p.f.moore at gmail.com Fri Oct 14 17:26:34 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 14 Oct 2011 16:26:34 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: <4E984397.4060906@v.loewis.de> References: <4E914A09.50209@netwok.org> <4E970F70.70105@netwok.org> <4E972038.3050000@v.loewis.de> <4E973BE8.3030105@timgolden.me.uk> <4E984397.4060906@v.loewis.de> Message-ID: On 14 October 2011 15:13, "Martin v. L?wis" wrote: >> Thanks for the clarification. I can see why this would be important. >> But maintaining 3 different interfaces to do essentially the same >> thing (collect some data from the user, then based on that data put >> the same set of files in the same places) seems a waste of effort, and >> a recipe for discrepancies in capabilities. >> >> Maybe the wininst and MSI installers should ultimately become simple >> UIs around a zipfile and an invocation of the packaging APIs? Not that >> I'm offering to do that work, I'm afraid... > > I think you are mixing issues: even if they were simple wrappers, they > would still be 3 different interfaces (presented to the user). I.e. > the user doesn't really care whether there is a zip file inside > (as in bdist_wininst) or a cab file (as in bdist_msi), and whether or > not the packaging APIs are invoked during installation. > > So if you want to get rid of interfaces, you really have to drop > one of the formats. > > Making maintenance of the interfaces simpler and more homogenous > by having them call into installed Python code is certainly worthwhile. > The challenge here is that the installers also work on older Python > versions (unless there are extension modules in there), so they > could only use the packaging API when installing into 3.3 or newer. You are right that I'm mixing issues somewhat. I think the two issues are multiple interfaces and multiple formats. - On interfaces, I personally don't mind the existence of multiple choices. Some people like a GUI, others like command lines. Some like platform integration, others like integration with the language environment, or a consistent cross-platform experience. Trying to mandate one interface will always upset someone. So I don't see this as an important goal in itself. (But see below for a proviso [1]). - On formats, I strongly believe that having multiple formats is a problem. But I need to be clear here - an installer (MSI, wininst) is a bundle containing executable code (which drives the interface), plus a chunk of data that is the objects to be installed. (I am oversimplifying here, but bear with me). That's necessary because the package must be a single download, but it means that the delivery format combines code and data. Format to me refers to that data (and how accessible it is). The problem is that the packager has two roles: 1. He does the build, which is an essential service as end users don't necessarily have the means to compile. 2. He chooses the distribution format (and hence the user experience), which is not essential per se. The second role is the problem, as the packager's preferences may not match those of the end user. I am proposing decoupling those 2 roles. In the purest sense, I'd like to see a distribution format that contains *purely* the binaries, with no UI intelligence. That's what the proposed bdist_simple format is about. This format can be consumed by external tools that just read it as data (that's what pysetup, and packaging, does). For a standard user experience on Windows, at least, there needs to be a means of wrapping that data into an installer. MSI is the obvious choice, because it's the MS standard, but there's a problem here in that there's no obvious way to retrieve the raw bdist_simple file from the MSI once you've bundled it. The wininst format is better here, as you can retrieve the original format (just by removing, or ignoring, the EXE header). I have no opinions on how the MSI/wininst installers should present options to the user, nor do I have any strong views on how they should do the installation (my instinct says that they should reuse the packaging core code, but as you say that causes compatibility problems, so may not be feasible). My only strong views are: - *if* there are multiple formats, they should be easily convertible - pysetup should be able to consume one of the formats (trivially if a bdist_simple format exists, less so if MSI becomes the only format...) The second point is because pysetup offers capabilities that MSI/wininst do not - at least right now, and probably always, as I expect the core packaging code to support new features like venvs sooner than Windows-specfic formats. If only because Unix developers can hack on packaging. Conversion between bdist_wininst and the proposed bdist_simple format is easy as they are both fundamentally zipfiles. The MSI format is the odd one out here, as I don't know how to build an MSI from binary data (without using bdist_msi - or to put it another way, the layout of the files that bdist_msi expects to build the installer from isn't documented) and I don't know how to get binaries out of an MSI (without installing). If these two things were possible, we'd have the means to convert between formats at will (and writing a tool that non-packagers could use to convert formats would be possible). I'm focusing on writing code to allow packaging to install from a binary zipfile for a very simple reason - it's easy! The code is in Python, most of the support routines are there, and it's just a matter of putting the bits together. I had a mostly-working prototype done in a couple of hours last night. In contrast, I wouldn't even know where to start to modify bdist_wininst or bdist_msi to conform to the new package metadata standards, let alone to cope with non-system Pythons (which pysetup does out of the box). So by developing a new format, I can at least offer code to help, rather than just flooding the mailing lists with suggestions (although I seem to be doing that as well :-)) Paul. [1] The one sticking point is that users generally want all the features that exist, so if bdist_msi doesn't support installing into "non-system" Python builds, and pysetup does, that will drive people to either switch, or more likely complain about the "limitations" of MSI :-) From status at bugs.python.org Fri Oct 14 18:07:29 2011 From: status at bugs.python.org (Python tracker) Date: Fri, 14 Oct 2011 18:07:29 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20111014160729.8B0251DDCC@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2011-10-07 - 2011-10-14) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3077 (+25) closed 21884 (+31) total 24961 (+56) Open issues with patches: 1315 Issues opened (42) ================== #11085: expose _abcoll as collections.abc http://bugs.python.org/issue11085 reopened by georg.brandl #12967: IDLE RPC Proxy for standard IO streams lacks 'errors' attribut http://bugs.python.org/issue12967 reopened by ned.deily #13124: Add "Running a Build Slave" page to the devguide http://bugs.python.org/issue13124 opened by eric.snow #13125: test_all_project_files() expected failure http://bugs.python.org/issue13125 opened by barry #13126: find() slower than rfind() http://bugs.python.org/issue13126 opened by pitrou #13127: xml.dom.Attr.name is not labeled as read-only http://bugs.python.org/issue13127 opened by dillona #13128: httplib debuglevel on CONNECT doesn't print response headers http://bugs.python.org/issue13128 opened by Matt.Spear #13131: FD leak in urllib2 http://bugs.python.org/issue13131 opened by Valery.Khamenya #13132: distutils sends non-RFC compliant HTTP request http://bugs.python.org/issue13132 opened by mitchellh #13133: FD leaks in ZipFile.read(), ZipFile.extract() and also using e http://bugs.python.org/issue13133 opened by Valery.Khamenya #13139: multiprocessing.map skips finally blocks http://bugs.python.org/issue13139 opened by illicitonion #13140: ThreadingMixIn.daemon_threads is not honored when parent is da http://bugs.python.org/issue13140 opened by flox #13141: get rid of old threading API in the examples http://bugs.python.org/issue13141 opened by flox #13143: os.path.islink documentation is ambiguous http://bugs.python.org/issue13143 opened by Garen #13144: Global Module Index link in the offline documentation is incor http://bugs.python.org/issue13144 opened by graemeglass #13146: Writing a pyc file is not atomic http://bugs.python.org/issue13146 opened by pitrou #13147: Multiprocessing Pool.map_async() does not have an error_callba http://bugs.python.org/issue13147 opened by Jakub.Gedeon #13149: optimization for append-only StringIO http://bugs.python.org/issue13149 opened by pitrou #13150: Most of Python's startup time is sysconfig http://bugs.python.org/issue13150 opened by pitrou #13151: pysetup3 run bdist_wininst fails http://bugs.python.org/issue13151 opened by vinay.sajip #13152: textwrap: support custom tabsize http://bugs.python.org/issue13152 opened by jfeuerstein #13153: IDLE crash with unicode bigger than 0xFFFF http://bugs.python.org/issue13153 opened by JBernardo #13156: _PyGILState_Reinit assumes auto thread state will always exist http://bugs.python.org/issue13156 opened by grahamd #13157: Build Python outside the source directory http://bugs.python.org/issue13157 opened by haypo #13160: Rename install_dist to install http://bugs.python.org/issue13160 opened by eric.araujo #13161: problems with help() documentation of __i*__ operators http://bugs.python.org/issue13161 opened by eli.bendersky #13163: `port` and `host` are confused in `_get_socket http://bugs.python.org/issue13163 opened by cool-RR #13164: importing rlcompleter module writes a control sequence in stdo http://bugs.python.org/issue13164 opened by valva #13165: Integrate stringbench in the Tools directory http://bugs.python.org/issue13165 opened by pitrou #13166: Implement packaging.database.Distribution.__str__ http://bugs.python.org/issue13166 opened by eric.araujo #13167: Add get_metadata to packaging http://bugs.python.org/issue13167 opened by eric.araujo #13168: Python 2.6 having trouble finding modules when invoked via a s http://bugs.python.org/issue13168 opened by RandyGalbraith #13169: Regular expressions with 0 to 65536 repetitions raises Overflo http://bugs.python.org/issue13169 opened by techmaurice #13170: distutils2 test failures http://bugs.python.org/issue13170 opened by eric.araujo #13171: Bug in file.read(), can access unknown data. http://bugs.python.org/issue13171 opened by Alexander.Steppke #13172: pysetup run --list-commands fails with a traceback http://bugs.python.org/issue13172 opened by pmoore #13173: Default values for string.Template http://bugs.python.org/issue13173 opened by nitupho #13174: test_os failures on Fedora 15: listxattr() returns ['security. http://bugs.python.org/issue13174 opened by haypo #13175: packaging uses wrong line endings in RECORD files on Windows http://bugs.python.org/issue13175 opened by pmoore #13177: Avoid chained exceptions in lru_cache http://bugs.python.org/issue13177 opened by ezio.melotti #13178: Need tests for Unicode handling in install_distinfo and instal http://bugs.python.org/issue13178 opened by eric.araujo #13179: IDLE uses common tkinter variables across all editor windows http://bugs.python.org/issue13179 opened by serwy Most recent 15 issues with no replies (15) ========================================== #13179: IDLE uses common tkinter variables across all editor windows http://bugs.python.org/issue13179 #13178: Need tests for Unicode handling in install_distinfo and instal http://bugs.python.org/issue13178 #13170: distutils2 test failures http://bugs.python.org/issue13170 #13168: Python 2.6 having trouble finding modules when invoked via a s http://bugs.python.org/issue13168 #13166: Implement packaging.database.Distribution.__str__ http://bugs.python.org/issue13166 #13164: importing rlcompleter module writes a control sequence in stdo http://bugs.python.org/issue13164 #13161: problems with help() documentation of __i*__ operators http://bugs.python.org/issue13161 #13160: Rename install_dist to install http://bugs.python.org/issue13160 #13152: textwrap: support custom tabsize http://bugs.python.org/issue13152 #13147: Multiprocessing Pool.map_async() does not have an error_callba http://bugs.python.org/issue13147 #13144: Global Module Index link in the offline documentation is incor http://bugs.python.org/issue13144 #13141: get rid of old threading API in the examples http://bugs.python.org/issue13141 #13140: ThreadingMixIn.daemon_threads is not honored when parent is da http://bugs.python.org/issue13140 #13128: httplib debuglevel on CONNECT doesn't print response headers http://bugs.python.org/issue13128 #13126: find() slower than rfind() http://bugs.python.org/issue13126 Most recent 15 issues waiting for review (15) ============================================= #13179: IDLE uses common tkinter variables across all editor windows http://bugs.python.org/issue13179 #13177: Avoid chained exceptions in lru_cache http://bugs.python.org/issue13177 #13174: test_os failures on Fedora 15: listxattr() returns ['security. http://bugs.python.org/issue13174 #13167: Add get_metadata to packaging http://bugs.python.org/issue13167 #13163: `port` and `host` are confused in `_get_socket http://bugs.python.org/issue13163 #13157: Build Python outside the source directory http://bugs.python.org/issue13157 #13156: _PyGILState_Reinit assumes auto thread state will always exist http://bugs.python.org/issue13156 #13152: textwrap: support custom tabsize http://bugs.python.org/issue13152 #13150: Most of Python's startup time is sysconfig http://bugs.python.org/issue13150 #13149: optimization for append-only StringIO http://bugs.python.org/issue13149 #13146: Writing a pyc file is not atomic http://bugs.python.org/issue13146 #13133: FD leaks in ZipFile.read(), ZipFile.extract() and also using e http://bugs.python.org/issue13133 #13132: distutils sends non-RFC compliant HTTP request http://bugs.python.org/issue13132 #13131: FD leak in urllib2 http://bugs.python.org/issue13131 #13128: httplib debuglevel on CONNECT doesn't print response headers http://bugs.python.org/issue13128 Top 10 most discussed issues (10) ================================= #13150: Most of Python's startup time is sysconfig http://bugs.python.org/issue13150 14 msgs #12602: Missing cross-references in Doc/using http://bugs.python.org/issue12602 12 msgs #12436: Missing items in installation/setup instructions http://bugs.python.org/issue12436 11 msgs #13156: _PyGILState_Reinit assumes auto thread state will always exist http://bugs.python.org/issue13156 10 msgs #13146: Writing a pyc file is not atomic http://bugs.python.org/issue13146 9 msgs #6715: xz compressor support http://bugs.python.org/issue6715 8 msgs #7833: bdist_wininst installers fail to load extensions built with Is http://bugs.python.org/issue7833 8 msgs #3902: Packages containing only extension modules have to contain __i http://bugs.python.org/issue3902 7 msgs #6164: [AIX] Patch to correct the AIX C/C++ linker argument used for http://bugs.python.org/issue6164 7 msgs #1673007: urllib2 requests history + HEAD support http://bugs.python.org/issue1673007 7 msgs Issues closed (27) ================== #9100: test_sysconfig fails (test_user_similar) http://bugs.python.org/issue9100 closed by eric.araujo #10653: test_time test_strptime fails on windows http://bugs.python.org/issue10653 closed by haypo #11254: distutils doesn't byte-compile .py files to __pycache__ during http://bugs.python.org/issue11254 closed by eric.araujo #12192: Doc that collection mutation methods return item or None http://bugs.python.org/issue12192 closed by python-dev #12386: packaging fails in install_distinfo when writing RESOURCES http://bugs.python.org/issue12386 closed by eric.araujo #12427: packaging register fails because "POST data should be bytes" http://bugs.python.org/issue12427 closed by eric.araujo #12555: PEP 3151 implementation http://bugs.python.org/issue12555 closed by pitrou #13025: mimetypes should read the rule file using UTF-8, not the local http://bugs.python.org/issue13025 closed by haypo #13029: test_strptime fails on Windows 7 french http://bugs.python.org/issue13029 closed by haypo #13075: PEP-0001 contains dead links http://bugs.python.org/issue13075 closed by ezio.melotti #13114: check -r fails with non-ASCII unicode long_description http://bugs.python.org/issue13114 closed by eric.araujo #13129: bad argument exceptions observed in AST http://bugs.python.org/issue13129 closed by rmtew #13130: test_gdb: attempt to dereference a generic pointer http://bugs.python.org/issue13130 closed by pitrou #13134: speed up finding of one-character strings http://bugs.python.org/issue13134 closed by pitrou #13135: Using type() as a constructor doesn't support new class keywor http://bugs.python.org/issue13135 closed by benjamin.peterson #13136: speed-up conversion between unicode widths http://bugs.python.org/issue13136 closed by pitrou #13137: from __future__ import division breaks ad hoc numeric types http://bugs.python.org/issue13137 closed by benjamin.peterson #13138: ElementTree's Element.iter() lacks versionadded http://bugs.python.org/issue13138 closed by ezio.melotti #13142: Add support for other HTTP methods in urllib.request http://bugs.python.org/issue13142 closed by ezio.melotti #13145: Documentation of PyNumber_ToBase() wrong http://bugs.python.org/issue13145 closed by mark.dickinson #13148: simple bug in mmap size check http://bugs.python.org/issue13148 closed by pitrou #13154: pep-0000.txt doesn't build anymore http://bugs.python.org/issue13154 closed by orsenthil #13155: Optimize finding the max character width http://bugs.python.org/issue13155 closed by pitrou #13158: tarfile.TarFile.getmembers misses some entries http://bugs.python.org/issue13158 closed by lars.gustaebel #13159: _io.FileIO uses a quadratic-time buffer growth algorithm http://bugs.python.org/issue13159 closed by nadeem.vawda #13162: Trying to install a binary extension as a resource file causes http://bugs.python.org/issue13162 closed by eric.araujo #13176: Broken link in bugs.rst http://bugs.python.org/issue13176 closed by eric.araujo From martin at v.loewis.de Fri Oct 14 18:46:50 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 14 Oct 2011 18:46:50 +0200 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> <4E970F70.70105@netwok.org> <4E972038.3050000@v.loewis.de> <4E973BE8.3030105@timgolden.me.uk> <4E984397.4060906@v.loewis.de> Message-ID: <4E98677A.30301@v.loewis.de> > - On formats, I strongly believe that having multiple formats is a > problem. But I need to be clear here - an installer (MSI, wininst) is > a bundle containing executable code (which drives the interface), plus > a chunk of data that is the objects to be installed. (I am > oversimplifying here, but bear with me). Beyond oversimplifying, I think this is actually wrong: MSI deliberately is *not* an executable format, but just a "dumb" database, to be interpreted by the installation routines that are already on the system. In that sense, it is very similar to pysetup and bdist_simple. Regards, Martin From p.f.moore at gmail.com Fri Oct 14 20:42:12 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 14 Oct 2011 19:42:12 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: <4E98677A.30301@v.loewis.de> References: <4E914A09.50209@netwok.org> <4E970F70.70105@netwok.org> <4E972038.3050000@v.loewis.de> <4E973BE8.3030105@timgolden.me.uk> <4E984397.4060906@v.loewis.de> <4E98677A.30301@v.loewis.de> Message-ID: On 14 October 2011 17:46, "Martin v. L?wis" wrote: > >> - On formats, I strongly believe that having multiple formats is a >> problem. But I need to be clear here - an installer (MSI, wininst) is >> a bundle containing executable code (which drives the interface), plus >> a chunk of data that is the objects to be installed. (I am >> oversimplifying here, but bear with me). > > Beyond oversimplifying, I think this is actually wrong: MSI deliberately > is *not* an executable format, but just a "dumb" database, to be interpreted > by the installation routines that are already on the system. > In that sense, it is very similar to pysetup and bdist_simple. Ah. Sorry, I had misunderstood that, then. So in theory, packaging could be taught to extract the distribution files from an MSI file (using msilib, presumably) and install them much like it could with a zip file. That would imply that the only barrier to using MSI as the default format is the fact that the files can only be manipulated on a Windows platform (which is only a problem if Unix users can build binaries for Windows - they would then be able to build but not package them). I wish I felt more comfortable with MSI as a format (as opposed to an opaque clickable installer). I'd be interested to know what others think. Paul. From martin at v.loewis.de Fri Oct 14 20:05:16 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Fri, 14 Oct 2011 20:05:16 +0200 Subject: [Python-Dev] Identifier API In-Reply-To: <4E9760DD.90206@canterbury.ac.nz> References: <4E90640E.2040301@v.loewis.de> <4E943868.6070204@avl.com> <20111011091943.4160b217@resist.wooz.org> <4E96D2E6.60207@v.loewis.de> <4E9760DD.90206@canterbury.ac.nz> Message-ID: <4E9879DC.8040500@v.loewis.de> > Instead of an explicit prefix, how about a macro, such as > Py_ID(__string__)? That wouldn't be instead, but in addition - you need the variable name, anyway. Not sure whether there is actually a gain in readability - people not familiar with this would assume that it's a function call of some kind, which it would not be. Regards, Martin From greg.ewing at canterbury.ac.nz Sat Oct 15 01:32:50 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 15 Oct 2011 12:32:50 +1300 Subject: [Python-Dev] Identifier API In-Reply-To: <4E9879DC.8040500@v.loewis.de> References: <4E90640E.2040301@v.loewis.de> <4E943868.6070204@avl.com> <20111011091943.4160b217@resist.wooz.org> <4E96D2E6.60207@v.loewis.de> <4E9760DD.90206@canterbury.ac.nz> <4E9879DC.8040500@v.loewis.de> Message-ID: <4E98C6A2.4010503@canterbury.ac.nz> Martin v. L?wis wrote: > That wouldn't be instead, but in addition - you need the > variable name, anyway. But the details of exactly how the name is constructed could be kept as an implementation detail. > Not sure whether there is actually > a gain in readability - people not familiar with this would > assume that it's a function call of some kind, which it would > not be. To me the benefit would be that the name you write as the argument would be *exactly* the identifier it represents. If you have to manually add a prefix, there's room for a bit of confusion, especially if the prefix itself ends with an underscore. E.g. if the identifier is "__init__" and the prefix is "PyID_", do you write "PyID__init__" (two underscores) or "PyID___init__" (three underscores?) And can you easily spot the difference in your editor? -- Greg From g.brandl at gmx.net Sat Oct 15 06:51:28 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 15 Oct 2011 06:51:28 +0200 Subject: [Python-Dev] Identifier API In-Reply-To: <4E98C6A2.4010503@canterbury.ac.nz> References: <4E90640E.2040301@v.loewis.de> <4E943868.6070204@avl.com> <20111011091943.4160b217@resist.wooz.org> <4E96D2E6.60207@v.loewis.de> <4E9760DD.90206@canterbury.ac.nz> <4E9879DC.8040500@v.loewis.de> <4E98C6A2.4010503@canterbury.ac.nz> Message-ID: Am 15.10.2011 01:32, schrieb Greg Ewing: > Martin v. L?wis wrote: >> That wouldn't be instead, but in addition - you need the >> variable name, anyway. > > But the details of exactly how the name is constructed > could be kept as an implementation detail. Is there a use case for keeping that detail hidden? >> Not sure whether there is actually >> a gain in readability - people not familiar with this would >> assume that it's a function call of some kind, which it would >> not be. > > To me the benefit would be that the name you write as > the argument would be *exactly* the identifier it > represents. > > If you have to manually add a prefix, there's room for > a bit of confusion, especially if the prefix itself > ends with an underscore. E.g. if the identifier is > "__init__" and the prefix is "PyID_", do you write > "PyID__init__" (two underscores) or "PyID___init__" > (three underscores?) And can you easily spot the > difference in your editor? The compiler can, very easily. Georg From ncoghlan at gmail.com Sat Oct 15 10:04:05 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 15 Oct 2011 18:04:05 +1000 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> <4E970F70.70105@netwok.org> <4E972038.3050000@v.loewis.de> <4E973BE8.3030105@timgolden.me.uk> <4E984397.4060906@v.loewis.de> <4E98677A.30301@v.loewis.de> Message-ID: On Sat, Oct 15, 2011 at 4:42 AM, Paul Moore wrote: > I wish I felt more comfortable with MSI as a format (as opposed to an > opaque clickable installer). I'd be interested to know what others > think. Compilation can be a problem on Linux systems as well, so a platform neutral format is a better idea. Just have a mechanism that allows pysetup to create a bdist_msi from a bdist_simple. Similar, bdist_rpm and bdist_deb plugins could be taught to interpret bdist_simple. However, you do get into architecture problems (x86 vs x86_64 vs ARM) if you go that route. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From p.f.moore at gmail.com Sat Oct 15 10:57:08 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 15 Oct 2011 09:57:08 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> <4E970F70.70105@netwok.org> <4E972038.3050000@v.loewis.de> <4E973BE8.3030105@timgolden.me.uk> <4E984397.4060906@v.loewis.de> <4E98677A.30301@v.loewis.de> Message-ID: On 15 October 2011 09:04, Nick Coghlan wrote: > On Sat, Oct 15, 2011 at 4:42 AM, Paul Moore wrote: >> I wish I felt more comfortable with MSI as a format (as opposed to an >> opaque clickable installer). I'd be interested to know what others >> think. > > Compilation can be a problem on Linux systems as well, so a platform > neutral format is a better idea. Just have a mechanism that allows > pysetup to create a bdist_msi from a bdist_simple. Similar, bdist_rpm > and bdist_deb plugins could be taught to interpret bdist_simple. > > However, you do get into architecture problems (x86 vs x86_64 vs ARM) > if you go that route. Architecture problems are an issue for any binary format, surely? It's the content (the binaries themselves) that are architecture dependent, not the format itself. Paul. From martin at v.loewis.de Sat Oct 15 13:45:26 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sat, 15 Oct 2011 13:45:26 +0200 Subject: [Python-Dev] Identifier API In-Reply-To: References: <4E90640E.2040301@v.loewis.de> Message-ID: <4E997256.3090401@v.loewis.de> >> PyObject *tmp; >> Py_identifier(update); > As I understand it, the macro expands to both the ID variable > declaration and the init-at-first-call code, right? No. The variable will only get static initialization with the char*. The initialization on first call (of the PyObject*) happens in the helper functions, such as PyObject_GetAttrId. > I'm not sure how often users will need more than one identifier in a function That's actually fairly common. > Also note that existing code needs to be changed in order to take > advantage of this. It might be possible to optimise > PyObject_CallMethod() internally by making the lookup either reuse a > number of cached Python strings, or by supporting a lookup of char* > values in a dict *somehow*. However, this appears to be substantially > more involved than just moving a smaller burden on the users. I think this would have to hash the string in any case, since keying by char* pointer value cannot work (there might be a different string at the same memory the next time). So even if this could side-step many of the steps, you'd keep at least one iteration over the characters; if this is hashing, you actually need two iterations (the second one to determine whether it's the right string). The Py_IDENTIFIER API can do the lookup in constant time for all but the first call. Regards, Martin From martin at v.loewis.de Sat Oct 15 13:47:50 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sat, 15 Oct 2011 13:47:50 +0200 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #13156: revert changeset f6feed6ec3f9, which was only relevant for native In-Reply-To: <201110122119.09398.victor.stinner@haypocalc.com> References: <201110122119.09398.victor.stinner@haypocalc.com> Message-ID: <4E9972E6.3060805@v.loewis.de> >> -- Issue #10517: After fork(), reinitialize the TLS used by the >> PyGILState_* - APIs, to avoid a crash with the pthread implementation in >> RHEL 5. Patch - by Charles-Fran?ois Natali. > > You should restore this NEWS entry and add a new one to say that the patch has > been reverted. This may be a done deal, but: no. If a patch is reverted, the NEWS entry that got in with it gets out again on reversal. The NEWS file is for users of the release; there is no point telling them that a change was made first, and than got undone. Regards, Martin From tjreedy at udel.edu Sat Oct 15 22:10:02 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 15 Oct 2011 16:10:02 -0400 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #13156: revert changeset f6feed6ec3f9, which was only relevant for native In-Reply-To: <4E9972E6.3060805@v.loewis.de> References: <201110122119.09398.victor.stinner@haypocalc.com> <4E9972E6.3060805@v.loewis.de> Message-ID: On 10/15/2011 7:47 AM, "Martin v. L?wis" wrote: >>> -- Issue #10517: After fork(), reinitialize the TLS used by the >>> PyGILState_* - APIs, to avoid a crash with the pthread implementation in >>> RHEL 5. Patch - by Charles-Fran?ois Natali. >> >> You should restore this NEWS entry and add a new one to say that the >> patch has >> been reverted. > > This may be a done deal, but: no. If a patch is reverted, the NEWS entry > that got in with it gets out again on reversal. The NEWS file > is for users of the release; there is no point telling them that a > change was made first, and than got undone. I was going to say the same thing, but ... If a change is released in x.y.z and reverted for release x.y.(z+k), then I think both notices should be present in their respective sections. I checked the date on the original patch and it was before 3.2.1, so perhaps it *was* released. -- Terry Jan Reedy From scott+python-dev at scottdial.com Sun Oct 16 06:12:04 2011 From: scott+python-dev at scottdial.com (Scott Dial) Date: Sun, 16 Oct 2011 00:12:04 -0400 Subject: [Python-Dev] check for PyUnicode_READY look backwards In-Reply-To: References: <4E8E2990.9060806@v.loewis.de> <4E8EB0C3.80100@haypocalc.com> <4E8EFCE0.7060005@v.loewis.de> Message-ID: <4E9A5994.4030107@scottdial.com> On 10/7/2011 7:13 PM, Terry Reedy wrote: > On 10/7/2011 10:06 AM, Nick Coghlan wrote: >> FWIW, I don't mind whether it's "< 0" or "== -1", so long as there's a >> comparison there to kick my brain out of Python boolean logic mode. > > Is there any speed difference (on common x86/64 processors and > compilers)? I would expect that '< 0' should be optimized to just check > the sign bit and 'if n < 0' to 'load n; jump-non-negative'. > There are several different ways to express those operators depending on the context. If "n" is worth moving into a register, then "<0" will get to use a "test" and it's fewer instruction bytes than a "cmp", but otherwise, it is no better. So, there is a very special case where "<0" is better, but I think you'd be hard-pressed to measure it against the noise. -- Scott Dial scott at scottdial.com From ncoghlan at gmail.com Sun Oct 16 09:19:55 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 16 Oct 2011 17:19:55 +1000 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #13156: revert changeset f6feed6ec3f9, which was only relevant for native In-Reply-To: References: <201110122119.09398.victor.stinner@haypocalc.com> <4E9972E6.3060805@v.loewis.de> Message-ID: On Sun, Oct 16, 2011 at 6:10 AM, Terry Reedy wrote: > On 10/15/2011 7:47 AM, "Martin v. L?wis" wrote: >> >> This may be a done deal, but: no. If a patch is reverted, the NEWS entry >> that got in with it gets out again on reversal. The NEWS file >> is for users of the release; there is no point telling them that a >> change was made first, and than got undone. > > I was going to say the same thing, but ... > > If a change is released in x.y.z and reverted for release x.y.(z+k), then I > think both notices should be present in their respective sections. > > I checked the date on the original patch and it was before 3.2.1, so perhaps > it *was* released. Indeed, "was it released?" is the gating criteria for whether the old NEWS entry is removed or whether a new one is made. No release should ever remove a NEWS entry from an earlier release, *unless* the NEWS entry itself was a mistake (i.e. it refers to a change that wasn't actually part of a release). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From p.f.moore at gmail.com Sun Oct 16 16:40:58 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 16 Oct 2011 15:40:58 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: <4E9710FA.3010003@netwok.org> References: <4E914A09.50209@netwok.org> <4E9710FA.3010003@netwok.org> Message-ID: On 13 October 2011 17:25, ?ric Araujo wrote: >> My expectation would be that the user would type pysetup install >> some_binary_format_file.zip and have that file unpacked and all the >> "bits" put in the appropriate place. Basically just like installing >> from a source archive - pysetup install project-1.0.tar.gz - but >> skipping the compile steps because the compiler output files are >> present. > Yep. > >> That may need some extra intelligence in pysetup if it doesn't have >> this feature already [...] just unzip the bits into the right place, >> or something similar. > Yes. ?The bdist can be just like an sdist, but it contains compiled > files instead of C source files (maybe setuptools bdist_egg is just > that), then pysetup uses the setup.cfg file to find files and install > them at the right places. I have uploaded an initial (working, just needs test, docs and some testing and possibly tidying up) version of bdist_simple in the tracker (http://bugs.python.org/issue13189). Taking note of Martin's comments, it would be nice to at a minimum have a converter to take a bdist_simple and build bdist_msi/bdist_wininst from it (and possibly vice versa). I'm not sure how to go about that at this stage, but I'll work on it. Paul. From vinay_sajip at yahoo.co.uk Sun Oct 16 22:24:58 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 16 Oct 2011 20:24:58 +0000 (UTC) Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 References: <4E973BE8.3030105@timgolden.me.uk> <201110131602.27792.jeremy.kloth@gmail.com> Message-ID: Jeremy Kloth gmail.com> writes: > That said, I have been working on a drop-in replacement for the current > bdist_wininst executable stub with the following features: [snip] > http://www.flickr.com/photos/67460826 N04/sets/72157627653603530/ [snip] Sounds interesting, but your flickr link did not work for me, even after I tried replacing "" with "@". Care to post a shortened link? Regards, Vinay Sajip From vinay_sajip at yahoo.co.uk Sun Oct 16 22:38:11 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 16 Oct 2011 20:38:11 +0000 (UTC) Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 References: <4E914A09.50209@netwok.org> <4E970F70.70105@netwok.org> <4E972038.3050000@v.loewis.de> Message-ID: Martin v. L?wis v.loewis.de> writes: > In particular wrt. virtual environments: I see no need to actually > *install* files multiple times. It's rather sufficient that the > distributions to be installed are *available* in the virtual env after > installation, and unavailable after being removed. Actually copying > them into the virtual environment might not be necessary or useful. > > So I envision a setup where the MSI file puts the binaries into a place > on disk where pysetup (or whatever tool) finds them, and links them > whereever they need to go (using whatever linking mechanism might work). > For MSI in particular, there could be some interaction with pysetup, > e.g. to register all virtualenvs that have linked the installation, > and warn the user that the file is still in use in certain locations. > Likewise, automated download might pick an MSI file, and tell it not > to place itself into the actual Python installation, but instead into > a location where pysetup will find it. While it seems a little inelegant, copying might actually be simpler and less error-prone. Firstly, AFAIK you can't do true symlinks in relatively recent, widely-used versions of Windows like XP. Also, while some people use virtualenvs in a central location (such as virtualenvwrapper users), others have their envs under a project folder. I don't know that the complication of a centralised registry of virtual envs is necessarily a good thing. Regards, Vinay Sajip From vinay_sajip at yahoo.co.uk Sun Oct 16 22:49:41 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 16 Oct 2011 20:49:41 +0000 (UTC) Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 References: <4E914A09.50209@netwok.org> <4E970F70.70105@netwok.org> <4E972038.3050000@v.loewis.de> <4E973BE8.3030105@timgolden.me.uk> <4E984397.4060906@v.loewis.de> <4E98677A.30301@v.loewis.de> Message-ID: Nick Coghlan gmail.com> writes: > Compilation can be a problem on Linux systems as well, so a platform > neutral format is a better idea. Just have a mechanism that allows > pysetup to create a bdist_msi from a bdist_simple. Similar, bdist_rpm > and bdist_deb plugins could be taught to interpret bdist_simple. I agree that a platform-neutral format is a good idea, but there might be other complications with binary formats, which I'm not sure we've considered. For example, if we're just bundling self-contained C extensions which just link to libc/msvcrt, that's one thing. But what if those extensions link to particular versions of other libraries? Are those referenced binaries supposed to be bundled in the archive, too? I don't know that the dependency language supported by packaging extends to these kinds of dependencies on external, non-Python components. If we leave it to the packager to include all relevant binary dependencies, I'm not sure how satisfactory that'll be - possibly, not very. Regards, Vinay Sajip From vinay_sajip at yahoo.co.uk Sun Oct 16 23:13:55 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 16 Oct 2011 21:13:55 +0000 (UTC) Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 References: <4E914A09.50209@netwok.org> <4E9710FA.3010003@netwok.org> Message-ID: ?ric Araujo netwok.org> writes: > [Vinay] > > A simple change to packaging will allow an archive containing a > > setup.cfg-based > > directory to be installed in the same way as a > > source directory. > Isn?t that already supported, as long as the tarball or zipfile contains > source files? In any case, it was intended to be, and there?s still > support code around. No, by which I mean - if you have a simple zip of a project directory# containing a setup.cfg, and run pysetup3 install , it fails to work in the same way as pysetup3 install where the is a recursive zip of . However, a two-line change enables this: http://goo.gl/pd51J > Correct. I?m still pondering whether I find the idea of registering > built files in setup.cfg as elegant or hacky :) We also have the other > ideas I wrote to choose from. On Linux, if we're building from source, of course we use the build_ext step to capture the built artifacts. However, how else could you do it on Windows, when you're not actually building? The built files could be named in the [extension:] section rather than the [files] section - the former means that you have to add code to deal with it, the latter is less elegant but would require less work to make it happen. > > 3. Ideally, the GUI should co-operate with venvs, by offering some > > form of browse facility. The command line does this automatically. > Will Windows users want a GUI to create venvs too? I don't think this is needed for venv creation, but having a "Find Other..." to locate an alternative Python in a virtual env doesn't seem too onerous for the user. Regards, Vinay Sajip From vinay_sajip at yahoo.co.uk Sun Oct 16 23:32:34 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 16 Oct 2011 21:32:34 +0000 (UTC) Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 References: <4E914A09.50209@netwok.org> <4E9710FA.3010003@netwok.org> Message-ID: Paul Moore gmail.com> writes: > > On 13 October 2011 17:25, ?ric Araujo netwok.org> wrote: > >> My expectation would be that the user would type pysetup install > >> some_binary_format_file.zip and have that file unpacked and all the > >> "bits" put in the appropriate place. Basically just like installing > >> from a source archive - pysetup install project-1.0.tar.gz - but > >> skipping the compile steps because the compiler output files are > >> present. > > Yep. > > > >> That may need some extra intelligence in pysetup if it doesn't have > >> this feature already [...] just unzip the bits into the right place, > >> or something similar. > > Yes. ?The bdist can be just like an sdist, but it contains compiled > > files instead of C source files (maybe setuptools bdist_egg is just > > that), then pysetup uses the setup.cfg file to find files and install > > them at the right places. > > I have uploaded an initial (working, just needs test, docs and some > testing and possibly tidying up) version of bdist_simple in the > tracker (http://bugs.python.org/issue13189). There's one area of pysetup3 functionality which I don't think has been discussed in this thread, though it's pertinent to Windows users. Namely, a completely declarative approach to installation locations will not satisfy all requirements. For example, if you want to install PowerShell scripts, the right way of doing that is to make Shell API calls at installation time to determine the correct location for the installing user. I have this working at present for a project, using pysetup's hook functionality; but any move to a completely passive archive format would lose this kind of flexibility. So, I think whatever archive format we end up with should provide exactly the same level of flexibility we currently get with pysetup3 for pure-Python projects, but extended to include binary deliverables. The simplest way of doing this is to register those binaries in setup.cfg, and to have hook code check for the correct dependencies in the run-time environment before actually installing (e.g. x86/x64/ARM/Python version dependencies). While it's not the slickest solution imaginable, it does allow for just about all use cases. This flexibility is more important under Windows than under Posix, because the installation locations under Posix conform much more closely to a standard (FHS) than anything you find in the Windows environment. Distribution build step = get all source, binary code and data defined in setup.cfg and in a single directory tree (containing the setup.cfg in the root of the tree), then zip that tree. Distribution installation step = unzip aforementioned zip, and run pysetup3 install . BTW, I believe that for virtual env installations, there's no need to provide integration with Add/Remove Programs. I find that the Add/Remove Programs dialog takes long enough to populate as it is :-( Regards, Vinay Sajip From p.f.moore at gmail.com Mon Oct 17 00:05:43 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 16 Oct 2011 23:05:43 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> <4E9710FA.3010003@netwok.org> Message-ID: On 16 October 2011 22:32, Vinay Sajip wrote: > There's one area of pysetup3 functionality which I don't think has been > discussed in this thread, though it's pertinent to Windows users. Namely, a > completely declarative approach to installation locations will not satisfy all > requirements. For example, if you want to install PowerShell scripts, the right > way of doing that is to make Shell API calls at installation time to determine > the correct location for the installing user. Interesting. That's not a use case that I have encountered, but I can see it could be an issue. I have been working on the basis that a bdist_simple format that matches the functionality of bdist_wininst is at least good enough for those projects that currently use bdist_wininst. The only potential issue right now is with postinstall scripts, which bdist_simple doesn't support. It would be easy enough to add, and indeed it may be possible to use existing packaging functionality already (I haven't looked into this area). > I have this working at present for > a project, using pysetup's hook functionality; but any move to a completely > passive archive format would lose this kind of flexibility. So, I think whatever > archive format we end up with should provide exactly the same level of > flexibility we currently get with pysetup3 for pure-Python projects, but > extended to include binary deliverables. I don't disagree, but I'm struggling to see how that would be done. > The simplest way of doing this is to > register those binaries in setup.cfg, and to have hook code check for the > correct dependencies in the run-time environment before actually installing > (e.g. x86/x64/ARM/Python version dependencies). While it's not the slickest > solution imaginable, it does allow for just about all use cases. Can you give an example of a setup.cfg, that would work like this? Suppose I have two files, foo.py and bar.pyd, which are a pure-python module and a compiled C extension. How would I write a setup.cfg and lay out the directory structure to allow pysetup install to do the right thing? I tried to do this myself, but couldn't get it to work the way I expected. It may be I was just hitting bugs, but it felt to me like I was going at the problem the wrong way. > This > flexibility is more important under Windows than under Posix, because the > installation locations under Posix conform much more closely to a standard (FHS) > than anything you find in the Windows environment. Agreed in principle, although in practice, most projects I've encountered have ended up working within bdist_wininst limitations, probably just as a matter of practicality. > Distribution build step = get all source, binary code and data defined in > setup.cfg and in a single directory tree (containing the setup.cfg in the root > of the tree), then zip that tree. > > Distribution installation step = unzip aforementioned zip, and run pysetup3 > install . Works for me. I'd like to see a bdist_xxx command to do the build step as you describe, if only to make it trivially simple for developers to produce binary distributions. Having to package stuff up manually is bound to put at least some developers off. If you can give me the example I mentioned above, I could work on modifying the bdist_simple code I posted to the tracker today to produce that format rather than my custom format based on bdist_wininst. For the installation step, you shouldn't even need to unzip, as pysetup3 can do the unpacking for you. > BTW, I believe that for virtual env installations, there's no need to provide > integration with Add/Remove Programs. I find that the Add/Remove Programs dialog > takes long enough to populate as it is :-( Agreed. Personally, as I've said, I'm happy not to use Add/Remove even for system installations - pysetup list and pysetup remove do what I need without slowing down the Add/Remove list. But I accept that's not likely to be the view of many Windows users. Anyone using virtual envs, though, is probably by definition comfortable enough with command line tools to be willing to use pysetup3. Paul. From jeremy.kloth at gmail.com Mon Oct 17 00:37:54 2011 From: jeremy.kloth at gmail.com (Jeremy Kloth) Date: Sun, 16 Oct 2011 16:37:54 -0600 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <201110131602.27792.jeremy.kloth@gmail.com> Message-ID: <201110161637.54744.jeremy.kloth@gmail.com> On Sunday, October 16, 2011 02:24:58 PM Vinay Sajip wrote: > Jeremy Kloth gmail.com> writes: > > That said, I have been working on a drop-in replacement for the current > > > bdist_wininst executable stub with the following features: > [snip] > > > http://www.flickr.com/photos/67460826 N04/sets/72157627653603530/ > > [snip] > > Sounds interesting, but your flickr link did not work for me, even after I > tried replacing "" with "@". Care to post a shortened link? Hmm, clicking the link in the email works here. but just to be safe: http://goo.gl/pC48e Thanks -- Jeremy Kloth From victor.stinner at haypocalc.com Mon Oct 17 01:16:36 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Mon, 17 Oct 2011 01:16:36 +0200 Subject: [Python-Dev] Modules of plat-* directories Message-ID: <201110170116.36678.victor.stinner@haypocalc.com> Hi, I don't understand why we kept modules of the plat-* directories (e.g. Lib/plat-linux/CDROM.py). It looks like these modules are not used, except maybe some DL constants used by PyKDE4. Can't we move used constants to classic Python modules (e.g. the os module) and drop unused modules? These modules are not regenerated when Python is compiled, so I don't understand how values can be correct. For example, IN.INT_MAX is 2147483647, whereas it should be 9223372036854775807 on my 64-bit Linux. These values don't look reliable. I'm looking at these modules because Arfrever asked me to review a patch for h2py.py, a script to regenerate these modules: http://bugs.python.org/issue13032 He also suggested somewhere to regenerate these modules when Python is built. -- Python has builtin modules generated from C header files: - CDIO: sunos5 - CDROM: linux - DLFCN: linux, sunos5 - _emx_link: os2emx - grp: os2emx - IN: aix4, darwin, freebsd[45678], linux, netbsd1, os2emx, sunos5, unixware7 - pwd: os2emx - SOCKET: os2emx - STROPTS: sunos5, unixware7 - TYPES: linux, sunos5 CDROM/CDIO can be used to control the low-level CDROM API. CDROM is used by the cdsuite project: http://offog.org/code/cdsuite.html DLFCN is used by PyKDE4: sys.setdlopenflags(DLFCN.RTLD_NOW|DLFCN.RTLD_GLOBAL). I didn't know this sys function :-) (OS/2 platform is deprecated, see PEP 11.) The IN module is used by policykit to get INT_MAX (used to compute "MAX_DBUS_TIMEOUT = INT_MAX / 1000.0"). It was also used in SGI video demos of Python 1.4. STROPTS is not used. TYPES is used by other plat-* modules (IN, DLFCN, STROPTS). These modules contain non-working functions: def __STRING(x): return #x def __constant_le32_to_cpu(x): return ((__u32)(__le32)(x)) __STRING(x) returns None, "#x" is a comment in Python (in C, the preprocessor converts x to a string). Call __constant_le32_to_cpu() fails because __le32 is not defined. Even if __le32 was defined, the code doesn't look to be valid. Victor From victor.stinner at haypocalc.com Mon Oct 17 02:04:38 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Mon, 17 Oct 2011 02:04:38 +0200 Subject: [Python-Dev] Modules of plat-* directories In-Reply-To: <201110170116.36678.victor.stinner@haypocalc.com> References: <201110170116.36678.victor.stinner@haypocalc.com> Message-ID: <201110170204.38591.victor.stinner@haypocalc.com> Le lundi 17 octobre 2011 01:16:36, Victor Stinner a ?crit : > For example, IN.INT_MAX is 2147483647, whereas it should > be 9223372036854775807 on my 64-bit Linux. Oops, wrong example: INT_MAX is also 2147483647 on 64 bits. I mean IN.LONG_MAX. IN.LONG_MAX is always 9223372036854775807 on Linux, on 32 and 64 bits systems. > [Arfrerever] also suggested somewhere to regenerate these modules > when Python is built. somewhere is here: http://bugs.python.org/issue12619 > DLFCN is used by PyKDE4: > sys.setdlopenflags(DLFCN.RTLD_NOW|DLFCN.RTLD_GLOBAL). I didn't know this > sys function :-) Because Python has a sys.setdlopenflags(), we should provide these constants in a regular module (sys or posix). I'm quite sure that PyKDE4 would accept to add a try/except ImportError: DLFCN is only used in one file, to get 2 constants. Victor From vinay_sajip at yahoo.co.uk Mon Oct 17 11:15:55 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 17 Oct 2011 09:15:55 +0000 (UTC) Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 References: <4E914A09.50209@netwok.org> <4E9710FA.3010003@netwok.org> Message-ID: Paul Moore gmail.com> writes: > Interesting. That's not a use case that I have encountered, but I can > see it could be an issue. I have been working on the basis that a > bdist_simple format that matches the functionality of bdist_wininst is > at least good enough for those projects that currently use > bdist_wininst. The only potential issue right now is with postinstall > scripts, which bdist_simple doesn't support. It would be easy enough > to add, and indeed it may be possible to use existing packaging > functionality already (I haven't looked into this area). The packaging machinery contains a reasonably good set of hooks which a setup.cfg can plug into, which is IMO more flexible than just a post- installation script (e.g. sometimes you need your code to run before installation, to determine where to install things). > I don't disagree, but I'm struggling to see how that would be done. See example below. > Can you give an example of a setup.cfg, that would work like this? > Suppose I have two files, foo.py and bar.pyd, which are a pure-python > module and a compiled C extension. How would I write a setup.cfg and > lay out the directory structure to allow pysetup install to do the > right thing? I tried to do this myself, but couldn't get it to work > the way I expected. It may be I was just hitting bugs, but it felt to > me like I was going at the problem the wrong way. It may not work for you, because in the default branch, packaging is currently missing some functionality or has bugs (I've raised about a dozen issues since trying to get packaging working with virtual environments). In the pythonv branch (which is pretty up to date with default), I've added the missing functionality/fixed some of the issues. Here's an example: I've created an empty virtual environment. Here are the contents of it at the moment: C:\TEMP\VENV ? env.cfg ? ????Include ????Lib ? ????site-packages ????Scripts activate.bat deactivate.bat pysetup3-script.py pysetup3.exe [various stock Python binaries omitted] Here's an example setup.cfg: [global] setup_hooks = hooks.setup [install_data] pre-hook.win32 = hooks.pre_install_data [metadata] name = dory version = 0.1 requires_python = >= 3.3 [other metadata omitted] [files] modules = dory packages = apackage apackage.asubpackage scripts = dory extra_files = hooks.py resources = data/data.bin = {data} compiled/ spam.pyd = {compiled} #[extension: spam] #language = c #sources = spammodule.c The extension section is commented out because we're not building the extension at installation time. This setup.cfg works hand-in-hand with some hooks, in file hooks.py below: def setup(config): # Can do various checks here. For example, platform # compatibility checks, or set up different binaries # to install for x86 vs. x64, etc. # Do this by setting up config['files']['resources'] # appropriately based on installation time environment. pass def pre_install_data(cmd): # assumes os.name == 'nt' for simplicity of this example cmd.categories['compiled'] = '%s/Lib/site-packages' % cmd.install_dir Notice how in setup.cfg, file 'spam.pyd' in 'compiled' is expected to be copied to category 'compiled', whose path is set in hooks.pre_install_data. Here's the project directory: C:\USERS\VINAY\PROJECTS\DORY ? dory ? dory.py ? hooks.py ? setup.cfg ? ????apackage ? ? __init__.py ? ? ? ????asubpackage ? __init__.py ? ????compiled ? spam.pyd ? ????data data.bin At the moment, the "spam" module is of course not installed: (venv) c:\Users\Vinay\Projects\dory>python -c "import spam" Traceback (most recent call last): File "", line 1, in ImportError: No module named 'spam' Now, we install the project: (venv) c:\Users\Vinay\Projects\dory>pysetup3 install . Installing from source directory: c:\Users\Vinay\Projects\dory running install_dist running build running build_py running build_scripts copying and adjusting dory -> build\scripts-3.3 running install_lib creating C:\temp\venv\Lib\site-packages\apackage byte-compiling C:\temp\venv\Lib\site-packages\apackage\__init__.py to __init__.c python-33.pyc byte-compiling C:\temp\venv\Lib\site-packages\dory.py to dory.cpython-33.pyc running install_scripts running pre_hook hooks.pre_install_data for command install_data running install_data running install_distinfo creating C:\temp\venv\Lib\site-packages\dory-0.1.dist-info creating C:\temp\venv\Lib\site-packages\dory-0.1.dist-info\METADATA creating C:\temp\venv\Lib\site-packages\dory-0.1.dist-info\INSTALLER creating C:\temp\venv\Lib\site-packages\dory-0.1.dist-info\REQUESTED creating C:\temp\venv\Lib\site-packages\dory-0.1.dist-info\RESOURCES creating C:\temp\venv\Lib\site-packages\dory-0.1.dist-info\RECORD Now, the virtualenv looks like this: C:\TEMP\VENV ? env.cfg ? ????data ? data.bin ? ????Include ????Lib ? ????site-packages ? ? dory.py ? ? spam.pyd ? ? ? ????apackage ? ? ? __init__.py ? ? ? ? ? ????asubpackage ? ? ? ? __init__.py ? ? ? ? ? ? ? ????__pycache__ ? ? ? __init__.cpython-33.pyc ? ? ? ? ? ????__pycache__ ? ? __init__.cpython-33.pyc ? ? ? ????dory-0.1.dist-info ? ? INSTALLER ? ? METADATA ? ? RECORD ? ? REQUESTED ? ? RESOURCES ? ? ? ????__pycache__ ? dory.cpython-33.pyc ? ????Scripts activate.bat deactivate.bat dory-script.py dory.exe pysetup3-script.py pysetup3.exe [various stock Python binaries omitted] Now we can import "spam" and run something from it: (venv) c:\Users\Vinay\Projects\dory>python -c "import spam; spam.system('cd')" c:\Users\Vinay\Projects\dory > I'd like to see a bdist_xxx command to do the build step > as you describe, if only to make it trivially simple for developers to > produce binary distributions. Having to package stuff up manually is > bound to put at least some developers off. If you can give me the > example I mentioned above, I could work on modifying the bdist_simple > code I posted to the tracker today to produce that format rather than > my custom format based on bdist_wininst. Example as above, though you may need to use the pythonv branch to actually get it working. I can zip up the directory and send it to you, but at the moment there's missing functionality in pythonv in terms of the link step when building the extension. (I overcame this by linking manually .) If you want, I can zip all the project files up and send them to you. In the more general case, one might want an arrangement with a directory structure like compiled/x86/..., compiled/x64/... in the built zip, with the hooks.py code setting up the resources appropriately based on the target environment as determined at installation time. > For the installation step, you shouldn't even need to unzip, as > pysetup3 can do the unpacking for you. Indeed, with pythonv I could just zip the whole "dory" project directory and install with e.g. "pysetup3 install dory-1.0-win32-py3.3.zip". > Agreed. Personally, as I've said, I'm happy not to use Add/Remove even > for system installations - pysetup list and pysetup remove do what I > need without slowing down the Add/Remove list. But I accept that's not > likely to be the view of many Windows users. Anyone using virtual > envs, though, is probably by definition comfortable enough with > command line tools to be willing to use pysetup3. A fair subset of those who must have ARP integration will probably also want to install using MSI, so that would be taken care of by having a good bdist_simple -> bdist_msi conversion. Regards, Vinay Sajip From sam.partington at gmail.com Mon Oct 17 12:10:58 2011 From: sam.partington at gmail.com (Sam Partington) Date: Mon, 17 Oct 2011 11:10:58 +0100 Subject: [Python-Dev] PEP397 no command line options to python? Message-ID: Hello all, I was surprised to see that the excellent pylauncher doesn't do the magic shebang processing if you give it any python command line options. e.g. Given #!/usr/bin/env python2.6 import sys print(sys.executable) C:\>py test.py C:\Python26\python.exe C:\>py -utt test.py C:\Python27\python.exe It is spelled out that it shouldn't do so in the pep : "Only the first command-line argument will be checked for a shebang line and only if that argument does not start with a '-'." But I can't really see why that should be the case. What is the rational behind this? It is very surprising to the user that adding a simple option like -tt should change the way the launcher behaves. The PEP also states that the launcher should show the python help if '-h' is specified : "If the only command-line argument is "-h" or "--help", the launcher will print a small banner and command-line usage, then pass the argument to the default Python. This will cause help for the launcher being printed followed by help for Python itself. The output from the launcher will clearly indicate the extended help information is coming from the launcher and not Python." To me that would suggest to end users that they can use any of the command line options with the launcher, and they should behave as if you had called python directly. I am directing this to python-dev because this pylauncher is merely implementing the PEP as it currently stands, so the PEP would ideally need to be modified before the launcher. I would change the that first paragraph of the PEP to read something like : " The first command-line argument not beginning with a '-' will be checked for a shebang line, but only if : * That command-line argument is not preceded by a '-c' or '-m', and providing * There is no explicit version specifier, e.g. -2.7 as documumented later in this PEP. " Incidentally whilst implementing this I also noticed a bug in the pylauncher whereby the py launcher would incorrectly treat "py t3" as a request for python version as if '-3' had been specified. I have a small patch that fixes this and implements the above for pylauncher with extra tests if there is interest. Sam PS I have been lurking for a while, hello everyone. From p.f.moore at gmail.com Mon Oct 17 12:58:32 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 17 Oct 2011 11:58:32 +0100 Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: References: <4E914A09.50209@netwok.org> <4E9710FA.3010003@netwok.org> Message-ID: On 17 October 2011 10:15, Vinay Sajip wrote: > It may not work for you, because in the default branch, packaging is currently > missing some functionality or has bugs (I've raised about a dozen issues since > trying to get packaging working with virtual environments). Ah. That might be part of the issue I've been having. I do recall hitting some bugs. The other part is that I wasn't trying anything nearly as sophisticated as this :-) > In the pythonv branch (which is pretty up to date with default), I've added the > missing functionality/fixed some of the issues. Here's an example: [...] Nice! I see what you are getting at now. >> I'd like to see a bdist_xxx command to do the build step >> as you describe, if only to make it trivially simple for developers to >> produce binary distributions. Having to package stuff up manually is >> bound to put at least some developers off. If you can give me the >> example I mentioned above, I could work on modifying the bdist_simple >> code I posted to the tracker today to produce that format rather than >> my custom format based on bdist_wininst. > > Example as above, though you may need to use the pythonv branch to actually get > it working. I can zip up the directory and send it to you, but at the moment > there's missing functionality in pythonv in terms of the link step when > building the extension. (I overcame this by linking manually .) If you want, I > can zip all the project files up and send them to you. No need, you've given me enough to investigate myself. But thanks for the offer. > In the more general case, one might want an arrangement with a directory > structure like compiled/x86/..., compiled/x64/... in the built zip, with the > hooks.py code setting up the resources appropriately based on the target > environment as determined at installation time. Correct me if I'm wrong, but if we standardised on a particular structure, the hooks.py contents could actually be integrated into the core, if we wanted? People could still write hooks for more complex cases, but the basic binary build case could work out of the box that way. >> Agreed. Personally, as I've said, I'm happy not to use Add/Remove even >> for system installations - pysetup list and pysetup remove do what I >> need without slowing down the Add/Remove list. But I accept that's not >> likely to be the view of many Windows users. Anyone using virtual >> envs, though, is probably by definition comfortable enough with >> command line tools to be willing to use pysetup3. > > A fair subset of those who must have ARP integration will probably also want to > install using MSI, so that would be taken care of by having a good bdist_simple > -> bdist_msi conversion. Yes, that would be good. Paul. From skippy.hammond at gmail.com Mon Oct 17 14:23:11 2011 From: skippy.hammond at gmail.com (Mark Hammond) Date: Mon, 17 Oct 2011 23:23:11 +1100 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: References: Message-ID: <4E9C1E2F.3040604@gmail.com> On 17/10/2011 9:10 PM, Sam Partington wrote: > Hello all, > > I was surprised to see that the excellent pylauncher doesn't do the > magic shebang processing if you give it any python command line > options. e.g. Given > > #!/usr/bin/env python2.6 > import sys > print(sys.executable) > > C:\>py test.py > C:\Python26\python.exe > C:\>py -utt test.py > C:\Python27\python.exe > > It is spelled out that it shouldn't do so in the pep : > > "Only the first command-line argument will be checked for a shebang line > and only if that argument does not start with a '-'." > > But I can't really see why that should be the case. What is the > rational behind this? It really is a combination of 2 things: * The key use-case for the launcher is to be executed implicitly - ie, the user types just "foo.py". In that scenario there is no opportunity for the user to specify any args between the name of the executable and of the script. IOW, the expectation is that people will not type "py foo.py", but instead just type "foo.py". * A desire to avoid command-line parsing in the launcher or to make some options "more equal" then others. Eg, you mention later that -c and -m should be special, but what about -w or -Q? What about new options in future versions? > It is very surprising to the user that adding a > simple option like -tt should change the way the launcher behaves. > The PEP also states that the launcher should show the python help if > '-h' is specified : > > "If the only command-line argument is "-h" or "--help", the launcher will > print a small banner and command-line usage, then pass the argument to > the default Python. This will cause help for the launcher being printed > followed by help for Python itself. The output from the launcher will > clearly indicate the extended help information is coming from the > launcher and not Python." > > To me that would suggest to end users that they can use any of the > command line options with the launcher, and they should behave as if > you had called python directly. I think the language is fairly clear - only the help options are special and no other options will work. ... > Incidentally whilst implementing this I also noticed a bug in the > pylauncher whereby the py launcher would incorrectly treat "py t3" as > a request for python version as if '-3' had been specified. I have a > small patch that fixes this and implements the above for pylauncher > with extra tests if there is interest. That certainly sounds like a bug and a patch sent to https://bitbucket.org/vinay.sajip/pylauncher will be appreciated! > PS I have been lurking for a while, hello everyone. Hi and cheers! :) Mark From sam.partington at gmail.com Mon Oct 17 14:55:27 2011 From: sam.partington at gmail.com (Sam Partington) Date: Mon, 17 Oct 2011 13:55:27 +0100 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: <4E9C1E2F.3040604@gmail.com> References: <4E9C1E2F.3040604@gmail.com> Message-ID: On 17 October 2011 13:23, Mark Hammond wrote: > On 17/10/2011 9:10 PM, Sam Partington wrote: >> >> ? "Only the first command-line argument will be checked for a shebang line >> ? ?and only if that argument does not start with a '-'." >> >> But I can't really see why that should be the case. ?What is the >> rational behind this? > > It really is a combination of 2 things: > > * The key use-case for the launcher is to be executed implicitly - ie, the > user types just "foo.py". ?In that scenario there is no opportunity for the > user to specify any args between the name of the executable and of the > script. ?IOW, the expectation is that people will not type "py foo.py", but > instead just type "foo.py". That sounds like an explanation of why it hasn't been implemented before, not an explanation of why it should continue that way. In any case, I think that expectation is not complete. In my case it was my editor that inserted the '-u' on my behalf. Or why might I not set the default action for .py files to be "py -tt %1", or "py -3 %1". Why deny any of the arguments to a pylauncher user? > * A desire to avoid command-line parsing in the launcher or to make some > options "more equal" then others. ?Eg, you mention later that -c and -m > should be special, but what about -w or -Q? ?What about new options in > future versions? Yes it is a bit annoying to have to treat those specially, but other than -c/-m it does not need to understand pythons args, just check that the arg is not an explicit version specifier. -q/-Q etc have no impact on how to treat the file. In fact there's no real need to treat -c differently as it's extremely unlikely that there is a file that might match. But for -m you can come up with a situation where if you it gets it wrong. e.g. 'module' and 'module.py' in the cwd. I would suggest that it is also unlikely that there will be any future options would need any special consideration. >> It is very surprising to the user that adding a >> simple option like -tt should change the way the launcher behaves. >> The PEP also states that the launcher should show the python help if >> '-h' is specified : >> >> ? ? "If the only command-line argument is "-h" or "--help", the launcher >> will >> ? ? print a small banner and command-line usage, then pass the argument to >> ? ? the default Python. ?This will cause help for the launcher being >> printed >> ? ? followed by help for Python itself. ?The output from the launcher will >> ? ? clearly indicate the extended help information is coming from the >> ? ? launcher and not Python." >> >> To me that would suggest to end users that they can use any of the >> command line options with the launcher, and they should behave as if >> you had called python directly. > > I think the language is fairly clear - only the help options are special and > no other options will work. The PEP is clear yes, but the the help output for the launcher displays all of the python switches, so I expected them to be available in the a py.exe call. >> Incidentally whilst implementing this I also noticed a bug in the >> pylauncher whereby the py launcher would incorrectly treat "py t3" as >> a request for python version as if '-3' had been specified. ?I have a >> small patch that fixes this and implements the above for pylauncher >> with extra tests if there is interest. > > That certainly sounds like a bug and a patch sent to > https://bitbucket.org/vinay.sajip/pylauncher will be appreciated! The patch does both the bug fix and the arg skipping at present, but I'll happily separate them if needs be. Sam From michael at python.org Mon Oct 17 15:16:25 2011 From: michael at python.org (Michael Foord) Date: Mon, 17 Oct 2011 14:16:25 +0100 Subject: [Python-Dev] Fwd: Issue with the link to python modules documentation In-Reply-To: <4E9B4244.6010309@ohmytux.com> References: <4E9B4244.6010309@ohmytux.com> Message-ID: <4E9C2AA9.6000206@python.org> Hey folks, The title of the "Global Module Index" for 3.2 documentation is "Python 3.1.3 documentation". http://docs.python.org/py3k/modindex.html See the report below (attached screenshot removed). All the best, Michael Foord -------- Original Message -------- Subject: Issue with the link to python modules documentation Date: Sun, 16 Oct 2011 22:44:52 +0200 From: Carl Chenet Reply-To: chaica at ohmytux.com To: webmaster at python.org Hi, Browsing http://www.python.org/doc/ , I click on the link to Python 3.x Module Index linking to http://docs.python.org/3.2/modindex.html and I'm redirected to http://docs.python.org/py3k/modindex.html to the Python module list documentation but for the version 3.1.3. I'm using Chromium 12. I tried several times and cleared the cache before retrying but the issue remains. I'm joining a screenshot showing the finale page with the url http://docs.python.org/py3k/modindex.html which should be the Python Module list for current 3.x version, which is I guess 3.2. Bye, Carl Chenet -------------- next part -------------- An HTML attachment was scrubbed... URL: From vinay_sajip at yahoo.co.uk Mon Oct 17 15:39:53 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 17 Oct 2011 13:39:53 +0000 (UTC) Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 References: <4E914A09.50209@netwok.org> <4E9710FA.3010003@netwok.org> Message-ID: Paul Moore gmail.com> writes: > Correct me if I'm wrong, but if we standardised on a particular > structure, the hooks.py contents could actually be integrated into the > core, if we wanted? People could still write hooks for more complex > cases, but the basic binary build case could work out of the box that > way. Well, the hooks.py is there to allow user-defined setups which are outside the scope of what should be provided in the stdlib - for instance, my earlier example about PowerShell scripts is IMO out-of-scope for the stdlib itself, but perfectly fine for the documentation, say in a set of example recipes in a packaging HOWTO. The hooks aren't needed at all for conventional deployments - only when you need something out of the ordinary. We could certainly extend the setup.cfg scheme to have specific support for pre-compiled binaries, which are currently "out of the ordinary" (which of course is why this thread is here :-)). Life could be made easier for distribution authors by initially having well documented examples or recipes, and later, if the ubiquity of certain patterns is established, better support might be provided in the stdlib for those patterns. But there are other changes we could make now - for example, the list of categories does not include a library location (necessitating my use of a "compiled" category), but perhaps a "lib" category could be built in now. Regards, Vinay Sajip From vinay_sajip at yahoo.co.uk Mon Oct 17 16:20:27 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 17 Oct 2011 14:20:27 +0000 (UTC) Subject: [Python-Dev] PEP397 no command line options to python? References: <4E9C1E2F.3040604@gmail.com> Message-ID: Sam Partington gmail.com> writes: > That sounds like an explanation of why it hasn't been implemented > before, not an explanation of why it should continue that way. >From a desire to keep the launcher as simple as possible, and to minimise the need to synchronise the launcher with command line parameter changes in future versions of Python. > In any case, I think that expectation is not complete. In my case it > was my editor that inserted the '-u' on my behalf. > > Or why might I not set the default action for .py files to be "py -tt > %1", or "py -3 %1". > > Why deny any of the arguments to a pylauncher user? Don't forget that customised commands allow Python to be invoked from shebang lines with fair flexibility. > >> Incidentally whilst implementing this I also noticed a bug in the > >> pylauncher whereby the py launcher would incorrectly treat "py t3" as > >> a request for python version as if '-3' had been specified. ?I have a > >> small patch that fixes this and implements the above for pylauncher > >> with extra tests if there is interest. > > > > That certainly sounds like a bug and a patch sent to > > https://bitbucket.org/vinay.sajip/pylauncher will be appreciated! > > The patch does both the bug fix and the arg skipping at present, but > I'll happily separate them if needs be. Don't worry about separating them for now, assuming that it's fairly easy to figure out which bit is which :-) Thanks & regards, Vinay Sajip From pje at telecommunity.com Mon Oct 17 18:24:21 2011 From: pje at telecommunity.com (PJ Eby) Date: Mon, 17 Oct 2011 12:24:21 -0400 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: References: <4E9C1E2F.3040604@gmail.com> Message-ID: On Mon, Oct 17, 2011 at 8:55 AM, Sam Partington wrote: > Yes it is a bit annoying to have to treat those specially, but other > than -c/-m it does not need to understand pythons args, just check > that the arg is not an explicit version specifier. -q/-Q etc have no > impact on how to treat the file. > > In fact there's no real need to treat -c differently as it's extremely > unlikely that there is a file that might match. But for -m you can > come up with a situation where if you it gets it wrong. e.g. 'module' > and 'module.py' in the cwd. > > I would suggest that it is also unlikely that there will be any future > options would need any special consideration. > What about -S (no site.py) and -E (no environment)? These are needed for secure setuid scripts on *nix; I don't know how often they'd be used in practice on Windows. (Basically, they let you isolate a script's effective sys.path; there may be some use case overlap with virtual envs. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vinay_sajip at yahoo.co.uk Mon Oct 17 19:07:16 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 17 Oct 2011 18:07:16 +0100 (BST) Subject: [Python-Dev] Packaging and binary distributions for Python 3.3 In-Reply-To: <201110161637.54744.jeremy.kloth@gmail.com> References: <201110131602.27792.jeremy.kloth@gmail.com> <201110161637.54744.jeremy.kloth@gmail.com> Message-ID: <1318871236.97607.YahooMailNeo@web25801.mail.ukl.yahoo.com> > Hmm, clicking the link in the email works here. but just to be safe: > > http://goo.gl/pC48e > Thanks - looks nice! What is the license which applies to the code? Is it available in a public repository? Regards, Vinay Sajip From tjreedy at udel.edu Mon Oct 17 19:11:33 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 17 Oct 2011 13:11:33 -0400 Subject: [Python-Dev] Fwd: Issue with the link to python modules documentation In-Reply-To: <4E9C2AA9.6000206@python.org> References: <4E9B4244.6010309@ohmytux.com> <4E9C2AA9.6000206@python.org> Message-ID: On 10/17/2011 9:16 AM, Michael Foord wrote: > Hey folks, > > The title of the "Global Module Index" for 3.2 documentation is "Python > 3.1.3 documentation". > > http://docs.python.org/py3k/modindex.html Verified. Clicking [index] in upper right goes to http://docs.python.org/py3k/genindex.html 3.2.2 version. -- Terry Jan Reedy From g.brandl at gmx.net Mon Oct 17 20:34:11 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 17 Oct 2011 20:34:11 +0200 Subject: [Python-Dev] Fwd: Issue with the link to python modules documentation In-Reply-To: <4E9C2AA9.6000206@python.org> References: <4E9B4244.6010309@ohmytux.com> <4E9C2AA9.6000206@python.org> Message-ID: It's just an outdated link; fixed now. Georg Am 17.10.2011 15:16, schrieb Michael Foord: > Hey folks, > > The title of the "Global Module Index" for 3.2 documentation is "Python 3.1.3 > documentation". > > http://docs.python.org/py3k/modindex.html > > See the report below (attached screenshot removed). > > All the best, > > Michael Foord > > -------- Original Message -------- > Subject: Issue with the link to python modules documentation > Date: Sun, 16 Oct 2011 22:44:52 +0200 > From: Carl Chenet > Reply-To: chaica at ohmytux.com > To: webmaster at python.org > > > > Hi, > > Browsing http://www.python.org/doc/ , I click on the link to Python 3.x > Module Index linking to http://docs.python.org/3.2/modindex.html and I'm > redirected to http://docs.python.org/py3k/modindex.html to the Python > module list documentation but for the version 3.1.3. > > I'm using Chromium 12. I tried several times and cleared the cache > before retrying but the issue remains. > > I'm joining a screenshot showing the finale page with the url > http://docs.python.org/py3k/modindex.html which should be the Python > Module list for current 3.x version, which is I guess 3.2. > > Bye, > Carl Chenet > > > > From solipsis at pitrou.net Mon Oct 17 23:27:09 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 17 Oct 2011 23:27:09 +0200 Subject: [Python-Dev] Modules of plat-* directories References: <201110170116.36678.victor.stinner@haypocalc.com> <201110170204.38591.victor.stinner@haypocalc.com> Message-ID: <20111017232709.355fcd8f@pitrou.net> On Mon, 17 Oct 2011 02:04:38 +0200 Victor Stinner wrote: > Le lundi 17 octobre 2011 01:16:36, Victor Stinner a ?crit : > > For example, IN.INT_MAX is 2147483647, whereas it should > > be 9223372036854775807 on my 64-bit Linux. > > Oops, wrong example: INT_MAX is also 2147483647 on 64 bits. I mean > IN.LONG_MAX. > > IN.LONG_MAX is always 9223372036854775807 on Linux, on 32 and 64 bits systems. Given the issues you are mentioning, and given they were never reported in years before, it seems unlikely anybody is using these files. +1 to remove them, as they don't seem documented either. Regards Antoine. From sam.partington at gmail.com Mon Oct 17 23:52:06 2011 From: sam.partington at gmail.com (Sam Partington) Date: Mon, 17 Oct 2011 22:52:06 +0100 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: References: <4E9C1E2F.3040604@gmail.com> Message-ID: On 17 October 2011 17:24, PJ Eby wrote: > What about -S (no site.py) and -E (no environment)? These are needed for > secure setuid scripts on *nix; I don't know how often they'd be used in > practice on Windows. (Basically, they let you isolate a script's effective > sys.path; there may be some use case overlap with virtual envs. At the moment py -S test.py would mean that test.py would not be scanned for a shebang, and would be executed with the latest python. The change that I am suggesting would mean that it would be scanned for a shebang to select the python, and then that python would be called with -S. Do you think it would be necessary to have -S disable reading of the .ini files (in the users application data dir and in \windows)? Sam PS Sorry, I initially replied off-list by accident. From skippy.hammond at gmail.com Tue Oct 18 02:00:27 2011 From: skippy.hammond at gmail.com (Mark Hammond) Date: Tue, 18 Oct 2011 11:00:27 +1100 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: References: <4E9C1E2F.3040604@gmail.com> Message-ID: <4E9CC19B.10309@gmail.com> On 18/10/2011 3:24 AM, PJ Eby wrote: > > > On Mon, Oct 17, 2011 at 8:55 AM, Sam Partington > > wrote: > > Yes it is a bit annoying to have to treat those specially, but other > than -c/-m it does not need to understand pythons args, just check > that the arg is not an explicit version specifier. -q/-Q etc have no > impact on how to treat the file. > > In fact there's no real need to treat -c differently as it's extremely > unlikely that there is a file that might match. But for -m you can > come up with a situation where if you it gets it wrong. e.g. 'module' > and 'module.py' in the cwd. > > I would suggest that it is also unlikely that there will be any future > options would need any special consideration. > > > What about -S (no site.py) and -E (no environment)? These are needed > for secure setuid scripts on *nix; I don't know how often they'd be used > in practice on Windows. (Basically, they let you isolate a script's > effective sys.path; there may be some use case overlap with virtual envs. It is worth pointing out that options can be specified directly in the shebang line - eg, a line like "#! /usr/bin/python -S" in a foo.py works as expected. What doesn't work is explicitly using a command like: % py -E foo.py Using the foo.py above as an example, that would need to result in spawning Python with both -E and -S options. For my money, that doesn't seem worthwhile - eg, someone else may have the expectation that specifying the args to py.exe should override the args on the shebang line. Then someone else will have different expectations about the specific order of the args that should be used (eg, compare using "python -m somemodule -v" versus "python -v -m somemodule". etc. For these reasons I'm still advocating we don't support such command-lines, but as usual I'll go with the consensus :) Cheers, Mark From victor.stinner at haypocalc.com Tue Oct 18 02:20:39 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 18 Oct 2011 02:20:39 +0200 Subject: [Python-Dev] Modules of plat-* directories In-Reply-To: <20111017232709.355fcd8f@pitrou.net> References: <201110170116.36678.victor.stinner@haypocalc.com> <201110170204.38591.victor.stinner@haypocalc.com> <20111017232709.355fcd8f@pitrou.net> Message-ID: <201110180220.39956.victor.stinner@haypocalc.com> Le lundi 17 octobre 2011 23:27:09, Antoine Pitrou a ?crit : > On Mon, 17 Oct 2011 02:04:38 +0200 > > Victor Stinner wrote: > > Le lundi 17 octobre 2011 01:16:36, Victor Stinner a ?crit : > > > For example, IN.INT_MAX is 2147483647, whereas it should > > > be 9223372036854775807 on my 64-bit Linux. > > > > Oops, wrong example: INT_MAX is also 2147483647 on 64 bits. I mean > > IN.LONG_MAX. > > > > IN.LONG_MAX is always 9223372036854775807 on Linux, on 32 and 64 bits > > systems. > > Given the issues you are mentioning, and given they were never > reported in years before, it seems unlikely anybody is using these > files. > > +1 to remove them, as they don't seem documented either. Oh, there are other (new?) problems listed in last comments of the issue #12619. The Mac OS X issue is funny. Extracts: "What do you do for platforms like OS X where we support one set of binary files that contain multi-architecture C-files that can run as Intel-64, Intel-32 or PPC-32 on the same machine at user option at run time? (...) The static IN.py currently shipped in plat-darwin is misleading at best." "-1 on auto-building. The header needed may not be available on the build platform, (...)" "There is no reason to keep plat-xxx files if cannot be managed properly." Victor From pje at telecommunity.com Tue Oct 18 02:36:13 2011 From: pje at telecommunity.com (PJ Eby) Date: Mon, 17 Oct 2011 20:36:13 -0400 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: <4E9CC19B.10309@gmail.com> References: <4E9C1E2F.3040604@gmail.com> <4E9CC19B.10309@gmail.com> Message-ID: On Mon, Oct 17, 2011 at 8:00 PM, Mark Hammond wrote: > On 18/10/2011 3:24 AM, PJ Eby wrote: > >> What about -S (no site.py) and -E (no environment)? These are needed >> for secure setuid scripts on *nix; I don't know how often they'd be used >> in practice on Windows. (Basically, they let you isolate a script's >> effective sys.path; there may be some use case overlap with virtual envs. >> > > It is worth pointing out that options can be specified directly in the > shebang line - eg, a line like "#! /usr/bin/python -S" in a foo.py works as > expected. Ah, ok. Never mind, then. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sam.partington at gmail.com Tue Oct 18 08:20:00 2011 From: sam.partington at gmail.com (Sam Partington) Date: Tue, 18 Oct 2011 07:20:00 +0100 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: References: <4E9C1E2F.3040604@gmail.com> Message-ID: On 17 October 2011 15:20, Vinay Sajip wrote: > Sam Partington gmail.com> writes: > >> That sounds like an explanation of why it hasn't been implemented >> before, not an explanation of why it should continue that way. > > From a desire to keep the launcher as simple as possible, and to minimise the > need to synchronise the launcher with command line parameter changes in future > versions of Python. As simple as possible yes... but no simpler. I think having pylauncher behave so differently in the two cases of : py -u test.py py test.py Is very unexpected. And to do so silently, without warning will cause real headaches_ for users, *especially* since py -h lists -u as one of the options, it does not say 'here are the python options but you must call PythonXX/python.exe to use them'. [headaches : it did for me as I ended up with a broken build of my app due to different parts of my app built for different pythons.] To fix this the launcher doesn't need to track all python command line options, only those that take two args. I don't really see that it will be such a maintenance burden to have the launcher track any new ones. Python has only two such options after 20 years of development. As for complexity it's less than 10 lines of C. >> In any case, I think that expectation is not complete. In my case it >> was my editor that inserted the '-u' on my behalf. >> >> Or why might I not set the default action for .py files to be "py -tt >> %1", or "py -3 %1". >> >> Why deny any of the arguments to a pylauncher user? > > Don't forget that customised commands allow Python to be invoked from shebang > lines with fair flexibility. That's a cool feature which I'd not really read up on, but that requires a global configuration file change, it's not doable on a per usage basis. > Don't worry about separating them for now, assuming that it's fairly easy to > figure out which bit is which :-) It wasn't hard to do and I see you've applied the patch already, thanks for the fast turn around! Sam PS Sorry, I replied off-list again! From vinay_sajip at yahoo.co.uk Tue Oct 18 09:10:17 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 18 Oct 2011 08:10:17 +0100 (BST) Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: References: <4E9C1E2F.3040604@gmail.com> Message-ID: <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> >> From a desire to keep the launcher as simple as possible, and to minimise > the >> need to synchronise the launcher with command line parameter changes in > future >> versions of Python. > > As simple as possible yes... but no simpler.? I think having > pylauncher behave so differently in the two cases of : > > py -u test.py > py test.py > > Is very unexpected. And to do so silently, without warning will cause It's only unexpected if you don't read the PEP. From there: "The launcher may offer some conveniences for Python developers working interactively - for example, starting the launcher with no command-line arguments will launch the default Python with no command-line arguments. Further, command-line arguments will be supported to allow a specific Python version to be launched interactively - however, these conveniences must not detract from the primary purpose of launching scripts and must be easy to avoid if desired." > real headaches_ for users, *especially* since py -h lists -u as one of > the options, it does not say 'here are the python options but you must > call PythonXX/python.exe to use them'. Well, it's easy enough to make that clearer in the help output of py.exe :-) ? > [headaches : it did for me as I ended up with a broken build of my app > due to different parts of my app built for different pythons.] Why does the need arise to invoke py.exe in a build system? Why not just reference the Python executable you want directly? > To fix this the launcher doesn't need to track all python command line > options, only those that take two args.? I don't really see that it > will be such a maintenance burden to have the launcher track any new > ones.? Python has only two such options after 20 years of development. > > As for complexity it's less than 10 lines of C. Plus tests, presumably ... let's see 'em :-) ? > That's a cool feature which I'd not really read up on, but that > requires a global configuration file change, it's not doable on a per > usage basis. Per usage = interactively, which is more of a "by-the-by" feature for the launcher, the main purpose being to bring shebang-line functionality to Windows. Regards, Vinay Sajip From john at arbash-meinel.com Tue Oct 18 09:47:15 2011 From: john at arbash-meinel.com (John Arbash Meinel) Date: Tue, 18 Oct 2011 09:47:15 +0200 Subject: [Python-Dev] Disabling cyclic GC in timeit module In-Reply-To: References: <20111008014728.7f0916ea@pitrou.net> <1318033100.3697.10.camel@localhost.localdomain> Message-ID: <4E9D2F03.9040109@arbash-meinel.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 ... >> If you are only measuring json encoding of a few select pieces of >> data then it's a microbenchmark. If you are measuring the whole >> application (or a significant part of it) then I'm not sure >> timeit is the right tool for that. >> >> Regards >> >> Antoine. >> > > When you're measuring how much time it takes to encode json, this > is a microbenchmark and yet the time that timeit gives you is > misleading, because it'll take different amount of time in your > application. I guess my proposition would be to not disable gc by > default and disable it when requested, but well, I guess I'll give > up given the strong push against it. > > Cheers, fijal True, but it is also heavily dependent on how much other data your application has in memory at the time. If your application has 1M objects in memory and then goes to encode/decode a JSON string when the gc kicks in, it will take a lot longer because of all the stuff that isn't JSON related. I don't think it can be suggested that timeit should grow a flag for "put garbage into memory, and then run this microbenchmark with gc enabled.". If you really want to know how fast something is in your application, you sort of have to do the timing in your application, at scale and at load. John =:-> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Cygwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6dLwMACgkQJdeBCYSNAAOzzACfXpP16589Mu7W8ls9KddacF+g ozwAnRz5ciPg950qcV2uzyTKl1R21+6t =hGgf -----END PGP SIGNATURE----- From sam.partington at gmail.com Tue Oct 18 11:59:26 2011 From: sam.partington at gmail.com (Sam Partington) Date: Tue, 18 Oct 2011 10:59:26 +0100 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> Message-ID: On 18 October 2011 08:10, Vinay Sajip wrote: >> Is very unexpected. And to do so silently, without warning will cause > > It's only unexpected if you don't read the PEP. From there: > > "The launcher may offer some conveniences for Python developers working > interactively - for example, starting the launcher with no command-line > arguments will launch the default Python with no command-line arguments. > Further, command-line arguments will be supported to allow a specific > Python version to be launched interactively - however, these conveniences > must not detract from the primary purpose of launching scripts and must > be easy to avoid if desired." I read the PEP, but didn't spot that subtelty.I wonder how many other people will read the PEP, or just think "Oh, I can just replace python with py" and have it figure out the python to call. >> real headaches_ for users, *especially* since py -h lists -u as one of >> the options, it does not say 'here are the python options but you must >> call PythonXX/python.exe to use them'. > > Well, it's easy enough to make that clearer in the help output of py.exe :-) Indeed, I would say that if nothing else then that should be done >> [headaches : it did for me as I ended up with a broken build of my app >> due to different parts of my app built for different pythons.] > > Why does the need arise to invoke py.exe in a build system? Why not just reference the Python executable you want directly? That's rather OT here, but briefly as I can. We have transitioned our devel branch to py 2.7. Our stable branches are to remain py 2.6. The build system (also written in python) starts lots of sub build commands, (various SCons, make, bash and python). I added shebangs to all files as appropriate for devel/stable branch, and initially I changed the python build targets from "python -utt build.py" to "./build.py" and I lost the -utt functionality which I could live with. But on some of the windows machines the default action of python files was to open an editor with the build.py. So I changed it to "py -utt build.py". Everything seemed fine initially until tests started to fail which ensued some head scratching. I actually didn't figure out what was going on until I noticed that SCiTe was also calling the wrong python because it also had "py -utt" to run python scripts. Incidentally, one of my colleagues also discovered the same issue in his eclipse pydev setup. I also notice that Editra also does "python -u" by default, and I can imagine lots of users swapping "python" with "py". I am well aware that is is by no means a perfect system and I am working at making it more bulletproof, but as usual there are time constraints etc. We will also go through the whole thing again when wxPython supports python 3. Hopefully I will have solved all these issues by then :-) >> To fix this the launcher doesn't need to track all python command line >> options, only those that take two args.? I don't really see that it >> will be such a maintenance burden to have the launcher track any new >> ones.? Python has only two such options after 20 years of development. >> >> As for complexity it's less than 10 lines of C. > > Plus tests, presumably ... let's see 'em :-) > >> That's a cool feature which I'd not really read up on, but that >> requires a global configuration file change, it's not doable on a per >> usage basis. > > Per usage = interactively, which is more of a "by-the-by" feature for the launcher, the main purpose being to bring shebang-line functionality to Windows. Fair enough. I can see that I am asking more of pylauncher than the unix shebang parser does. But it seems to so nearly do it correctly that I was surprised that it didn't do what I had assumed it did do. I find this usage of it so useful in fact that irrespective of whether the PEP takes on my suggestions I will be using the patched one, and I will be writing a unix pylauncher to do the same. Would it not be an idea to have new installations of python actually install pylauncher as 'python' which then forwards onto the correct 'pythonX.X'. It would possibly help resolve the whole question of "Does python invoke python2 or python3" issue. The patch should be attached. It is of course 20% C and 80% python tests :-) Sam -------------- next part -------------- A non-text attachment was scrubbed... Name: add_arg_skipping.patch Type: application/octet-stream Size: 10160 bytes Desc: not available URL: From merwok at netwok.org Tue Oct 18 17:23:07 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Tue, 18 Oct 2011 17:23:07 +0200 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Fixes #10860: Handle empty port after port delimiter in httplib In-Reply-To: References: Message-ID: <4E9D99DB.3020105@netwok.org> Hi, > diff --git a/Misc/NEWS b/Misc/NEWS > --- a/Misc/NEWS > +++ b/Misc/NEWS > @@ -54,6 +54,9 @@ > the following case: sys.stdin.read() stopped with CTRL+d (end of file), > raw_input() interrupted by CTRL+c. > > +- Issue #10860: httplib now correctly handles an empty port after port > + delimiter in URLs. > + > - dict_proxy objects now display their contents rather than just the class > name. Looks like your entry went into the Interpreter Core section instead of Library. BTW, I don?t understand ?3.x version will come as a separate patch? in your commit message; isn?t that the case for all patches? They?re changesets with no relationship at all from Mercurial?s viewpoint, and often their contents are also different. Cheers From senthil at uthcode.com Tue Oct 18 17:27:12 2011 From: senthil at uthcode.com (Senthil Kumaran) Date: Tue, 18 Oct 2011 23:27:12 +0800 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Fixes #10860: Handle empty port after port delimiter in httplib In-Reply-To: <4E9D99DB.3020105@netwok.org> References: <4E9D99DB.3020105@netwok.org> Message-ID: On Tue, Oct 18, 2011 at 11:23 PM, ?ric Araujo wrote: > > Looks like your entry went into the Interpreter Core section instead of > Library. That should be corrected. > > BTW, I don?t understand ?3.x version will come as a separate patch? in I think, he meant in a separate commit. :) Senthil From skippy.hammond at gmail.com Wed Oct 19 01:18:32 2011 From: skippy.hammond at gmail.com (Mark Hammond) Date: Wed, 19 Oct 2011 10:18:32 +1100 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> Message-ID: <4E9E0948.9030203@gmail.com> On 18/10/2011 8:59 PM, Sam Partington wrote: ... > I added shebangs to > all files as appropriate for devel/stable branch, and initially I > changed the python build targets from "python -utt build.py" to > "./build.py" and I lost the -utt functionality which I could live > with. Can't you just put the -utt options in the shebang line of build.py? > But on some of the windows machines the default action of python > files was to open an editor with the build.py. So I changed it to "py > -utt build.py". Everything seemed fine initially until tests started > to fail which ensued some head scratching. I actually didn't figure > out what was going on until I noticed that SCiTe was also calling the > wrong python because it also had "py -utt" to run python scripts. > Incidentally, one of my colleagues also discovered the same issue in > his eclipse pydev setup. I also notice that Editra also does "python > -u" by default, and I can imagine lots of users swapping "python" with > "py". Why would users choose to do that? Using "python" presumably already works for them, so what benefit do they get? If the main advantage is they can now use shebang lines, then the specific options the script wants can be expressed in that line. I wonder if we just need to make it clear that py.exe is not designed to simply be a replacement for python.exe - a simple replacement adds no value. It is designed to bring shebang processing to Python on Windows and the shebang line is where these args should live. If you want finer control over things, just keep using python.exe. Also, seeing it is much easier to add a feature later than to remove it, we should err on the side of not adding the feature until it is clear that many people want it and ensure we aren't creating other inconsistencies or issues when we do. If it turns out to be true that even with clear documentation people come to the same conclusion as you (ie, that py.exe supports arguments the same way as python.exe) we still have the option of adding it. Cheers, Mark From p.f.moore at gmail.com Wed Oct 19 09:05:25 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 19 Oct 2011 08:05:25 +0100 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: <4E9E0948.9030203@gmail.com> References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4E9E0948.9030203@gmail.com> Message-ID: On 19 October 2011 00:18, Mark Hammond wrote: > On 18/10/2011 8:59 PM, Sam Partington wrote: >> ... and I can imagine lots of users swapping "python" with >> "py". > > Why would users choose to do that? ?Using "python" presumably already works > for them, so what benefit do they get? ?If the main advantage is they can > now use shebang lines, then the specific options the script wants can be > expressed in that line. I use "py" interactively rather than "python" because I have 2.7 and 3.2 installed, and py -2 or py -3 gives me the explicit version I want without PATH hacking. If 2.7 and 3.x provided python2 and python3 executables explicitly I might not do this (I'm not at my PC right now so I can't recall if 3.x provides python3.exe as well as python.exe, there was talk of this certainly, but 2.7 definitely doesn't include python2.exe). Having said that, I don't use other command line options much, so the limitation doesn't bother me much (py -3 -m xxx would be the most likely usage I'd miss...) If the extra options really mattered to me, I could probably hack up a Powershell alias easily enough, but py -V is available and does 99% of what I need. Paul. From sam.partington at gmail.com Wed Oct 19 14:17:47 2011 From: sam.partington at gmail.com (Sam Partington) Date: Wed, 19 Oct 2011 13:17:47 +0100 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: <4E9E0948.9030203@gmail.com> References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4E9E0948.9030203@gmail.com> Message-ID: On 19 October 2011 00:18, Mark Hammond wrote: > On 18/10/2011 8:59 PM, Sam Partington wrote: ... >> I added shebangs to >> all files as appropriate for devel/stable branch, and initially I >> changed the python build targets from "python -utt build.py" to >> "./build.py" and I lost the -utt functionality which I could live >> with. > > Can't you just put the -utt options in the shebang line of build.py? Yes I can but I didn't. There are many ways to skin this cat, but I chose what seemed the most straightforward way. It went wrong and I didn't expect it to. Adding -tt to the shebang line makes sense, the use of -tt depends on the content of the script (although it would require me to add -tt to thousands of shebang lines). But the addition of '-u' depends on the environment in which I want to run it. i.e. when run from a console it's unnecessary and a performance penalty, whilst when run as part of a sub-process (either from an editor or as part of a larger collection of scripts) then it's nice to avoid having the output arrive in lumps. >> But on some of the windows machines the default action of python >> files was to open an editor with the build.py. So I changed it to "py >> -utt build.py". Everything seemed fine initially until tests started >> to fail which ensued some head scratching. ?I actually didn't figure >> out what was going on until I noticed that SCiTe was also calling the >> wrong python because it also had "py -utt" to run python scripts. >> Incidentally, one of my colleagues also discovered the same issue in >> his eclipse pydev setup. I also notice that Editra also does "python >> -u" by default, and I can imagine lots of users swapping "python" with >> "py". > > Why would users choose to do that? ?Using "python" presumably already works > for them, so what benefit do they get? ?If the main advantage is they can > now use shebang lines, then the specific options the script wants can be > expressed in that line. But "python" does NOT work. If you have multiple pythons installed and scripts that are designed to run with various versions then 'python' will use the wrong one, I had thought that this was precisely what pylauncher was meant to solve, i.e. the ability to select the correct python version based on the contents of that file. Nearly all python editors have the ability to run the current script with python. None of the editors that I know of have the ability (out of the box) to parse the shebang line and dispatch to the correct python. There is however already a tool that can do that, called pylauncher. Swap "python" with "py" and it would work in all editors. The only problem is that it does not support unbuffered output which nearly all editors will need in order to display timely output. If you don't want to support all args, how about just supporting the -u argument, which is the one I really need? My issue is currently to select the right python from 2.4, 2.5, 2.6, 2.7. But the need for a tool like this is even stronger when encouraging users to move to py 3 - on all platforms. In the absence of .py3 file extensions the shebang seems the way to go. "Why would users choose to do that?" : Because they are in a project using mixed python versions and they want an easy way to use the correct python from within their editor. > I wonder if we just need to make it clear that py.exe is not designed to > simply be a replacement for python.exe - a simple replacement adds no value. > ?It is designed to bring shebang processing to Python on Windows and the > shebang line is where these args should live. ?If you want finer control > over things, just keep using python.exe. There I have to disagree. Yes, a simple replacement would not add value. But a replacement that detects the correct python but otherwise works like python adds lots of value and what is more it has value on all platforms. Shouldn't that argument have applied also for the version specifier args, because to my mind that is a less elegant feature with more baggage. (oddity of "py -2 -3" for example). "If you want finer control just keep using c:\PythonXX\python.exe". Incidentally it would be really really nice if the windows installer installed "pythonX[.Y].exe" then we could just put all the python dirs in PATH and it would just work. no need for the -X.Y flags at all. > Also, seeing it is much easier to add a feature later than to remove it, we > should err on the side of not adding the feature until it is clear that many > people want it and ensure we aren't creating other inconsistencies or issues > when we do. ?If it turns out to be true that even with clear documentation > people come to the same conclusion as you (ie, that py.exe supports > arguments the same way as python.exe) we still have the option of adding it. Ok ok, I give up. Apparently I am the only one who wants to be able to run different versions of python based on the shebang line AND add occasional arguments to the python command line. Thanks for listening. Sam From vinay_sajip at yahoo.co.uk Wed Oct 19 14:40:30 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Wed, 19 Oct 2011 12:40:30 +0000 (UTC) Subject: [Python-Dev] PEP397 no command line options to python? References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4E9E0948.9030203@gmail.com> Message-ID: Sam Partington gmail.com> writes: > Ok ok, I give up. Apparently I am the only one who wants to be able > to run different versions of python based on the shebang line AND add > occasional arguments to the python command line. It sounds as if adding arguments to the shebang line will work for all cases other than the invoke-from-editor case. (Even if you need to change thousands of shebang lines, you might be able to automate this with Python, so perhaps it's not as bad as it sounds.) For the invoke-from-editor case, that's most useful when you have a reasonably long edit-run-edit-run-... session for the script(s) being edited. If that's the case, just append -u to the shebang line at the beginning of your session, and remove it at the end, and keep using "py" as your editor's Python; won't that do the trick? Regards, Vinay Sajip From merwok at netwok.org Wed Oct 19 20:32:22 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Wed, 19 Oct 2011 20:32:22 +0200 Subject: [Python-Dev] Buildbots with rpm installed Message-ID: <4E9F17B6.2020900@netwok.org> Hi, Do we have buildbots with the rpm programs installed? There is a patch I want to commit to fix a bug in distutils? bdist_rpm; it was tested by the patch author, but I cannot verify it on my machine, so I would feel safer if our buildbot fleet would cover that. Thanks From stefan at bytereef.org Wed Oct 19 21:01:18 2011 From: stefan at bytereef.org (Stefan Krah) Date: Wed, 19 Oct 2011 21:01:18 +0200 Subject: [Python-Dev] Buildbots with rpm installed In-Reply-To: <4E9F17B6.2020900@netwok.org> References: <4E9F17B6.2020900@netwok.org> Message-ID: <20111019190118.GA28506@sleipnir.bytereef.org> ?ric Araujo wrote: > Do we have buildbots with the rpm programs installed? There is a patch > I want to commit to fix a bug in distutils? bdist_rpm; it was tested by > the patch author, but I cannot verify it on my machine, so I would feel > safer if our buildbot fleet would cover that. Yes, the Fedora bot currently fails the bdist_rpm tests: http://www.python.org/dev/buildbot/all/builders/AMD64%20Fedora%20without%20threads%203.x/builds/873/steps/test/logs/stdio Stefan Krah From techtonik at gmail.com Wed Oct 19 21:17:52 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 19 Oct 2011 22:17:52 +0300 Subject: [Python-Dev] Define Tracker workflow Message-ID: Does everybody feel comfortable with 'stage' and 'resultion' fields in tracker? I understand that 'stage' defines workflow and 'resolution' is status indicator, but the question is - do we really need to separate them? For example, right now when a ticket's 'status' is closed (all right - there is also a 'status' field), we mark 'stage' as 'committed/rejected'. I see the 'stage' as a workflow state and 'committed/rejected' value is confusing because further steps are actually depend on if the state is actually 'committed' or 'rejected'. stage: patch review -> committed/rejected When I see a patch was rejected, I need to analyse why and propose a better one. To analyse I need to look at 'resolution' field: duplicate fixed invalid later out of date postponed rejected remind wont fix works for me The resolution will likely be 'fixed' which doesn't give any info about if the patch was actually committed or not. You need to know that there is 'rejected' status, so if the status 'is not rejected' then the patch was likely committed. Note that resolution is also a state, but for a closed issue. Let me remind the values for the state of opened issue (recorded in a 'stage' field): test needed needs patch patch review commit review committed/rejected There is a clear duplication in stage:'committed/rejected', resolution:'fixed,rejected' and status:'closed'. Now `status` can be one of: open languishing pending closed For me the only things in `status` that matter are - open and closed. Everything else is more descriptive 'state' of the issue. So I'd merge all our descriptive fields into single 'state' field that will accept the following values depending on master 'status': open: languishing pending needs test needs patch patch review commit review closed: committed duplicate fixed invalid out of date rejected wont fix works for me Renamed 'test needed' -> 'needs test'. For a workflow states like 'later', 'postponed' and 'remind' are too vague, so I removed them. These are better suit to user tags (custom keywords) like 'easy' etc. Implementing this change will 1. define clear workflow to pave the road for automation and future enhancements (commit/review queue, etc.) 2. de-clutter tracker UI to free space for more useful components 3. reduce categorization overhead -- anatoly t. From brian.curtin at gmail.com Wed Oct 19 21:26:19 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Wed, 19 Oct 2011 14:26:19 -0500 Subject: [Python-Dev] Define Tracker workflow In-Reply-To: References: Message-ID: On Wed, Oct 19, 2011 at 14:17, anatoly techtonik wrote: > The resolution will likely be 'fixed' which doesn't give any info > about if the patch was actually committed or not. If there's no commit update in the messages on the issue, you should assume it was not committed. At that point, either it wasn't actually committed (fixed), or the person forgot to tag the issue in the commit message, which they should remedy by just posting a message with the changeset ID. From ncoghlan at gmail.com Wed Oct 19 23:17:10 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Oct 2011 07:17:10 +1000 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Fix closes Issue12529 - cgi.parse_header failure on double quotes and In-Reply-To: References: Message-ID: On Thu, Oct 20, 2011 at 2:53 AM, senthil.kumaran wrote: > http://hg.python.org/cpython/rev/489237756488 > changeset: ? 72988:489237756488 > branch: ? ? ?2.7 > parent: ? ? ?72984:86e3943d0d5b > user: ? ? ? ?Senthil Kumaran > date: ? ? ? ?Thu Oct 20 00:52:24 2011 +0800 > summary: > ?Fix closes Issue12529 - cgi.parse_header failure on double quotes and > semicolons. Patch by Ben Darnell and Petri Lehtinen. > > files: > ?Lib/cgi.py ? ? ? ? ? | ?2 +- > ?Lib/test/test_cgi.py | ?3 +++ > ?2 files changed, 4 insertions(+), 1 deletions(-) NEWS entry? (same question for the later _sre fix) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ron3200 at gmail.com Thu Oct 20 04:42:30 2011 From: ron3200 at gmail.com (Ron Adam) Date: Wed, 19 Oct 2011 21:42:30 -0500 Subject: [Python-Dev] Define Tracker workflow In-Reply-To: References: Message-ID: <1319078550.8756.106.camel@Gutsy> On Wed, 2011-10-19 at 22:17 +0300, anatoly techtonik wrote: > Does everybody feel comfortable with 'stage' and 'resultion' fields in tracker? > > I understand that 'stage' defines workflow and 'resolution' is status > indicator, but the question is - do we really need to separate them? > For example, right now when a ticket's 'status' is closed (all right - > there is also a 'status' field), we mark 'stage' as > 'committed/rejected'. I see the 'stage' as a workflow state and > 'committed/rejected' value is confusing because further steps are > actually depend on if the state is actually 'committed' or 'rejected'. > > stage: patch review -> committed/rejected > > When I see a patch was rejected, I need to analyse why and propose a > better one. To analyse I need to look at 'resolution' field: > > duplicate > fixed > invalid > later > out of date > postponed > rejected > remind > wont fix > works for me > > The resolution will likely be 'fixed' which doesn't give any info > about if the patch was actually committed or not. You need to know > that there is 'rejected' status, so if the status 'is not rejected' > then the patch was likely committed. Note that resolution is also a > state, but for a closed issue. It's somewhat confusing to me also. > For me the only things in `status` that matter are - open and closed. > Everything else is more descriptive 'state' of the issue. So I'd merge > all our descriptive fields into single 'state' field that will accept > the following values depending on master 'status': > open: > languishing > pending > needs test > needs patch > patch review > commit review > > closed: > committed > duplicate > fixed > invalid > out of date > rejected > wont fix > works for me I like the idea. But its not clear what should be set at what times. While trying not to change too much, how about the following? Status: Open Closed Stage: In progress: needs fix (More specific than the term 'patch'.) needs test needs docs needs patch (Needs a combined fix/test/docs .diff file.) needs patch review (To Accepted if OK.) languishing (To "Rejected:out_of_date" if no action soon.) pending Accepted: commit review committed (And Close) Rejected: (Pick one and Close.) duplicate invalid out of date won't fix cannot reproduce (instead of 'works for me') This combines the stage and resolution fields together. Currently the stage is up in the upper left of a tracker page, while the status and resolution is further down. They should at least be moved near each other. +------------------+------------------+-----------------------+ | status: | stage: |resoltution: | +------------------+------------------+-----------------------+ But it would be better if it was just... +----------------+--------------------------------------------+ | status: | stage: | +----------------+--------------------------------------------+ And just list the stages like... status: Open stage: In progress -> needs docs status: Open stage: In progress -> needs patch review status: Open stage: Accepted -> commit review status: Closed stage: Accepted -> committed status: Closed stage: Rejected -> invalid It's not entirely consistent because while it's open, the stage refers to what is needed, but once it's closed, it refers to the last item done. But I think it would be fine that way. As for more detailed info, you pretty much have to read the discussion. Cheers, Ron From senthil at uthcode.com Thu Oct 20 18:46:28 2011 From: senthil at uthcode.com (Senthil Kumaran) Date: Fri, 21 Oct 2011 00:46:28 +0800 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Fix closes Issue12529 - cgi.parse_header failure on double quotes and In-Reply-To: References: Message-ID: <20111020164628.GB1964@mathmagic> On Thu, Oct 20, 2011 at 07:17:10AM +1000, Nick Coghlan wrote: > NEWS entry? (same question for the later _sre fix) Added. Thanks for catching this. For some reason, I had slight doubt, if those issues were NEWS worthy items. IIRC, devguide recommends that a NEWS entry be added for all fixes made, but still had a suspicion, if we unnecessarily add up too many entries to NEWS. Thanks, Senthil From asif.jamadar at rezayat.net Thu Oct 20 20:51:05 2011 From: asif.jamadar at rezayat.net (Asif Jamadar) Date: Thu, 20 Oct 2011 18:51:05 +0000 Subject: [Python-Dev] Generate Dynamic lists Message-ID: So I'm trying to generate dynamic choices for django form. Here i'm usig formset concept (CODE is mentioned below) Suppose i have list called criteria_list = ['education', 'know how', 'managerial', 'interpersonal', ] now i need to generate choices as follows list1 = [('education', 1), ('education', 2), ('education', 3), (''education' , 4) , ('know how', 1) ('know ho', 2), ('know ho', 3), ('know ho', 4)] list2 = [('education', 1), ('education', 2), ('education', 3), (''education' , 4) , ('managerial', 1) ('managerial', 2), ('managerial', 3), ('managerial', 4)] list3 = [('education', 1), ('education', 2), ('education', 3), (''education' , 4) , ('interpersonal', 1) ('interpersonal', 2), ('interpersonal', 3), ('interpersonal', 4)] list4 = [('know how', 1), ('know how', 2), ('know how ', 3), ('know how' , 4) , ('managerial', 1) ('managerial', 2), ('managerial', 3), ('managerial', 4)] list5 = [('know how', 1), ('know how', 2), ('know how ', 3), ('know how' , 4) , ('interpersonal', 1) ('interpersonal', 2), ('interpersonal', 3), ('interpersonal', 4)] list6= [('managerial', 1), ('managerial', 2), ('managerial ', 3), ('managerial' , 4) , ('interpersonal', 1) ('interpersonal', 2), ('interpersonal', 3), ('interpersonal', 4)] How can i achive this in python? The above all eachh list become the choices for each form. Suppose i have formset of 6 forms. Then how can i assign above dynamic generates list to the choice field of each form. I tried by using this following code but no luck view.py def evaluation(request): evaluation_formset = formset_factory(EvaluationForm, formset=BaseEvaluationFormset, extra=6) if request.POST: formset = evaluation_formset(request.POST) ##validation and save else: formset = evaluation_formset() render_to_response(formset) forms.py class EvaluationForm(forms.Form): value = forms.ChoiceField(widget=forms.RadioSelect(renderer=HorizontalRadioRenderer)) class BaseEvaluationFormSet(BaseFormSet): def __init__(self, *args, **kwargs): super(BaseEvaluationFormSet, self).__init__(*args, **kwargs) for form_index, form in enumerate(self.forms): form.fields["value"].choices = self.choice_method(form_index) def choice_method(self, form_index): list = [] item_list = [] criteria_list = [] criteria_length = len(sub_criterias)-1 for criteria_index in range(criteria_length): counter = 1 if criteria_index == form_index: for j in range(criteria_length-counter): x = 1 for i in range(6): criteria_list.append((sub_criterias[criteria_index], sub_criterias[criteria_index])) item_list.append((sub_criterias[criteria_index+ 1], sub_criterias[criteria_index+1])) list = criteria_list +item_list counter = counter + 1 if x != criteria_length: x = x + 1 return list -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Thu Oct 20 21:12:22 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 20 Oct 2011 21:12:22 +0200 Subject: [Python-Dev] Generate Dynamic lists In-Reply-To: References: Message-ID: Asif Jamadar, 20.10.2011 20:51: > So I'm trying to generate dynamic choices for django form.[...] Note that this is the CPython core developers mailing list. The right list for general Python programming related questions is either python-list at python.org, or the newsgroup comp.lang.python. Stefan From jacek.pliszka at gmail.com Thu Oct 20 21:13:37 2011 From: jacek.pliszka at gmail.com (Jacek Pliszka) Date: Thu, 20 Oct 2011 21:13:37 +0200 Subject: [Python-Dev] Generate Dynamic lists In-Reply-To: References: Message-ID: I believe this is not the correct forum for this as it does not concern development of Python language itself - you should post to comp.lang.python. However solutions is relatively simply, try this: list1,list2,list3,list4,list5,list6 = [ list(itertools.product(i,range(0,4))) for i in itertools.combinations(criteria_list,2) ] Best Regards, Jacek Pliszka 2011/10/20 Asif Jamadar : > So I'm trying to generate dynamic choices for? django form. Here i'm usig > formset concept (CODE is mentioned below) > > > > Suppose i have list called criteria_list = ['education', 'know how', > 'managerial', 'interpersonal', ] > > > > now i need to generate choices as follows > > > > list1 = [('education', 1), ('education', 2), ('education', 3), (''education' > , 4) , ('know how', 1) ('know ho', 2), ('know ho', 3), ('know ho', 4)] > > > > list2 = [('education', 1), ('education', 2), ('education', 3), (''education' > , 4) , ('managerial', 1) ('managerial', 2), ('managerial', 3), > ('managerial', 4)] > > > > list3 = [('education', 1), ('education', 2), ('education', 3), (''education' > , 4) , ('interpersonal', 1) ('interpersonal', 2), ('interpersonal', 3), > ('interpersonal', 4)] > > > > list4 = [('know how', 1), ('know how', 2), ('know how ', 3), ('know how' , > 4) , ('managerial', 1) ('managerial', 2), ('managerial', 3), ('managerial', > 4)] > > > > list5 = [('know how', 1), ('know how', 2), ('know how ', 3), ('know how' , > 4) , ('interpersonal', 1) ('interpersonal', 2), ('interpersonal', 3), > ('interpersonal', 4)] > > > > list6= [('managerial', 1), ('managerial', 2), ('managerial ', 3), > ('managerial' , 4) , ('interpersonal', 1) ('interpersonal', 2), > ('interpersonal', 3), ('interpersonal', 4)] > > > > > > How can i achive this in python? > > > > The above all eachh list become the? choices for each form. > > > > Suppose i have formset of 6 forms. Then how can i assign above dynamic > generates list to the choice field of each form. > > > > I tried by using this following code but no luck > > > > > > view.py > > > > def evaluation(request): > > > > ??? evaluation_formset = formset_factory(EvaluationForm, > formset=BaseEvaluationFormset, extra=6) > > > > ??? if request.POST: > > > > ??????? formset = evaluation_formset(request.POST) > > > > ??????? ##validation and save > > > > ??? else: > > > > ??????? formset = evaluation_formset() > > > > ??? render_to_response(formset) > > > > forms.py > > > > > > class EvaluationForm(forms.Form): > > > > ??? value = > forms.ChoiceField(widget=forms.RadioSelect(renderer=HorizontalRadioRenderer)) > > > > class BaseEvaluationFormSet(BaseFormSet): > > ??? def __init__(self, *args, **kwargs): > > ??????? super(BaseEvaluationFormSet, self).__init__(*args, **kwargs) > > ??????? for form_index, form in enumerate(self.forms): > > > > ??????????? form.fields["value"].choices = self.choice_method(form_index) > > ??? def choice_method(self, form_index): > > ??????? list = [] > > ??????? item_list = [] > > ??????? criteria_list = [] > > ??????? criteria_length = len(sub_criterias)-1 > > ??????? for criteria_index in range(criteria_length): > > ??????????? counter = 1 > > ??????????? if criteria_index == form_index: > > ??????????????? for j in range(criteria_length-counter): > ??????????????????? x = 1 > ??????????????????? for i in range(6): > ???????????????? criteria_list.append((sub_criterias[criteria_index], > sub_criterias[criteria_index])) > > ??????????????????????? item_list.append((sub_criterias[criteria_index+ 1], > sub_criterias[criteria_index+1])) > > ??????????????????????? list =? criteria_list +item_list > > ??????????????????????? counter = counter + 1 > ??????????????????????? if x != criteria_length: > > ??????????????????????????? x = x + 1 > > > > ??????????? return list > > > > > > > > > > > > > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/jacek.pliszka%40gmail.com > > From phd at phdru.name Thu Oct 20 21:07:17 2011 From: phd at phdru.name (Oleg Broytman) Date: Thu, 20 Oct 2011 23:07:17 +0400 Subject: [Python-Dev] Generate Dynamic lists In-Reply-To: References: Message-ID: <20111020190717.GA22043@iskra.aviel.ru> Hello. We are sorry but we cannot help you. This mailing list is to work on developing Python (adding new features to Python itself and fixing bugs); if you're having problems learning, understanding or using Python, please find another forum. Probably python-list/comp.lang.python mailing list/news group is the best place; there are Python developers who participate in it; you may get a faster, and probably more complete, answer there. See http://www.python.org/community/ for other lists/news groups/fora. Thank you for understanding. On Thu, Oct 20, 2011 at 06:51:05PM +0000, Asif Jamadar wrote: > So I'm trying to generate dynamic choices for django form. Here i'm usig formset concept (CODE is mentioned below) > > > > Suppose i have list called criteria_list = ['education', 'know how', 'managerial', 'interpersonal', ] > > > > now i need to generate choices as follows > > > > list1 = [('education', 1), ('education', 2), ('education', 3), (''education' , 4) , ('know how', 1) ('know ho', 2), ('know ho', 3), ('know ho', 4)] > > > > list2 = [('education', 1), ('education', 2), ('education', 3), (''education' , 4) , ('managerial', 1) ('managerial', 2), ('managerial', 3), ('managerial', 4)] > > > > list3 = [('education', 1), ('education', 2), ('education', 3), (''education' , 4) , ('interpersonal', 1) ('interpersonal', 2), ('interpersonal', 3), ('interpersonal', 4)] > > > > list4 = [('know how', 1), ('know how', 2), ('know how ', 3), ('know how' , 4) , ('managerial', 1) ('managerial', 2), ('managerial', 3), ('managerial', 4)] > > > > list5 = [('know how', 1), ('know how', 2), ('know how ', 3), ('know how' , 4) , ('interpersonal', 1) ('interpersonal', 2), ('interpersonal', 3), ('interpersonal', 4)] > > > > list6= [('managerial', 1), ('managerial', 2), ('managerial ', 3), ('managerial' , 4) , ('interpersonal', 1) ('interpersonal', 2), ('interpersonal', 3), ('interpersonal', 4)] > > > > > > How can i achive this in python? > > > > The above all eachh list become the choices for each form. > > > > Suppose i have formset of 6 forms. Then how can i assign above dynamic generates list to the choice field of each form. > > > > I tried by using this following code but no luck > > > > > > view.py > > > > def evaluation(request): > > > > evaluation_formset = formset_factory(EvaluationForm, formset=BaseEvaluationFormset, extra=6) > > > > if request.POST: > > > > formset = evaluation_formset(request.POST) > > > > ##validation and save > > > > else: > > > > formset = evaluation_formset() > > > > render_to_response(formset) > > > > forms.py > > > > > > class EvaluationForm(forms.Form): > > > > value = forms.ChoiceField(widget=forms.RadioSelect(renderer=HorizontalRadioRenderer)) > > > > class BaseEvaluationFormSet(BaseFormSet): > > def __init__(self, *args, **kwargs): > > super(BaseEvaluationFormSet, self).__init__(*args, **kwargs) > > for form_index, form in enumerate(self.forms): > > > > form.fields["value"].choices = self.choice_method(form_index) > > def choice_method(self, form_index): > > list = [] > > item_list = [] > > criteria_list = [] > > criteria_length = len(sub_criterias)-1 > > for criteria_index in range(criteria_length): > > counter = 1 > > if criteria_index == form_index: > > for j in range(criteria_length-counter): > x = 1 > for i in range(6): > criteria_list.append((sub_criterias[criteria_index], sub_criterias[criteria_index])) > > item_list.append((sub_criterias[criteria_index+ 1], sub_criterias[criteria_index+1])) > > list = criteria_list +item_list > > counter = counter + 1 > if x != criteria_length: > > x = x + 1 > > > > return list > > > > > > > > > > > > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/phd%40phdru.name Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From richismyname at me.com Thu Oct 20 20:15:23 2011 From: richismyname at me.com (Richard Saunders) Date: Thu, 20 Oct 2011 18:15:23 +0000 (GMT) Subject: [Python-Dev] memcmp performance Message-ID: <554f488d-acc4-40c9-afd9-867f42186ebc@me.com> Hi, This is my first time on Python-dev, so I apologize for my newbie-ness. I have been doing some performance experiments with memcmp, and I was surprised that memcmp wasn't faster than it was in Python. ?I did a whole,? long analysis and came up with some very simple results. Before I put in a tracker bug report, I wanted to present my findings and make sure they were repeatable to others (isn't that the nature of science? ;) ? as well as offer discussion. The analysis is a pdf and is here:? ? ? http://www.picklingtools.com/study.pdf The testcases are a tarball here: ? ?http://www.picklingtools.com/PickTest5.tar.gz I have three basic recommendations in the study: I am curious what other people think. ? Gooday, ? Richie -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Oct 20 23:08:48 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 20 Oct 2011 23:08:48 +0200 Subject: [Python-Dev] memcmp performance References: <554f488d-acc4-40c9-afd9-867f42186ebc@me.com> Message-ID: <20111020230848.799c28a1@pitrou.net> Hello, > I have been doing some performance experiments with memcmp, and I was > surprised that memcmp wasn't faster than it was in Python. ?I did a whole,? > long analysis and came up with some very simple results. > > Before I put in a tracker bug report, I wanted to present my findings > and make sure they were repeatable to others (isn't that the nature > of science? ;) ? as well as offer discussion. Thanks for the analysis. Non-bugfix work now happens on Python 3, where the str type is Python 2's unicode type. Your recommendations would have to be revisited under that light. Have you reported gcc's "outdated optimization" issue to them? Or is it already solved in newer gcc versions? Under glibc-based systems, it seems we can't go wrong with the system memcpy function. If gcc doesn't get in the way, that is. Regards Antoine. From scott+python-dev at scottdial.com Thu Oct 20 23:38:16 2011 From: scott+python-dev at scottdial.com (Scott Dial) Date: Thu, 20 Oct 2011 17:38:16 -0400 Subject: [Python-Dev] memcmp performance In-Reply-To: <20111020230848.799c28a1@pitrou.net> References: <554f488d-acc4-40c9-afd9-867f42186ebc@me.com> <20111020230848.799c28a1@pitrou.net> Message-ID: <4EA094C8.2000107@scottdial.com> On 10/20/2011 5:08 PM, Antoine Pitrou wrote: > Have you reported gcc's "outdated optimization" issue to them? Or is it > already solved in newer gcc versions? I checked this on gcc 4.6, and it still optimizes memcmp/strcmp into a "repz cmpsb" instruction on x86. This has been known to be a problem since at least 2002[1][2]. There are also some alternative implementations available on their mailing list. It seems the main objection to removing the optimization was that gcc isn't always compiling against an optimized libc, so they didn't want to drop the optimization. Beyond that, I think nobody was willing to put in the effort to change the optimization itself. [1] http://gcc.gnu.org/ml/gcc/2002-10/msg01616.html [2] http://gcc.gnu.org/ml/gcc/2003-04/msg00166.html -- Scott Dial scott at scottdial.com From ncoghlan at gmail.com Thu Oct 20 23:57:25 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 21 Oct 2011 07:57:25 +1000 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Fix closes Issue12529 - cgi.parse_header failure on double quotes and In-Reply-To: <20111020164628.GB1964@mathmagic> References: <20111020164628.GB1964@mathmagic> Message-ID: My take is that a further fix or tweak to something that already has a NEWS entry for the current release doesn't get a new entry, but everything else does. "What's New" is the place to get selective. -- Nick Coghlan (via Gmail on Android, so likely to be more terse than usual) On Oct 21, 2011 2:46 AM, "Senthil Kumaran" wrote: > On Thu, Oct 20, 2011 at 07:17:10AM +1000, Nick Coghlan wrote: > > NEWS entry? (same question for the later _sre fix) > > Added. Thanks for catching this. > > For some reason, I had slight doubt, if those issues were NEWS worthy > items. IIRC, devguide recommends that a NEWS entry be added for all fixes > made, but still had a suspicion, if we unnecessarily add up too many > entries to NEWS. > > Thanks, > Senthil > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From richismyname at me.com Thu Oct 20 23:23:30 2011 From: richismyname at me.com (Richard Saunders) Date: Thu, 20 Oct 2011 21:23:30 +0000 (GMT) Subject: [Python-Dev] memcmp performance Message-ID: <98ddfb2f-364a-61f3-bbc4-fed21f765ee4@me.com> Hey, > I have been doing some performance experiments with memcmp, and I was > surprised that memcmp wasn't faster than it was in Python. ?I did a whole,? > long analysis and came up with some very simple results. Paul Svensson suggested I post as much as I can as text, as people would be more likely to read it. So, here's the basic ideas: (1) memcmp is surprisingly slow on some Intel gcc platforms (Linux) ? ? ? ? On several Linux, Intel platforms, memcmp was 2-3x slower than? ? ? ? ? a simple, portable C function (with some optimizations) (2) The problem: If you compile C programs with gcc with any optimization on,? ? ? ?it will replace all memcmp calls with an assembly language stub: rep cmpsb ? ? ?instead of the memcmp call. (3) rep cmpsb seems like it would be faster, but it really isn't:? ? ? ? this completely bypasses the memcmp.S, memcmp_sse3.S ? ? ? and memcmp_sse4.S in glibc which are typically ?faster. (4) The basic conclusion is that the Python baseline on? ? ? ?Intel gcc platforms should probably be compiled with -fno-builtin-memcmp ? ? ?so we "avoid" gcc's ?memcmp optimization. The numbers are all in the paper: I will endeavor to try to generate a text form of all the tables so it's easier to read. ?This is much first in the Python dev arena, so I went a little overboard with my paper below. ;) ? Gooday, ? Richie > Before I put in a tracker bug report, I wanted to present my findings > and make sure they were repeatable to others (isn't that the nature > of science? ;) ? as well as offer discussion. > > The analysis is a pdf and is here:? > ? ? http://www.picklingtools.com/study.pdf > The testcases are a tarball here: > ? ?http://www.picklingtools.com/PickTest5.tar.gz > > I have three basic recommendations in the study: I am > curious what other people think. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Fri Oct 21 08:24:44 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 21 Oct 2011 08:24:44 +0200 Subject: [Python-Dev] memcmp performance In-Reply-To: <20111020230848.799c28a1@pitrou.net> References: <554f488d-acc4-40c9-afd9-867f42186ebc@me.com> <20111020230848.799c28a1@pitrou.net> Message-ID: Antoine Pitrou, 20.10.2011 23:08: >> I have been doing some performance experiments with memcmp, and I was >> surprised that memcmp wasn't faster than it was in Python. I did a whole, >> long analysis and came up with some very simple results. > > Thanks for the analysis. Non-bugfix work now happens on Python 3, where > the str type is Python 2's unicode type. Your recommendations would > have to be revisited under that light. Well, Py3 is quite a bit different now that PEP393 is in. It appears to use memcmp() or strcmp() a lot less than before, but I think unicode_compare() should actually receive an optimisation to use a fast memcmp() if both string kinds are equal, at least when their character unit size is less than 4 (i.e. especially for ASCII strings). Funny enough, tailmatch() has such an optimisation. Stefan From eric at trueblade.com Fri Oct 21 17:45:20 2011 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 21 Oct 2011 11:45:20 -0400 Subject: [Python-Dev] [Python-checkins] cpython (3.2): adjust braces a bit In-Reply-To: References: Message-ID: <4EA19390.9080604@trueblade.com> What's the logic for adding some braces, but removing others? On 10/19/2011 4:58 PM, benjamin.peterson wrote: > http://hg.python.org/cpython/rev/9c79a25f4a8b > changeset: 73010:9c79a25f4a8b > branch: 3.2 > parent: 72998:99a9f0251924 > user: Benjamin Peterson > date: Wed Oct 19 16:57:40 2011 -0400 > summary: > adjust braces a bit > > files: > Objects/genobject.c | 9 ++++----- > 1 files changed, 4 insertions(+), 5 deletions(-) > > > diff --git a/Objects/genobject.c b/Objects/genobject.c > --- a/Objects/genobject.c > +++ b/Objects/genobject.c > @@ -232,8 +232,9 @@ > > /* First, check the traceback argument, replacing None with > NULL. */ > - if (tb == Py_None) > + if (tb == Py_None) { > tb = NULL; > + } > else if (tb != NULL && !PyTraceBack_Check(tb)) { > PyErr_SetString(PyExc_TypeError, > "throw() third argument must be a traceback object"); > @@ -244,9 +245,8 @@ > Py_XINCREF(val); > Py_XINCREF(tb); > > - if (PyExceptionClass_Check(typ)) { > + if (PyExceptionClass_Check(typ)) > PyErr_NormalizeException(&typ, &val, &tb); > - } > > else if (PyExceptionInstance_Check(typ)) { > /* Raising an instance. The value should be a dummy. */ > @@ -262,10 +262,9 @@ > typ = PyExceptionInstance_Class(typ); > Py_INCREF(typ); > > - if (tb == NULL) { > + if (tb == NULL) > /* Returns NULL if there's no traceback */ > tb = PyException_GetTraceback(val); > - } > } > } > else { > > > > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins From solipsis at pitrou.net Fri Oct 21 18:01:45 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 21 Oct 2011 18:01:45 +0200 Subject: [Python-Dev] memcmp performance References: <554f488d-acc4-40c9-afd9-867f42186ebc@me.com> <20111020230848.799c28a1@pitrou.net> Message-ID: <20111021180145.153ead21@pitrou.net> On Fri, 21 Oct 2011 08:24:44 +0200 Stefan Behnel wrote: > Antoine Pitrou, 20.10.2011 23:08: > >> I have been doing some performance experiments with memcmp, and I was > >> surprised that memcmp wasn't faster than it was in Python. I did a whole, > >> long analysis and came up with some very simple results. > > > > Thanks for the analysis. Non-bugfix work now happens on Python 3, where > > the str type is Python 2's unicode type. Your recommendations would > > have to be revisited under that light. > > Well, Py3 is quite a bit different now that PEP393 is in. It appears to use > memcmp() or strcmp() a lot less than before, but I think unicode_compare() > should actually receive an optimisation to use a fast memcmp() if both > string kinds are equal, at least when their character unit size is less > than 4 (i.e. especially for ASCII strings). Funny enough, tailmatch() has > such an optimisation. Yes, unicode_compare() probably deserves optimizing. Patches welcome, by the way :) Regards Antoine. From status at bugs.python.org Fri Oct 21 18:07:29 2011 From: status at bugs.python.org (Python tracker) Date: Fri, 21 Oct 2011 18:07:29 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20111021160729.AD4C11CC0B@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2011-10-14 - 2011-10-21) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3095 (+18) closed 21927 (+43) total 25022 (+61) Open issues with patches: 1319 Issues opened (47) ================== #7322: Socket timeout can cause file-like readline() method to lose d http://bugs.python.org/issue7322 reopened by r.david.murray #11254: distutils doesn't byte-compile .py files to __pycache__ during http://bugs.python.org/issue11254 reopened by eric.araujo #11637: Add cwd to sys.path for hooks http://bugs.python.org/issue11637 reopened by eric.araujo #13180: pysetup silently ignores invalid entries in setup.cfg http://bugs.python.org/issue13180 opened by pmoore #13183: pdb skips frames after hitting a breakpoint and running step http://bugs.python.org/issue13183 opened by xdegaye #13184: Multi-layered symlinks to python cause runtime error. sys.pat http://bugs.python.org/issue13184 opened by Jason.Howlett #13189: New bdist_simple binary distribution format for packaging http://bugs.python.org/issue13189 opened by pmoore #13190: ConfigParser uses wrong newline on Windows http://bugs.python.org/issue13190 opened by noam.el #13191: Typo in argparse documentation http://bugs.python.org/issue13191 opened by mikehoy #13192: ImportError silences low-level OS errors http://bugs.python.org/issue13192 opened by pitrou #13193: test_packaging and test_distutils failures http://bugs.python.org/issue13193 opened by pitrou #13195: subprocess: args with shell=True is not documented on Windows http://bugs.python.org/issue13195 opened by techtonik #13196: subprocess: undocumented if shell=True is necessary to find ex http://bugs.python.org/issue13196 opened by techtonik #13197: subprocess: move shell arguments to a separate keyword param http://bugs.python.org/issue13197 opened by techtonik #13198: Remove duplicate definition of write_record_file http://bugs.python.org/issue13198 opened by eric.araujo #13200: Add start, stop and step attributes to range objects http://bugs.python.org/issue13200 opened by smarnach #13201: Implement comparison operators for range objects http://bugs.python.org/issue13201 opened by smarnach #13203: Doc: say id() is only useful for existing objects http://bugs.python.org/issue13203 opened by terry.reedy #13204: sys.flags.__new__ crashes http://bugs.python.org/issue13204 opened by Trundle #13207: os.path.expanduser breaks when using unicode character in the http://bugs.python.org/issue13207 opened by mandel #13208: Problems with urllib on windows http://bugs.python.org/issue13208 opened by Deepak.Dodo #13209: Refactor code using unicode_encode_call_errorhandler() in unic http://bugs.python.org/issue13209 opened by haypo #13210: Support Visual Studio 2010 http://bugs.python.org/issue13210 opened by sable #13211: urllib2.HTTPError does not have 'reason' attribute. http://bugs.python.org/issue13211 opened by jason.coombs #13212: json library is decoding/encoding when it should not http://bugs.python.org/issue13212 opened by thinred #13213: generator.throw() behavior http://bugs.python.org/issue13213 opened by petri.lehtinen #13214: Cmd: list available completions from the cmd.Cmd subclass and http://bugs.python.org/issue13214 opened by yaneurabeya #13215: multiprocessing Manager.connect() aggressively retries refused http://bugs.python.org/issue13215 opened by bgilbert #13216: Add cp65001 codec http://bugs.python.org/issue13216 opened by haypo #13217: Missing header dependencies in Makefile http://bugs.python.org/issue13217 opened by jcon #13218: test_ssl failures on Ubuntu 11.10 http://bugs.python.org/issue13218 opened by nadeem.vawda #13220: print function unable while multiprocessing.Process is being r http://bugs.python.org/issue13220 opened by Ben.thelen #13223: pydoc removes 'self' in HTML for method docstrings with exampl http://bugs.python.org/issue13223 opened by Cameron.Hayne #13224: Change str(class) to return only the class name http://bugs.python.org/issue13224 opened by eric.araujo #13225: Failing packaging hooks should not stop operation http://bugs.python.org/issue13225 opened by eric.araujo #13226: Expose RTLD_* constants in the posix module http://bugs.python.org/issue13226 opened by haypo #13228: Add "Quick Start" section to the devguide index http://bugs.python.org/issue13228 opened by ezio.melotti #13229: Add shutil.filter_walk http://bugs.python.org/issue13229 opened by ncoghlan #13231: sys.settrace - document 'some other code blocks' for 'call' ev http://bugs.python.org/issue13231 opened by techtonik #13232: Logging: Unicode Error http://bugs.python.org/issue13232 opened by guettli #13234: os.listdir breaks with literal paths http://bugs.python.org/issue13234 opened by mandel #13236: unittest needs more flush calls http://bugs.python.org/issue13236 opened by petere #13237: subprocess docs should emphasise convenience functions http://bugs.python.org/issue13237 opened by ncoghlan #13238: Add shell command helpers to shutil module http://bugs.python.org/issue13238 opened by ncoghlan #13239: Remove <> operator from Grammar/Grammar http://bugs.python.org/issue13239 opened by eli.bendersky #13240: sysconfig gives misleading results for USE_COMPUTED_GOTOS http://bugs.python.org/issue13240 opened by flox #415492: Compiler generates relative filenames http://bugs.python.org/issue415492 reopened by ncoghlan Most recent 15 issues with no replies (15) ========================================== #13234: os.listdir breaks with literal paths http://bugs.python.org/issue13234 #13231: sys.settrace - document 'some other code blocks' for 'call' ev http://bugs.python.org/issue13231 #13229: Add shutil.filter_walk http://bugs.python.org/issue13229 #13220: print function unable while multiprocessing.Process is being r http://bugs.python.org/issue13220 #13217: Missing header dependencies in Makefile http://bugs.python.org/issue13217 #13215: multiprocessing Manager.connect() aggressively retries refused http://bugs.python.org/issue13215 #13213: generator.throw() behavior http://bugs.python.org/issue13213 #13211: urllib2.HTTPError does not have 'reason' attribute. http://bugs.python.org/issue13211 #13204: sys.flags.__new__ crashes http://bugs.python.org/issue13204 #13198: Remove duplicate definition of write_record_file http://bugs.python.org/issue13198 #13191: Typo in argparse documentation http://bugs.python.org/issue13191 #13190: ConfigParser uses wrong newline on Windows http://bugs.python.org/issue13190 #13178: Need tests for Unicode handling in install_distinfo and instal http://bugs.python.org/issue13178 #13166: Implement packaging.database.Distribution.__str__ http://bugs.python.org/issue13166 #13161: problems with help() documentation of __i*__ operators http://bugs.python.org/issue13161 Most recent 15 issues waiting for review (15) ============================================= #13234: os.listdir breaks with literal paths http://bugs.python.org/issue13234 #13228: Add "Quick Start" section to the devguide index http://bugs.python.org/issue13228 #13226: Expose RTLD_* constants in the posix module http://bugs.python.org/issue13226 #13224: Change str(class) to return only the class name http://bugs.python.org/issue13224 #13223: pydoc removes 'self' in HTML for method docstrings with exampl http://bugs.python.org/issue13223 #13218: test_ssl failures on Ubuntu 11.10 http://bugs.python.org/issue13218 #13217: Missing header dependencies in Makefile http://bugs.python.org/issue13217 #13214: Cmd: list available completions from the cmd.Cmd subclass and http://bugs.python.org/issue13214 #13204: sys.flags.__new__ crashes http://bugs.python.org/issue13204 #13201: Implement comparison operators for range objects http://bugs.python.org/issue13201 #13200: Add start, stop and step attributes to range objects http://bugs.python.org/issue13200 #13192: ImportError silences low-level OS errors http://bugs.python.org/issue13192 #13191: Typo in argparse documentation http://bugs.python.org/issue13191 #13189: New bdist_simple binary distribution format for packaging http://bugs.python.org/issue13189 #13184: Multi-layered symlinks to python cause runtime error. sys.pat http://bugs.python.org/issue13184 Top 10 most discussed issues (10) ================================= #12619: Automatically regenerate platform-specific modules http://bugs.python.org/issue12619 17 msgs #12405: packaging does not record/remove directories it creates http://bugs.python.org/issue12405 12 msgs #13218: test_ssl failures on Ubuntu 11.10 http://bugs.python.org/issue13218 11 msgs #7475: codecs missing: base64 bz2 hex zlib hex_codec ... http://bugs.python.org/issue7475 10 msgs #13153: IDLE crash with unicode bigger than 0xFFFF http://bugs.python.org/issue13153 10 msgs #3067: setlocale error message is confusing http://bugs.python.org/issue3067 9 msgs #13175: packaging uses wrong line endings in RECORD files on Windows http://bugs.python.org/issue13175 8 msgs #13173: Default values for string.Template http://bugs.python.org/issue13173 7 msgs #13210: Support Visual Studio 2010 http://bugs.python.org/issue13210 7 msgs #12296: Minor clarification in devguide http://bugs.python.org/issue12296 6 msgs Issues closed (44) ================== #3902: Packages containing only extension modules have to contain __i http://bugs.python.org/issue3902 closed by eric.araujo #6090: zipfile: Bad error message when zipping a file with timestamp http://bugs.python.org/issue6090 closed by python-dev #9168: setuid in smtp.py sheds privileges before binding port http://bugs.python.org/issue9168 closed by flox #11552: Confusing error message when hook module cannot be loaded http://bugs.python.org/issue11552 closed by eric.araujo #11931: Regular expression documentation patch http://bugs.python.org/issue11931 closed by rhettinger #12170: index() and count() methods of bytes and bytearray should acce http://bugs.python.org/issue12170 closed by pitrou #12281: bytes.decode('mbcs', 'ignore') does replace undecodable bytes http://bugs.python.org/issue12281 closed by haypo #12448: smtplib's __main__ doesn't flush when prompting http://bugs.python.org/issue12448 closed by ezio.melotti #12451: open: avoid the locale encoding when possible http://bugs.python.org/issue12451 closed by haypo #12454: mailbox: use ASCII to read/write .mh_sequences files http://bugs.python.org/issue12454 closed by python-dev #12517: Large file support on Windows: sizeof(off_t) is 32 bits http://bugs.python.org/issue12517 closed by haypo #12527: assertRaisesRegex doc'd with 'msg' arg, but it's not implement http://bugs.python.org/issue12527 closed by ezio.melotti #12529: cgi.parse_header fails on double quotes and semicolons http://bugs.python.org/issue12529 closed by python-dev #12604: VTRACE macro in _sre.c should use do {} while (0) http://bugs.python.org/issue12604 closed by python-dev #12668: 3.2 What's New: it's integer->string, not the opposite http://bugs.python.org/issue12668 closed by rhettinger #12997: sqlite3: PRAGMA foreign_keys = ON doesn't work http://bugs.python.org/issue12997 closed by ned.deily #13088: Add Py_hexdigits constant: use one unique constant to format a http://bugs.python.org/issue13088 closed by haypo #13121: collections.Counter's += copies the entire object http://bugs.python.org/issue13121 closed by rhettinger #13144: Global Module Index link in the offline documentation is incor http://bugs.python.org/issue13144 closed by georg.brandl #13146: Writing a pyc file is not atomic http://bugs.python.org/issue13146 closed by pitrou #13150: Most of Python's startup time is sysconfig http://bugs.python.org/issue13150 closed by pitrou #13174: test_os failures on Fedora 15: listxattr() returns ['security. http://bugs.python.org/issue13174 closed by python-dev #13177: Avoid chained exceptions in lru_cache http://bugs.python.org/issue13177 closed by rhettinger #13181: pysetup install creates .pyc files but pysetup remove doesn't http://bugs.python.org/issue13181 closed by eric.araujo #13182: pysetup run bdist_wininst does not work (tries to use "install http://bugs.python.org/issue13182 closed by vinay.sajip #13185: Why does Python interpreter care about curvy quotes in comment http://bugs.python.org/issue13185 closed by loewis #13186: instance_ass_item() broken in classobject.c (Py2.7) http://bugs.python.org/issue13186 closed by python-dev #13187: relative imports don't work when circular http://bugs.python.org/issue13187 closed by ncoghlan #13188: generator.throw() ignores __traceback__ of exception http://bugs.python.org/issue13188 closed by pitrou #13194: zlib (de)compressobj copy() method missing on Windows http://bugs.python.org/issue13194 closed by nadeem.vawda #13199: slice_richcompare() might leak an object in rare cases http://bugs.python.org/issue13199 closed by python-dev #13202: subprocess __exit__ attribute missing http://bugs.python.org/issue13202 closed by eric.araujo #13205: NameErrors in generated setup.py (codecs, split_multiline) http://bugs.python.org/issue13205 closed by eric.araujo #13206: while loop vs for loop test http://bugs.python.org/issue13206 closed by mark.dickinson #13219: re module doc has minor inaccuracy in character sets http://bugs.python.org/issue13219 closed by ezio.melotti #13221: No "edit with IDLE" in right click context menu http://bugs.python.org/issue13221 closed by loewis #13222: Erroneous unclosed file warning http://bugs.python.org/issue13222 closed by pitrou #13227: Option to make the lru_cache type specific http://bugs.python.org/issue13227 closed by rhettinger #13230: test_resources fails http://bugs.python.org/issue13230 closed by nadeem.vawda #13233: os.acces documentation error http://bugs.python.org/issue13233 closed by ezio.melotti #13235: logging.warn() is not documented http://bugs.python.org/issue13235 closed by vinay.sajip #12185: Decimal documentation lists "first" and "second" arguments, sh http://bugs.python.org/issue12185 closed by rhettinger #868845: Need unit tests for <...> reprs http://bugs.python.org/issue868845 closed by gvanrossum #1673007: urllib2 requests history + HEAD support http://bugs.python.org/issue1673007 closed by python-dev From benjamin at python.org Fri Oct 21 18:31:04 2011 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 21 Oct 2011 12:31:04 -0400 Subject: [Python-Dev] [Python-checkins] cpython (3.2): adjust braces a bit In-Reply-To: <4EA19390.9080604@trueblade.com> References: <4EA19390.9080604@trueblade.com> Message-ID: 2011/10/21 Eric V. Smith : > What's the logic for adding some braces, but removing others? No braces if everything is a one-liner, otherwise braces everywhere. -- Regards, Benjamin From tseaver at palladion.com Fri Oct 21 18:40:47 2011 From: tseaver at palladion.com (Tres Seaver) Date: Fri, 21 Oct 2011 12:40:47 -0400 Subject: [Python-Dev] cpython (3.2): adjust braces a bit In-Reply-To: References: <4EA19390.9080604@trueblade.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/21/2011 12:31 PM, Benjamin Peterson wrote: > 2011/10/21 Eric V. Smith : >> What's the logic for adding some braces, but removing others? > > No braces if everything is a one-liner, otherwise braces > everywhere. Hmm, PEP 7 doesn't show any example of the one-liner exception. Given that it tends to promote errors, particularly among indentation-conditioned Python programmers (adding another statement at the same indentation level), why not just have braces everywhere? Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6hoI4ACgkQ+gerLs4ltQ5xlACeLTfh93yW6jxaySzgVwyA9xDS GxMAn1ZmY5KcjZ50m5eYFusc6FMI++NF =mSl4 -----END PGP SIGNATURE----- From barry at python.org Fri Oct 21 18:45:19 2011 From: barry at python.org (Barry Warsaw) Date: Fri, 21 Oct 2011 12:45:19 -0400 Subject: [Python-Dev] cpython (3.2): adjust braces a bit In-Reply-To: References: <4EA19390.9080604@trueblade.com> Message-ID: <20111021124519.1e997fa6@limelight.wooz.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On Oct 21, 2011, at 12:40 PM, Tres Seaver wrote: >Hmm, PEP 7 doesn't show any example of the one-liner exception. Given >that it tends to promote errors, particularly among >indentation-conditioned Python programmers (adding another statement >at the same indentation level), why not just have braces everywhere? +1 - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAEBCAAGBQJOoaGfAAoJEBJutWOnSwa/ng4P/AkZvFfMIwYnX5F8ctItZhgS nZjCpbuH1dG0JBrYwXq+A0+CPJM2fnS/UItt1Lmu464XAjs53Gyk5sc4OFR4ex07 7gYz65yrs7YLkefxc+06Ate1VvyTCDy7iagGwjELaRM/Pmt47Ei2Rhm5sVgZNbSa XzoEKURw7Uk5Yc/9/rHW5uo2KmRMp91hvU20oXFE9aq/N5bZXhfRrJqol7CURNgf AfGM7PP/Bl4g3prOJFSXdmELzFuppnlqJ77q8UuckI5I4q7n8dBk8WJskWmZFfo1 xZo7S8P+3wt7BKw3yoOxkjzeYN0Hk17ct9aeDfmQXXKNUNRu7lpiqzCi5mcY5EOL E5Nzw1/qmf9dkSBHUTH/zl1VDw5LFPiUdvTXM1mEROuH5MfpbGob1da+YCSG4UWJ 3ekqSDBmfuaPRfMBD1gY/m4olNW/0Bixb0upWCncuWBdng6SB8qeegEEyAPMQkWz llMOaySLxVFG7zNPJQRli4HbARedqpUzk1SzjPJINaAUuwyVx3KbCpsJIDkwNOHl yJ1MoGYRQzlWdR/X2KJahHzYUaRDa+Rg9p9uXmnoDFAmVomh6HDVDbjHfhsxj5Xd H+wP9wI2k0tI2ZMMHgDJO9nSI51ec2WmMtXe+bYG8D/D1HWuYo/D6qhHFKGR3zNm aLH0qwwieUTpa81Qg7D8 =HB4F -----END PGP SIGNATURE----- From ethan at stoneleaf.us Fri Oct 21 18:50:04 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 21 Oct 2011 09:50:04 -0700 Subject: [Python-Dev] cpython (3.2): adjust braces a bit In-Reply-To: References: <4EA19390.9080604@trueblade.com> Message-ID: <4EA1A2BC.7060908@stoneleaf.us> Tres Seaver wrote: > On 10/21/2011 12:31 PM, Benjamin Peterson wrote: >> 2011/10/21 Eric V. Smith : >>> >>> What's the logic for adding some braces, but removing others? >> >> No braces if everything is a one-liner, otherwise braces >> everywhere. > > Hmm, PEP 7 doesn't show any example of the one-liner exception. Given > that it tends to promote errors, particularly among > indentation-conditioned Python programmers (adding another statement > at the same indentation level), why not just have braces everywhere? +1 ~Ethan~ From benjamin at python.org Fri Oct 21 19:16:59 2011 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 21 Oct 2011 13:16:59 -0400 Subject: [Python-Dev] cpython (3.2): adjust braces a bit In-Reply-To: References: <4EA19390.9080604@trueblade.com> Message-ID: 2011/10/21 Tres Seaver : > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 10/21/2011 12:31 PM, Benjamin Peterson wrote: >> 2011/10/21 Eric V. Smith : >>> What's the logic for adding some braces, but removing others? >> >> No braces if everything is a one-liner, otherwise braces >> everywhere. > > Hmm, PEP 7 doesn't show any example of the one-liner exception. ?Given > that it tends to promote errors, particularly among > indentation-conditioned Python programmers (adding another statement > at the same indentation level), why not just have braces everywhere? It certainly doesn't explicitly but if (type->tp_dictoffset != 0 && base->tp_dictoffset == 0 && type->tp_dictoffset == b_size && (size_t)t_size == b_size + sizeof(PyObject *)) return 0; /* "Forgive" adding a __dict__ only */ -- Regards, Benjamin From benjamin at python.org Fri Oct 21 19:17:39 2011 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 21 Oct 2011 13:17:39 -0400 Subject: [Python-Dev] cpython (3.2): adjust braces a bit In-Reply-To: References: <4EA19390.9080604@trueblade.com> Message-ID: 2011/10/21 Tres Seaver : > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 10/21/2011 12:31 PM, Benjamin Peterson wrote: >> 2011/10/21 Eric V. Smith : >>> What's the logic for adding some braces, but removing others? >> >> No braces if everything is a one-liner, otherwise braces >> everywhere. > > Hmm, PEP 7 doesn't show any example of the one-liner exception. ?Given > that it tends to promote errors, particularly among > indentation-conditioned Python programmers (adding another statement > at the same indentation level), why not just have braces everywhere? Because we're not writing Python? -- Regards, Benjamin From richismyname at me.com Fri Oct 21 20:23:24 2011 From: richismyname at me.com (Richard Saunders) Date: Fri, 21 Oct 2011 18:23:24 +0000 (GMT) Subject: [Python-Dev] memcmp performance Message-ID: <4b460c0d-e55d-e429-ecf8-e9f34ab033c4@me.com> >>> Richard Saunders >>> I have been doing some performance experiments with memcmp, and I was >>> surprised that memcmp wasn't faster than it was in Python. I did a whole, >>> long analysis and came up with some very simple results. >> >>Antoine Pitrou, 20.10.2011 23:08: >> Thanks for the analysis. Non-bugfix work now happens on Python 3, where >> the str type is Python 2's unicode type. Your recommendations would >> have to be revisited under that light. > > Stefan Behnel >Well, Py3 is quite a bit different now that PEP393 is in. It appears to use? >memcmp() or strcmp() a lot less than before, but I think unicode_compare()? >should actually receive an optimisation to use a fast memcmp() if both? >string kinds are equal, at least when their character unit size is less? >than 4 (i.e. especially for ASCII strings). Funny enough, tailmatch() has? >such an optimisation. I started looking at the most recent 3.x baseline: a lot of places,? the memcmp analysis appears relevant (zlib, arraymodule, datetime, xmlparse): all still use memcmp in about the same way. ?But I agree that there are? some major differences in the unicode portion. As long as the two strings are the same unicode "kind", you can use a? memcmp to compare. ?In that case, I would almost argue some memcmp optimization is even more important: unicode strings are potentially 2 to 4 times larger, so the amount of time spent in memcmp may be more (i.e., I am still rooting for -fno-builtin-memcmp on the compile lines). I went ahead a quick string_test3.py for comparing strings (similar to what I did in Python 2.7) # Simple python string comparison test for Python 3.3 a = []; b = []; c = []; d = [] for x in range(0,1000) : ? ? a.append("the quick brown fox"+str(x)) ? ? b.append("the wuick brown fox"+str(x)) ? ? c.append("the quick brown fox"+str(x)) ? ? d.append("the wuick brown fox"+str(x)) count = 0 for x in range(0,200000) : ? ? if a==c : count += 1 ? ? if a==c : count += 2 ? ? if a==d : count += 3 ? ? if b==c : count += 5 ? ? if b==d : count += 7 ? ? if c==d : count += 11 print(count) Timings on On My FC14 machine (Intel Xeon W3520 at 2.67Ghz): 29.18 seconds: ?Vanilla build of Python 3.3? 29.17 seconds: Python 3.3 compiled with -fno-builtin-memcmp:? No change: a little investigation shows unicode_compare is where all the work is: Here's currently the main loop inside unicode_compare: ? ? for (i = 0; i < len1 && i < len2; ++i) { ? ? ? ? Py_UCS4 c1, c2; ? ? ? ? c1 = PyUnicode_READ(kind1, data1, i); ? ? ? ? c2 = PyUnicode_READ(kind2, data2, i); ? ? ? ? if (c1 != c2) ? ? ? ? ? ? return (c1 < c2) ? -1 : 1; ? ? } ? ? return (len1 < len2) ? -1 : (len1 != len2); If both loops are the same unicode kind, we can add memcmp to unicode_compare for an optimization: ?? ? ? Py_ssize_t len = (len1 From benjamin at python.org Fri Oct 21 20:39:45 2011 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 21 Oct 2011 14:39:45 -0400 Subject: [Python-Dev] [Python-checkins] cpython (3.2): adjust braces a bit In-Reply-To: <4EA1BA94.6080006@trueblade.com> References: <4EA19390.9080604@trueblade.com> <4EA1BA94.6080006@trueblade.com> Message-ID: 2011/10/21 Eric V. Smith : > On 10/21/2011 12:31 PM, Benjamin Peterson wrote: >> 2011/10/21 Eric V. Smith : >>> What's the logic for adding some braces, but removing others? >> >> No braces if everything is a one-liner, otherwise braces everywhere. > > Not sure what "everything" means here. My specific question is why > braces were added here: > - ? ?if (tb == Py_None) > + ? ?if (tb == Py_None) { > ? ? ? ? tb = NULL; > + ? ?} Because an else if follows it. -- Regards, Benjamin From stefan_ml at behnel.de Fri Oct 21 20:57:45 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 21 Oct 2011 20:57:45 +0200 Subject: [Python-Dev] memcmp performance In-Reply-To: <4b460c0d-e55d-e429-ecf8-e9f34ab033c4@me.com> References: <4b460c0d-e55d-e429-ecf8-e9f34ab033c4@me.com> Message-ID: Richard Saunders, 21.10.2011 20:23: > As long as the two strings are the same unicode "kind", you can use a > memcmp to compare. In that case, I would almost argue some memcmp > optimization is even more important: unicode strings are potentially 2 > to 4 times larger, so the amount of time spent in memcmp may be more > (i.e., I am still rooting for -fno-builtin-memcmp on the compile lines). I would argue that the pure ASCII (1 byte per character) case is even more important than the other cases, and it suffers from the "1 byte per comparison" problem you noted. That's why you got the 2x speed-up for your quick test. Stefan From p.f.moore at gmail.com Fri Oct 21 21:16:49 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 21 Oct 2011 20:16:49 +0100 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4E9E0948.9030203@gmail.com> Message-ID: On 19 October 2011 13:17, Sam Partington wrote: > Ok ok, I give up. ?Apparently I am the only one who wants to be able > to run different versions of python based on the shebang line AND add > occasional arguments to the python command line. I don't know if this is of use to anyone, but I attach a Powershell module which does more or less what you're suggesting. It exports a few functions, and an alias "pyx" (just to distinguish it from the launcher, it could be renamed "py" if you prefer) which takes an -X.Y version option similar to the launcher (and a -w option for selecting the GUI version). All the remaining arguments are passed to Python. You can also add your own aliases - "Add-Python dev ." is a good way of adding a -dev flag to select the current development version you're working on. Hope it's useful to someone... Paul. -------------- next part -------------- A non-text attachment was scrubbed... Name: pyx.psm1 Type: application/octet-stream Size: 3358 bytes Desc: not available URL: From solipsis at pitrou.net Fri Oct 21 21:18:58 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 21 Oct 2011 21:18:58 +0200 Subject: [Python-Dev] memcmp performance References: <4b460c0d-e55d-e429-ecf8-e9f34ab033c4@me.com> Message-ID: <20111021211858.4c54ded2@pitrou.net> On Fri, 21 Oct 2011 18:23:24 +0000 (GMT) Richard Saunders wrote: > > If both loops are the same unicode kind, we can add memcmp > to unicode_compare for an optimization: > ?? > ? ? Py_ssize_t len = (len1 > ? ? /* use memcmp if both the same kind */ > ? ? if (kind1==kind2) { > ? ? ? int result=memcmp(data1, data2, ((int)kind1)*len); > ? ? ? if (result!=0)? > return result<0 ? -1 : +1;? > ? ? } Hmm, you have to be a bit subtler than that: on a little-endian machine, you can't compare two characters by comparing their bytes representation in memory order. So memcmp() can only be used for the one-byte representation. (actually, it can also be used for equality comparisons on any representation) > Rerunning the test with this small change to unicode_compare: > > 17.84 seconds: ?-fno-builtin-memcmp? > 36.25 seconds: ?STANDARD memcmp > > The standard memcmp is WORSE that the original unicode_compare > code, but if we compile using memcmp with -fno-builtin-memcmp, we get that > wonderful 2x performance increase again. The standard memcmp being worse is a bit puzzling. Intuitively, it should have roughly the same performance as the original function. I also wonder whether the slowdown could materialize on non-glibc systems. > I am still rooting for -fno-builtin-memcmp in both Python 2.7 and 3.3 ... > (after we put memcmp in unicode_compare) A patch for unicode_compare would be a good start. Its performance can then be checked on other systems (such as Windows). Regards Antoine. From victor.stinner at haypocalc.com Fri Oct 21 22:07:55 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 21 Oct 2011 22:07:55 +0200 Subject: [Python-Dev] Status of the PEP 400? (deprecate codecs.StreamReader/StreamWriter) In-Reply-To: References: <4E308D63.9090901@haypocalc.com> Message-ID: <201110212207.55106.victor.stinner@haypocalc.com> Le vendredi 29 juillet 2011 19:01:06, Guido van Rossum a ?crit : > On Fri, Jul 29, 2011 at 8:37 AM, Nick Coghlan wrote: > > On Sat, Jul 30, 2011 at 1:17 AM, Antoine Pitrou wrote: > >> On Thu, 28 Jul 2011 11:28:43 +0200 > >> > >> Victor Stinner wrote: > >>> I will add your alternative to the PEP (except if you would like to do > >>> that yourself?). If I understood correctly, you propose to: > >>> > >>> * rename codecs.open() to codecs.open_stream() > >>> * change codecs.open() to reuse open() (and so io.TextIOWrapper) > >>> > >>> (and don't deprecate anything) > >> > >> This may be an interesting approach. In a few years, we can evaluate > >> whether users are calling open_stream(), and if there aren't any, we > >> can deprecate the whole thing. > > > > Indeed. I'm also heavily influenced by MAL's opinion on this > > particular topic, so the fact he's OK with this approach counts for a > > lot. It achieves the main benefit I'm interested in (transparently > > migrating users of the codecs.open API to the new IO stack), while > > paving the way for eliminating the redundancy at some point in the > > future. > > +1 I updated the PEP 400 to no longer *remove* deprecated functions in Python 3.4. I don't like the idea of adding a *new* function (codecs.open_stream()) which emits a DeprecatingWarning. New functions are not supposed to be (indirectly) deprecated. Short summary of the updated PEP 400: - patch codecs.open() to make it reuse TextIOWrapper to access text files (instead of Stream* classes) - instanciate Stream* classes emit a DeprecationWarning - that's all So you can still get stream reader/writer using codecs.getreader() and codecs.getwriter() functions. Victor From solipsis at pitrou.net Fri Oct 21 23:08:08 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 21 Oct 2011 23:08:08 +0200 Subject: [Python-Dev] Buildbot failures Message-ID: <20111021230808.7c101aec@pitrou.net> Hello, There are currently a bunch of various buildbot failures on all 3 branches. I would remind committers to regularly take a look at the buildbots, so that these failures get solved reasonably fast. Regards Antoine. From ncoghlan at gmail.com Sat Oct 22 01:31:53 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 22 Oct 2011 09:31:53 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Issue 13227: Option to make the lru_cache() type specific (suggested by Andrew In-Reply-To: References: Message-ID: On Fri, Oct 21, 2011 at 1:57 AM, raymond.hettinger wrote: > + ? If *typed* is set to True, function arguments of different types will be > + ? cached separately. ?For example, ``f(3)`` and ``f(3.0)`` will be treated > + ? as distinct calls with distinct results. I've been pondering this one a bit since reviewing it on the tracker, and I'm wondering if we have the default behaviour the wrong way around. For "typed=True": - never results in accidental type coercion and potentially wrong answers* (see below) - cache uses additional memory (each entry is larger, more entries may be stored) - additional cache misses - differs from current behaviour For "typed=False" (current default): - matches current (pre-3.3) behaviour - can lead to accidental type coercion and incorrect answers I only just realised this morning that the existing (untyped) caching behaviour can give answers that are *numerically* wrong, not just of the wrong type. This becomes clear once we use division as our test operation rather than multiplication and bring Decimal into the mix: >>> from functools import lru_cache >>> @lru_cache() ... def divide(x, y): ... return x / y ... >>> from decimal import Decimal >>> 10 / 9 1.1111111111111112 >>> Decimal(10) / Decimal(9) Decimal('1.111111111111111111111111111') >>> divide(10, 9) 1.1111111111111112 >>> divide(Decimal(10), Decimal(9)) 1.1111111111111112 At the very least, I think lru_cache should default to typed behaviour in 3.3, with people being able to explicitly switch it off as a cache optimisation technique when they know it doesn't matter. You could even make the case that making the cache type aware under the hood in 3.2 would be a bug fix, and the only new feature in 3.3 would be the ability to switch off the type awareness to save memory. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Oct 22 01:37:35 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 22 Oct 2011 09:37:35 +1000 Subject: [Python-Dev] =?windows-1252?q?=5BPython-checkins=5D_cpython=3A_Do?= =?windows-1252?q?cument_that_packaging_doesn=92t_create_=5F=5Finit?= =?windows-1252?q?=5F=5F=2Epy_files_=28=233902=29=2E?= In-Reply-To: References: Message-ID: On Fri, Oct 21, 2011 at 11:52 PM, eric.araujo wrote: > +To distribute extension modules that live in a package (e.g. ``package.ext``), > +you need to create you need to create a :file:`{package}/__init__.py` file to > +let Python recognize and import your module. "you need to create" is repeated in the new text. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Oct 22 01:46:37 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 22 Oct 2011 09:46:37 +1000 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4E9E0948.9030203@gmail.com> Message-ID: On Wed, Oct 19, 2011 at 10:17 PM, Sam Partington wrote: > Ok ok, I give up. ?Apparently I am the only one who wants to be able > to run different versions of python based on the shebang line AND add > occasional arguments to the python command line. As a simpler alternative, I suggest the launcher just gain a "--which" long option that displays the full path to the interpreter it found. So: C:\> py -2 --which C:\Python27\python.exe C:\> py -3 --which C:\Python32\python.exe No significant complexity in the launcher, and if you want to add additional arguments like -m, -c, or -i you can do it by running '--which' and switching to invoking that interpreter directly. "-i" in particular is invaluable for the following scenario: - app crashes with exception - rerun with "-i" - at the interpreter prompt, do "import pdd; pdb.pm()" - poke around in the offending frame directly rather than sprinkling print statement fairy dust around everywhere potentially relevant And, of course, the "-m" use case has already been mentioned to invoke modules by module name rather than file name ("python -m timeit", anyone?) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From stefan_ml at behnel.de Sat Oct 22 06:44:10 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 22 Oct 2011 06:44:10 +0200 Subject: [Python-Dev] [PATCH] unicode subtypes broken in latest py3k debug builds Message-ID: Hi, the py3k debug build has been broken in Cython's integration tests for a couple of weeks now due to a use-after-decref bug. Here's the fix, please apply. BTW, is there a reason unicode_subtype_new() copies the buffer of the unicode object it just created, instead of just stealing it? Stefan -------------- next part -------------- A non-text attachment was scrubbed... Name: unicode_subtype_fix.patch Type: text/x-patch Size: 790 bytes Desc: not available URL: From victor.stinner at haypocalc.com Sat Oct 22 14:05:25 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sat, 22 Oct 2011 14:05:25 +0200 Subject: [Python-Dev] [PATCH] unicode subtypes broken in latest py3k debug builds In-Reply-To: References: Message-ID: <4EA2B185.5060007@haypocalc.com> > the py3k debug build has been broken in Cython's integration tests for a > couple of weeks now due to a use-after-decref bug. Here's the fix, > please apply. Oops, I introduced this bug when I added "check_content" option to _PyUnicode_CheckUnicode(). > BTW, is there a reason unicode_subtype_new() copies the buffer of the > unicode object it just created, instead of just stealing it? Good question. We can maybe optimize this function. Victor From vinay_sajip at yahoo.co.uk Sat Oct 22 14:15:46 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sat, 22 Oct 2011 12:15:46 +0000 (UTC) Subject: [Python-Dev] PEP397 no command line options to python? References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4E9E0948.9030203@gmail.com> Message-ID: Nick Coghlan gmail.com> writes: > As a simpler alternative, I suggest the launcher just gain a "--which" > long option that displays the full path to the interpreter it found. > > So: > > C:\> py -2 --which > C:\Python27\python.exe > > C:\> py -3 --which > C:\Python32\python.exe > > No significant complexity in the launcher, and if you want to add > additional arguments like -m, -c, or -i you can do it by running > '--which' and switching to invoking that interpreter directly. Perhaps even simpler would be for the -h option to print the interpreter paths which would be returned for -2 and -3, on separate lines, even without the --which, e.g. Currently configured: -2: c:\Python27\python.exe -3: c:\Python32\python.exe Regards, Vinay Sajip From p.f.moore at gmail.com Sat Oct 22 15:27:26 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 22 Oct 2011 14:27:26 +0100 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4E9E0948.9030203@gmail.com> Message-ID: (Sorry, should have gone to the list...) On 22 October 2011 13:15, Vinay Sajip wrote: > Nick Coghlan gmail.com> writes: > >> As a simpler alternative, I suggest the launcher just gain a "--which" >> long option that displays the full path to the interpreter it found. >> >> So: >> >> C:\> py -2 --which >> C:\Python27\python.exe >> >> C:\> py -3 --which >> C:\Python32\python.exe >> >> No significant complexity in the launcher, and if you want to add >> additional arguments like -m, -c, or -i you can do it by running >> '--which' and switching to invoking that interpreter directly. > > Perhaps even simpler would be for the -h option to print the interpreter paths > which would be returned for -2 and -3, on separate lines, even without the > --which, e.g. > > Currently configured: > -2: c:\Python27\python.exe > -3: c:\Python32\python.exe --which is nice for people who can use Unix-style $() or Powershell & to directly execute the output as a command. & (py -3 --which) Paul > From andrea.crotti.0 at gmail.com Sat Oct 22 21:30:34 2011 From: andrea.crotti.0 at gmail.com (Andrea Crotti) Date: Sat, 22 Oct 2011 20:30:34 +0100 Subject: [Python-Dev] Buildbot failures In-Reply-To: <20111021230808.7c101aec@pitrou.net> References: <20111021230808.7c101aec@pitrou.net> Message-ID: <4EA319DA.2000904@gmail.com> On 10/21/2011 10:08 PM, Antoine Pitrou wrote: > Hello, > > There are currently a bunch of various buildbot failures on all 3 > branches. I would remind committers to regularly take a look at the > buildbots, so that these failures get solved reasonably fast. > > Regards > > Antoine. In my previous workplace if someone broke a build committing something wrong he/she had to bring cake for everyone next meeting. The cake is not really feasible I guess, but isn't it possible to notify the developer that broke the build? If one is not clearly defined, maybe notifying the last N developers that committed between the last successful builds and the failing build, would it be possible and make sense? From solipsis at pitrou.net Sat Oct 22 21:33:42 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 22 Oct 2011 21:33:42 +0200 Subject: [Python-Dev] Buildbot failures In-Reply-To: <4EA319DA.2000904@gmail.com> References: <20111021230808.7c101aec@pitrou.net> <4EA319DA.2000904@gmail.com> Message-ID: <20111022213342.2c2a4b35@pitrou.net> On Sat, 22 Oct 2011 20:30:34 +0100 Andrea Crotti wrote: > > In my previous workplace if someone broke a build committing something > wrong he/she > had to bring cake for everyone next meeting. > > The cake is not really feasible I guess, but isn't it possible to notify > the developer that > broke the build? Some of us do the notifying manually, but it's quite boring and bothersome. Automating it is a bit tricky, since some of our tests (as well as some of the buildslaves themselves) are a bit flaky and will produce intermittent failures. But I think it's indeed the good direction. Regards Antoine. From ncoghlan at gmail.com Sun Oct 23 05:40:16 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 23 Oct 2011 13:40:16 +1000 Subject: [Python-Dev] Buildbot failures In-Reply-To: <4EA319DA.2000904@gmail.com> References: <20111021230808.7c101aec@pitrou.net> <4EA319DA.2000904@gmail.com> Message-ID: On Sun, Oct 23, 2011 at 5:30 AM, Andrea Crotti wrote: > If one is not clearly defined, maybe notifying the last N developers that > committed > between the last successful builds and the failing build, would it be > possible and make sense? Yeah, as Antoine noted, that's where we want to get to eventually, but at the moment, even the "stable" buildbots are a bit too flaky for us to turn that on (essentially, the buildbots end up spamming the alerts, so people start assuming they're *all* false alarms and the notifications become ineffective). We're getting closer though - since the buildbots were put in place, many of the flakier tests have been redesigned to be significantly more reliable. In the meantime, we rely on committers to check the buildbot pages for a day or two after they make commits to confirm there aren't any lurking cross-platforms problems or other issues with their changes. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From peck at us.ibm.com Sun Oct 23 18:03:39 2011 From: peck at us.ibm.com (Jon K Peck) Date: Sun, 23 Oct 2011 10:03:39 -0600 Subject: [Python-Dev] AUTO: Jon K Peck is out of the office (returning 10/27/2011) Message-ID: I am out of the office until 10/27/2011. I am out of the office attending the IBM IOD conference. I will be back in the office on Friday, 10/28. My email responses will be delayed during that period. Note: This is an automated response to your message "Python-Dev Digest, Vol 99, Issue 45" sent on 10/23/11 4:00:03. This is the only notification you will receive while this person is away. -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Sun Oct 23 22:56:41 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 23 Oct 2011 22:56:41 +0200 Subject: [Python-Dev] cpython (2.7): Whoops, PyException_GetTraceback() is not documented on 2.7 In-Reply-To: References: Message-ID: On 10/23/11 20:54, petri.lehtinen wrote: > http://hg.python.org/cpython/rev/5c4781a237ef > changeset: 73073:5c4781a237ef > branch: 2.7 > parent: 73071:11da12600f5b > user: Petri Lehtinen > date: Sun Oct 23 21:52:10 2011 +0300 > summary: > Whoops, PyException_GetTraceback() is not documented on 2.7 If it exists there, why not document it instead? :) Georg From martin at v.loewis.de Sun Oct 23 23:46:08 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 23 Oct 2011 23:46:08 +0200 Subject: [Python-Dev] Modules of plat-* directories In-Reply-To: <20111017232709.355fcd8f@pitrou.net> References: <201110170116.36678.victor.stinner@haypocalc.com> <201110170204.38591.victor.stinner@haypocalc.com> <20111017232709.355fcd8f@pitrou.net> Message-ID: <4EA48B20.80206@v.loewis.de> > Given the issues you are mentioning, and given they were never > reported in years before, it seems unlikely anybody is using these > files. > > +1 to remove them, as they don't seem documented either. -1. If they were broken, and somebody used them, a bug would be reported. That no bug is being reported means that they either work fine, or nobody uses them. In the former case, removing them will break somebody's code. In the latter case, nothing is gained by either keeping or removing them. So why remove them? Regards, Martin From martin at v.loewis.de Sun Oct 23 23:50:26 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 23 Oct 2011 23:50:26 +0200 Subject: [Python-Dev] Modules of plat-* directories In-Reply-To: <201110170116.36678.victor.stinner@haypocalc.com> References: <201110170116.36678.victor.stinner@haypocalc.com> Message-ID: <4EA48C22.4040101@v.loewis.de> > I don't understand why we kept modules of the plat-* directories (e.g. > Lib/plat-linux/CDROM.py). Because they are useful. There is no reasonable other way at getting at the information in the modules for a Python program that may need them. > These modules are not regenerated when Python is compiled, so I don't > understand how values can be correct. They must be correct. On a specific system, these constants are not just part of the API - they are part of the ABI. So a system vendor cannot just change their values. Once defined, they must stay fixed forever. That's why it's not necessary to regenerate the files. > For example, IN.INT_MAX is 2147483647, > whereas it should be 9223372036854775807 on my 64-bit Linux. These values > don't look reliable. In general, this system can't deal well with conditional defines. People using them will know that. > These modules contain non-working functions: > > def __STRING(x): return #x > def __constant_le32_to_cpu(x): return ((__u32)(__le32)(x)) So what? The whole point of h2py is that it is automatically generated. If the output is bogus, users just won't use the fragments that are bogus. Other fragments work just fine, and can be used. Regards, Martin From martin at v.loewis.de Sun Oct 23 23:44:05 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 23 Oct 2011 23:44:05 +0200 Subject: [Python-Dev] memcmp performance In-Reply-To: <4b460c0d-e55d-e429-ecf8-e9f34ab033c4@me.com> References: <4b460c0d-e55d-e429-ecf8-e9f34ab033c4@me.com> Message-ID: <4EA48AA5.7010607@v.loewis.de> > I am still rooting for -fno-builtin-memcmp in both Python 2.7 and 3.3 ... > (after we put memcmp in unicode_compare) -1. We shouldn't do anything about this. Python has the tradition of not working around platform bugs, except if the work-arounds are necessary to make something work at all - i.e. in particular not for performance issues. If this is a serious problem, then platform vendors need to look into it (CPU vendor, compiler vendor, OS vendor). If they don't act, it's probably not a serious problem. In the specific case, I don't think it's a problem at all. It's not that memcmp is slow with the builtin version - it's just not as fast as it could be. Adding a compiler option would put a maintenance burden on Python - we already have way too many compiler options in configure.in, and there is no good procedure to ever take them out should they not be needed anymore. Regards, Martin From solipsis at pitrou.net Sun Oct 23 23:48:20 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 23 Oct 2011 23:48:20 +0200 Subject: [Python-Dev] Modules of plat-* directories In-Reply-To: <4EA48B20.80206@v.loewis.de> References: <201110170116.36678.victor.stinner@haypocalc.com> <201110170204.38591.victor.stinner@haypocalc.com> <20111017232709.355fcd8f@pitrou.net> <4EA48B20.80206@v.loewis.de> Message-ID: <1319406500.3517.1.camel@localhost.localdomain> > -1. If they were broken, and somebody used them, a bug would be > reported. That no bug is being reported means that they either > work fine, or nobody uses them. > > In the former case, removing them will break somebody's code. > In the latter case, nothing is gained by either keeping or removing > them. > > So why remove them? Not worrying whether we should maintain these files or not would be a reason. Not worrying whether we should document them (or provide a better way to access these facilities) is another. Given the messages on the bug tracker issue, it seems that (almost) nobody uses them *and* they are buggy ;) Regards Antoine. From martin at v.loewis.de Mon Oct 24 00:03:42 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 24 Oct 2011 00:03:42 +0200 Subject: [Python-Dev] Modules of plat-* directories In-Reply-To: <1319406500.3517.1.camel@localhost.localdomain> References: <201110170116.36678.victor.stinner@haypocalc.com> <201110170204.38591.victor.stinner@haypocalc.com> <20111017232709.355fcd8f@pitrou.net> <4EA48B20.80206@v.loewis.de> <1319406500.3517.1.camel@localhost.localdomain> Message-ID: <4EA48F3E.9020304@v.loewis.de> >> So why remove them? > > Not worrying whether we should maintain these files or not would be a > reason. Not worrying whether we should document them (or provide a > better way to access these facilities) is another. Don't worry whether, I tell you :-) Yes, we maintain them, and no, we make no changes to them unless a user actually requests a change, and no, we don't need to document them. I think there is a section on undocumented modules somewhere; if you worry too much, just add them there. There is little point in documenting them, since what they contain will vary from system to system. People should read the manual of their operating system to find out what all the constants mean (or perhaps the source code of their operating system in case the constants are undocumented even by the system vendor - which many of them are). Regards, Martin From skippy.hammond at gmail.com Mon Oct 24 00:15:00 2011 From: skippy.hammond at gmail.com (Mark Hammond) Date: Mon, 24 Oct 2011 09:15:00 +1100 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4E9E0948.9030203@gmail.com> Message-ID: <4EA491E4.80602@gmail.com> On 23/10/2011 12:27 AM, Paul Moore wrote: > (Sorry, should have gone to the list...) > > On 22 October 2011 13:15, Vinay Sajip wrote: >> Nick Coghlan gmail.com> writes: >> >>> As a simpler alternative, I suggest the launcher just gain a "--which" >>> long option that displays the full path to the interpreter it found. >>> >>> So: >>> >>> C:\> py -2 --which >>> C:\Python27\python.exe >>> >>> C:\> py -3 --which >>> C:\Python32\python.exe >>> >>> No significant complexity in the launcher, and if you want to add >>> additional arguments like -m, -c, or -i you can do it by running >>> '--which' and switching to invoking that interpreter directly. >> >> Perhaps even simpler would be for the -h option to print the interpreter paths >> which would be returned for -2 and -3, on separate lines, even without the >> --which, e.g. >> >> Currently configured: >> -2: c:\Python27\python.exe >> -3: c:\Python32\python.exe > > --which is nice for people who can use Unix-style $() or Powershell& > to directly execute the output as a command. > > & (py -3 --which) How about abusing the existing flags for this purpose - eg: % py -3? % py -2.7? etc. Mark From ncoghlan at gmail.com Mon Oct 24 01:36:37 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 24 Oct 2011 09:36:37 +1000 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: <4EA491E4.80602@gmail.com> References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4E9E0948.9030203@gmail.com> <4EA491E4.80602@gmail.com> Message-ID: On Mon, Oct 24, 2011 at 8:15 AM, Mark Hammond wrote: > How about abusing the existing flags for this purpose - eg: > > % py -3? > % py -2.7? What does using the magic symbol offer over an explicit separate flag? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From skippy.hammond at gmail.com Mon Oct 24 02:00:45 2011 From: skippy.hammond at gmail.com (Mark Hammond) Date: Mon, 24 Oct 2011 11:00:45 +1100 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4E9E0948.9030203@gmail.com> <4EA491E4.80602@gmail.com> Message-ID: <4EA4AAAD.7090104@gmail.com> On 24/10/2011 10:36 AM, Nick Coghlan wrote: > On Mon, Oct 24, 2011 at 8:15 AM, Mark Hammond wrote: >> How about abusing the existing flags for this purpose - eg: >> >> % py -3? >> % py -2.7? > > What does using the magic symbol offer over an explicit separate flag? * The "magic" symbol is somewhat self-documenting - it implies a question. Using --which adds another special case that people would need to understand isn't passed to Python. IOW, I like that there is only 1 special option and that one special option can be expressed in the form of a question. * Simplicity - does "py -2.3 --which" work the same as "py --which -2.3"? If not, that's not at all intuitive. If so, it adds complexity to the launcher and the PEP text. * Extensibility - While I've resisted, I predict that due to popular demand, we will wind up supporting additional arguments which are passed directly to Python (eg, "py -2.3 -W scriptName"). If we did, how would we treat --which when it is specified with additional options? So to turn the question back around - why introduce a new special option when the existing single special option can be leveraged? Are we opening the door to further special options? I guess the key downside to this suggestion is that it doesn't allow you ask where the default Python is without using "-2?" (or maybe just -?) Mark From ncoghlan at gmail.com Mon Oct 24 02:46:20 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 24 Oct 2011 10:46:20 +1000 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: <4EA4AAAD.7090104@gmail.com> References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4E9E0948.9030203@gmail.com> <4EA491E4.80602@gmail.com> <4EA4AAAD.7090104@gmail.com> Message-ID: On Mon, Oct 24, 2011 at 10:00 AM, Mark Hammond wrote: > * The "magic" symbol is somewhat self-documenting - it implies a question. > ?Using ?--which adds another special case that people would need to > understand isn't passed to Python. ?IOW, I like that there is only 1 special > option and that one special option can be expressed in the form of a > question. This may be a difference in what we're used to. To me, the "-?" is strongly associated with "-h" and "--help", whereas "--which" maps directly to the *nix "which" command: $ which python /usr/bin/python As far as simplicity and extensibility go, I would treat "--which" the way most programs treat "--help" and "--version" - they can appear anywhere on the command line and completely change the expected output of the command: $ python -Ei --version -c "This is never evaluated" Python 2.7.1 So I don't actually see any particularly *new* design decisions to be made in relation to a "--which" option - it's just a workaround for the lack of a native 'which' equivalent on Windows, and it behaves like Python's own "--version" option. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mhammond at skippinet.com.au Mon Oct 24 03:15:39 2011 From: mhammond at skippinet.com.au (Mark Hammond) Date: Mon, 24 Oct 2011 12:15:39 +1100 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4E9E0948.9030203@gmail.com> <4EA491E4.80602@gmail.com> <4EA4AAAD.7090104@gmail.com> Message-ID: <4EA4BC3B.4020400@skippinet.com.au> On 24/10/2011 11:46 AM, Nick Coghlan wrote: > On Mon, Oct 24, 2011 at 10:00 AM, Mark Hammond wrote: >> * The "magic" symbol is somewhat self-documenting - it implies a question. >> Using --which adds another special case that people would need to >> understand isn't passed to Python. IOW, I like that there is only 1 special >> option and that one special option can be expressed in the form of a >> question. > > This may be a difference in what we're used to. To me, the "-?" is > strongly associated with "-h" and "--help" Fair enough - and I admit to thinking -? didn't work for Python - but it does! >, whereas "--which" maps directly to the *nix "which" command: Sure, but this isn't for *nix, so I'm not sure it is safe to assume the users of the launcher will make that association. > So I don't actually see any particularly *new* design decisions to be > made in relation to a "--which" option - it's just a workaround for > the lack of a native 'which' equivalent on Windows, Actually I don't think that is true - even with a 'which' on Windows you can't get this information from it. Indeed, this functionality is quite distinct from that offered by which. TBH I'm not that bothered - I just have a slight uneasiness to this new special option which really just helps describe what a different special option does. So - in an effort to talk myself out of my idea... :) one advantage --which would have is that it could work without any version qualifiers at all - eg: % py --which path/to/script.py could also tell you what version of Python would be used to execute the named script, taking into account the current defaults, environment variables and shebang line found in the script. Cheers, Mark From ncoghlan at gmail.com Mon Oct 24 03:32:09 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 24 Oct 2011 11:32:09 +1000 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: <4EA4BC3B.4020400@skippinet.com.au> References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4E9E0948.9030203@gmail.com> <4EA491E4.80602@gmail.com> <4EA4AAAD.7090104@gmail.com> <4EA4BC3B.4020400@skippinet.com.au> Message-ID: On Mon, Oct 24, 2011 at 11:15 AM, Mark Hammond wrote: >> So I don't actually see any particularly *new* design decisions to be >> made in relation to a "--which" option - it's just a workaround for >> the lack of a native 'which' equivalent on Windows, > > Actually I don't think that is true - even with a 'which' on Windows you > can't get this information from it. ?Indeed, this functionality is quite > distinct from that offered by which. True, that comparison was a bad one - the launcher takes into account more than just path entries the way the *nix equivalent does. Still, it's a tool in the same spirit, even if the mechanics differs. > TBH I'm not that bothered - I just have a slight uneasiness to this new > special option which really just helps describe what a different special > option does. > > So - in an effort to talk myself out of my idea... :) ?one advantage --which > would have is that it could work without any version qualifiers at all - eg: > > % py --which path/to/script.py > > could also tell you what version of Python would be used to execute the > named script, taking into account the current defaults, environment > variables and shebang line found in the script. I was actually just thinking of the simple "py --which" use case, but you're right, it could be extended to shebang line checking as well. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From murman at gmail.com Mon Oct 24 03:56:57 2011 From: murman at gmail.com (Michael Urman) Date: Sun, 23 Oct 2011 20:56:57 -0500 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: <4EA491E4.80602@gmail.com> References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4E9E0948.9030203@gmail.com> <4EA491E4.80602@gmail.com> Message-ID: On Sun, Oct 23, 2011 at 17:15, Mark Hammond wrote: > How about abusing the existing flags for this purpose - eg: > > % py -3? > % py -2.7? I would have expected that to launch an interactive python shell of the appropriate version. Does it do something else today? Michael From mhammond at skippinet.com.au Mon Oct 24 03:58:35 2011 From: mhammond at skippinet.com.au (Mark Hammond) Date: Mon, 24 Oct 2011 12:58:35 +1100 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4E9E0948.9030203@gmail.com> <4EA491E4.80602@gmail.com> Message-ID: <4EA4C64B.4040701@skippinet.com.au> On 24/10/2011 12:56 PM, Michael Urman wrote: > On Sun, Oct 23, 2011 at 17:15, Mark Hammond wrote: >> How about abusing the existing flags for this purpose - eg: >> >> % py -3? >> % py -2.7? > > I would have expected that to launch an interactive python shell of > the appropriate version. Does it do something else today? That is what it does today without the trailing '?' character. My idea was to allow the trailing '?' to behave like the proposed --which. Mark From murman at gmail.com Mon Oct 24 04:15:49 2011 From: murman at gmail.com (Michael Urman) Date: Sun, 23 Oct 2011 21:15:49 -0500 Subject: [Python-Dev] PEP397 no command line options to python? In-Reply-To: <4EA4C64B.4040701@skippinet.com.au> References: <4E9C1E2F.3040604@gmail.com> <1318921817.49702.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4E9E0948.9030203@gmail.com> <4EA491E4.80602@gmail.com> <4EA4C64B.4040701@skippinet.com.au> Message-ID: On Sun, Oct 23, 2011 at 20:58, Mark Hammond wrote: > On 24/10/2011 12:56 PM, Michael Urman wrote: >> >> On Sun, Oct 23, 2011 at 17:15, Mark Hammond >> ?wrote: >>> >>> How about abusing the existing flags for this purpose - eg: >>> >>> % py -3? >>> % py -2.7? >> >> I would have expected that to launch an interactive python shell of >> the appropriate version. Does it do something else today? > > That is what it does today without the trailing '?' character. ?My idea was > to allow the trailing '?' to behave like the proposed --which. Oh, I read right over question mark without seeing it. I wonder if that's a notch against it from a documentation standpoint or just my own personal quirk. (I'm not used to thinking of it as a command line flag, partly due to my unix years.) Thanks for explaining! From petri at digip.org Mon Oct 24 08:42:23 2011 From: petri at digip.org (Petri Lehtinen) Date: Mon, 24 Oct 2011 09:42:23 +0300 Subject: [Python-Dev] cpython (2.7): Whoops, PyException_GetTraceback() is not documented on 2.7 In-Reply-To: References: Message-ID: <20111024064223.GA2960@p16> Georg Brandl wrote: > On 10/23/11 20:54, petri.lehtinen wrote: > > http://hg.python.org/cpython/rev/5c4781a237ef > > changeset: 73073:5c4781a237ef > > branch: 2.7 > > parent: 73071:11da12600f5b > > user: Petri Lehtinen > > date: Sun Oct 23 21:52:10 2011 +0300 > > summary: > > Whoops, PyException_GetTraceback() is not documented on 2.7 > > If it exists there, why not document it instead? :) Hmm, an interesting idea. I'll see what I can do :) Petri From petri at digip.org Mon Oct 24 08:56:11 2011 From: petri at digip.org (Petri Lehtinen) Date: Mon, 24 Oct 2011 09:56:11 +0300 Subject: [Python-Dev] cpython (2.7): Whoops, PyException_GetTraceback() is not documented on 2.7 In-Reply-To: <20111024064223.GA2960@p16> References: <20111024064223.GA2960@p16> Message-ID: <20111024065610.GB2960@p16> Petri Lehtinen wrote: > Georg Brandl wrote: > > On 10/23/11 20:54, petri.lehtinen wrote: > > > http://hg.python.org/cpython/rev/5c4781a237ef > > > changeset: 73073:5c4781a237ef > > > branch: 2.7 > > > parent: 73071:11da12600f5b > > > user: Petri Lehtinen > > > date: Sun Oct 23 21:52:10 2011 +0300 > > > summary: > > > Whoops, PyException_GetTraceback() is not documented on 2.7 > > > > If it exists there, why not document it instead? :) > > Hmm, an interesting idea. I'll see what I can do :) It seems it doesn't exist on 2.7 after all. I just failed with my hg-fu and thought it was there. Petri From stefan_ml at behnel.de Mon Oct 24 09:20:50 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 24 Oct 2011 09:20:50 +0200 Subject: [Python-Dev] memcmp performance In-Reply-To: <4EA48AA5.7010607@v.loewis.de> References: <4b460c0d-e55d-e429-ecf8-e9f34ab033c4@me.com> <4EA48AA5.7010607@v.loewis.de> Message-ID: "Martin v. L?wis", 23.10.2011 23:44: >> I am still rooting for -fno-builtin-memcmp in both Python 2.7 and 3.3 ... >> (after we put memcmp in unicode_compare) > > -1. We shouldn't do anything about this. Python has the tradition of not > working around platform bugs, except if the work-arounds are necessary > to make something work at all - i.e. in particular not for performance > issues. > > If this is a serious problem, then platform vendors need to look into > it (CPU vendor, compiler vendor, OS vendor). If they don't act, it's > probably not a serious problem. > > In the specific case, I don't think it's a problem at all. It's not > that memcmp is slow with the builtin version - it's just not as fast > as it could be. Adding a compiler option would put a maintenance burden > on Python - we already have way too many compiler options in > configure.in, and there is no good procedure to ever take them out > should they not be needed anymore. I agree. Given that the analysis shows that the libc memcmp() is particularly fast on many Linux systems, it should be up to the Python package maintainers for these systems to set that option externally through the optimisation CFLAGS. Stefan From asmodai at in-nomine.org Mon Oct 24 10:38:55 2011 From: asmodai at in-nomine.org (Jeroen Ruigrok van der Werven) Date: Mon, 24 Oct 2011 10:38:55 +0200 Subject: [Python-Dev] memcmp performance In-Reply-To: References: <4b460c0d-e55d-e429-ecf8-e9f34ab033c4@me.com> <4EA48AA5.7010607@v.loewis.de> Message-ID: <20111024083855.GL9752@nexus.in-nomine.org> -On [20111024 09:22], Stefan Behnel (stefan_ml at behnel.de) wrote: >I agree. Given that the analysis shows that the libc memcmp() is >particularly fast on many Linux systems, it should be up to the Python >package maintainers for these systems to set that option externally through >the optimisation CFLAGS. In the same stretch, stuff like this needs to be documented. Package maintainers cannot be expected to follow each and every mailinglist's posts for nuggets of information like this. Been there, done that, it's impossible to keep track. -- Jeroen Ruigrok van der Werven / asmodai ????? ?????? ??? ?? ?????? http://www.in-nomine.org/ | GPG: 2EAC625B Only in sleep can one find salvation that resembles Death... From solipsis at pitrou.net Mon Oct 24 13:54:08 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 24 Oct 2011 13:54:08 +0200 Subject: [Python-Dev] cpython (3.2): Issue #13255: wrong docstrings in array module. References: Message-ID: <20111024135408.7bbf1173@pitrou.net> On Mon, 24 Oct 2011 13:17:53 +0200 florent.xicluna wrote: > @@ -2557,7 +2557,7 @@ > extend() -- extend array by appending multiple elements from an iterable\n\ > fromfile() -- read items from a file object\n\ > fromlist() -- append items from the list\n\ > -fromstring() -- append items from the string\n\ > +frombytes() -- append items from the string\n\ > index() -- return index of first occurrence of an object\n\ > insert() -- insert a new item into the array at a provided position\n\ > pop() -- remove and return item (default last)\n\ > @@ -2565,7 +2565,7 @@ > reverse() -- reverse the order of the items in the array\n\ > tofile() -- write all items to a file object\n\ > tolist() -- return the array converted to an ordinary list\n\ > -tostring() -- return the array converted to a string\n\ > +tobytes() -- return the array converted to a string\n\ The alphabetical order should probably be kept. Regards Antoine. From victor.stinner at haypocalc.com Mon Oct 24 14:06:20 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Mon, 24 Oct 2011 14:06:20 +0200 Subject: [Python-Dev] Modules of plat-* directories In-Reply-To: <4EA48F3E.9020304@v.loewis.de> References: <201110170116.36678.victor.stinner@haypocalc.com> <1319406500.3517.1.camel@localhost.localdomain> <4EA48F3E.9020304@v.loewis.de> Message-ID: <1402612.iJB2KdNuoB@dsk000552> There are open issues related to plat-XXX. Le Lundi 24 Octobre 2011 00:03:42 Martin v. L?wis a ?crit : > no, we make no changes to them unless a user actually requests a change Matthias Klose asked for socket SIO* constants in september 2006 (5 years ago). http://bugs.python.org/issue1565071 I would prefer to see such constants in the socket module. > On a specific system, these constants are not just part of the API - > they are part of the ABI. So a system vendor cannot just change > their values. Once defined, they must stay fixed forever. Thiemo Seufer noticed that "the linux2 platform definition is incorrect for several architectures, namely Alpha, PA-RISC(hppa), MIPS and SPARC." in september 2008 (3 years ago). He proposed to add a sublevel: Lib/plat- linux2/CDROM.py would become: - Lib/plat-linux2-alpha/CDROM.py - Lib/plat-linux2-hppa/CDROM.py - Lib/plat-linux2-mips/CDROM.py, - Lib/plat-linux2-sparc/CDROM.py - (and a default for other platforms like Intel x86?) => http://bugs.python.org/issue3990 I really don't like this idea (of adding the architecture in the directory name) :-p IMO plat-XXX is wrong by design. It would be better if at least these files were regenerated at build, but Martin doesn't want to regenerate them. And there is still the problem of Mac OS X which embed 3 binarires for 3 architectures in the same "FAT" file. Victor From ezio.melotti at gmail.com Mon Oct 24 14:58:11 2011 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Mon, 24 Oct 2011 15:58:11 +0300 Subject: [Python-Dev] Deprecation policy Message-ID: <4EA560E3.8060307@gmail.com> Hi, our current deprecation policy is not so well defined (see e.g. [0]), and it seems to me that it's something like: 1) deprecate something and add a DeprecationWarning; 2) forget about it after a while; 3) wait a few versions until someone notices it; 4) actually remove it; I suggest to follow the following process: 1) deprecate something and add a DeprecationWarning; 2) decide how long the deprecation should last; 3) use the deprecated-remove[1] directive to document it; 4) add a test that fails after the update so that we remember to remove it[2]; Other related issues: PendingDeprecationWarnings: * AFAIK the difference between PDW and DW is that PDW are silenced by default; * now DW are silence by default too, so there are no differences; * I therefore suggest we stop using it, but we can leave it around[3] (other projects might be using it for something different); Deprecation Progression: Before, we more or less used to deprecated in release X and remove in X+1, or add a PDW in X, DW in X+1, and remove it in X+2. I suggest we drop this scheme and just use DW until X+N, where N is >=1 and depends on what is being removed. We can decide to leave the DW for 2-3 versions before removing something widely used, or just deprecate in X and remove in X+1 for things that are less used. Porting from 2.x to 3.x: Some people will update directly from 2.7 to 3.2 or even later versions (3.3, 3.4, ...), without going through earlier 3.x versions. If something is deprecated on 3.2 but not in 2.7 and then is removed in 3.3, people updating from 2.7 to 3.3 won't see any warning, and this will make the porting even more difficult. I suggest that: * nothing that is available and not deprecated in 2.7, will be removed until 3.x (x needs to be defined); * possibly we start backporting warnings to 2.7 so that they are visible while running with -3; Documenting the deprecations: In order to advertise the deprecations, they should be documented: * in their doc, using the deprecated-removed directive (and possibly not the 'deprecated' one); * in the what's new, possibly listing everything that is currently deprecated, and when it will be removed; Django seems to do something similar[4]. (Another thing I would like is a different rending for deprecated functions. Some part of the docs have a deprecation warning on the top of the section and the single functions look normal if you miss that. Also while linking to a deprecated function it would be nice to have it rendered with a different color or something similar.) Testing the deprecations: Tests that fail when a new release is made and the version number is bumped should be added to make sure we don't forget to remove it. The test should have a related issue with a patch to remove the deprecated function and the test. Setting the priority of the issue to release blocker or deferred blocker can be done in addition/instead, but that works well only when N == 1 (the priority could be updated for every release though). The tests could be marked with an expected failure to give some time after the release to remove them. All the deprecation-related tests might be added to the same file, or left in the test file of their module. Where to add this: Once we agree about the process we should write it down somewhere. Possible candidates are: * PEP387: Backwards Compatibility Policy[5] (it has a few lines about this); * a new PEP; * the devguide; I think having it in a PEP would be good, the devguide can then link to it. Best Regards, Ezio Melotti [0]: http://bugs.python.org/issue13248 [1]: deprecated-removed doesn't seem to be documented in the documenting doc, but it was added here: http://hg.python.org/cpython/rev/03296316a892 [2]: see e.g. http://hg.python.org/cpython/file/default/Lib/unittest/test/test_case.py#l1187 [3]: we could also introduce a MetaDeprecationWarning and make PendingDeprecationWarning inherit from it so that it can be used to pending-deprecate itself. Once PendingDeprecationWarning is gone, the MetaDeprecationWarning will become useless and can then be used to meta-deprecate itself. [4]: https://docs.djangoproject.com/en/dev/internals/deprecation/ [5]: http://www.python.org/dev/peps/pep-0387/ From solipsis at pitrou.net Mon Oct 24 15:17:44 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 24 Oct 2011 15:17:44 +0200 Subject: [Python-Dev] Deprecation policy References: <4EA560E3.8060307@gmail.com> Message-ID: <20111024151744.231c7119@pitrou.net> On Mon, 24 Oct 2011 15:58:11 +0300 Ezio Melotti wrote: > > I suggest to follow the following process: > 1) deprecate something and add a DeprecationWarning; > 2) decide how long the deprecation should last; > 3) use the deprecated-remove[1] directive to document it; > 4) add a test that fails after the update so that we remember to > remove it[2]; This sounds like a nice process. > PendingDeprecationWarnings: > * AFAIK the difference between PDW and DW is that PDW are silenced by > default; > * now DW are silence by default too, so there are no differences; > * I therefore suggest we stop using it, but we can leave it around[3] Agreed as well. > [3]: we could also introduce a MetaDeprecationWarning and make > PendingDeprecationWarning inherit from it so that it can be used to > pending-deprecate itself. Once PendingDeprecationWarning is gone, the > MetaDeprecationWarning will become useless and can then be used to > meta-deprecate itself. People may start using MetaDeprecationWarning to deprecate their metaclasses. It sounds wrong to deprecate it. Regards Antoine. From jimjjewett at gmail.com Mon Oct 24 19:04:20 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 24 Oct 2011 13:04:20 -0400 Subject: [Python-Dev] Case consistency [was: Re: [Python-checkins] cpython: Cleanup code: remove int/long idioms and simplify a while statement.] Message-ID: Is there a reason to check for ?if s[:5] == 'pass ' or s[:5] == 'PASS ': instead of if s[:5].lower() == 'pass' ? If so, it should be documented; otherwise, I would rather see the more inclusive form, that would also allow things like "Pass" -jJ On Sun, Oct 23, 2011 at 4:21 PM, florent.xicluna wrote: > http://hg.python.org/cpython/rev/67053b135ed9 > changeset: ? 73076:67053b135ed9 > user: ? ? ? ?Florent Xicluna > date: ? ? ? ?Sun Oct 23 22:11:00 2011 +0200 > summary: > ?Cleanup code: remove int/long idioms and simplify a while statement. > diff --git a/Lib/ftplib.py b/Lib/ftplib.py > --- a/Lib/ftplib.py > +++ b/Lib/ftplib.py > @@ -175,10 +175,8 @@ > > ? ? # Internal: "sanitize" a string for printing > ? ? def sanitize(self, s): > - ? ? ? ?if s[:5] == 'pass ' or s[:5] == 'PASS ': > - ? ? ? ? ? ?i = len(s) > - ? ? ? ? ? ?while i > 5 and s[i-1] in {'\r', '\n'}: > - ? ? ? ? ? ? ? ?i = i-1 > + ? ? ? ?if s[:5] in {'pass ', 'PASS '}: > + ? ? ? ? ? ?i = len(s.rstrip('\r\n')) > ? ? ? ? ? ? s = s[:5] + '*'*(i-5) + s[i:] > ? ? ? ? return repr(s) From brett at python.org Mon Oct 24 21:32:46 2011 From: brett at python.org (Brett Cannon) Date: Mon, 24 Oct 2011 12:32:46 -0700 Subject: [Python-Dev] Deprecation policy In-Reply-To: <20111024151744.231c7119@pitrou.net> References: <4EA560E3.8060307@gmail.com> <20111024151744.231c7119@pitrou.net> Message-ID: On Mon, Oct 24, 2011 at 06:17, Antoine Pitrou wrote: > On Mon, 24 Oct 2011 15:58:11 +0300 > Ezio Melotti wrote: >> >> I suggest to follow the following process: >> ? ?1) deprecate something and add a DeprecationWarning; >> ? ?2) decide how long the deprecation should last; >> ? ?3) use the deprecated-remove[1] directive to document it; >> ? ?4) add a test that fails after the update so that we remember to >> remove it[2]; > > This sounds like a nice process. I have thought about this extensively when I did the stdlib reorg for Python 3, and the only difference from approach Ezio's is proposing was I was thinking of introducing a special deprecate() function to warnings or something that took a Python version argument so it would automatically turn into an error once the version bump occurred. But then I realized other apps wouldn't necessarily care, so short of adding an argument which let people specify a different version number to compare against, I kind of sat on the idea. I also thought about specifying when to go from PendingDeprecationWarning to DeprecationWarning, but as has been suggested, PendingDeprecationWarning is not really useful to the core anymore since But adding something to test.support for our tests which requires a specified version # would also work and be less invasive to users, eg. with test.support.deprecated(remove_in='3.4'): deprecated_func() And obviously if we don't plan on removing the feature any time soon, the test can specify Python 4.0 as the removal version. But the important thing is to require some specification in the test so we don't forget to stick to our contract of when to remove something. P.S.: Did we ever discuss naming py3k Python 4 instead, in honor of King Arthur from Holy Grail not being able to ever count straight to three (eg. the holy hand grenade scene)? Maybe we need to have the next version of Python be Python 6 since the Book of Armaments says you should have 4, and 5 is right out. =) -Brett > >> PendingDeprecationWarnings: >> * AFAIK the difference between PDW and DW is that PDW are silenced by >> default; >> * now DW are silence by default too, so there are no differences; >> * I therefore suggest we stop using it, but we can leave it around[3] > > Agreed as well. > >> [3]: we could also introduce a MetaDeprecationWarning and make >> PendingDeprecationWarning inherit from it so that it can be used to >> pending-deprecate itself. ?Once PendingDeprecationWarning is gone, the >> MetaDeprecationWarning will become useless and can then be used to >> meta-deprecate itself. > > People may start using MetaDeprecationWarning to deprecate their > metaclasses. It sounds wrong to deprecate it. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > From victor.stinner at haypocalc.com Tue Oct 25 00:57:42 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 25 Oct 2011 00:57:42 +0200 Subject: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API Message-ID: <201110250057.42116.victor.stinner@haypocalc.com> Hi, I propose to raise Unicode errors if a filename cannot be decoded on Windows, instead of creating a bogus filenames with questions marks. Because this change is incompatible with Python 3.2, even if such filenames are unusable and I consider the problem as a (Python?) bug, I would like your opinion on such change before working on a patch. -- Windows works internally on Unicode strings since Windows 95 (or something like that), but provides also an "ANSI" API using the ANSI code page and byte strings for backward compatibility. It was already proposed to drop completly the bytes API in our nt (os) module, but it may break the Python backward compatibility (and it is difficult to list Python programs using the bytes API to access the file system). The ANSI API uses MultiByteToWideChar (decode) and WideCharToMultiByte (encode) functions in the default mode (flags=0): MultiByteToWideChar() replaces undecodable bytes by '?' and WideCharToMultiByte() ignores unencodable characters (!!!). This behaviour produces invalid filenames (see for example the issue #13247) and *the user is unable to detect codec errors*. In Python 3.2, I changed the MBCS codec to make it strict: it raises a UnicodeEncodeError if a character cannot be encoded to the ANSI code page (e.g. encode ? to cp1252) and a UnicodeDecodeError if a character cannot be decoded from the ANSI code page (e.g. b'\xff' from cp932). I propose to reuse our MBCS codec in strict mode (error handler="strict"), to notice directly encode/decode errors, with the Windows native (wide character) API. It should simplify the source code: replace 2 versions of a function by 1 version + optional code to decode arguments and/or encode the result. -- Read also the previous thread: [Python-Dev] Byte filenames in the posix module on Windows Wed Jun 8 00:23:20 CEST 2011 http://mail.python.org/pipermail/python-dev/2011-June/111831.html -- FYI I patched again Python MBCS codec: it now handles correclty ignore and replace mode (to encode and decode), but now also supports any error handler. -- We might use the PEP 383 to store undecoable bytes as surrogates (U+DC80- U+DCFF). But the situation is the opposite of the situtation on UNIX: on Windows, the problem is more on encoding (text->bytes) than on decoding (bytes->text). On UNIX, problems occur when the system is misconfigured (e.g. wrong locale encoding). On Windows, problems occur when your application uses the old (ANSI) API, whereas your filesystem is fully Unicode compliant and you created Unicode filenames with a program using the new (Windows) API. Only few programs are fully Unicode compliant. A lot of programs fail if a filename cannot be encoded to the ANSI code page (just 2 examples: Mercurial and Visual Studio). Victor From ncoghlan at gmail.com Tue Oct 25 01:55:48 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 25 Oct 2011 09:55:48 +1000 Subject: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API In-Reply-To: <201110250057.42116.victor.stinner@haypocalc.com> References: <201110250057.42116.victor.stinner@haypocalc.com> Message-ID: On Tue, Oct 25, 2011 at 8:57 AM, Victor Stinner wrote: > The ANSI API uses MultiByteToWideChar (decode) and WideCharToMultiByte > (encode) functions in the default mode (flags=0): MultiByteToWideChar() > replaces undecodable bytes by '?' and WideCharToMultiByte() ignores > unencodable characters (!!!). This behaviour produces invalid filenames (see > for example the issue #13247) and *the user is unable to detect codec errors*. > > In Python 3.2, I changed the MBCS codec to make it strict: it raises a > UnicodeEncodeError if a character cannot be encoded to the ANSI code page > (e.g. encode ? to cp1252) and a UnicodeDecodeError if a character cannot be > decoded from the ANSI code page (e.g. b'\xff' from cp932). > > I propose to reuse our MBCS codec in strict mode (error handler="strict"), to > notice directly encode/decode errors, with the Windows native (wide character) > API. It should simplify the source code: replace 2 versions of a function by 1 > version + optional code to decode arguments and/or encode the result. So we'd be taking existing failures that appear at whatever point the corrupted filename is used and replacing them with explicit failures at the point where the offending string is converted to or from encoded bytes? That sounds reasonable to me, and a lot closer to the way Python behaves on POSIX based systems. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From richismyname at me.com Tue Oct 25 01:17:55 2011 From: richismyname at me.com (Richard Saunders) Date: Mon, 24 Oct 2011 23:17:55 +0000 (GMT) Subject: [Python-Dev] memcmp performance Message-ID: <4106bcf1-cfd0-6046-db54-80345774d712@me.com> -On [20111024 09:22], Stefan Behnel (stefan_ml at behnel.de) wrote: >>I agree. Given that the analysis shows that the libc memcmp() is? >>particularly fast on many Linux systems, it should be up to the Python? >>package maintainers for these systems to set that option externally through? >>the optimisation CFLAGS. Indeed, this is how I constructed my Python 3.3 and Python 2.7 : setenv CFLAGS '-fno-builtin-memcmp' just before I configured. I would like to revisit changing unicode_compare: adding a? special arm for using memcmp when the "unicode kinds" are the? same will only work in two specific instances: (1) the strings are the same kind, the char size is 1 ? ? * We could add THIS to unicode_compare, but it seems extremely ? ? ? specialized by itself (2) the strings are the same kind, the char size is >1, and checking ? ? for equality ? ? * Since unicode_compare can't detect equality checking, we can't ? ? ? really add this to unicode_compare at all The problem is, of course, that memcmp won't compare for less-than or greater-than correctly (unless on a BIG ENDIAN machine) for char sizes of 2 or 4. If we wanted to put memcmp in unicodeobject.c, it would probably need to go into PyUnicode_RichCompare (so we would have some more semantic information). ?I may try to put together a patch for that, if people think that's a good idea? ?It would be JUST adding a call to memcmp for two instances specified above. >From: Jeroen Ruigrok van der Werven >In the same stretch, stuff like this needs to be documented. Package >maintainers cannot be expected to follow each and every mailinglist's posts >for nuggets of information like this. Been there, done that, it's impossible >to keep track. I would like to second that: the whole point of a Makefile/configuration file is to capture knowledge like this so it doesn't get lost. I would prefer the option would be part of a standard build Python distributes, but as long as the information gets captured SOMEWHERE? so that (say) Fedora Core 17 has Python 2.7 built with -fno-builtin-memcmp,? I would be happy. ? Gooday, ? Richie -------------- next part -------------- An HTML attachment was scrubbed... URL: From skippy.hammond at gmail.com Tue Oct 25 05:23:54 2011 From: skippy.hammond at gmail.com (Mark Hammond) Date: Tue, 25 Oct 2011 14:23:54 +1100 Subject: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API In-Reply-To: <201110250057.42116.victor.stinner@haypocalc.com> References: <201110250057.42116.victor.stinner@haypocalc.com> Message-ID: <4EA62BCA.1010603@gmail.com> +1 from me! Mark On 25/10/2011 9:57 AM, Victor Stinner wrote: > Hi, > > I propose to raise Unicode errors if a filename cannot be decoded on Windows, > instead of creating a bogus filenames with questions marks. Because this change > is incompatible with Python 3.2, even if such filenames are unusable and I > consider the problem as a (Python?) bug, I would like your opinion on such > change before working on a patch. > > -- > > Windows works internally on Unicode strings since Windows 95 (or something > like that), but provides also an "ANSI" API using the ANSI code page and byte > strings for backward compatibility. It was already proposed to drop completly > the bytes API in our nt (os) module, but it may break the Python backward > compatibility (and it is difficult to list Python programs using the bytes API > to access the file system). > > The ANSI API uses MultiByteToWideChar (decode) and WideCharToMultiByte > (encode) functions in the default mode (flags=0): MultiByteToWideChar() > replaces undecodable bytes by '?' and WideCharToMultiByte() ignores > unencodable characters (!!!). This behaviour produces invalid filenames (see > for example the issue #13247) and *the user is unable to detect codec errors*. > > In Python 3.2, I changed the MBCS codec to make it strict: it raises a > UnicodeEncodeError if a character cannot be encoded to the ANSI code page > (e.g. encode ? to cp1252) and a UnicodeDecodeError if a character cannot be > decoded from the ANSI code page (e.g. b'\xff' from cp932). > > I propose to reuse our MBCS codec in strict mode (error handler="strict"), to > notice directly encode/decode errors, with the Windows native (wide character) > API. It should simplify the source code: replace 2 versions of a function by 1 > version + optional code to decode arguments and/or encode the result. > > -- > > Read also the previous thread: > > [Python-Dev] Byte filenames in the posix module on Windows > Wed Jun 8 00:23:20 CEST 2011 > http://mail.python.org/pipermail/python-dev/2011-June/111831.html > > -- > > FYI I patched again Python MBCS codec: it now handles correclty ignore and > replace mode (to encode and decode), but now also supports any error handler. > > -- > > We might use the PEP 383 to store undecoable bytes as surrogates (U+DC80- > U+DCFF). But the situation is the opposite of the situtation on UNIX: on > Windows, the problem is more on encoding (text->bytes) than on decoding > (bytes->text). On UNIX, problems occur when the system is misconfigured (e.g. > wrong locale encoding). On Windows, problems occur when your application uses > the old (ANSI) API, whereas your filesystem is fully Unicode compliant and you > created Unicode filenames with a program using the new (Windows) API. > > Only few programs are fully Unicode compliant. A lot of programs fail if a > filename cannot be encoded to the ANSI code page (just 2 examples: Mercurial > and Visual Studio). > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/skippy.hammond%40gmail.com From stephen at xemacs.org Tue Oct 25 06:20:12 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 25 Oct 2011 13:20:12 +0900 Subject: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API In-Reply-To: <201110250057.42116.victor.stinner@haypocalc.com> References: <201110250057.42116.victor.stinner@haypocalc.com> Message-ID: <87r52166tf.fsf@uwakimon.sk.tsukuba.ac.jp> Victor Stinner writes: > I propose to raise Unicode errors if a filename cannot be decoded > on Windows, instead of creating a bogus filenames with questions > marks. By "bogus" you mean "sometimes (?) invalid and the OS will refuse to use them, causing a later hard-to-diagnose exception", rather than "not what the user thinks he wants", right? In the "hard errors" case, a hearty +1 (I'm dealing with this in an experimental version of XEmacs and it's a right PITA if the codec doesn't complain). Backward compatibility is important, but here the costs of fixing such bugs outweigh the value of bug-compatibility. In the latter (doing things behind the users back rather than actually breaking the program), I'm basically +1 but do worry about backward compatibility. From martin at v.loewis.de Tue Oct 25 09:09:56 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 25 Oct 2011 09:09:56 +0200 Subject: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API In-Reply-To: <201110250057.42116.victor.stinner@haypocalc.com> References: <201110250057.42116.victor.stinner@haypocalc.com> Message-ID: <4EA660C4.1000203@v.loewis.de> > I propose to raise Unicode errors if a filename cannot be decoded on Windows, > instead of creating a bogus filenames with questions marks. Can you please elaborate what APIs you are talking about exactly? If it's the byte APIs (i.e. using bytes as file names), then I'm -1 on this proposal. People that explicitly use bytes for file names deserve to get whatever exact platform semantics the platform has to offer. This is true on Unix, and it is also true on Windows. Regards, Martin From solipsis at pitrou.net Tue Oct 25 10:08:12 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 25 Oct 2011 10:08:12 +0200 Subject: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API References: <201110250057.42116.victor.stinner@haypocalc.com> Message-ID: <20111025100812.14f07b4d@pitrou.net> On Tue, 25 Oct 2011 00:57:42 +0200 Victor Stinner wrote: > Hi, > > I propose to raise Unicode errors if a filename cannot be decoded on Windows, > instead of creating a bogus filenames with questions marks. Because this change > is incompatible with Python 3.2, even if such filenames are unusable and I > consider the problem as a (Python?) bug, I would like your opinion on such > change before working on a patch. +1 from me. Regards Antoine. From victor.stinner at haypocalc.com Tue Oct 25 10:22:26 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 25 Oct 2011 10:22:26 +0200 Subject: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API In-Reply-To: <87r52166tf.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201110250057.42116.victor.stinner@haypocalc.com> <87r52166tf.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <1496225.YmzZrbhPeg@dsk000552> Le Mardi 25 Octobre 2011 13:20:12 vous avez ?crit : > Victor Stinner writes: > > I propose to raise Unicode errors if a filename cannot be decoded > > on Windows, instead of creating a bogus filenames with questions > > marks. > > By "bogus" you mean "sometimes (?) invalid and the OS will refuse to > use them, causing a later hard-to-diagnose exception", rather than > "not what the user thinks he wants", right? If the ("Unicode") filename cannot be encoded to the ANSI code page, which is usually a small charset (e.g. cp1252 contains 256 code points), Windows replaces unencodable characters by question marks. Imagine that the code page is ASCII, the ("Unicode") filename "h?ho.txt" will be encoded to b"h?ho.txt". You can display this string in a dialog, but you cannot open the file to read its content... If you pass the filename to os.listdir(), it is even worse because "?" is interpreted ("?" means any character, it's a pattern to match a filename). I would like to raise an error on such situation, because currently the user cannot be noticed otherwise. The user may search "?" in the filename, but Windows replaces also unencodable characters by *similar glyph* (e.g. "?" replaced by "e"). > In the "hard errors" case, a hearty +1 (I'm dealing with this in an > experimental version of XEmacs and it's a right PITA if the codec > doesn't complain). If you use MultiByteToWideChar and WideCharToMultiByte, you can be noticed on error using some flags, but functions of the ANSI API doesn't give access to these flags... > Backward compatibility is important, but here the > costs of fixing such bugs outweigh the value of bug-compatibility. I only want to change how unencodable filenames are handled, the bytes API will still be available. If you filesystem has the "8dot3name" feature enable, it may work even for unencodable filenames (Windows generates names like HEHO~1.TXT). Victor From victor.stinner at haypocalc.com Tue Oct 25 10:31:56 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 25 Oct 2011 10:31:56 +0200 Subject: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API In-Reply-To: <4EA660C4.1000203@v.loewis.de> References: <201110250057.42116.victor.stinner@haypocalc.com> <4EA660C4.1000203@v.loewis.de> Message-ID: <2078391.ezBZFOimXZ@dsk000552> Le Mardi 25 Octobre 2011 09:09:56 vous avez ?crit : > > I propose to raise Unicode errors if a filename cannot be decoded on > > Windows, instead of creating a bogus filenames with questions marks. > > Can you please elaborate what APIs you are talking about exactly? Basically, all functions processing filenames, so most functions of posixmodule.c. Some examples: - os.listdir(): FindFirstFileA, FindNextFileA, FindCloseA - os.lstat(): CreateFileA - os.getcwdb(): getcwd() - os.mkdir(): CreateDirectoryA - os.chmod(): SetFileAttributesA - ... > If it's the byte APIs (i.e. using bytes as file names), then I'm -1 on > this proposal. People that explicitly use bytes for file names deserve > to get whatever exact platform semantics the platform has to offer. This > is true on Unix, and it is also true on Windows. My proposition is a fix to user reported by a user: http://bugs.python.org/issue13247 I want to keep the bytes API for backward compatibility, and it will still work for non-ASCII characters, but only for non-ASCII characters encodable to the ANSI code page. In practice, characters not encodable to the ANSI code page are very rare. For example: it's difficult to write such characters directly with the keyboard. I bet that very few people will notify the change. Victor From stefan_ml at behnel.de Tue Oct 25 10:44:16 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 25 Oct 2011 10:44:16 +0200 Subject: [Python-Dev] memcmp performance In-Reply-To: <4106bcf1-cfd0-6046-db54-80345774d712@me.com> References: <4106bcf1-cfd0-6046-db54-80345774d712@me.com> Message-ID: Richard Saunders, 25.10.2011 01:17: > -On [20111024 09:22], Stefan Behnel wrote: > >>I agree. Given that the analysis shows that the libc memcmp() is > >>particularly fast on many Linux systems, it should be up to the Python > >>package maintainers for these systems to set that option externally through > >>the optimisation CFLAGS. > > Indeed, this is how I constructed my Python 3.3 and Python 2.7 : > setenv CFLAGS '-fno-builtin-memcmp' > just before I configured. > > I would like to revisit changing unicode_compare: adding a > special arm for using memcmp when the "unicode kinds" are the > same will only work in two specific instances: > > (1) the strings are the same kind, the char size is 1 > * We could add THIS to unicode_compare, but it seems extremely > specialized by itself But also extremely likely to happen. This means that the strings are pure ASCII, which is highly likely and one of the main reasons why the unicode string layout was rewritten for CPython 3.3. It allows CPython to save a lot of memory (thus clearly proving how likely this case is!), and it would also allow it to do faster comparisons for these strings. Stefan From victor.stinner at haypocalc.com Tue Oct 25 10:52:43 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 25 Oct 2011 10:52:43 +0200 Subject: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API In-Reply-To: <4EA660C4.1000203@v.loewis.de> References: <201110250057.42116.victor.stinner@haypocalc.com> <4EA660C4.1000203@v.loewis.de> Message-ID: <2286663.IsKJyCQ0pc@dsk000552> Le Mardi 25 Octobre 2011 09:09:56 vous avez ?crit : > If it's the byte APIs (i.e. using bytes as file names), then I'm -1 on > this proposal. People that explicitly use bytes for file names deserve > to get whatever exact platform semantics the platform has to offer. This > is true on Unix, and it is also true on Windows. For your information, it took me something like 3 months (when I was working on the issue #12281) to understand exactly how Windows handles undecodable bytes and unencodable characters. I did a lot of tests on different Windows versions (XP, Vista and Seven, the behaviour changed in Windows Vista). I had to take notes because it is really complex. Well, I wanted to understand exactly *all* code pages, including CP_UTF7 and CP_UTF8, not only the most common ones like cp1252 or cp932. See the dedicated section in my book to learn more about these funtions: http://www.haypocalc.com/tmp/unicode-2011-07-20/html/operating_systems.html#encode- and-decode-functions Some information are available in MultiByteToWideChar and WideCharToMultiByte documentation, but they are not well explained :-p Victor From victor.stinner at haypocalc.com Tue Oct 25 11:27:36 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 25 Oct 2011 11:27:36 +0200 Subject: [Python-Dev] memcmp performance In-Reply-To: References: <4106bcf1-cfd0-6046-db54-80345774d712@me.com> Message-ID: <1819493.Er5iXUrEHF@dsk000552> Le Mardi 25 Octobre 2011 10:44:16 Stefan Behnel a ?crit : > Richard Saunders, 25.10.2011 01:17: > > -On [20111024 09:22], Stefan Behnel wrote: > > >>I agree. Given that the analysis shows that the libc memcmp() is > > >>particularly fast on many Linux systems, it should be up to the > > >>Python package maintainers for these systems to set that option > > >>externally through the optimisation CFLAGS. > > > > Indeed, this is how I constructed my Python 3.3 and Python 2.7 : > > setenv CFLAGS '-fno-builtin-memcmp' > > just before I configured. > > > > I would like to revisit changing unicode_compare: adding a > > special arm for using memcmp when the "unicode kinds" are the > > same will only work in two specific instances: > > > > (1) the strings are the same kind, the char size is 1 > > * We could add THIS to unicode_compare, but it seems extremely > > specialized by itself > > But also extremely likely to happen. This means that the strings are pure > ASCII, which is highly likely and one of the main reasons why the unicode > string layout was rewritten for CPython 3.3. It allows CPython to save a > lot of memory (thus clearly proving how likely this case is!), and it would > also allow it to do faster comparisons for these strings. Python 3.3 has already some optimizations for latin1: CPU and the C language are more efficient to process char* strings than Py_UCS2 and Py_UCS4 strings. For example, we are using memchr() to search a single character is a latin1 string. Victor From petri at digip.org Tue Oct 25 14:45:24 2011 From: petri at digip.org (Petri Lehtinen) Date: Tue, 25 Oct 2011 15:45:24 +0300 Subject: [Python-Dev] [Python-checkins] cpython: #13251: update string description in datamodel.rst. In-Reply-To: References: Message-ID: <20111025124524.GB15772@p16> Hi, ezio.melotti wrote: > http://hg.python.org/cpython/rev/11d18ebb2dd1 > changeset: 73116:11d18ebb2dd1 > user: Ezio Melotti > date: Tue Oct 25 09:23:42 2011 +0300 > summary: > #13251: update string description in datamodel.rst. > > files: > Doc/reference/datamodel.rst | 20 ++++++++++---------- > 1 files changed, 10 insertions(+), 10 deletions(-) > > > diff --git a/Doc/reference/datamodel.rst b/Doc/reference/datamodel.rst > --- a/Doc/reference/datamodel.rst > +++ b/Doc/reference/datamodel.rst > @@ -276,16 +276,16 @@ > single: integer > single: Unicode > > - The items of a string object are Unicode code units. A Unicode code > - unit is represented by a string object of one item and can hold either > - a 16-bit or 32-bit value representing a Unicode ordinal (the maximum > - value for the ordinal is given in ``sys.maxunicode``, and depends on > - how Python is configured at compile time). Surrogate pairs may be > - present in the Unicode object, and will be reported as two separate > - items. The built-in functions :func:`chr` and :func:`ord` convert > - between code units and nonnegative integers representing the Unicode > - ordinals as defined in the Unicode Standard 3.0. Conversion from and to > - other encodings are possible through the string method :meth:`encode`. > + A string is a sequence of values that represent Unicode codepoints. > + All the codepoints in range ``U+0000 - U+10FFFF`` can be represented > + in a string. Python doesn't have a :c:type:`chr` type, and > + every characters in the string is represented as a string object typo ^ Should be "character", right? > + with length ``1``. The built-in function :func:`chr` converts a > + character to its codepoint (as an integer); :func:`ord` converts > + an integer in range ``0 - 10FFFF`` to the corresponding character. Actually chr() converts an integer to a string and ord() converts a string to an integer. chr and ord are swapped in your text. > + :meth:`str.encode` can be used to convert a :class:`str` to > + :class:`bytes` using the given encoding, and :meth:`bytes.decode` can > + be used to achieve the opposite. Petri From petri at digip.org Tue Oct 25 14:50:44 2011 From: petri at digip.org (Petri Lehtinen) Date: Tue, 25 Oct 2011 15:50:44 +0300 Subject: [Python-Dev] [Python-checkins] cpython: Issue #13226: Add RTLD_xxx constants to the os module. These constants can by In-Reply-To: References: Message-ID: <20111025125044.GC15772@p16> Hi, victor.stinner wrote: > http://hg.python.org/cpython/rev/c75427c0da06 > changeset: 73127:c75427c0da06 > user: Victor Stinner > date: Tue Oct 25 13:34:04 2011 +0200 > summary: > Issue #13226: Add RTLD_xxx constants to the os module. These constants can by > used with sys.setdlopenflags(). > > files: > Doc/library/os.rst | 13 +++++++++++++ > Doc/library/sys.rst | 10 +++++----- > Lib/test/test_posix.py | 7 +++++++ > Misc/NEWS | 3 +++ > Modules/posixmodule.c | 26 ++++++++++++++++++++++++++ > 5 files changed, 54 insertions(+), 5 deletions(-) [snip] > diff --git a/Misc/NEWS b/Misc/NEWS > --- a/Misc/NEWS > +++ b/Misc/NEWS > @@ -341,6 +341,9 @@ > Library > ------- > > +- Issue #13226: Add RTLD_xxx constants to the os module. These constants can by Typo: s/by/be/ > + used with sys.setdlopenflags(). > + > - Issue #10278: Add clock_getres(), clock_gettime() and CLOCK_xxx constants to > the time module. time.clock_gettime(time.CLOCK_MONOTONIC) provides a > monotonic clock Petri From martin at v.loewis.de Tue Oct 25 20:00:11 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 25 Oct 2011 20:00:11 +0200 Subject: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API In-Reply-To: <2078391.ezBZFOimXZ@dsk000552> References: <201110250057.42116.victor.stinner@haypocalc.com> <4EA660C4.1000203@v.loewis.de> <2078391.ezBZFOimXZ@dsk000552> Message-ID: <4EA6F92B.8060104@v.loewis.de> > My proposition is a fix to user reported by a user: > http://bugs.python.org/issue13247 So your proposal is that abspath(b".") shall raise a UnicodeError in this case? Are you serious??? > In practice, characters not encodable to the ANSI code page are very rare. For > example: it's difficult to write such characters directly with the keyboard. I > bet that very few people will notify the change. Except people running into the very issues you are trying to resolve. I'm not sure these people are really helped by having their applications crash all of a sudden. Regards, Martin From martin at v.loewis.de Tue Oct 25 20:13:29 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 25 Oct 2011 20:13:29 +0200 Subject: [Python-Dev] Modules of plat-* directories In-Reply-To: <1402612.iJB2KdNuoB@dsk000552> References: <201110170116.36678.victor.stinner@haypocalc.com> <1319406500.3517.1.camel@localhost.localdomain> <4EA48F3E.9020304@v.loewis.de> <1402612.iJB2KdNuoB@dsk000552> Message-ID: <4EA6FC49.5040004@v.loewis.de> Am 24.10.2011 14:06, schrieb Victor Stinner: > There are open issues related to plat-XXX. > > Le Lundi 24 Octobre 2011 00:03:42 Martin v. L?wis a ?crit : >> no, we make no changes to them unless a user actually requests a change > > Matthias Klose asked for socket SIO* constants in september 2006 (5 years > ago). > http://bugs.python.org/issue1565071 > > I would prefer to see such constants in the socket module. These are not mutually exclusive. You can regenerate IN.py and still add the constants to the socket module. > Thiemo Seufer noticed that "the linux2 platform definition is incorrect for > several architectures, namely Alpha, PA-RISC(hppa), MIPS and SPARC." in > september 2008 (3 years ago). He proposed to add a sublevel: Lib/plat- > linux2/CDROM.py would become: > > - Lib/plat-linux2-alpha/CDROM.py > - Lib/plat-linux2-hppa/CDROM.py > - Lib/plat-linux2-mips/CDROM.py, > - Lib/plat-linux2-sparc/CDROM.py > - (and a default for other platforms like Intel x86?) > > => http://bugs.python.org/issue3990 > > I really don't like this idea (of adding the architecture in the directory > name) :-p Neither do I. In the specific case, I'd generate four versions of CDROM.py (with differing names), and provide a CDROM.py that imports the right one. > IMO plat-XXX is wrong by design. I disagree. It's limited, not wrong. > It would be better if at least these files > were regenerated at build, but Martin doesn't want to regenerate them. And > there is still the problem of Mac OS X which embed 3 binarires for 3 > architectures in the same "FAT" file. These are problems, but not necessarily issues. Even if some of the values are incorrect, the values that are correct may still be useful. Regards, Martin From victor.stinner at haypocalc.com Tue Oct 25 22:18:13 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 25 Oct 2011 22:18:13 +0200 Subject: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API In-Reply-To: <201110250057.42116.victor.stinner@haypocalc.com> References: <201110250057.42116.victor.stinner@haypocalc.com> Message-ID: <201110252218.13224.victor.stinner@haypocalc.com> Le mardi 25 octobre 2011 00:57:42, Victor Stinner a ?crit : > I propose to raise Unicode errors if a filename cannot be decoded on > Windows, instead of creating a bogus filenames with questions marks. > Because this change is incompatible with Python 3.2, even if such > filenames are unusable and I consider the problem as a (Python?) bug, I > would like your opinion on such change before working on a patch. Most people like the idea, so I wrote a patch and attached it to: http://bugs.python.org/issue13247 The patch only changes os.getcwdb() and os.listdir(). > We might use the PEP 383 to store undecoable bytes as surrogates (U+DC80- > U+DCFF). But the situation is the opposite of the situtation on UNIX: on > Windows, the problem is more on encoding (text->bytes) than on decoding > (bytes->text). On UNIX, problems occur when the system is misconfigured > (e.g. wrong locale encoding). On Windows, problems occur when your > application uses the old (ANSI) API, whereas your filesystem is fully > Unicode compliant and you created Unicode filenames with a program using > the new (Windows) API. I only changed functions returning filenames, so os.mkdir() is unchanged for example. We may also patch the other functions to simplify the source code. Victor From victor.stinner at haypocalc.com Wed Oct 26 01:43:05 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 26 Oct 2011 01:43:05 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Issue #13226: Add RTLD_xxx constants to the os module. These constants can by In-Reply-To: <20111025125044.GC15772@p16> References: <20111025125044.GC15772@p16> Message-ID: <201110260143.05371.victor.stinner@haypocalc.com> Le mardi 25 octobre 2011 14:50:44, Petri Lehtinen a ?crit : > Hi, > > victor.stinner wrote: > > http://hg.python.org/cpython/rev/c75427c0da06 > > changeset: 73127:c75427c0da06 > > user: Victor Stinner > > date: Tue Oct 25 13:34:04 2011 +0200 > > > > summary: > > Issue #13226: Add RTLD_xxx constants to the os module. These constants > > can by > > > > used with sys.setdlopenflags(). > > > > files: > > Doc/library/os.rst | 13 +++++++++++++ > > Doc/library/sys.rst | 10 +++++----- > > Lib/test/test_posix.py | 7 +++++++ > > Misc/NEWS | 3 +++ > > Modules/posixmodule.c | 26 ++++++++++++++++++++++++++ > > 5 files changed, 54 insertions(+), 5 deletions(-) > > [snip] > > > diff --git a/Misc/NEWS b/Misc/NEWS > > --- a/Misc/NEWS > > +++ b/Misc/NEWS > > @@ -341,6 +341,9 @@ > > > > Library > > ------- > > > > +- Issue #13226: Add RTLD_xxx constants to the os module. These constants > > can by > > Typo: s/by/be/ Fixed, thanks. Victor From tjreedy at udel.edu Wed Oct 26 02:49:43 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 25 Oct 2011 20:49:43 -0400 Subject: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API In-Reply-To: <2078391.ezBZFOimXZ@dsk000552> References: <201110250057.42116.victor.stinner@haypocalc.com> <4EA660C4.1000203@v.loewis.de> <2078391.ezBZFOimXZ@dsk000552> Message-ID: On 10/25/2011 4:31 AM, Victor Stinner wrote: > Le Mardi 25 Octobre 2011 09:09:56 vous avez ?crit : >>> I propose to raise Unicode errors if a filename cannot be decoded on >>> Windows, instead of creating a bogus filenames with questions marks. >> >> Can you please elaborate what APIs you are talking about exactly? > > Basically, all functions processing filenames, so most functions of > posixmodule.c. Some examples: This seems way too broad. From you previous posts, I presumed that you only propose to change behavior when the user asks for the bytes versions of a unicode name that cannot be properly converted to a bytes version. > - os.listdir(): os.listdir(unicode) works fine and should not be changed. os.listdir(bytes) is what OP of issue wants changed. > FindFirstFileA, FindNextFileA, FindCloseA There are not Python names. Are they Windows API names? > - os.lstat(): CreateFileA This does not create a path and should not be changed as far as I can see. > - os.getcwdb(): This you might change. > getcwd() This should not be, as no bytes are involved. > - os.mkdir(): CreateDirectoryA > - os.chmod(): SetFileAttributesA Like os.lstat, these accept only accept a path and should do what they are supposed to do. >> If it's the byte APIs (i.e. using bytes as file names), then I'm -1 on >> this proposal. People that explicitly use bytes for file names deserve >> to get whatever exact platform semantics the platform has to offer. This >> is true on Unix, and it is also true on Windows. > > My proposition is a fix to user reported by a user: > http://bugs.python.org/issue13247 > > I want to keep the bytes API for backward compatibility, and it will still > work for non-ASCII characters, but only for non-ASCII characters encodable to > the ANSI code page. > > In practice, characters not encodable to the ANSI code page are very rare. For > example: it's difficult to write such characters directly with the keyboard. I > bet that very few people will notify the change. Actually, Windows makes switching keyboard setups rather easy once you enable the feature. It might be that people who routinely use non-'ansi' characters in file and directory names do not routinely ask for bytes versions thereof. The doc says "All functions accepting path or file names accept both bytes and string objects, and result in an object of the same type, if a path or file name is returned." It does that now, though it says nothing about the encoding assumed for input bytes or used for output bytes. It does not mention raising exceptions, so doing so is a feature-change that would likely break code. Currently, exceptional situations are signalled with "'?' in returned_path" rather than with an exception object. It ('?') is a bad choice of signal though, given the other uses of '?' in paths. -- Terry Jan Reedy From stephen at xemacs.org Wed Oct 26 05:31:36 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 26 Oct 2011 12:31:36 +0900 Subject: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API In-Reply-To: References: <201110250057.42116.victor.stinner@haypocalc.com> <4EA660C4.1000203@v.loewis.de> <2078391.ezBZFOimXZ@dsk000552> Message-ID: <87zkgo4eef.fsf@uwakimon.sk.tsukuba.ac.jp> In general I agree with what you write, Terry. One clarification and one comment, though. Terry Reedy writes: > The doc says "All functions accepting path or file names accept both > bytes and string objects, and result in an object of the same type, if a > path or file name is returned." It does that now, though it says nothing > about the encoding assumed for input bytes or used for output > bytes. That's determined by the OS, and figuring that out is the end user's problem. > It does not mention raising exceptions, so doing so is a > feature-change that would likely break code. Currently, exceptional > situations are signalled with "'?' in returned_path" rather than > with an exception object. It ('?') is a bad choice of signal > though, given the other uses of '?' in paths. True, but this isn't really Python's problem. And IIUC Martin's post, it is hardly "exceptional": isn't Python doing this, it's just standard Windows behavior, which results in pathnames that are perfectly acceptable to Windows APIs, but unreliable in use because they have different semantics in different Windows APIs. If that is true, there are almost surely user programs that depend on this behavior, even though it sucks.[1] My original "hearty +1" was dependent on my understanding from Victor's post that this substitution could cause later exceptions because filename is invalid (eg, contains illegal characters causing Windows to signal an error). If that's not true, I think the proper remedy is to add a strong warning to pylint that use of those APIs is supported (eg, for interaction with existing programs that use them) but that they require careful error-checking for robust use. As a card-carrying Unicode nazi I wouldn't mind tagging the bytes APIs with a DeprecationWarning but I know that proposal is going nowhere so I withdraw it in advance. Footnotes: [1] Note that the original rationale for this was surely "since users will have a very hard time using file names with this character in them, using it as a substitution character internally will make the problem evident and Sufficiently Smart Programs can deal with it." From victor.stinner at haypocalc.com Wed Oct 26 10:52:16 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 26 Oct 2011 10:52:16 +0200 Subject: [Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API In-Reply-To: <87zkgo4eef.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201110250057.42116.victor.stinner@haypocalc.com> <87zkgo4eef.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <7382481.IsoqDAHput@dsk000552> Le Mardi 25 Octobre 2011 10:31:56 Victor Stinner a ?crit : >> Basically, all functions processing filenames, so most functions of >> posixmodule.c. Some examples: >> >> - os.listdir(): FindFirstFileA, FindNextFileA, FindCloseA >> - os.lstat(): CreateFileA >> - os.getcwdb(): getcwd() >> - os.mkdir(): CreateDirectoryA >> - os.chmod(): SetFileAttributesA >> - ... > This seems way too broad. I changed my mind about this list: I only want to change how filenames are encoded, not how filenames are decoded. So only os.listdir() & os.getcwdb() should be changed, as I wrote in another email in this thread and in the issue #13247. >> - os.getcwdb(): > This you might change. Issue #13247 combines os.getcwdb() and os.listdir(). Read the issue for more information. > It ('?') is a bad choice of signal though, given the other uses > of '?' in paths. If I understood correctly, '?' is a pattern to match any character in FindFirstFile/FindNextFile. Python cannot configure the replacement character, it's hardcoded to "?" (U+003F). > it's just > standard Windows behavior, which results in pathnames that are > perfectly acceptable to Windows APIs, but unreliable in use because > they have different semantics in different Windows APIs. I think that such filenames cannot be used with any Windows function accessing to the filesystem. Extract of the issue: "Such filenames cannot be used, open() fails with OSError(22, "invalid argument: '?'") for example." You can only be used if you want to display the content of a directory, but don't expect to be able to read file content. -- Anyway, you must use Unicode on Windows! The bytes API was just kept for backward compatibility. Victor From berker.peksag at gmail.com Wed Oct 26 11:39:06 2011 From: berker.peksag at gmail.com (=?UTF-8?Q?Berker_Peksa=C4=9F?=) Date: Wed, 26 Oct 2011 12:39:06 +0300 Subject: [Python-Dev] [Python-checkins] r88914 - tracker/instances/python-dev/html/issue.item.js In-Reply-To: <3ST4gT603mzP7n@mail.python.org> References: <3ST4gT603mzP7n@mail.python.org> Message-ID: Hi, On Wed, Oct 26, 2011 at 11:45 AM, ezio.melotti wrote: > Author: ezio.melotti > Date: Wed Oct 26 10:45:41 2011 > New Revision: 88914 > > Log: > Mark automated messages with a different background. > > Modified: > ? tracker/instances/python-dev/html/issue.item.js > > Modified: tracker/instances/python-dev/html/issue.item.js > ============================================================================== > --- tracker/instances/python-dev/html/issue.item.js ? ? (original) > +++ tracker/instances/python-dev/html/issue.item.js ? ? Wed Oct 26 10:45:41 2011 > @@ -313,3 +313,14 @@ > ? ? if (link.length != 0) > ? ? ? ? link.attr('href', link.attr('href').split('?')[0]); > ?}); > + > + > +$(document).ready(function() { > + ? ?/* Mark automated messages with a different background */ > + ? ?$('table.messages th:nth-child(2)').each(function (i, e) { > + ? ? ? ?var e = $(e); > + ? ? ? ?if (/\(python-dev\)$/.test(e.text())) > + ? ? ? ? ? ?e.parent().next().find('td.content').css( > + ? ? ? ? ? ? ? ?'background-color', '#efeff9'); > + ? ?}); > +}); I think this is shorter than $(document).ready(); $(function() { // ... }); See: http://stackoverflow.com/questions/3528509/document-readyfunction-vs-function/3528528#3528528 --Berker > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > From ezio.melotti at gmail.com Thu Oct 27 06:06:02 2011 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Thu, 27 Oct 2011 07:06:02 +0300 Subject: [Python-Dev] [Python-checkins] r88914 - tracker/instances/python-dev/html/issue.item.js In-Reply-To: References: <3ST4gT603mzP7n@mail.python.org> Message-ID: <4EA8D8AA.8090000@gmail.com> Hi, On 26/10/2011 12.39, Berker Peksa? wrote: > Hi, > I think this is shorter than $(document).ready(); > > $(function() { > // ... > }); > > See: http://stackoverflow.com/questions/3528509/document-readyfunction-vs-function/3528528#3528528 Thanks a lot for the review, I didn't know about this shortcut! However I think I'll just leave $(document).ready(...); because, even if longer, is more explicit and readable. Best Regards, Ezio Melotti > --Berker > From status at bugs.python.org Fri Oct 28 18:07:30 2011 From: status at bugs.python.org (Python tracker) Date: Fri, 28 Oct 2011 18:07:30 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20111028160730.7C7261CC11@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2011-10-21 - 2011-10-28) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3108 (+13) closed 21961 (+34) total 25069 (+47) Open issues with patches: 1324 Issues opened (33) ================== #10278: add time.wallclock() method http://bugs.python.org/issue10278 reopened by haypo #13241: llvm-gcc-4.2 miscompiles Python (XCode 4.1 on Mac OS 10.7) http://bugs.python.org/issue13241 opened by Oleg.Plakhotnyuk #13244: WebSocket schemes in urllib.parse http://bugs.python.org/issue13244 opened by oberstet #13245: sched.py kwargs addition and default time functions http://bugs.python.org/issue13245 opened by clach04 #13246: Py_UCS4_strlen and friends needn't be public http://bugs.python.org/issue13246 opened by pitrou #13247: os.path.abspath returns unicode paths as question marks http://bugs.python.org/issue13247 opened by ubershmekel #13248: deprecated in 3.2, should be removed in 3.3 http://bugs.python.org/issue13248 opened by flox #13249: argparse.ArgumentParser() lists arguments in the wrong order http://bugs.python.org/issue13249 opened by roysmith #13252: new decumulate() function in itertools module http://bugs.python.org/issue13252 opened by carlo.verre #13253: 2to3 fix_renames renames sys.maxint only in imports http://bugs.python.org/issue13253 opened by ezio.melotti #13254: maildir.items() broken http://bugs.python.org/issue13254 opened by marco.ghidinelli #13256: Document and test new socket options http://bugs.python.org/issue13256 opened by haypo #13257: Move importlib over to PEP 3151 exceptions http://bugs.python.org/issue13257 opened by brett.cannon #13262: IDLE opens partially hidden http://bugs.python.org/issue13262 opened by Aivar.Annamaa #13263: Group some os functions in submodules http://bugs.python.org/issue13263 opened by ezio.melotti #13264: Monkeypatching using metaclass http://bugs.python.org/issue13264 opened by Artem.Tomilov #13265: IDLE crashes when printing some unprintable characters. http://bugs.python.org/issue13265 opened by maniram.maniram #13266: Add inspect.unwrap(f) to easily unravel "__wrapped__" chains http://bugs.python.org/issue13266 opened by ncoghlan #13271: When -h is used with argparse, default values that fail should http://bugs.python.org/issue13271 opened by Joshua.Chia #13272: 2to3 fix_renames doesn't rename string.lowercase/uppercase/let http://bugs.python.org/issue13272 opened by ezio.melotti #13274: heapq pure python version uses islice without guarding for neg http://bugs.python.org/issue13274 opened by Ronny.Pfannschmidt #13275: Recommend xml.etree for XML processing http://bugs.python.org/issue13275 opened by ash #13276: distutils bdist_wininst created installer does not run the pos http://bugs.python.org/issue13276 opened by francisco #13277: tzinfo subclasses information http://bugs.python.org/issue13277 opened by orsenthil #13279: Add memcmp into unicode_compare for optimizing comparisons http://bugs.python.org/issue13279 opened by RichIsMyName #13280: argparse should use the new Formatter class http://bugs.python.org/issue13280 opened by poke #13281: robotparser.RobotFileParser ignores rules preceeded by a blank http://bugs.python.org/issue13281 opened by bernie9998 #13282: the table of contents in epub file is too long http://bugs.python.org/issue13282 opened by wrobell #13283: removal of two unused variable in locale.py http://bugs.python.org/issue13283 opened by nicoe #13284: email.utils.formatdate function does not handle timezones corr http://bugs.python.org/issue13284 opened by burak.arslan #13285: signal module ignores external signal changes http://bugs.python.org/issue13285 opened by vilya #13286: PEP 3151 breaks backward compatibility: it should be documente http://bugs.python.org/issue13286 opened by haypo #13287: urllib.request exposes too many names http://bugs.python.org/issue13287 opened by flox Most recent 15 issues with no replies (15) ========================================== #13283: removal of two unused variable in locale.py http://bugs.python.org/issue13283 #13282: the table of contents in epub file is too long http://bugs.python.org/issue13282 #13277: tzinfo subclasses information http://bugs.python.org/issue13277 #13276: distutils bdist_wininst created installer does not run the pos http://bugs.python.org/issue13276 #13272: 2to3 fix_renames doesn't rename string.lowercase/uppercase/let http://bugs.python.org/issue13272 #13262: IDLE opens partially hidden http://bugs.python.org/issue13262 #13257: Move importlib over to PEP 3151 exceptions http://bugs.python.org/issue13257 #13231: sys.settrace - document 'some other code blocks' for 'call' ev http://bugs.python.org/issue13231 #13229: Add shutil.filter_walk http://bugs.python.org/issue13229 #13217: Missing header dependencies in Makefile http://bugs.python.org/issue13217 #13213: generator.throw() behavior http://bugs.python.org/issue13213 #13204: sys.flags.__new__ crashes http://bugs.python.org/issue13204 #13198: Remove duplicate definition of write_record_file http://bugs.python.org/issue13198 #13191: Typo in argparse documentation http://bugs.python.org/issue13191 #13190: ConfigParser uses wrong newline on Windows http://bugs.python.org/issue13190 Most recent 15 issues waiting for review (15) ============================================= #13283: removal of two unused variable in locale.py http://bugs.python.org/issue13283 #13281: robotparser.RobotFileParser ignores rules preceeded by a blank http://bugs.python.org/issue13281 #13279: Add memcmp into unicode_compare for optimizing comparisons http://bugs.python.org/issue13279 #13256: Document and test new socket options http://bugs.python.org/issue13256 #13249: argparse.ArgumentParser() lists arguments in the wrong order http://bugs.python.org/issue13249 #13247: os.path.abspath returns unicode paths as question marks http://bugs.python.org/issue13247 #13245: sched.py kwargs addition and default time functions http://bugs.python.org/issue13245 #13244: WebSocket schemes in urllib.parse http://bugs.python.org/issue13244 #13240: sysconfig gives misleading results for USE_COMPUTED_GOTOS http://bugs.python.org/issue13240 #13238: Add shell command helpers to shutil module http://bugs.python.org/issue13238 #13234: os.listdir breaks with literal paths http://bugs.python.org/issue13234 #13224: Change str(class) to return only the class name http://bugs.python.org/issue13224 #13223: pydoc removes 'self' in HTML for method docstrings with exampl http://bugs.python.org/issue13223 #13218: test_ssl failures on Debian/Ubuntu http://bugs.python.org/issue13218 #13217: Missing header dependencies in Makefile http://bugs.python.org/issue13217 Top 10 most discussed issues (10) ================================= #13244: WebSocket schemes in urllib.parse http://bugs.python.org/issue13244 20 msgs #13238: Add shell command helpers to shutil module http://bugs.python.org/issue13238 16 msgs #13241: llvm-gcc-4.2 miscompiles Python (XCode 4.1 on Mac OS 10.7) http://bugs.python.org/issue13241 16 msgs #13247: os.path.abspath returns unicode paths as question marks http://bugs.python.org/issue13247 16 msgs #13237: subprocess docs should emphasise convenience functions http://bugs.python.org/issue13237 14 msgs #13234: os.listdir breaks with literal paths http://bugs.python.org/issue13234 12 msgs #13218: test_ssl failures on Debian/Ubuntu http://bugs.python.org/issue13218 11 msgs #13263: Group some os functions in submodules http://bugs.python.org/issue13263 10 msgs #12394: packaging: generate scripts from callable (dotted paths) http://bugs.python.org/issue12394 8 msgs #13193: test_packaging and test_distutils failures http://bugs.python.org/issue13193 8 msgs Issues closed (35) ================== #6216: Raise Unicode KEEPALIVE_SIZE_LIMIT from 9 to 32? http://bugs.python.org/issue6216 closed by pitrou #9980: str(float) failure http://bugs.python.org/issue9980 closed by mark.dickinson #10332: Multiprocessing maxtasksperchild results in hang http://bugs.python.org/issue10332 closed by neologix #10925: Document pure Python version of integer-to-float correctly-rou http://bugs.python.org/issue10925 closed by mark.dickinson #11183: Finer-grained exceptions for the ssl module http://bugs.python.org/issue11183 closed by pitrou #11447: test_pydoc refleak http://bugs.python.org/issue11447 closed by pitrou #12753: \N{...} neglects formal aliases and named sequences from Unico http://bugs.python.org/issue12753 closed by ezio.melotti #13016: selectmodule.c: refleak http://bugs.python.org/issue13016 closed by petri.lehtinen #13017: pyexpat.c: refleak http://bugs.python.org/issue13017 closed by petri.lehtinen #13018: dictobject.c: refleak http://bugs.python.org/issue13018 closed by petri.lehtinen #13105: Please elaborate on how 2.x and 3.x are different heads http://bugs.python.org/issue13105 closed by ncoghlan #13132: distutils sends non-RFC compliant HTTP request http://bugs.python.org/issue13132 closed by eric.araujo #13141: get rid of old threading API in the examples http://bugs.python.org/issue13141 closed by flox #13201: Implement comparison operators for range objects http://bugs.python.org/issue13201 closed by mark.dickinson #13207: os.path.expanduser breaks when using unicode character in the http://bugs.python.org/issue13207 closed by haypo #13216: Add cp65001 codec http://bugs.python.org/issue13216 closed by haypo #13226: Expose RTLD_* constants in the posix module http://bugs.python.org/issue13226 closed by haypo #13228: Add "Quick Start" section to the devguide index http://bugs.python.org/issue13228 closed by ezio.melotti #13232: Logging: Unicode Error http://bugs.python.org/issue13232 closed by python-dev #13242: Crash in test_pydoc http://bugs.python.org/issue13242 closed by pitrou #13243: _Py_identifier should be _Py_IDENTIFER http://bugs.python.org/issue13243 closed by meador.inge #13250: ctypes: reference leak in POINTER code http://bugs.python.org/issue13250 closed by loewis #13251: Update string description in datamodel.rst http://bugs.python.org/issue13251 closed by ezio.melotti #13255: wrong docstrings in array module http://bugs.python.org/issue13255 closed by flox #13258: replace hasattr(obj, '__call__') with callable(obj) http://bugs.python.org/issue13258 closed by python-dev #13259: __bytes__ not documented http://bugs.python.org/issue13259 closed by python-dev #13260: distutils and cross-compiling the extensions http://bugs.python.org/issue13260 closed by eric.araujo #13261: time.clock () has very low resolution on Linux http://bugs.python.org/issue13261 closed by haypo #13267: Add an option to disable importing orphaned bytecode files http://bugs.python.org/issue13267 closed by ncoghlan #13268: assert statement violates the documentation http://bugs.python.org/issue13268 closed by python-dev #13269: Document that "Remote hg repo" accepts remote branches http://bugs.python.org/issue13269 closed by python-dev #13270: all classes are new style http://bugs.python.org/issue13270 closed by flox #13273: HTMLParser improperly handling open tags when strict is False http://bugs.python.org/issue13273 closed by ezio.melotti #13278: Typo in documentation for sched module http://bugs.python.org/issue13278 closed by ezio.melotti #1548891: shlex (or perhaps cStringIO) and unicode strings http://bugs.python.org/issue1548891 closed by pitrou From carl at oddbird.net Fri Oct 28 20:37:35 2011 From: carl at oddbird.net (Carl Meyer) Date: Fri, 28 Oct 2011 12:37:35 -0600 Subject: [Python-Dev] draft PEP: virtual environments Message-ID: <4EAAF66F.9020603@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello python-dev, As has been discussed here previously, Vinay Sajip and I are working on a PEP for making "virtual Python environments" a la virtualenv [1] a built-in feature of Python 3.3. This idea was first proposed on python-dev by Ian Bicking in February 2010 [2]. It was revived at PyCon 2011 and has seen discussion on distutils-sig [3], more recently again on python-dev [4] [5], and most recently on python-ideas [6]. Full text of the draft PEP is pasted below, and also available on Bitbucket [7]. The reference implementation is also on Bitbucket [8]. For known issues in the reference implementation and cases where it does not yet match the PEP, see the open issues list [9]. In particular, please note the "Open Questions" section of the draft PEP. These are areas where we are still unsure of the best approach, or where we've received conflicting feedback and haven't reached consensus. We welcome your thoughts on anything in the PEP, but feedback on the open questions is especially useful. We'd also especially like to hear from Windows and OSX users, from authors of packaging-related tools (packaging/distutils2, zc.buildout) and from Python implementors (PyPy, IronPython, Jython). If it is easier to review and comment on the PEP after it is published on python.org, I can submit it to the PEP editors anytime. Otherwise I'll wait until we've resolved a few more of the open questions, as it's easier for me to update the PEP on Bitbucket. Thanks! Carl [1] http://virtualenv.org [2] http://mail.python.org/pipermail/python-dev/2010-February/097787.html [3] http://mail.python.org/pipermail/distutils-sig/2011-March/017498.html [4] http://mail.python.org/pipermail/python-dev/2011-June/111903.html [5] http://mail.python.org/pipermail/python-dev/2011-October/113883.html [6] http://mail.python.org/pipermail/python-ideas/2011-October/012500.html [7] https://bitbucket.org/carljm/pythonv-pep/src/ [8] https://bitbucket.org/vinay.sajip/pythonv/ [9] https://bitbucket.org/vinay.sajip/pythonv/issues?status=new&status=open PEP: XXX Title: Python Virtual Environments Version: $Revision$ Last-Modified: $Date$ Author: Carl Meyer Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 13-Jun-2011 Python-Version: 3.3 Post-History: 24-Oct-2011, 28-Oct-2011 Abstract ======== This PEP proposes to add to Python a mechanism for lightweight "virtual environments" with their own site directories, optionally isolated from system site directories. Each virtual environment has its own Python binary (allowing creation of environments with various Python versions) and can have its own independent set of installed Python packages in its site directories, but shares the standard library with the base installed Python. Motivation ========== The utility of Python virtual environments has already been well established by the popularity of existing third-party virtual-environment tools, primarily Ian Bicking's `virtualenv`_. Virtual environments are already widely used for dependency management and isolation, ease of installing and using Python packages without system-administrator access, and automated testing of Python software across multiple Python versions, among other uses. Existing virtual environment tools suffer from lack of support from the behavior of Python itself. Tools such as `rvirtualenv`_, which do not copy the Python binary into the virtual environment, cannot provide reliable isolation from system site directories. Virtualenv, which does copy the Python binary, is forced to duplicate much of Python's ``site`` module and manually symlink/copy an ever-changing set of standard-library modules into the virtual environment in order to perform a delicate boot-strapping dance at every startup. (Virtualenv copies the binary because symlinking it does not provide isolation, as Python dereferences a symlinked executable before searching for `sys.prefix`.) The ``PYTHONHOME`` environment variable, Python's only existing built-in solution for virtual environments, requires copying/symlinking the entire standard library into every environment. Copying the whole standard library is not a lightweight solution, and cross-platform support for symlinks remains inconsistent (even on Windows platforms that do support them, creating them often requires administrator privileges). A virtual environment mechanism integrated with Python and drawing on years of experience with existing third-party tools can be lower maintenance, more reliable, and more easily available to all Python users. .. _virtualenv: http://www.virtualenv.org .. _rvirtualenv: https://github.com/kvbik/rvirtualenv Specification ============= When the Python binary is executed, it attempts to determine its prefix (which it stores in ``sys.prefix``), which is then used to find the standard library and other key files, and by the ``site`` module to determine the location of the site-package directories. Currently the prefix is found (assuming ``PYTHONHOME`` is not set) by first walking up the filesystem tree looking for a marker file (``os.py``) that signifies the presence of the standard library, and if none is found, falling back to the build-time prefix hardcoded in the binary. This PEP proposes to add a new first step to this search. If a ``pyvenv.cfg`` file is found either adjacent to the Python executable, or one directory above it, this file is scanned for lines of the form ``key = value``. If a ``home`` key is found, this signifies that the Python binary belongs to a virtual environment, and the value of the ``home`` key is the directory containing the Python executable used to create this virtual environment. In this case, prefix-finding continues as normal using the value of the ``home`` key as the effective Python binary location, which results in ``sys.prefix`` being set to the system installation prefix, while ``sys.site_prefix`` is set to the directory containing ``pyvenv.cfg``. (If ``pyvenv.cfg`` is not found or does not contain the ``home`` key, prefix-finding continues normally, and ``sys.site_prefix`` will be equal to ``sys.prefix``.) The ``site`` and ``sysconfig`` standard-library modules are modified such that site-package directories ("purelib" and "platlib", in ``sysconfig`` terms) are found relative to ``sys.site_prefix``, while other directories (the standard library, include files) are still found relative to ``sys.prefix``. (Also, ``sys.site_exec_prefix`` is added, and handled similarly with regard to ``sys.exec_prefix``.) Thus, a Python virtual environment in its simplest form would consist of nothing more than a copy or symlink of the Python binary accompanied by a ``pyvenv.cfg`` file and a site-packages directory. The ``venv`` module also adds a ``pysetup3`` script into each venv, as well as necessary DLLs and `.pyd` files on Windows. In order to allow Python package managers to install packages into the virtual environment the same way they would install into a normal Python installation, and avoid special-casing virtual environments in ``sysconfig`` beyond using ``sys.site_prefix`` in place of ``sys.prefix``, the internal virtual environment layout mimics the layout of the Python installation itself on each platform. So a typical virtual environment layout on a POSIX system would be:: pyvenv.cfg bin/python3 bin/python bin/pysetup3 lib/python3.3/site-packages/ While on a Windows system:: pyvenv.cfg Scripts/python.exe Scripts/python3.dll Scripts/pysetup3.exe Scripts/pysetup3-script.py ... other DLLs and pyds... Lib/site-packages/ Third-party packages installed into the virtual environment will have their Python modules placed in the ``site-packages`` directory, and their executables placed in ``bin/`` or ``Scripts\``. .. note:: On a normal Windows system-level installation, the Python binary itself wouldn't go inside the "Scripts/" subdirectory, as it does in the default venv layout. This is useful in a virtual environment so that a user only has to add a single directory to their shell PATH in order to effectively "activate" the virtual environment. .. note:: On Windows, it is necessary to also copy or symlink DLLs and pyd files from compiled stdlib modules into the env, because if the venv is created from a non-system-wide Python installation, Windows won't be able to find the Python installation's copies of those files when Python is run from the venv. Isolation from system site-packages - ----------------------------------- By default, a virtual environment is entirely isolated from the system-level site-packages directories. If the ``pyvenv.cfg`` file also contains a key ``include-system-site-packages`` with a value of ``true`` (not case sensitive), the ``site`` module will also add the system site directories to ``sys.path`` after the virtual environment site directories. Thus system-installed packages will still be importable, but a package of the same name installed in the virtual environment will take precedence. :pep:`370` user-level site-packages are considered part of the system site-packages for venv purposes: they are not available from an isolated venv, but are available from an ``include-system-site-packages = true`` venv. Creating virtual environments - ----------------------------- This PEP also proposes adding a new ``venv`` module to the standard library which implements the creation of virtual environments. This module can be executed using the ``-m`` flag:: python3 -m venv /path/to/new/virtual/environment A ``pyvenv`` installed script is also provided to make this more convenient:: pyvenv /path/to/new/virtual/environment Running this command creates the target directory (creating any parent directories that don't exist already) and places a ``pyvenv.cfg`` file in it with a ``home`` key pointing to the Python installation the command was run from. It also creates a ``bin/`` (or ``Scripts`` on Windows) subdirectory containing a copy (or symlink) of the ``python3`` executable, and the ``pysetup3`` script from the ``packaging`` standard library module (to facilitate easy installation of packages from PyPI into the new virtualenv). And it creates an (initially empty) ``lib/pythonX.Y/site-packages`` (or ``Lib\site-packages`` on Windows) subdirectory. If the target directory already exists an error will be raised, unless the ``--clear`` option was provided, in which case the target directory will be deleted and virtual environment creation will proceed as usual. The created ``pyvenv.cfg`` file also includes the ``include-system-site-packages`` key, set to ``true`` if ``venv`` is run with the ``--system-site-packages`` option, ``false`` by default. Multiple paths can be given to ``venv``, in which case an identical virtualenv will be created, according to the given options, at each provided path. Copies versus symlinks - ---------------------- The technique in this PEP works equally well in general with a copied or symlinked Python binary (and other needed DLLs on Windows). Some users prefer a copied binary (for greater isolation from system changes) and some prefer a symlinked one (so that e.g. security updates automatically propagate to virtual environments). There are some cross-platform difficulties with symlinks: * Not all Windows versions support symlinks, and even on those that do, creating them often requires administrator privileges. * On OSX framework builds of Python, sys.executable is just a stub that executes the real Python binary. Symlinking this stub does not work with the implementation in this PEP; it must be copied. (Fortunately the stub is also small, so copying it is not an issue). Because of these issues, this PEP proposes to copy the Python binary by default, to maintain cross-platform consistency in the default behavior. The ``pyvenv`` script accepts a ``--symlink`` option. If this option is provided, the script will attempt to symlink instead of copy. If a symlink fails (e.g. because they are not supported by the platform, or additional privileges are needed), the script will warn the user and fall back to a copy. On OSX framework builds, where a symlink of the executable would succeed but create a non-functional virtual environment, the script will fail with an error message that symlinking is not supported on OSX framework builds. API - --- The high-level method described above will make use of a simple API which provides mechanisms for third-party virtual environment creators to customize environment creation according to their needs. The ``venv`` module will contain an ``EnvBuilder`` class which accepts the following keyword arguments on instantiation:: * ``system_site_packages`` - A Boolean value indicating that the system Python site-packages should be available to the environment (defaults to ``False``). * ``clear`` - A Boolean value which, if True, will delete any existing target directory instead of raising an exception (defaults to ``False``). * ``use_symlinks`` - A Boolean value indicating whether to attempt to symlink the Python binary (and any necessary DLLs or other binaries, e.g. ``pythonw.exe``), rather than copying. Defaults to ``True``. The returned env-builder is an object which is expected to have a single method, ``create``, which takes as required argument the path (absolute or relative to the current directory) of the target directory which is to contain the virtual environment. The ``create`` method will either create the environment in the specified directory, or raise an appropriate exception. Creators of third-party virtual environment tools will be free to use the provided ``EnvBuilder`` class as a base class. The ``venv`` module will also provide a module-level function as a convenience:: def create(env_dir, system_site_packages=False, clear=False, use_symlinks=True): builder = EnvBuilder( system_site_packages=system_site_packages, clear=clear) builder.create(env_dir) The ``create`` method of the ``EnvBuilder`` class illustrates the hooks available for customization: def create(self, env_dir): """ Create a virtualized Python environment in a directory. :param env_dir: The target directory to create an environment in. """ env_dir = os.path.abspath(env_dir) context = self.create_directories(env_dir) self.create_configuration(context) self.setup_python(context) self.setup_packages(context) self.setup_scripts(context) Each of the methods ``create_directories``, ``create_configuration``, ``setup_python``, ``setup_packages`` and ``setup_scripts`` can be overridden. The functions of these methods are:: * ``create_directories`` - creates the environment directory and all necessary directories, and returns a context object. This is just a holder for attributes (such as paths), for use by the other methods. * ``create_configuration`` - creates the ``pyvenv.cfg`` configuration file in the environment. * ``setup_python`` - creates a copy of the Python executable (and, under Windows, DLLs) in the environment. * ``setup_packages`` - A placeholder method which can be overridden in third party implementations to pre-install packages in the virtual environment. * ``setup_scripts`` - A placeholder methd which can be overridden in third party implementations to pre-install scripts (such as activation and deactivation scripts) in the virtual environment. The ``DistributeEnvBuilder`` subclass in the reference implementation illustrates how these last two methods can be used in practice. It's not envisaged that ``DistributeEnvBuilder`` will be actually added to Python core, but it makes the reference implementation more immediately useful for testing and exploratory purposes. * The ``setup_packages`` method installs Distribute in the target environment. This is needed at the moment in order to actually install most packages in an environment, since most packages are not yet packaging / setup.cfg based. * The ``setup_scripts`` method installs shell activation scripts in the environment. This is also done in a configurable way: A ``scripts`` property on the builder is expected to provide a buffer which is a base64-encoded zip file. The zip file contains directories "common", "linux2", "darwin", "win32", each containing scripts destined for the bin directory in the environment. The contents of "common" and the directory corresponding to ``sys.platform`` are copied after doing some text replacement of placeholders: * ``__VIRTUAL_ENV__`` is replaced with absolute path of the environment directory. * ``__VIRTUAL_PROMPT__`` is replaced with the environment prompt prefix. * ``__BIN_NAME__`` is replaced with the name of the bin directory. * ``__ENV_PYTHON__`` is replaced with the absolute path of the environment's executable. The "shell activation scripts" provided by ``DistributeEnvBuilder`` simply add the virtual environment's ``bin/`` (or ``Scripts\``) directory to the front of the user's shell PATH. This is not strictly necessary for use of a virtual environment (as an explicit path to the venv's python binary or scripts can just as well be used), but it is convenient. This PEP does not propose that the ``venv`` module in core Python will add such activation scripts by default, as they are shell-specific. Adding activation scripts for the wide variety of possible shells is an added maintenance burden, and is left to third-party extension tools. No doubt the process of PEP review will show up any customization requirements which have not yet been considered. Backwards Compatibility ======================= Splitting the meanings of ``sys.prefix`` - ---------------------------------------- Any virtual environment tool along these lines (which attempts to isolate site-packages, while still making use of the base Python's standard library with no need for it to be symlinked into the virtual environment) is proposing a split between two different meanings (among others) that are currently both wrapped up in ``sys.prefix``: the answers to the questions "Where is the standard library?" and "Where is the site-packages location where third-party modules should be installed?" This split could be handled by introducing a new ``sys`` attribute for either the former prefix or the latter prefix. Either option potentially introduces some backwards-incompatibility with software written to assume the other meaning for ``sys.prefix``. (Such software should preferably be using the APIs in the ``site`` and ``sysconfig`` modules to answer these questions rather than using ``sys.prefix`` directly, in which case there is no backwards-compatibility issue, but in practice ``sys.prefix`` is sometimes used.) The `documentation`__ for ``sys.prefix`` describes it as "A string giving the site-specific directory prefix where the platform independent Python files are installed," and specifically mentions the standard library and header files as found under ``sys.prefix``. It does not mention ``site-packages``. __ http://docs.python.org/dev/library/sys.html#sys.prefix This PEP currently proposes to leave ``sys.prefix`` pointing to the base system installation (which is where the standard library and header files are found), and introduce a new value in ``sys`` (``sys.site_prefix``) to point to the prefix for ``site-packages``. This maintains the documented semantics of ``sys.prefix``, but risks breaking isolation if third-party code uses ``sys.prefix`` rather than ``sys.site_prefix`` or the appropriate ``site`` API to find site-packages directories. The most notable case is probably `setuptools`_ and its fork `distribute`_, which mostly use ``distutils``/``sysconfig`` APIs, but do use ``sys.prefix`` directly to build up a list of site directories for pre-flight checking where ``pth`` files can usefully be placed. It would be trivial to modify these tools (currently only `distribute`_ is Python 3 compatible) to check ``sys.site_prefix`` and fall back to ``sys.prefix`` if it doesn't exist (for earlier versions of Python). If Distribute is modified in this way and released before Python 3.3 is released with the ``venv`` module, there would be no likely reason for an older version of Distribute to ever be installed in a virtual environment. In terms of other third-party usage, a `Google Code Search`_ turns up what appears to be a roughly even mix of usage between packages using ``sys.prefix`` to build up a site-packages path and packages using it to e.g. eliminate the standard-library from code-execution tracing. Either choice that's made here will require one or the other of these uses to be updated. .. _setuptools: http://peak.telecommunity.com/DevCenter/setuptools .. _distribute: http://packages.python.org/distribute/ .. _Google Code Search: http://www.google.com/codesearch#search/&q=sys\.prefix&p=1&type=cs Open Questions ============== Naming of the new ``sys`` prefix attributes - ------------------------------------------- The name ``sys.site_prefix`` was chosen with the following considerations in mind: * Inasmuch as "site" has a meaning in Python, it means a combination of Python version, standard library, and specific set of site-packages. This is, fundamentally, what a venv is (although it shares the standard library with its "base" site). * It is the Python ``site`` module which implements adding site-packages directories to ``sys.path``, so ``sys.site_prefix`` is a prefix used (and set) primarily by the ``site`` module. A concern has been raised that the term ``site`` in Python is already overloaded and of unclear meaning, and this usage will increase the overload. One proposed alternative is ``sys.venv_prefix``, which has the advantage of being clearly related to the venv implementation. The downside of this proposal is that it implies the attribute is only useful/relevant when in a venv and should be absent or ``None`` when not in a venv. This imposes an unnecessary extra burden on code using the attribute: ``sys.venv_prefix if sys.venv_prefix else sys.prefix``. The prefix attributes are more usable and general if they are always present and set, and split by meaning (stdlib vs site-packages, roughly), rather than specifically tied to venv. Also, third-party code should be encouraged to not know or care whether it is running in a virtual environment or not; this option seems to work against that goal. Another option would be ``sys.local_prefix``, which has both the advantage and disadvantage, depending on perspective, that it introduces the new term "local" rather than drawing on existing associations with the term "site". Why not modify sys.prefix? - -------------------------- As discussed above under `Backwards Compatibility`_, this PEP proposes to add ``sys.site_prefix`` as "the prefix relative to which site-package directories are found". This maintains compatibility with the documented meaning of ``sys.prefix`` (as the location relative to which the standard library can be found), but means that code assuming that site-packages directories are found relative to ``sys.prefix`` will not respect the virtual environment correctly. Since it is unable to modify ``distutils``/``sysconfig``, `virtualenv`_ is forced to instead re-point ``sys.prefix`` at the virtual environment. An argument could be made that this PEP should follow virtualenv's lead here (and introduce something like ``sys.base_prefix`` to point to the standard library and header files), since virtualenv already does this and it doesn't appear to have caused major problems with existing code. Another argument in favor of this is that it would be preferable to err on the side of greater, rather than lesser, isolation. Changing ``sys.prefix`` to point to the virtual environment and introducing a new ``sys.base_prefix`` attribute would err on the side of greater isolation in the face of existing code's use of ``sys.prefix``. What about include files? - ------------------------- For example, ZeroMQ installs zmq.h and zmq_utils.h in $VE/include, whereas SIP (part of PyQt4) installs sip.h by default in $VE/include/pythonX.Y. With virtualenv, everything works because the PythonX.Y include is symlinked, so everything that's needed is in $VE/include. At the moment the reference implementation doesn't do anything with include files, besides creating the include directory; this might need to change, to copy/symlink $VE/include/pythonX.Y. As in Python there's no abstraction for a site-specific include directory, other than for platform-specific stuff, then the user expectation would seem to be that all include files anyone could ever want should be found in one of just two locations, with sysconfig labels "include" & "platinclude". There's another issue: what if includes are Python-version-specific? For example, SIP installs by default into $VE/include/pythonX.Y rather than $VE/include, presumably because there's version-specific stuff in there - but even if that's not the case with SIP, it could be the case with some other package. And the problem that gives is that you can't just symlink the include/pythonX.Y directory, but actually have to provide a writable directory and symlink/copy the contents from the system include/pythonX.Y. Of course this is not hard to do, but it does seem inelegant. OTOH it's really because there's no supporting concept in Python/sysconfig. Interface with packaging tools - ------------------------------ Some work will be needed in packaging tools (Python 3.3 packaging, Distribute) to support implementation of this PEP. For example: * How Distribute and packaging use sys.prefix and/or sys.site_prefix. Clearly, in practice we'll need to use Distribute for a while, until packages have migrated over to usage of setup.cfg. * How packaging and Distribute set up shebang lines in scripts which they install in virtual environments. Testability and Source Build Issues - ----------------------------------- Currently in the reference implementation, virtual environments must be created with an installed Python, rather than a source build, as the base installation. In order to be able to fully test the ``venv`` module in the Python regression test suite, some anomalies in how sysconfig data is configured in source builds will need to be removed. For example, sysconfig.get_paths() in a source build gives (partial output):: { 'include': '/home/vinay/tools/pythonv/Include', 'libdir': '/usr/lib ; or /usr/lib64 on a multilib system', 'platinclude': '/home/vinay/tools/pythonv', 'platlib': '/usr/local/lib/python3.3/site-packages', 'platstdlib': '/usr/local/lib/python3.3', 'purelib': '/usr/local/lib/python3.3/site-packages', 'stdlib': '/usr/local/lib/python3.3' } Need for ``install_name_tool`` on OSX? - -------------------------------------- `Virtualenv uses`_ ``install_name_tool``, a tool provided in the Xcode developer tools, to modify the copied executable on OSX. We need input from OSX developers on whether this is actually necessary in this PEP's implementation of virtual environments, and if so, if there is an alternative to ``install_name_tool`` that would allow ``venv`` to not require that Xcode is installed. .. _Virtualenv uses: https://github.com/pypa/virtualenv/issues/168 Activation and Utility Scripts - ------------------------------ Virtualenv provides shell "activation" scripts as a user convenience, to put the virtual environment's Python binary first on the shell PATH. This is a maintenance burden, as separate activation scripts need to be provided and maintained for every supported shell. For this reason, this PEP proposes to leave such scripts to be provided by third-party extensions; virtual environments created by the core functionality would be used by directly invoking the environment's Python binary or scripts. If we are going to rely on external code to provide these conveniences, we need to check with existing third-party projects in this space (virtualenv, zc.buildout) and ensure that the proposed API meets their needs. (Virtualenv would be fine with the proposed API; it would become a relatively thin wrapper with a subclass of the env builder that adds shell activation and automatic installation of ``pip`` inside the virtual environment). Provide a mode that is isolated only from user site packages? - ------------------------------------------------------------- Is there sufficient rationale for providing a mode that isolates the venv from :pep:`370` user site packages, but not from the system-level site-packages? Other Python implementations? - ----------------------------- We should get feedback from Jython, IronPython, and PyPy about whether there's anything in this PEP that they foresee as a difficulty for their implementation. Reference Implementation ======================== The in-progress reference implementation is found in `a clone of the CPython Mercurial repository`_. To test it, build and install it (the virtual environment tool currently does not run from a source tree). - From the installed Python, run ``bin/python3 -m venv /path/to/new/virtualenv`` to create a virtual environment. The reference implementation (like this PEP!) is a work in progress. .. _a clone of the CPython Mercurial repository: https://bitbucket.org/vinay.sajip/pythonv Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6q9m4ACgkQ8W4rlRKtE2fnGwCeJzRo6YVQMjNyykkGARbfCi2Y ADgAoOlSK3IttiZWYKxtA9KIOCoJknh/ =EIPa -----END PGP SIGNATURE----- From victor.stinner at haypocalc.com Sat Oct 29 00:52:41 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sat, 29 Oct 2011 00:52:41 +0200 Subject: [Python-Dev] Emit a BytesWarning on bytes filenames on Windows Message-ID: <201110290052.41619.victor.stinner@haypocalc.com> Hi, I am not more conviced that raising a UnicodeEncodeError on unencodable characters is the right fix for the issue #13247. The problem with this solution is that you have to wait until an user get a UnicodeEncodeError. I have yet another proposition: emit a warning when a bytes filename is used. So it doesn't affect the default behaviour, but you can use -Werror to test if your program is fully Unicode compliant on Windows (without having to test invalid filenames). I don't know if a BytesWarning or a DeprecationWarning is more apropriate. It depends if we plan to drop support of bytes filenames on Windows later (in Python 3.5 or later). List of impacted functions: os._getfinalpathname(bytes) os._getfullpathname(bytes) os._isdir(bytes) os.access(bytes) os.chdir(bytes) os.chmod(bytes) os.getcwdb() os.link(bytes, bytes) os.listdir(bytes) os.lstat(bytes) os.mkdir(bytes) os.readlink(bytes) os.rename(bytes, bytes) os.rmdir(bytes) os.stat(bytes) os.symlink(bytes, bytes) os.unlink(bytes) os.utime(bytes, time) Note: Unicode filenames are not affected by this change. For example, os.listdir(str) will not emit any warning. Victor From chrism at plope.com Sat Oct 29 01:10:08 2011 From: chrism at plope.com (Chris McDonough) Date: Fri, 28 Oct 2011 19:10:08 -0400 Subject: [Python-Dev] draft PEP: virtual environments In-Reply-To: <4EAAF66F.9020603@oddbird.net> References: <4EAAF66F.9020603@oddbird.net> Message-ID: <1319843408.10593.3.camel@thinko> This is really very comprehensive, thank you! > Why not modify sys.prefix? > - -------------------------- > > As discussed above under `Backwards Compatibility`_, this PEP proposes > to add ``sys.site_prefix`` as "the prefix relative to which > site-package directories are found". This maintains compatibility with > the documented meaning of ``sys.prefix`` (as the location relative to > which the standard library can be found), but means that code assuming > that site-packages directories are found relative to ``sys.prefix`` > will not respect the virtual environment correctly. > > Since it is unable to modify ``distutils``/``sysconfig``, > `virtualenv`_ is forced to instead re-point ``sys.prefix`` at the > virtual environment. > > An argument could be made that this PEP should follow virtualenv's > lead here (and introduce something like ``sys.base_prefix`` to point > to the standard library and header files), since virtualenv already > does this and it doesn't appear to have caused major problems with > existing code. > > Another argument in favor of this is that it would be preferable to > err on the side of greater, rather than lesser, isolation. Changing > ``sys.prefix`` to point to the virtual environment and introducing a > new ``sys.base_prefix`` attribute would err on the side of greater > isolation in the face of existing code's use of ``sys.prefix``. It would seem to make sense to me to err on the side of greater isolation, introducing sys.base_prefix to indicate the base prefix (as opposed to sys.site_prefix indicating the venv prefix). Bugs introduced via a semi-isolated virtual environment are very difficult to troubleshoot. It would also make changes to existing code unnecessary. I have encountered no issues with virtualenv doing this so far. - C From ncoghlan at gmail.com Sat Oct 29 03:55:08 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 29 Oct 2011 11:55:08 +1000 Subject: [Python-Dev] draft PEP: virtual environments In-Reply-To: <4EAAF66F.9020603@oddbird.net> References: <4EAAF66F.9020603@oddbird.net> Message-ID: On Sat, Oct 29, 2011 at 4:37 AM, Carl Meyer wrote: > If it is easier to review and comment on the PEP after it is published > on python.org, I can submit it to the PEP editors anytime. Otherwise > I'll wait until we've resolved a few more of the open questions, as it's > easier for me to update the PEP on Bitbucket. It's best to get it posted, firstly so it has an assigned PEP number (although some may argue having to call it "the virtualenv PEP" is a feature!), secondly so that it's easy for people to get hold of a formatted version. All the core committers can actually publish PEPs via the PEP hg repo, so Vinay could probably handle pushing the updates to python.org. Submission via the PEP editors is mainly there as a backstop for cases where there's no current core dev directly involved in the PEP. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From skippy.hammond at gmail.com Sat Oct 29 07:47:01 2011 From: skippy.hammond at gmail.com (Mark Hammond) Date: Sat, 29 Oct 2011 16:47:01 +1100 Subject: [Python-Dev] Emit a BytesWarning on bytes filenames on Windows In-Reply-To: <201110290052.41619.victor.stinner@haypocalc.com> References: <201110290052.41619.victor.stinner@haypocalc.com> Message-ID: <4EAB9355.8010906@gmail.com> On 29/10/2011 9:52 AM, Victor Stinner wrote: > Hi, > > I am not more conviced that raising a UnicodeEncodeError on unencodable > characters is the right fix for the issue #13247. The problem with this > solution is that you have to wait until an user get a UnicodeEncodeError. > > I have yet another proposition: emit a warning when a bytes filename is used. > So it doesn't affect the default behaviour, but you can use -Werror to test if > your program is fully Unicode compliant on Windows (without having to test > invalid filenames). > > I don't know if a BytesWarning or a DeprecationWarning is more apropriate. It > depends if we plan to drop support of bytes filenames on Windows later (in > Python 3.5 or later). When previously discussing this issue, I was under the impression that the problem was unencodable bytes passed from the Python code to Windows - but the reverse is true - only the data coming back from Windows isn't encodable. This changes my opinion significantly :) I don't think raising an error is the right choice - there are almost certainly use-cases where the current behaviour works OK and we would break them (eg, not all files in a directory are likely to be unencodable). As the data came externally, the only solution the programmer has is to change to the unicode version of the api - so we recommend the bytes version not be used by anyone, anytime - which means it is conceptually deprecated already. Therefore, as you imply, I think the solution to this issue is to start the process of deprecating the bytes version of the api in py3k with a view to removing it completely - possibly with a less aggressive timeline than normal. In Python 2.7, I think documenting the issue and a recommendation to always use unicode is sufficient (ie, we can't deprecate it and a new BytesWarning seems gratuitous.) Cheers, Mark From g.brandl at gmx.net Sat Oct 29 10:01:45 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 29 Oct 2011 10:01:45 +0200 Subject: [Python-Dev] cpython (3.2): I should be someone In-Reply-To: References: Message-ID: On 10/28/11 22:05, florent.xicluna wrote: > http://hg.python.org/cpython/rev/6f56e81da8f6 > changeset: 73171:6f56e81da8f6 > branch: 3.2 > parent: 73167:09d0510e1c50 > user: Florent Xicluna > date: Fri Oct 28 22:03:55 2011 +0200 > summary: > I should be someone > > files: > Doc/library/urllib.request.rst | 8 ++++---- > 1 files changed, 4 insertions(+), 4 deletions(-) > > > diff --git a/Doc/library/urllib.request.rst b/Doc/library/urllib.request.rst > --- a/Doc/library/urllib.request.rst > +++ b/Doc/library/urllib.request.rst > @@ -1257,11 +1257,11 @@ > pair: HTTP; protocol > pair: FTP; protocol > > -* Currently, only the following protocols are supported: HTTP, (versions 0.9 and > - 1.0), FTP, and local files. > +* Currently, only the following protocols are supported: HTTP (versions 0.9 and > + 1.0), FTP, and local files. > > -* The caching feature of :func:`urlretrieve` has been disabled until I find the > - time to hack proper processing of Expiration time headers. > +* The caching feature of :func:`urlretrieve` has been disabled until someone find > + the time to hack proper processing of Expiration time headers. In this case, s/find/finds/ :) Georg From solipsis at pitrou.net Sat Oct 29 17:23:42 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 29 Oct 2011 17:23:42 +0200 Subject: [Python-Dev] draft PEP: virtual environments References: <4EAAF66F.9020603@oddbird.net> Message-ID: <20111029172342.61dbbd71@pitrou.net> On Fri, 28 Oct 2011 12:37:35 -0600 Carl Meyer wrote: > What about include files? > - ------------------------- > > For example, ZeroMQ installs zmq.h and zmq_utils.h in $VE/include, > whereas SIP (part of PyQt4) installs sip.h by default in > $VE/include/pythonX.Y. With virtualenv, everything works because the > PythonX.Y include is symlinked, so everything that's needed is in > $VE/include. At the moment the reference implementation doesn't do > anything with include files, besides creating the include directory; > this might need to change, to copy/symlink $VE/include/pythonX.Y. > > As in Python there's no abstraction for a site-specific include > directory, other than for platform-specific stuff, then the user > expectation would seem to be that all include files anyone could ever > want should be found in one of just two locations, with sysconfig > labels "include" & "platinclude". > > There's another issue: what if includes are Python-version-specific? > For example, SIP installs by default into $VE/include/pythonX.Y rather > than $VE/include, presumably because there's version-specific stuff in > there - but even if that's not the case with SIP, it could be the case > with some other package. Why would that be a problem? Do you plan to install several versions of Python in a single VE? > Activation and Utility Scripts > - ------------------------------ > > Virtualenv provides shell "activation" scripts as a user convenience, > to put the virtual environment's Python binary first on the shell > PATH. This is a maintenance burden, as separate activation scripts > need to be provided and maintained for every supported shell. We already have Unix shell scripts and BAT files in the source tree. Is it really complicated to maintain these additional shell scripts? Is there a lot of code in them? Regards Antoine. From martin at v.loewis.de Sat Oct 29 19:47:48 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 29 Oct 2011 19:47:48 +0200 Subject: [Python-Dev] Emit a BytesWarning on bytes filenames on Windows In-Reply-To: <4EAB9355.8010906@gmail.com> References: <201110290052.41619.victor.stinner@haypocalc.com> <4EAB9355.8010906@gmail.com> Message-ID: <4EAC3C44.7050606@v.loewis.de> > Therefore, as you imply, I think the solution to this issue is to start > the process of deprecating the bytes version of the api in py3k with a > view to removing it completely - possibly with a less aggressive > timeline than normal. In Python 2.7, I think documenting the issue and > a recommendation to always use unicode is sufficient (ie, we can't > deprecate it and a new BytesWarning seems gratuitous.) That sounds all fine to me. Regards, Martin From stephen at xemacs.org Sun Oct 30 02:31:20 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 30 Oct 2011 09:31:20 +0900 Subject: [Python-Dev] Emit a BytesWarning on bytes filenames on Windows In-Reply-To: <4EAC3C44.7050606@v.loewis.de> References: <201110290052.41619.victor.stinner@haypocalc.com> <4EAB9355.8010906@gmail.com> <4EAC3C44.7050606@v.loewis.de> Message-ID: <878vo3pbfr.fsf@uwakimon.sk.tsukuba.ac.jp> "Martin v. L?wis" writes: > > Therefore, as you imply, I think the solution to this issue is to start > > the process of deprecating the bytes version of the api in py3k with a > > view to removing it completely > That sounds all fine to me. As quoted above, deprecation of the bytes version of the API sounds fine to me, but isn't this going to run into the usual objections from the "we need bytes for efficiency" crowd? It's OK with me to say "in this restricted area you must convert to Unicode", but is that going to fly with that constituency? From martin at v.loewis.de Sun Oct 30 09:00:51 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 30 Oct 2011 09:00:51 +0100 Subject: [Python-Dev] Emit a BytesWarning on bytes filenames on Windows In-Reply-To: <878vo3pbfr.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201110290052.41619.victor.stinner@haypocalc.com> <4EAB9355.8010906@gmail.com> <4EAC3C44.7050606@v.loewis.de> <878vo3pbfr.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4EAD0433.50005@v.loewis.de> > As quoted above, deprecation of the bytes version of the API sounds > fine to me, but isn't this going to run into the usual objections from > the "we need bytes for efficiency" crowd? It's OK with me to > say "in this restricted area you must convert to Unicode", but is that > going to fly with that constituency? I don't think this "we need bytes for efficiency" crowd actually exists. We are talking about file names here. The relevant crowd is the "we need bytes for correctness", and that crowd focuses primarily on Unix. It splits into the "we only care about Unix" crowd (A), the "we want correctness everywhere" crowd (B), and the "we want portable code" crowd (C). (A) can accept the deprecation. (B) will support it. Only (C) might protest, as we are going to break their code, hence the deprecation period. Regards, Martin From ncoghlan at gmail.com Sun Oct 30 09:45:26 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 30 Oct 2011 18:45:26 +1000 Subject: [Python-Dev] Emit a BytesWarning on bytes filenames on Windows In-Reply-To: <4EAD0433.50005@v.loewis.de> References: <201110290052.41619.victor.stinner@haypocalc.com> <4EAB9355.8010906@gmail.com> <4EAC3C44.7050606@v.loewis.de> <878vo3pbfr.fsf@uwakimon.sk.tsukuba.ac.jp> <4EAD0433.50005@v.loewis.de> Message-ID: On Sun, Oct 30, 2011 at 6:00 PM, "Martin v. L?wis" wrote: >> As quoted above, deprecation of the bytes version of the API sounds >> fine to me, but isn't this going to run into the usual objections from >> the "we need bytes for efficiency" crowd? ?It's OK with me to >> say "in this restricted area you must convert to Unicode", but is that >> going to fly with that constituency? > > I don't think this "we need bytes for efficiency" crowd actually exists. I think that crowd does exist, but I've only ever seen them complain about URLs and other wire protocols (where turnaround time can matter a lot in terms of responsiveness of network applications for short requests, and encode()/decode() cycles can really add up). Filesystem access is dominated by I/O time, and there's often going to be some encoding or decoding going anyway (since the app and the filesystem have to get the data into a common format). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From vinay_sajip at yahoo.co.uk Sun Oct 30 10:56:46 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 30 Oct 2011 09:56:46 +0000 (UTC) Subject: [Python-Dev] draft PEP: virtual environments References: <4EAAF66F.9020603@oddbird.net> Message-ID: Nick Coghlan gmail.com> writes: > All the core committers can actually publish PEPs via the PEP hg repo, > so Vinay could probably handle pushing the updates to python.org. > Submission via the PEP editors is mainly there as a backstop for cases > where there's no current core dev directly involved in the PEP. Added as PEP 404 - hope y'all can find it ;-) Regards, Vinay Sajip From vinay_sajip at yahoo.co.uk Sun Oct 30 13:10:18 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 30 Oct 2011 12:10:18 +0000 (UTC) Subject: [Python-Dev] draft PEP: virtual environments References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> Message-ID: Antoine Pitrou pitrou.net> writes: > Why would that be a problem? Do you plan to install several versions of > Python in a single VE? No, but some packages might install headers in /include and others in /include/pythonX.Y. I wasn't sure whether this would cause a problem with files not being found during build, though I realise this can be worked around with specific -I flags to the compiler. At present, we only create a /include in the venv, but not /include/pythonX.Y. > We already have Unix shell scripts and BAT files in the source tree. Is > it really complicated to maintain these additional shell scripts? Is > there a lot of code in them? No, they're pretty small: wc -l gives 76 posix/activate (Bash script, contains deactivate() function) 31 nt/activate.bat 17 nt/deactivate.bat The question is whether we should stop at that, or whether there should be support for tcsh, fish etc. such as virtualenv provides. IMO, if we provide the above as a bare minimum + an easy way for third-party tools to install replacements/additions, then we probably don't need to worry too much about an additional support burden in the stdlib - third parties can take up the responsibility for supporting additional shells or helper scripts. Regards, Vinay Sajip From solipsis at pitrou.net Sun Oct 30 13:28:37 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 30 Oct 2011 13:28:37 +0100 Subject: [Python-Dev] draft PEP: virtual environments References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> Message-ID: <20111030132837.220c124d@pitrou.net> On Sun, 30 Oct 2011 12:10:18 +0000 (UTC) Vinay Sajip wrote: > > > We already have Unix shell scripts and BAT files in the source tree. Is > > it really complicated to maintain these additional shell scripts? Is > > there a lot of code in them? > > No, they're pretty small: wc -l gives > > 76 posix/activate (Bash script, contains deactivate() function) > 31 nt/activate.bat > 17 nt/deactivate.bat > > The question is whether we should stop at that, or whether there should be > support for tcsh, fish etc. such as virtualenv provides. I don't think we need additional support for more or less obscure shells. Also, if posix/activate is sufficiently well written (don't ask me how :-)), it should presumably be compatible with all Unix shells? Regards Antoine. From vinay_sajip at yahoo.co.uk Sun Oct 30 13:35:20 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 30 Oct 2011 12:35:20 +0000 (UTC) Subject: [Python-Dev] draft PEP: virtual environments References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> Message-ID: Antoine Pitrou pitrou.net> writes: > We already have Unix shell scripts and BAT files in the source tree. Do we have a blessed location in the stdlib for data files in general? Although we're talking in this instance about scripts, they're just data as far as the venv module is concerned. While it's not uncommon for data which is included with packages to be installed in the source tree for that package(e.g. packaging's test data), I'm not sure what one would do with data which belongs to a top-level module. At the moment it's in the source as a base64-encoded string, but I'm not sure that's ideal - it's workable only because the data is so small. I don't really want to add a Lib/scripts.zip adjacent to venv.py, which venv accesses via os.path.dirname(__file__), because if every module did this, it would be a tad untidy. The other alternative would be to make venv a package with all its code in venv/__init__.py and a scripts.zip adjacent to that. Does that seem like a better solution? Can anyone suggest better alternatives? Sorry if this has come up before and I've missed something obvious. Regards, Vinay Sajip From solipsis at pitrou.net Sun Oct 30 13:39:58 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 30 Oct 2011 13:39:58 +0100 Subject: [Python-Dev] draft PEP: virtual environments References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> Message-ID: <20111030133958.0c336199@pitrou.net> On Sun, 30 Oct 2011 12:35:20 +0000 (UTC) Vinay Sajip wrote: > The other alternative would be to make venv a package with all its code > in venv/__init__.py and a scripts.zip adjacent to that. Does that seem > like a better solution? Please don't make it a zip file. We want code to be easily trackable and editable. Regards Antoine. From p.f.moore at gmail.com Sun Oct 30 15:09:58 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 30 Oct 2011 14:09:58 +0000 Subject: [Python-Dev] Packaging and binary distributions Message-ID: I'd like to reopen the discussions on how the new packaging module will handle/support binary distributions in Python 3.3. The previous thread (see http://mail.python.org/pipermail/python-dev/2011-October/113956.html) included a lot of good information and discussion, but ultimately didn't reach any firm conclusions. First question - is this a Windows only problem, or do Unix/MacOS users want binary support? My feeling is that it's not an issue for them, at least not enough that anyone has done anything about it in the past, so I'll focus on Windows here. Second question - is there a problem at all? For the majority of Windows users, I suspect not. The existing bdist_wininst and bdist_msi formats have worked fine for a long time, offer Windows integration and a GUI installer, and in the case of MSI offer options for integrating with corporate distribution policies that some users consider significant, if not essential. (Binary eggs are a third, and somewhat odd, case - a number of projects have started distributing binary eggs, but I don't know what benefits they have over bdist_wininst in particular, as easy_install will read bdist_wininst installers. Perhaps a setuptools/distribute user could comment. For now I'll assume that binary eggs will slowly go away as packaging gets more widely adopted). So that leaves a minority who (1) prefer integration with packaging, (2) need to work with virtual environments or custom local builds, (3) need binary extensions in some or all of their environments and (4) don't want to have to build all the binaries they need from scratch. Given the scale of the issue, it seems likely that putting significant effort into addressing it is unwise. In particular, it seems unlikely that developers are going to move en masse to a new distribution format just to cater for this minority. On the other hand, for people who care, the fact that packaging (currently) offers no direct support for consuming binary distributions is a fairly obvious hole. And having to build from source just to install into a virtual environment could be a showstopper. The bdist_wininst format is relatively amenable to manipulation - it's little more than a zip file, after all. So writing 3rd party code to install the contents via packaging shouldn't be hard (I've done some proof of concept work, and it isn't :-)) Vinay's proposal to use the resource mechanism and some custom hooks would work, but I'd like to see a small amount of extra direct support added to packaging to make things cleaner. Also, if packaging supported plugins to recognise new distribution formats, this would make it possible to integrate the extra code seamlessly. The MSI format is a little more tricky, mainly because it is a more complex format and (as far as I can tell from a brief check) files are stored in the opaque CAB format, so the only way of getting data out is to do a temporary install somewhere. But I see no reason why that isn't achievable. So, my proposal is as follows: 1. I will write a 3rd party module to take bsist_wininst and bdist_msi modules and install them using packaging 2. Where packaging changes are useful to make installing binaries easier, I'll request them (by supplying patches) 3. I'll look at creating a format-handling plugin mechanism for packaging. If it's viable, I'll post patches 4. If it seems useful, my module could be integrated into the core packaging module I don't intend to do anything about a GUI, or modify the existing formats at all. These don't interest me, particularly, so I'll leave them to someone who has a clear picture of what they want in those areas, and the time to develop it. For 3.3 at least, I'd expect developers to continue distributing bdist_wininst or bdist_msi format files. We'll see what happens with binary eggs. Unix/MacOS users who care will need to propose something themselves. Does anyone have any comments? Paul. From tseaver at palladion.com Sun Oct 30 16:40:31 2011 From: tseaver at palladion.com (Tres Seaver) Date: Sun, 30 Oct 2011 11:40:31 -0400 Subject: [Python-Dev] draft PEP: virtual environments In-Reply-To: References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/30/2011 08:35 AM, Vinay Sajip wrote: > Antoine Pitrou pitrou.net> writes: > >> We already have Unix shell scripts and BAT files in the source >> tree. > > Do we have a blessed location in the stdlib for data files in > general? Although we're talking in this instance about scripts, > they're just data as far as the venv module is concerned. While > it's not uncommon for data which is included with packages to be > installed in the source tree for that package(e.g. packaging's test > data), I'm not sure what one would do with data which belongs to a > top-level module. At the moment it's in the source as a > base64-encoded string, but I'm not sure that's ideal - it's > workable only because the data is so small. I don't really want to > add a Lib/scripts.zip adjacent to venv.py, which venv accesses via > os.path.dirname(__file__), because if every module did this, it > would be a tad untidy. > > The other alternative would be to make venv a package with all its > code in venv/__init__.py and a scripts.zip adjacent to that. Does > that seem like a better solution? Can anyone suggest better > alternatives? Sorry if this has come up before and I've missed > something obvious. +1 to making it a package and keeping the data in the package. - -1 to a zip file: each scripts should be a noraml, version-controlled entity. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6tb+8ACgkQ+gerLs4ltQ463wCfZoOOYK1c7XgAaihSdM9+0dxn /YgAoMVlq+ZRGA6xZUFNrajSbdr4aUQZ =P6zT -----END PGP SIGNATURE----- From vinay_sajip at yahoo.co.uk Sun Oct 30 16:42:11 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 30 Oct 2011 15:42:11 +0000 (UTC) Subject: [Python-Dev] draft PEP: virtual environments References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> <20111030133958.0c336199@pitrou.net> Message-ID: Antoine Pitrou pitrou.net> writes: > Please don't make it a zip file. We want code to be easily trackable > and editable. Of course. I was thinking of a directory tree in the source, subject to our normal revision control, but processed during make or installation to be available as a zip file once deployed. It was a general point about data that I was making; in this particular case, that data just happens to be source code. Regards, Vinay Sajip From solipsis at pitrou.net Sun Oct 30 18:20:34 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 30 Oct 2011 18:20:34 +0100 Subject: [Python-Dev] cpython (merge 3.2 -> default): Fix the return value of set_discard (issue #10519) References: Message-ID: <20111030182034.7f62e18f@pitrou.net> On Sun, 30 Oct 2011 13:38:35 +0100 petri.lehtinen wrote: > http://hg.python.org/cpython/rev/f634102aca01 > changeset: 73204:f634102aca01 > parent: 73201:a5c4ae15b59d > parent: 73203:b643458a0108 > user: Petri Lehtinen > date: Sun Oct 30 14:35:39 2011 +0200 > summary: > Fix the return value of set_discard (issue #10519) I get the following compiler warning here: Objects/setobject.c: In function ?set_discard?: Objects/setobject.c:1909:24: attention : unused variable ?result? Regards Antoine. From solipsis at pitrou.net Sun Oct 30 18:35:51 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 30 Oct 2011 18:35:51 +0100 Subject: [Python-Dev] draft PEP: virtual environments References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> <20111030133958.0c336199@pitrou.net> Message-ID: <20111030183551.2ce82c11@pitrou.net> On Sun, 30 Oct 2011 15:42:11 +0000 (UTC) Vinay Sajip wrote: > Antoine Pitrou pitrou.net> writes: > > > Please don't make it a zip file. We want code to be easily trackable > > and editable. > > Of course. I was thinking of a directory tree in the source, subject to our > normal revision control, but processed during make or installation to be > available as a zip file once deployed. It would be even simpler not to process it at all, but install the scripts as-is (without the execute bit) :) Regards Antoine. From nad at acm.org Sun Oct 30 19:04:08 2011 From: nad at acm.org (Ned Deily) Date: Sun, 30 Oct 2011 11:04:08 -0700 Subject: [Python-Dev] Packaging and binary distributions References: Message-ID: In article , Paul Moore wrote: > I'd like to reopen the discussions on how the new packaging module > will handle/support binary distributions in Python 3.3. The previous > thread (see > http://mail.python.org/pipermail/python-dev/2011-October/113956.html) > included a lot of good information and discussion, but ultimately > didn't reach any firm conclusions. > > First question - is this a Windows only problem, or do Unix/MacOS > users want binary support? My feeling is that it's not an issue for > them, at least not enough that anyone has done anything about it in > the past, so I'll focus on Windows here. I haven't been following this discussion that closely but I'm rather surprised that the need for binary distributions for Python packages on non-Windows platforms would be in question. Just as on Windows, it's not a given that all Unix or Mac OS X end-user systems will have the necessary development tools installed (C compiler, etc) to build C extension modules. Today, the most platform-independent way of distributing these are with binary eggs: the individual binary eggs are, of course, not platform-independent but the distribution and installation mechanism is or should be. Sure, there are other ways, like pushing the problem back to the OS distributor (e.g. Debian, Red Hat, et al) or, as in the case of Mac OS X where there isn't a system package manager in the same sense, to a third-party package distributor (like MacPorts, Homebrew, or Fink). Or you can produce platform-specific installers for each platform which also seems heavy-weight. Has anyone analyzed the current packages on PyPI to see how many provide binary distributions and in what format? -- Ned Deily, nad at acm.org From tseaver at palladion.com Sun Oct 30 22:14:55 2011 From: tseaver at palladion.com (Tres Seaver) Date: Sun, 30 Oct 2011 17:14:55 -0400 Subject: [Python-Dev] Packaging and binary distributions In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/30/2011 02:04 PM, Ned Deily wrote: > In article > , > > Paul Moore wrote: > >> I'd like to reopen the discussions on how the new packaging >> module will handle/support binary distributions in Python 3.3. >> The previous thread (see >> http://mail.python.org/pipermail/python-dev/2011-October/113956.html) >> >> included a lot of good information and discussion, but ultimately >> didn't reach any firm conclusions. >> >> First question - is this a Windows only problem, or do >> Unix/MacOS users want binary support? My feeling is that it's not >> an issue for them, at least not enough that anyone has done >> anything about it in the past, so I'll focus on Windows here. > > I haven't been following this discussion that closely but I'm > rather surprised that the need for binary distributions for Python > packages on non-Windows platforms would be in question. Just as on > Windows, it's not a given that all Unix or Mac OS X end-user > systems will have the necessary development tools installed (C > compiler, etc) to build C extension modules. Today, the most > platform-independent way of distributing these are with binary > eggs: the individual binary eggs are, of course, not > platform-independent but the distribution and installation > mechanism is or should be. Sure, there are other ways, like > pushing the problem back to the OS distributor (e.g. Debian, Red > Hat, et al) or, as in the case of Mac OS X where there isn't a > system package manager in the same sense, to a third-party package > distributor (like MacPorts, Homebrew, or Fink). Or you can produce > platform-specific installers for each platform which also seems > heavy-weight. > > Has anyone analyzed the current packages on PyPI to see how many > provide binary distributions and in what format? Practically speaking, nobody but Windows consumers *needs* binary packages on PyPI: even if the target ("production") box is crippled^Wstripped of its compiler, such environments always have "staging" hosts which can be used to build binary packages for internal distribution. Windows users are the only ones who routinely don't have access to a compiler at all. Even trying to push binary distributeions to PyPI for Linux is a nightmare (e.g., due to UCS2 / UCS4 incompatibility). Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6tvk4ACgkQ+gerLs4ltQ7zLwCfa0tvsRUtkwC3OkhYwGD7eGvL pbwAoLAm416vdyS3qbGDf/2R9iEtw2rH =tcS+ -----END PGP SIGNATURE----- From victor.stinner at haypocalc.com Sun Oct 30 22:26:20 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sun, 30 Oct 2011 22:26:20 +0100 Subject: [Python-Dev] Emit a BytesWarning on bytes filenames on Windows In-Reply-To: <4EAD0433.50005@v.loewis.de> References: <201110290052.41619.victor.stinner@haypocalc.com> <4EAB9355.8010906@gmail.com> <4EAC3C44.7050606@v.loewis.de> <878vo3pbfr.fsf@uwakimon.sk.tsukuba.ac.jp> <4EAD0433.50005@v.loewis.de> Message-ID: <4EADC0FC.2040401@haypocalc.com> Le 30/10/2011 09:00, "Martin v. L?wis" a ?crit : >> As quoted above, deprecation of the bytes version of the API sounds >> fine to me, but isn't this going to run into the usual objections from >> the "we need bytes for efficiency" crowd? It's OK with me to >> say "in this restricted area you must convert to Unicode", but is that >> going to fly with that constituency? > > I don't think this "we need bytes for efficiency" crowd actually exists. > We are talking about file names here. The relevant crowd is the > "we need bytes for correctness", and that crowd focuses primarily on > Unix. Oh, by the way, it is important to know that Unicode filenames is the best way to write portable programs with Python 3. On UNIX, since Python 3.1, undecodables filename don't raise Unicode errors: undecodable bytes are stored as surrogates (see the PEP 383). So even if the computer is completly misconfigured, it "just works". On Windows, you must Unicode for filenames for correctness. Anyway, with Python 3, it's easier to manipulate Unicode strings than bytes strings. Martin finally agreed with me, I should hurry to implement my idea! :-) Victor From victor.stinner at haypocalc.com Sun Oct 30 22:39:27 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sun, 30 Oct 2011 22:39:27 +0100 Subject: [Python-Dev] Emit a BytesWarning on bytes filenames on Windows In-Reply-To: <4EAB9355.8010906@gmail.com> References: <201110290052.41619.victor.stinner@haypocalc.com> <4EAB9355.8010906@gmail.com> Message-ID: <4EADC40F.5050607@haypocalc.com> Le 29/10/2011 07:47, Mark Hammond a ?crit : > When previously discussing this issue, I was under the impression that > the problem was unencodable bytes passed from the Python code to Windows > - but the reverse is true - only the data coming back from Windows isn't > encodable. The undecodable filenames issue occurs mostly on os.listdir(bytes) and os.getcwdb(). Unencodable filenames issue occurs on the rest of my function list. > As the data came externally, the only solution the programmer > has is to change to the unicode version of the api > - so we recommend the bytes version not be used by anyone, > anytime - which means it is conceptually deprecated already. I proposed to raise a Unicode error on undecodable filenames, instead of returning invalid filenames (with question marks), to force the developer to move to the Unicode API. But as I explained in my previous message, you have to wait for an user having the problem to be noticied of the problem. Terry J. Reedy is also concerned about backward compatibility (3.2 -> 3.3). Emiting a warning, disabled by default, is a softer solution :-) > Therefore, as you imply, I think the solution to this issue is to start > the process of deprecating the bytes version of the api in py3k with a > view to removing it completely - possibly with a less aggressive > timeline than normal. If there is a warning, I don't really care of removing the bytes API before Python 4. PendingDeprecationgWarning can be used, or maybe a DeprecationWarning mentioning that the code will stay for long time. > In Python 2.7, I think documenting the issue and a > recommendation to always use unicode is sufficient (ie, we can't > deprecate it and a new BytesWarning seems gratuitous.) Sorry, I don't understand "gratuitous" here: do you mean that a new warning would annoying, and that it is cheap and useful to add it to Python 2.7.x? Victor From vinay_sajip at yahoo.co.uk Sun Oct 30 23:47:13 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 30 Oct 2011 22:47:13 +0000 (UTC) Subject: [Python-Dev] draft PEP: virtual environments References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> <20111030133958.0c336199@pitrou.net> <20111030183551.2ce82c11@pitrou.net> Message-ID: Antoine Pitrou pitrou.net> writes: > > It would be even simpler not to process it at all, but install the > scripts as-is (without the execute bit) :) > Sure, but such an approach makes it difficult to provide a mechanism which is easily extensible; for example, with the current approach, it is straightforward for third party tools to either easily replace completely, update selectively or augment simply the scripts provided by base classes. Regards, Vinay Sajip From p.f.moore at gmail.com Sun Oct 30 23:52:44 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 30 Oct 2011 22:52:44 +0000 Subject: [Python-Dev] Packaging and binary distributions In-Reply-To: References: Message-ID: On 30 October 2011 18:04, Ned Deily wrote: > Has anyone analyzed the current packages on PyPI to see how many provide > binary distributions and in what format? A very quick and dirty check: dmg: 5 rpm: 12 msi: 23 dumb: 132 wininst: 364 egg: 2570 That's number of packages with binary distributions in that format. It's hard to be sure about egg distributions, as many of these could be pure-python (there's no way I know, from the PyPI metadata, to check this). This is 2913 packages with some form of binary distribution out of 16615 that have a single release on PyPI. I skipped 398 with multiple releases as I wasn't sure how to capture the data for those... I suspect they include some important cases, though (I know lxml is in there, for example). So: 17% of packages have any binary release. Of those, 88% have eggs, 12% have wininst and the rest are under 5%. Put another way, 2% of all packages have wininst installers. And 15% have eggs. That's not a lot. Paul. From solipsis at pitrou.net Sun Oct 30 23:59:44 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 30 Oct 2011 23:59:44 +0100 Subject: [Python-Dev] draft PEP: virtual environments References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> <20111030133958.0c336199@pitrou.net> <20111030183551.2ce82c11@pitrou.net> Message-ID: <20111030235944.240770f2@pitrou.net> On Sun, 30 Oct 2011 22:47:13 +0000 (UTC) Vinay Sajip wrote: > Antoine Pitrou pitrou.net> writes: > > > > > It would be even simpler not to process it at all, but install the > > scripts as-is (without the execute bit) :) > > > > Sure, but such an approach makes it difficult to provide a mechanism which is > easily extensible; for example, with the current approach, it is straightforward > for third party tools to either easily replace completely, update selectively or > augment simply the scripts provided by base classes. I don't understand why a zip file makes this easier (especially the "update selectively" part). Regards Antoine. From vinay_sajip at yahoo.co.uk Mon Oct 31 00:17:17 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 30 Oct 2011 23:17:17 +0000 (UTC) Subject: [Python-Dev] Packaging and binary distributions References: Message-ID: Paul Moore gmail.com> writes: > The MSI format is a little more tricky, mainly because it is a more > complex format and (as far as I can tell from a brief check) files are > stored in the opaque CAB format, so the only way of getting data out > is to do a temporary install somewhere. But I see no reason why that > isn't achievable. It's not just about getting the data out of the CAB, though - it's also about integration with Add/Remove Programs and the rest of the Windows Installer ecosystem. > 1. I will write a 3rd party module to take bsist_wininst and bdist_msi > modules and install them using packaging It would be important to retain the flexibility currently offered by setup.cfg hooks, as I don't believe any out-of-the-box approach will work for the wide range of use cases on Windows (think Powershell scripts, Visio templates and other Microsoft Office integration components). I'm also not sure if these formats provide all the flexibility required - e.g. they may be fine for extension modules, but how do they handle packaging include files? > For 3.3 at least, I'd expect developers to continue distributing > bdist_wininst or bdist_msi format files. We'll see what happens with > binary eggs. > > Unix/MacOS users who care will need to propose something themselves. I'm not sure there's anything especially Windows-specific about the bdist_wininst format, apart from the prepended GUI executable. One drawback of any current scheme is that if you're packaging an extension module that runs on say Windows, Linux and Mac OS X, there's no easy way to build or distribute a single archive (for a given version of Python, say) which has all the binary variants you want to include, such that at installation time, only the bits relevant to the target platform are installed. The current packaging functionality does sort of support this, but it entails potentially tedious manual editing of the setup.cfg file to add information about what resources apply to which platform - the kind of tedious editing which would be obviated by the right kind of additional support code. Regards, Vinay Sajip From songofacandy at gmail.com Mon Oct 31 00:26:27 2011 From: songofacandy at gmail.com (INADA Naoki) Date: Mon, 31 Oct 2011 08:26:27 +0900 Subject: [Python-Dev] Packaging and binary distributions In-Reply-To: References: Message-ID: I like binary distribution even under Linux. I access some Linux machines using same Linux distribution and some of them doesn't have "python-dev" package or even "build-essensials". (because they are netbooting so have restricted rootfs size) So I want build binary package by myself and distribute it to virtualenv on such machines. In this case, absolute path of virtualenv is not fixed. So "bdist_dumb --relative" or egg is good for me. On Sun, Oct 30, 2011 at 11:09 PM, Paul Moore wrote: > I'd like to reopen the discussions on how the new packaging module > will handle/support binary distributions in Python 3.3. The previous > thread (see http://mail.python.org/pipermail/python-dev/2011-October/113956.html) > included a lot of good information and discussion, but ultimately > didn't reach any firm conclusions. > > First question - is this a Windows only problem, or do Unix/MacOS > users want binary support? My feeling is that it's not an issue for > them, at least not enough that anyone has done anything about it in > the past, so I'll focus on Windows here. > > Second question - is there a problem at all? For the majority of > Windows users, I suspect not. The existing bdist_wininst and bdist_msi > formats have worked fine for a long time, offer Windows integration > and a GUI installer, and in the case of MSI offer options for > integrating with corporate distribution policies that some users > consider significant, if not essential. (Binary eggs are a third, and > somewhat odd, case - a number of projects have started distributing > binary eggs, but I don't know what benefits they have over > bdist_wininst in particular, as easy_install will read bdist_wininst > installers. Perhaps a setuptools/distribute user could comment. For > now I'll assume that binary eggs will slowly go away as packaging gets > more widely adopted). > > So that leaves a minority who (1) prefer integration with packaging, > (2) need to work with virtual environments or custom local builds, (3) > need binary extensions in some or all of their environments and (4) > don't want to have to build all the binaries they need from scratch. > > Given the scale of the issue, it seems likely that putting significant > effort into addressing it is unwise. In particular, it seems unlikely > that developers are going to move en masse to a new distribution > format just to cater for this minority. On the other hand, for people > who care, the fact that packaging (currently) offers no direct support > for consuming binary distributions is a fairly obvious hole. And > having to build from source just to install into a virtual environment > could be a showstopper. > > The bdist_wininst format is relatively amenable to manipulation - it's > little more than a zip file, after all. So writing 3rd party code to > install the contents via packaging shouldn't be hard (I've done some > proof of concept work, and it isn't :-)) Vinay's proposal to use the > resource mechanism and some custom hooks would work, but I'd like to > see a small amount of extra direct support added to packaging to make > things cleaner. Also, if packaging supported plugins to recognise new > distribution formats, this would make it possible to integrate the > extra code seamlessly. > > The MSI format is a little more tricky, mainly because it is a more > complex format and (as far as I can tell from a brief check) files are > stored in the opaque CAB format, so the only way of getting data out > is to do a temporary install somewhere. But I see no reason why that > isn't achievable. > > So, my proposal is as follows: > > 1. I will write a 3rd party module to take bsist_wininst and bdist_msi > modules and install them using packaging > 2. Where packaging changes are useful to make installing binaries > easier, I'll request them (by supplying patches) > 3. I'll look at creating a format-handling plugin mechanism for > packaging. If it's viable, I'll post patches > 4. If it seems useful, my module could be integrated into the core > packaging module > > I don't intend to do anything about a GUI, or modify the existing > formats at all. These don't interest me, particularly, so I'll leave > them to someone who has a clear picture of what they want in those > areas, and the time to develop it. > > For 3.3 at least, I'd expect developers to continue distributing > bdist_wininst or bdist_msi format files. We'll see what happens with > binary eggs. > > Unix/MacOS users who care will need to propose something themselves. > > Does anyone have any comments? > > Paul. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- INADA Naoki? From p.f.moore at gmail.com Mon Oct 31 00:54:25 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 30 Oct 2011 23:54:25 +0000 Subject: [Python-Dev] Packaging and binary distributions In-Reply-To: References: Message-ID: On 30 October 2011 23:17, Vinay Sajip wrote: > Paul Moore gmail.com> writes: > >> The MSI format is a little more tricky, mainly because it is a more >> complex format and (as far as I can tell from a brief check) files are >> stored in the opaque CAB format, so the only way of getting data out >> is to do a temporary install somewhere. But I see no reason why that >> isn't achievable. > > It's not just about getting the data out of the CAB, though - it's also about > integration with Add/Remove Programs and the rest of the Windows Installer > ecosystem. Hang on. I'm talking here about repackaging the binary files in the MSI file for use in a pysetup install invocation. As pysetup has no GUI, and doesn't integrate with Add/Remove, there's no issue here. If you want a GUI and Add/Remove integration, just run the MSI. Or am I missing something? We seem to be at cross purposes here, I suspect I'm missing your point. >> 1. I will write a 3rd party module to take bsist_wininst and bdist_msi >> modules and install them using packaging > > It would be important to retain the flexibility currently offered by setup.cfg > hooks, as I don't believe any out-of-the-box approach will work for the wide > range of use cases on Windows (think Powershell scripts, Visio templates and > other Microsoft Office integration components). Why? Again, if this is purely as a means to consume bdist_xxx files, then the only flexibility needed is enough to cater for any variations in data stored in the bdist_xxx format. The wininst format is easy here - it has directories PLATLIB, PURELIB, DATA, SCRIPTS and HEADERS (corresponding to the installation --install-xxx parameters) and that's all. As long as the module is flexible enough to deal with that, it can read anything bdist_wininst can produce. > I'm also not sure if these formats provide all the flexibility required - e.g. > they may be fine for extension modules, but how do they handle packaging include > files? Ah, I think I see what you are getting at. If someone uses the new features and flexibility of packaging to create a fancy custom install scheme, how do they bundle up a binary distribution from that? My (current) answer is that I don't know. The packaging module as it stands only offers the legacy bdist_xxx formats, so the answer is "run pysetup run bdist_wininst on it". If that breaks (as it is likely to - wininst format isn't very flexible) then tough, you're out of luck. I 100% agree that having a "native" packaging means of building binary distributions from source ones, which captures all of the necessary information to cover any flexibility available to setup.cfg, would be good. But that's potentially a much bigger project than I can manage. My bdist_simple format was based off bdist_dumb/bdist_wininst and had the same limitations as that. You might be able to get somewhere by running build, then zipping up the whole directory, source, build subdirectory and all. Then on the target machine, unzip and do a --skip-build install. That's a bit of a hack, but should in theory work. Whether it's the basis of a sensible distribution format I don't know. >> For 3.3 at least, I'd expect developers to continue distributing >> bdist_wininst or bdist_msi format files. We'll see what happens with >> binary eggs. >> >> Unix/MacOS users who care will need to propose something themselves. > > I'm not sure there's anything especially Windows-specific about the > bdist_wininst format, apart from the prepended GUI executable. One drawback of > any current scheme is that if you're packaging an extension module that runs on > say Windows, Linux and Mac OS X, there's no easy way to build or distribute a > single archive (for a given version of Python, say) which has all the binary > variants you want to include, such that at installation time, only the bits > relevant to the target platform are installed. The current packaging > functionality does sort of support this, but it entails potentially tedious > manual editing of the setup.cfg file to add information about what resources > apply to which platform - the kind of tedious editing which would be obviated by > the right kind of additional support code. Again, I agree that this would be useful. Not something I have the time to look at though (although if someone else picks it up, I'd be interested in doing some testing and maybe contributing to the work). I think I now see why we're not understanding each other. I'm coming from the position that the projects I care about (as an end user) use bdist_wininst or bdist_msi at the moment, so all I want is a way of using, as a consumer, those existing distributions (or something equivalent in power) to install the packages via pysetup (which gets me the ability to install in development builds and venvs). I see why a more powerful binary format would be nice for developers, but as an end user I have no direct need for it. Thanks for your patience. Paul. From skippy.hammond at gmail.com Mon Oct 31 03:51:28 2011 From: skippy.hammond at gmail.com (Mark Hammond) Date: Mon, 31 Oct 2011 13:51:28 +1100 Subject: [Python-Dev] Emit a BytesWarning on bytes filenames on Windows In-Reply-To: <4EADC40F.5050607@haypocalc.com> References: <201110290052.41619.victor.stinner@haypocalc.com> <4EAB9355.8010906@gmail.com> <4EADC40F.5050607@haypocalc.com> Message-ID: <4EAE0D30.5030805@gmail.com> On 31/10/2011 8:39 AM, Victor Stinner wrote: > Le 29/10/2011 07:47, Mark Hammond a ?crit : >> When previously discussing this issue, I was under the impression that >> the problem was unencodable bytes passed from the Python code to Windows >> - but the reverse is true - only the data coming back from Windows isn't >> encodable. > > The undecodable filenames issue occurs mostly on os.listdir(bytes) and > os.getcwdb(). > > Unencodable filenames issue occurs on the rest of my function list. > >> As the data came externally, the only solution the programmer >> has is to change to the unicode version of the api >> - so we recommend the bytes version not be used by anyone, >> anytime - which means it is conceptually deprecated already. > > I proposed to raise a Unicode error on undecodable filenames, instead of > returning invalid filenames (with question marks), to force the > developer to move to the Unicode API. But as I explained in my previous > message, you have to wait for an user having the problem to be noticied > of the problem. > > Terry J. Reedy is also concerned about backward compatibility (3.2 -> > 3.3). Emiting a warning, disabled by default, is a softer solution :-) Right - and just to be clear, we are all now agreeing that the UnicodeDecodeError isn't appropriate and a warning will be issued instead? > >> Therefore, as you imply, I think the solution to this issue is to start >> the process of deprecating the bytes version of the api in py3k with a >> view to removing it completely - possibly with a less aggressive >> timeline than normal. > > If there is a warning, I don't really care of removing the bytes API > before Python 4. Agreed - I was trying to say that I think we should start the deprecation process of the bytes API, so a [Pending]DeprecationWarning would then be appropriate. The actual timing of the removal isn't important. > > PendingDeprecationgWarning can be used, or maybe a DeprecationWarning > mentioning that the code will stay for long time. > >> In Python 2.7, I think documenting the issue and a >> recommendation to always use unicode is sufficient (ie, we can't >> deprecate it and a new BytesWarning seems gratuitous.) > > Sorry, I don't understand "gratuitous" here: do you mean that a new > warning would annoying, and that it is cheap and useful to add it to > Python 2.7.x? I mean "Uncalled for; lacking good reason; unwarranted." IOW, I don't think we need to take any action for 2.7, apart from possibly documentation changes. Mark From tjreedy at udel.edu Mon Oct 31 05:28:17 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 31 Oct 2011 00:28:17 -0400 Subject: [Python-Dev] Emit a BytesWarning on bytes filenames on Windows In-Reply-To: <4EADC40F.5050607@haypocalc.com> References: <201110290052.41619.victor.stinner@haypocalc.com> <4EAB9355.8010906@gmail.com> <4EADC40F.5050607@haypocalc.com> Message-ID: On 10/30/2011 5:39 PM, Victor Stinner wrote: > Terry J. Reedy is also concerned about backward compatibility (3.2 -> > 3.3). Emiting a warning, disabled by default, is a softer solution :-) The fact that Mark, Martin, and someone else, I believe, agree with you that the bytes api is not useful at all in 3.x and should go away reduces my concern. This fact does suggest that it is not worth changing anything to make those APIs easier to use. Instead, better to encourage people to not use those APIs in any 3.x code. Removal is ultimately, of course, the hardest solution. -- Terry Jan Reedy From vinay_sajip at yahoo.co.uk Mon Oct 31 08:50:24 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 31 Oct 2011 07:50:24 +0000 (UTC) Subject: [Python-Dev] draft PEP: virtual environments References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> <20111030133958.0c336199@pitrou.net> <20111030183551.2ce82c11@pitrou.net> <20111030235944.240770f2@pitrou.net> Message-ID: Antoine Pitrou pitrou.net> writes: > I don't understand why a zip file makes this easier (especially the > "update selectively" part). Not a zip file specifically - just a binary stream which organises scripts to be installed. If each class in a hierarchy has access to a binary stream, then subclasses have access to the streams for base classes as well as their own stream, and can install selectively from base class streams and their own stream. class Base: scripts = ... # zip stream containing scripts A, B def install_scripts(self, stream): # ... def setup_scripts(self): self.install_scripts(self.scripts) class Derived: scripts = ... # zip stream containing modified script B, new script C def setup_scripts(self): self.install_scripts(Base.scripts) # adds A, B self.install_scripts(self.scripts) # adds C, overwrites B I'm not saying you couldn't do this with e.g. directory trees; it just seems neater to have the scripts in a black box once they're deployed, with a zip file representing that black box. Regards, Vinay Sajip From vinay_sajip at yahoo.co.uk Mon Oct 31 09:07:12 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 31 Oct 2011 08:07:12 +0000 (UTC) Subject: [Python-Dev] Packaging and binary distributions References: Message-ID: Paul Moore gmail.com> writes: > Hang on. I'm talking here about repackaging the binary files in the > MSI file for use in a pysetup install invocation. As pysetup has no > GUI, and doesn't integrate with Add/Remove, there's no issue here. If > you want a GUI and Add/Remove integration, just run the MSI. Or am I > missing something? We seem to be at cross purposes here, I suspect I'm > missing your point. As you say later in your post, we're probably just coming at this from two different perspectives. I think you mentioned the possible need to install to a temporary location just to extract files from the CAB; then you would presumably need to uninstall again to remove the Add/Remove Programs entry created when you installed to the temporary location (or else I misunderstood your meaning here). >> It would be important to retain the flexibility offered by setup.cfg >> hooks, as I don't believe any out-of-the-box approach will work for the >> range of use cases on Windows (think Powershell scripts, Visio templates >> and other Microsoft Office integration components). > > Why? Again, if this is purely as a means to consume bdist_xxx files, > then the only flexibility needed is enough to cater for any variations > in data stored in the bdist_xxx format. The wininst format is easy > here - it has directories PLATLIB, PURELIB, DATA, SCRIPTS and HEADERS > (corresponding to the installation --install-xxx parameters) and > that's all. As long as the module is flexible enough to deal with > that, it can read anything bdist_wininst can produce. My point is really that a one-size-fits-all DATA location is unlikely to cater to all use cases. The flexibility offered by setup.cfg together with hooks gets around the limitation of a single location for data. > Ah, I think I see what you are getting at. If someone uses the new > features and flexibility of packaging to create a fancy custom install > scheme, how do they bundle up a binary distribution from that? My > (current) answer is that I don't know. The packaging module as it > stands only offers the legacy bdist_xxx formats, so the answer is "run > pysetup run bdist_wininst on it". If that breaks (as it is likely to - > wininst format isn't very flexible) then tough, you're out of luck. Yes, that's what I was getting at. Regards, Vinay Sajip From eric at trueblade.com Mon Oct 31 10:59:09 2011 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 31 Oct 2011 05:59:09 -0400 Subject: [Python-Dev] Packaging and binary distributions In-Reply-To: References: Message-ID: <4EAE716D.8090703@trueblade.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/30/2011 5:14 PM, Tres Seaver wrote: > On 10/30/2011 02:04 PM, Ned Deily wrote: >> In article >> , > >> > > Paul Moore wrote: > >>> I'd like to reopen the discussions on how the new packaging >>> module will handle/support binary distributions in Python 3.3. >>> The previous thread (see >>> http://mail.python.org/pipermail/python-dev/2011-October/113956.html) >>> >>> > >>> included a lot of good information and discussion, but ultimately >>> didn't reach any firm conclusions. >>> >>> First question - is this a Windows only problem, or do >>> Unix/MacOS users want binary support? My feeling is that it's >>> not an issue for them, at least not enough that anyone has >>> done anything about it in the past, so I'll focus on Windows >>> here. > >> I haven't been following this discussion that closely but I'm >> rather surprised that the need for binary distributions for >> Python packages on non-Windows platforms would be in question. >> Just as on Windows, it's not a given that all Unix or Mac OS X >> end-user systems will have the necessary development tools >> installed (C compiler, etc) to build C extension modules. Today, >> the most platform-independent way of distributing these are with >> binary eggs: the individual binary eggs are, of course, not >> platform-independent but the distribution and installation >> mechanism is or should be. Sure, there are other ways, like >> pushing the problem back to the OS distributor (e.g. Debian, Red >> Hat, et al) or, as in the case of Mac OS X where there isn't a >> system package manager in the same sense, to a third-party >> package distributor (like MacPorts, Homebrew, or Fink). Or you >> can produce platform-specific installers for each platform which >> also seems heavy-weight. I don't pushing it back to the OS vendor solves the problem. Say I want to install these binary packages with buildout: How would it go about consuming an RPM to install in an isolated buildout directory? >> Has anyone analyzed the current packages on PyPI to see how many >> provide binary distributions and in what format? > > Practically speaking, nobody but Windows consumers *needs* binary > packages on PyPI: even if the target ("production") box is > crippled^Wstripped of its compiler, such environments always have > "staging" hosts which can be used to build binary packages for > internal distribution. It might be true that such systems don't need binary packages on PyPI, but the original question is about binary package support for the packaging module on non-Windows systems. I think the answer is clearly "yes": I have such systems without compilers. If I build packages on a staging server, I would want to put them on an internal PyPI-like server, for consumption by packaging. So packaging would need to consume these binary packages. Eric. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Cygwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOrnFiAAoJENxauZFcKtNxLG0H/03d0uRXw/MvlCA9q92OlwWk +X2PqpZ/F5aFBuN3lsichr/qLiHm69tNu3K++JyLXypT7hzbiB8QEbVUn5Z8X2ds is/6wKIX5Hmd//UlX+VtlYZQSXd/1k7FbqFY0CPTRFGrE+I9ipfCnO3h1OiBwHpY eejoR4Lr/6MXZ+v7DdlyRC9mWZV/uNKnR0ec5ABbQIEC13/j91gR/57ua/ryhRmT hco4ssRSP9pqO058aVJ1ivw2q+9364f7DgWynafRjkrcTy80gZ90LTz7WtteeFPr QO2yFW8ZI0UsxUxNRsDBj1N91AVHngU6HJa1evgegUPRjl94neSQLLWLla37qfQ= =2b7E -----END PGP SIGNATURE----- From martin at v.loewis.de Mon Oct 31 11:42:58 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 31 Oct 2011 11:42:58 +0100 Subject: [Python-Dev] Packaging and binary distributions In-Reply-To: References: Message-ID: <4EAE7BB2.2050502@v.loewis.de> Am 31.10.2011 09:07, schrieb Vinay Sajip: > Paul Moore gmail.com> writes: > >> Hang on. I'm talking here about repackaging the binary files in the >> MSI file for use in a pysetup install invocation. As pysetup has no >> GUI, and doesn't integrate with Add/Remove, there's no issue here. If >> you want a GUI and Add/Remove integration, just run the MSI. Or am I >> missing something? We seem to be at cross purposes here, I suspect I'm >> missing your point. > > As you say later in your post, we're probably just coming at this from two > different perspectives. I think you mentioned the possible need to install to a > temporary location just to extract files from the CAB; then you would > presumably need to uninstall again to remove the Add/Remove Programs entry > created when you installed to the temporary location (or else I misunderstood > your meaning here). This presumption is false (as is the claim that you need to install the MSI to get at the files). It's quite possible to extract the files from the MSI without performing the installation. There are actually two ways to do that: a) perform an "administrative" installation, which unpacks the files to disk but doesn't actually perform any installation procedure, or b) use the MSI API to extract first the CAB file, and then the files in the CAB file. This would be a bit work to do if you want to find out the full path names of the individual files, but it could work in theory. >> Why? Again, if this is purely as a means to consume bdist_xxx files, >> then the only flexibility needed is enough to cater for any variations >> in data stored in the bdist_xxx format. The wininst format is easy >> here - it has directories PLATLIB, PURELIB, DATA, SCRIPTS and HEADERS >> (corresponding to the installation --install-xxx parameters) and >> that's all. As long as the module is flexible enough to deal with >> that, it can read anything bdist_wininst can produce. > > My point is really that a one-size-fits-all DATA location is unlikely to cater > to all use cases. The flexibility offered by setup.cfg together with hooks gets > around the limitation of a single location for data. I'm sure bdist_wininst can be augmented to support arbitrary "base prefixes" (assuming that is the flexibility you talk about). It would just need a list of what directory names are prefixes, The MSI format is designed to provide exactly that flexibility of arbitrarily mapping source folders to destination folders during installation. bdist_msi would just need to be taught to interpret setup.cfg files. >> Ah, I think I see what you are getting at. If someone uses the new >> features and flexibility of packaging to create a fancy custom install >> scheme, how do they bundle up a binary distribution from that? My >> (current) answer is that I don't know. The packaging module as it >> stands only offers the legacy bdist_xxx formats, so the answer is "run >> pysetup run bdist_wininst on it". If that breaks (as it is likely to - >> wininst format isn't very flexible) then tough, you're out of luck. > > Yes, that's what I was getting at. Hmm. You are just describing a bug, not an inherent limitation. Regards, Martin From p.f.moore at gmail.com Mon Oct 31 12:21:45 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 31 Oct 2011 11:21:45 +0000 Subject: [Python-Dev] Packaging and binary distributions In-Reply-To: <4EAE7BB2.2050502@v.loewis.de> References: <4EAE7BB2.2050502@v.loewis.de> Message-ID: On 31 October 2011 10:42, "Martin v. L?wis" wrote: > Am 31.10.2011 09:07, schrieb Vinay Sajip: > This presumption is false (as is the claim that you need to install the > MSI to get at the files). It's quite possible to extract the files from > the MSI without performing the installation. There are actually two ways > to do that: > a) perform an "administrative" installation, which unpacks the files to > ? disk but doesn't actually perform any installation procedure, or > b) use the MSI API to extract first the CAB file, and then the files in > ? the CAB file. This would be a bit work to do if you want to find out > ? the full path names of the individual files, but it could work in > ? theory. Yes, I'm currently doing an administrative install via msiexec to get the files out. It's simple enough to do. >> My point is really that a one-size-fits-all DATA location is unlikely to cater >> to all use cases. The flexibility offered by setup.cfg together with hooks gets >> around the limitation of a single location for data. > > I'm sure bdist_wininst can be augmented to support arbitrary "base > prefixes" (assuming that is the flexibility you talk about). It would > just need a list of what directory names are prefixes, > > The MSI format is designed to provide exactly that flexibility of > arbitrarily mapping source folders to destination folders during > installation. bdist_msi would just need to be taught to interpret > setup.cfg files. Agreed - the "one size fits all" data location is a limitation. I'm not sure that in practical terms it is a big issue, though - it's been like that since the wininst format was designed, and nobody has ever complained. There are certainly cases where packages have needed to implement more or less clumsy workarounds (for example, not including documentation in binary distributions) but it's obviously never been enough of an issue to prompt people to fix it. The egg format has the same limitation, as far as I'm aware, so clearly even the "eggs solve everything" crowd don't feel it's a real issue :-) >>> Ah, I think I see what you are getting at. If someone uses the new >>> features and flexibility of packaging to create a fancy custom install >>> scheme, how do they bundle up a binary distribution from that? My >>> (current) answer is that I don't know. The packaging module as it >>> stands only offers the legacy bdist_xxx formats, so the answer is "run >>> pysetup run bdist_wininst on it". If that breaks (as it is likely to - >>> wininst format isn't very flexible) then tough, you're out of luck. >> >> Yes, that's what I was getting at. > > Hmm. You are just describing a bug, not an inherent limitation. Precisely. And it's a bug that no-one has felt the need to fix in many years. The flexibility is not new - distutils had at least as much flexibility if not more. I'd love to see a binary format that was as flexible and powerful as building from source, which allowed OS integration where the user wanted it while still supporting venvs and non-system installations, and which was widely adopted by distribution authors. Oh, and can I have a pony? :-) Sadly, I don't have the time or understanding of the various requirements to deliver something like that. Realistically, I'd just like to be able to benefit from the generosity of existing distribution authors who make compiled versions of their code available, however they choose to do so. Hence my current focus on consuming existing formats (and even the bdist_simple proposal/patch was little more than a tidied up bdist_wininst made OS-neutral). Paul. From vinay_sajip at yahoo.co.uk Mon Oct 31 12:38:24 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 31 Oct 2011 11:38:24 +0000 (UTC) Subject: [Python-Dev] Packaging and binary distributions References: <4EAE7BB2.2050502@v.loewis.de> Message-ID: Paul Moore gmail.com> writes: > Agreed - the "one size fits all" data location is a limitation. I'm > not sure that in practical terms it is a big issue, though - it's been > like that since the wininst format was designed, and nobody has ever > complained. There are certainly cases where packages have needed to > implement more or less clumsy workarounds (for example, not including > documentation in binary distributions) but it's obviously never been > enough of an issue to prompt people to fix it. The egg format has the > same limitation, as far as I'm aware, so clearly even the "eggs solve > everything" crowd don't feel it's a real issue Yes, but with setup.py you had the option of running any Python code to move things around using a post-install script, so people could get around those limitations, albeit in a completely ad hoc way. So there was nothing to fix, but no standard way of achieving what you wanted in out-of-the-ordinary scenarios. > I'd love to see a binary format that was as flexible and powerful as > building from source, which allowed OS integration where the user > wanted it while still supporting venvs and non-system installations, > and which was widely adopted by distribution authors. Oh, and can I > have a pony? Sadly, I don't have the time or understanding of the > various requirements to deliver something like that. Well, from the point of view of venvs and PEP 404, it's certainly topical and worth trying to get some traction behind this particular pony. If bdist_pony is easy enough to use and doesn't close any existing doors, then there's no obvious reason why distribution authors wouldn't use it for future releases of their distributions. Regards, Vinay Sajip From vinay_sajip at yahoo.co.uk Mon Oct 31 12:40:07 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 31 Oct 2011 11:40:07 +0000 (UTC) Subject: [Python-Dev] Packaging and binary distributions References: <4EAE7BB2.2050502@v.loewis.de> Message-ID: Martin v. L?wis v.loewis.de> writes: > This presumption is false (as is the claim that you need to install the > MSI to get at the files). It's quite possible to extract the files from > the MSI without performing the installation. There are actually two ways > to do that: > a) perform an "administrative" installation, which unpacks the files to > disk but doesn't actually perform any installation procedure, or > b) use the MSI API to extract first the CAB file, and then the files in > the CAB file. This would be a bit work to do if you want to find out > the full path names of the individual files, but it could work in > theory. I'd completely forgotten about the administrative installation - thanks for reminding me. > The MSI format is designed to provide exactly that flexibility of > arbitrarily mapping source folders to destination folders during > installation. bdist_msi would just need to be taught to interpret > setup.cfg files. I agree in principle, but one thing you get with setup.cfg which seems harder to achieve with MSI is the use of Python to do things at installation time. For example, with setup.cfg hooks, you can use ctypes to make Windows API calls at installation time to decide where to put things. While this same flexibility exists in the MSI format (with custom actions and so forth) it's not as readily accessible to someone who wants to use Python to code this type of installation logic. > > Hmm. You are just describing a bug, not an inherent limitation. > You're right that it's not an inherent limitation, but I'm not sure which bug you're referring to. Do you mean just a current limitation? Regards, Vinay Sajip From carl at oddbird.net Mon Oct 31 15:06:26 2011 From: carl at oddbird.net (Carl Meyer) Date: Mon, 31 Oct 2011 08:06:26 -0600 Subject: [Python-Dev] draft PEP: virtual environments In-Reply-To: <1319843408.10593.3.camel@thinko> References: <4EAAF66F.9020603@oddbird.net> <1319843408.10593.3.camel@thinko> Message-ID: <4EAEAB62.3030506@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/28/2011 05:10 PM, Chris McDonough wrote: >> Why not modify sys.prefix? >> - -------------------------- >> >> As discussed above under `Backwards Compatibility`_, this PEP proposes >> to add ``sys.site_prefix`` as "the prefix relative to which >> site-package directories are found". This maintains compatibility with >> the documented meaning of ``sys.prefix`` (as the location relative to >> which the standard library can be found), but means that code assuming >> that site-packages directories are found relative to ``sys.prefix`` >> will not respect the virtual environment correctly. >> >> Since it is unable to modify ``distutils``/``sysconfig``, >> `virtualenv`_ is forced to instead re-point ``sys.prefix`` at the >> virtual environment. >> >> An argument could be made that this PEP should follow virtualenv's >> lead here (and introduce something like ``sys.base_prefix`` to point >> to the standard library and header files), since virtualenv already >> does this and it doesn't appear to have caused major problems with >> existing code. >> >> Another argument in favor of this is that it would be preferable to >> err on the side of greater, rather than lesser, isolation. Changing >> ``sys.prefix`` to point to the virtual environment and introducing a >> new ``sys.base_prefix`` attribute would err on the side of greater >> isolation in the face of existing code's use of ``sys.prefix``. > > It would seem to make sense to me to err on the side of greater > isolation, introducing sys.base_prefix to indicate the base prefix (as > opposed to sys.site_prefix indicating the venv prefix). Bugs introduced > via a semi-isolated virtual environment are very difficult to > troubleshoot. It would also make changes to existing code unnecessary. > I have encountered no issues with virtualenv doing this so far. I'm convinced that this is the better tradeoff. I'll begin working on a branch of the reference implementation that does things this way. Thanks for the feedback. Carl -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6uq2IACgkQ8W4rlRKtE2chHQCgik136LkoQ/JE6b3r4astWcog kYYAoN7ESaPlZOaYeok5t0i9hMkb2L4g =/Rn1 -----END PGP SIGNATURE----- From solipsis at pitrou.net Mon Oct 31 15:13:11 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 31 Oct 2011 15:13:11 +0100 Subject: [Python-Dev] draft PEP: virtual environments References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> <20111030133958.0c336199@pitrou.net> <20111030183551.2ce82c11@pitrou.net> <20111030235944.240770f2@pitrou.net> Message-ID: <20111031151311.43bed84d@pitrou.net> On Mon, 31 Oct 2011 07:50:24 +0000 (UTC) Vinay Sajip wrote: > Antoine Pitrou pitrou.net> writes: > > > I don't understand why a zip file makes this easier (especially the > > "update selectively" part). > > Not a zip file specifically - just a binary stream which organises scripts to be > installed. If each class in a hierarchy has access to a binary stream, then > subclasses have access to the streams for base classes as well as their own > stream, and can install selectively from base class streams and their own stream. Isn't that overengineered? We're talking about a couple of files. It's not even obvious that third-party tools will want to modify them, instead of writing their own (if the venv API is stable, it should be relatively easy). > I'm not saying you couldn't do this with e.g. directory trees; it just seems > neater to have the scripts in a black box once they're deployed, with a zip file > representing that black box. I don't know why it's neater. After all, we install .py files in their original form, not in a zipfile (even though Python supports the latter). Regards Antoine. From solipsis at pitrou.net Mon Oct 31 15:22:01 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 31 Oct 2011 15:22:01 +0100 Subject: [Python-Dev] Packaging and binary distributions References: <4EAE716D.8090703@trueblade.com> Message-ID: <20111031152201.30ce2087@pitrou.net> On Mon, 31 Oct 2011 05:59:09 -0400 "Eric V. Smith" wrote: > > It might be true that such systems don't need binary packages on PyPI, > but the original question is about binary package support for the > packaging module on non-Windows systems. I think the answer is clearly > "yes": I have such systems without compilers. If I build packages on a > staging server, I would want to put them on an internal PyPI-like > server, for consumption by packaging. So packaging would need to > consume these binary packages. And it's not only compilers, it's also external libraries (which are generally not installed by default). For example, to compile pyOpenSSL, you first need to fetch the OpenSSL development headers. Regards Antoine. From vinay_sajip at yahoo.co.uk Mon Oct 31 15:42:39 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 31 Oct 2011 14:42:39 +0000 (UTC) Subject: [Python-Dev] draft PEP: virtual environments References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> <20111030133958.0c336199@pitrou.net> <20111030183551.2ce82c11@pitrou.net> <20111030235944.240770f2@pitrou.net> <20111031151311.43bed84d@pitrou.net> Message-ID: Antoine Pitrou pitrou.net> writes: > Isn't that overengineered? We're talking about a couple of files. We're not talking about a lot of code to do this, either - just the interface to the existing code (which is needed anyway to install the minimal scripts in the venv). > It's not even obvious that third-party tools will want to modify them, > instead of writing their own (if the venv API is stable, it should be > relatively easy). Well, virtualenvwrapper is pretty popular addon to virtualenv which delivers additional scripts, even though virtualenv already supplies more scripts than we're proposing to do in the stdlib. Example use cases for such scripts might be things like environment manipulation when environments are activated/deactivated (e.g. for LD_LIBRARY_PATH) - we can't always predict all the different needs that arise, so I'm just leaving the door open to third parties to be able to do what they need. > I don't know why it's neater. After all, we install .py files in their > original form, not in a zipfile (even though Python supports the > latter). Perhaps it's a matter of taste. The files we're talking about are actually data in the context we're discussing. Regards, Vinay Sajip From p.f.moore at gmail.com Mon Oct 31 15:59:06 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 31 Oct 2011 14:59:06 +0000 Subject: [Python-Dev] Packaging and binary distributions In-Reply-To: <20111031152201.30ce2087@pitrou.net> References: <4EAE716D.8090703@trueblade.com> <20111031152201.30ce2087@pitrou.net> Message-ID: On 31 October 2011 14:22, Antoine Pitrou wrote: > On Mon, 31 Oct 2011 05:59:09 -0400 > "Eric V. Smith" wrote: >> >> It might be true that such systems don't need binary packages on PyPI, >> but the original question is about binary package support for the >> packaging module on non-Windows systems. I think the answer is clearly >> "yes": I have such systems without compilers. If I build packages on a >> staging server, I would want to put them on an internal PyPI-like >> server, for consumption by packaging. So packaging would need to >> consume these binary packages. > > And it's not only compilers, it's also external libraries (which are > generally not installed by default). > For example, to compile pyOpenSSL, you first need to fetch the OpenSSL > development headers. It sounds to me like there's a clear interest in some level of binary distribution support from packaging. Could anyone comment on whether the current level of support is sufficient? (My instinct says it isn't, but I don't want to put words in people's mouths). If not, a PEP may be the best way to move this forward, but as things stand I'm not entirely clear what that PEP should be proposing. My inclination (to make packaging and pysetup install capable of reading existing binary formats) doesn't seem to be sufficient for most people. Does anyone want to work with me on coming up with a PEP? Paul. PS Should this discussion move somewhere else? Maybe python-ideas or distutils-sig? I'm not sure it's well-formed enough for python-dev at the moment... From carl at oddbird.net Mon Oct 31 16:27:02 2011 From: carl at oddbird.net (Carl Meyer) Date: Mon, 31 Oct 2011 09:27:02 -0600 Subject: [Python-Dev] draft PEP: virtual environments In-Reply-To: References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> <20111030133958.0c336199@pitrou.net> <20111030183551.2ce82c11@pitrou.net> Message-ID: <4EAEBE46.5060707@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/30/2011 04:47 PM, Vinay Sajip wrote: > Antoine Pitrou pitrou.net> writes: >> It would be even simpler not to process it at all, but install the >> scripts as-is (without the execute bit) :) > Sure, but such an approach makes it difficult to provide a mechanism which is > easily extensible; for example, with the current approach, it is straightforward > for third party tools to either easily replace completely, update selectively or > augment simply the scripts provided by base classes. I don't understand this point either. It seems to me too that having the scripts installed as plain data files inside a package is just as easy or easier for third-party tools to work with flexibly in all of the ways you mention, compared to having them available in any kind of zipped format. The current os.name-based directory structure can still be used, and we can still provide the helper to take such a directory structure and install the appropriate scripts based on os.name. I don't see any advantage to zipping. If done at install-time (which is necessary to make the scripts maintainable in the source tree) it also has the downside of introducing another difficulty in supporting source builds equivalently to installed builds. Carl -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6uvkYACgkQ8W4rlRKtE2ezZwCfUv80rp7Vg//zRA471R9JJDlj 83gAn0e9r76c9WkjutLcpbRjeopFkmew =Z0kj -----END PGP SIGNATURE----- From vinay_sajip at yahoo.co.uk Mon Oct 31 16:35:42 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 31 Oct 2011 15:35:42 +0000 (UTC) Subject: [Python-Dev] draft PEP: virtual environments References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> <20111030133958.0c336199@pitrou.net> <20111030183551.2ce82c11@pitrou.net> <4EAEBE46.5060707@oddbird.net> Message-ID: Carl Meyer oddbird.net> writes: > I don't see any advantage to zipping. If done at install-time (which is > necessary to make the scripts maintainable in the source tree) it also > has the downside of introducing another difficulty in supporting source > builds equivalently to installed builds. That's true, I hadn't thought of that. So then it sounds like the thing to do is make venv a package and have the code in venv/__init__.py, then have the scripts in a 'scripts' subdirectory below that. The API would then change to take the absolute pathname of the scripts directory to install from, right? Regards, Vinay Sajip From carl at oddbird.net Mon Oct 31 16:40:33 2011 From: carl at oddbird.net (Carl Meyer) Date: Mon, 31 Oct 2011 09:40:33 -0600 Subject: [Python-Dev] draft PEP: virtual environments In-Reply-To: References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> <20111030133958.0c336199@pitrou.net> <20111030183551.2ce82c11@pitrou.net> <4EAEBE46.5060707@oddbird.net> Message-ID: <4EAEC171.9080806@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/31/2011 09:35 AM, Vinay Sajip wrote: > That's true, I hadn't thought of that. So then it sounds like the thing to do is > make venv a package and have the code in venv/__init__.py, then have the scripts > in a 'scripts' subdirectory below that. The API would then change to take the > absolute pathname of the scripts directory to install from, right? That sounds right to me. Carl -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6uwXEACgkQ8W4rlRKtE2fUQgCfb1Cn7OYZzt3/xoKzmJuCmvbt B9sAn0kuBZzjVImIC1r8Jb506KbsRHBN =lgga -----END PGP SIGNATURE----- From carl at oddbird.net Mon Oct 31 16:50:37 2011 From: carl at oddbird.net (Carl Meyer) Date: Mon, 31 Oct 2011 09:50:37 -0600 Subject: [Python-Dev] draft PEP: virtual environments In-Reply-To: <20111030132837.220c124d@pitrou.net> References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> <20111030132837.220c124d@pitrou.net> Message-ID: <4EAEC3CD.50408@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/30/2011 06:28 AM, Antoine Pitrou wrote: > On Sun, 30 Oct 2011 12:10:18 +0000 (UTC) > Vinay Sajip wrote: >> >>> We already have Unix shell scripts and BAT files in the source tree. Is >>> it really complicated to maintain these additional shell scripts? Is >>> there a lot of code in them? >> >> No, they're pretty small: wc -l gives >> >> 76 posix/activate (Bash script, contains deactivate() function) >> 31 nt/activate.bat >> 17 nt/deactivate.bat >> >> The question is whether we should stop at that, or whether there should be >> support for tcsh, fish etc. such as virtualenv provides. > > I don't think we need additional support for more or less obscure > shells. > Also, if posix/activate is sufficiently well written (don't ask me > how :-)), it should presumably be compatible with all Unix shells? I have no problem including the basic posix/nt activate scripts if no one else is concerned about the added maintenance burden there. I'm not sure that my cross-shell-scripting fu is sufficient to write posix/activate in a cross-shell-compatible way; I use bash and am not very familiar with other shells. If it runs under /bin/sh is that sufficient to make it compatible with "all Unix shells" (for some definition of "all")? If so, I can work on this. Carl -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6uw80ACgkQ8W4rlRKtE2e0AACcCGqxp/HWxX0UAqtS9W5y+UDr weAAnjF4YdsCUvb4bXFloEGt1b7KlvWB =2bd+ -----END PGP SIGNATURE----- From tseaver at palladion.com Mon Oct 31 17:08:20 2011 From: tseaver at palladion.com (Tres Seaver) Date: Mon, 31 Oct 2011 12:08:20 -0400 Subject: [Python-Dev] draft PEP: virtual environments In-Reply-To: <4EAEC3CD.50408@oddbird.net> References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> <20111030132837.220c124d@pitrou.net> <4EAEC3CD.50408@oddbird.net> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/31/2011 11:50 AM, Carl Meyer wrote: > I have no problem including the basic posix/nt activate scripts if > no one else is concerned about the added maintenance burden there. > > I'm not sure that my cross-shell-scripting fu is sufficient to > write posix/activate in a cross-shell-compatible way; I use bash > and am not very familiar with other shells. If it runs under > /bin/sh is that sufficient to make it compatible with "all Unix > shells" (for some definition of "all")? If so, I can work on this. I would say this is a perfect "opportunity to delegate," in this case to the devotees of other cults^Wshells than bash. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6ux/QACgkQ+gerLs4ltQ7j0wCffLICxbvo9ed0wMhEkn/iFzCj euEAnjvhPOAz09570Xh1PGBcksQ0De4n =YIG0 -----END PGP SIGNATURE----- From p.f.moore at gmail.com Mon Oct 31 17:28:34 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 31 Oct 2011 16:28:34 +0000 Subject: [Python-Dev] draft PEP: virtual environments In-Reply-To: References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> <20111030132837.220c124d@pitrou.net> <4EAEC3CD.50408@oddbird.net> Message-ID: On 31 October 2011 16:08, Tres Seaver wrote: > On 10/31/2011 11:50 AM, Carl Meyer wrote: > >> I have no problem including the basic posix/nt activate scripts if >> no one else is concerned about the added maintenance burden there. >> >> I'm not sure that my cross-shell-scripting fu is sufficient to >> write posix/activate in a cross-shell-compatible way; I use bash >> and am not very familiar with other shells. If it runs under >> /bin/sh is that sufficient to make it compatible with "all Unix >> shells" (for some definition of "all")? If so, I can work on this. > > > I would say this is a perfect "opportunity to delegate," in this case > to the devotees of other cults^Wshells than bash. For Windows, can you point me at the nt scripts? If they aren't too complex, I'd be willing to port to Powershell. Paul. From merwok at netwok.org Mon Oct 31 18:09:22 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 31 Oct 2011 18:09:22 +0100 Subject: [Python-Dev] Packaging and binary distributions In-Reply-To: References: Message-ID: <4EAED642.5050704@netwok.org> Hi, > I'd like to reopen the discussions on how the new packaging module > will handle/support binary distributions in Python 3.3. The previous > thread (see http://mail.python.org/pipermail/python-dev/2011-October/113956.html) > included a lot of good information and discussion, but ultimately > didn't reach any firm conclusions. I?m sorry there was no reply from the core group of packaging contributors. I read the messages as they flew by and wanted to reply on a lot of points, but didn?t get the time to do it. I hope the list subscribers won?t mind if I go through the threads in the coming days and make many replies. Cheers From merwok at netwok.org Mon Oct 31 18:19:15 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 31 Oct 2011 18:19:15 +0100 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Add a button to the code examples in the doc to show/hide the prompts and In-Reply-To: References: Message-ID: <4EAED893.5070100@netwok.org> Hi Ezio, > http://hg.python.org/cpython/rev/18bbfed9aafa > user: Ezio Melotti > summary: > Add a button to the code examples in the doc to show/hide the prompts and output. Looks cool! I hope this will stop our use of two or three different styles for Python code in the docs (doctest-compatible vs. source-file-style vs. copy-paste-ready) and ultimately help make them doctest-compatible. > +$(document).ready(function() { > + /* Add a [>>>] button on the top-right corner of code samples to hide > + * the >>> and ... prompts and the output and thus make the code > + * copyable. */ I think it would be more user-friendly if the button/trigger would use real English text like ?Hide prompts?/?Show prompts? rather than symbols. Cheers From merwok at netwok.org Mon Oct 31 18:20:16 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 31 Oct 2011 18:20:16 +0100 Subject: [Python-Dev] [Python-checkins] cpython (3.2): I should be someone In-Reply-To: References: Message-ID: <4EAED8D0.80701@netwok.org> Hi, > http://hg.python.org/cpython/rev/6f56e81da8f6 > user: Florent Xicluna > summary: > I should be someone Without quotation marks, that sounded like a philosophical commit :) Cheers From merwok at netwok.org Mon Oct 31 18:23:00 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 31 Oct 2011 18:23:00 +0100 Subject: [Python-Dev] Code cleanups in stable branches? In-Reply-To: References: Message-ID: <4EAED974.602@netwok.org> Hi, > http://hg.python.org/cpython/rev/72de2ac8bb4f > branch: 2.7 > user: Petri Lehtinen > date: Sun Oct 30 13:55:02 2011 +0200 > summary: > Avoid unnecessary recursive function calls (closes #10519) > http://hg.python.org/cpython/rev/0694ebb5db99 > branch: 2.7 > user: Jesus Cea > date: Mon Oct 31 16:02:12 2011 +0100 > summary: > Closes #13283: removal of two unused variable in locale.py I thought that patches that clean up code but don?t fix actual bugs were not done in stable branches. Has this changed? (The first commit quoted above may be a performance fix, which would be okay IIRC.) Regards From ezio.melotti at gmail.com Mon Oct 31 18:51:54 2011 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Mon, 31 Oct 2011 19:51:54 +0200 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Add a button to the code examples in the doc to show/hide the prompts and In-Reply-To: <4EAED893.5070100@netwok.org> References: <4EAED893.5070100@netwok.org> Message-ID: <4EAEE03A.6080105@gmail.com> Hi ?ric, On 31/10/2011 19.19, ?ric Araujo wrote: > Hi Ezio, > >> http://hg.python.org/cpython/rev/18bbfed9aafa >> user: Ezio Melotti >> summary: >> Add a button to the code examples in the doc to show/hide the prompts and output. > Looks cool! I hope this will stop our use of two or three different > styles for Python code in the docs (doctest-compatible vs. > source-file-style vs. copy-paste-ready) and ultimately help make them > doctest-compatible. My main concern about this is that it works only with the HTML doc and now with the pdf/chm, so converting the examples would make the situation a bit worse for the pdf/chm users (and also for js-less users). I think in the examples we should just use what make more sense -- either copy/paste from an interpreter session when the output is relevant, copy only the source when it's not, and avoid the "hybrid style" (i.e. include the '>>>' and the output but omit the '...'). Nonetheless I think most of the users use the HTML doc, so it should be an improvement for them :) > >> +$(document).ready(function() { >> + /* Add a [>>>] button on the top-right corner of code samples to hide >> + * the>>> and ... prompts and the output and thus make the code >> + * copyable. */ > I think it would be more user-friendly if the button/trigger would use > real English text like ?Hide prompts?/?Show prompts? rather than symbols. I thought about that and ended up adding a title="" with a more accurate description, while leaving the button short and unobtrusive. A bigger button might interfer/overlap with the code if the code contains long lines and/or if the window is too narrow. I expect that people will anyway try it out as soon as they notice it and learn quickly what it does. In a couple of years this whole script could be replaced with a couple of lines of CSS using the CSS3 "user-select" property (so that only the code and not the rest is actually copied), but at the moment the support for it is still a bit lacking and inconsistent. Best Regards, Ezio Melotti > Cheers > From solipsis at pitrou.net Mon Oct 31 19:16:25 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 31 Oct 2011 19:16:25 +0100 Subject: [Python-Dev] cpython (2.7): test_protocol_sslv2(): Skip this test if ssl.PROTOCOL_SSLv2 is not References: Message-ID: <20111031191625.2b6e11a4@pitrou.net> On Mon, 31 Oct 2011 19:09:01 +0100 barry.warsaw wrote: > http://hg.python.org/cpython/rev/1de4d92cd6a4 > changeset: 73258:1de4d92cd6a4 > branch: 2.7 > parent: 73253:7ce3b719306e > user: Barry Warsaw > date: Mon Oct 31 14:08:15 2011 -0400 > summary: > test_protocol_sslv2(): Skip this test if ssl.PROTOCOL_SSLv2 is not > defined (as is the case with Ubuntu 11.10). > > files: > Lib/test/test_ssl.py | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > > diff --git a/Lib/test/test_ssl.py b/Lib/test/test_ssl.py > --- a/Lib/test/test_ssl.py > +++ b/Lib/test/test_ssl.py > @@ -981,6 +981,8 @@ > @skip_if_broken_ubuntu_ssl > def test_protocol_sslv2(self): > """Connecting to an SSLv2 server with various client options""" > + if not hasattr(ssl, 'PROTOCOL_SSLv2'): > + raise unittest.SkipTest('No SSLv2 available') > if test_support.verbose: > sys.stdout.write("\n") > if not hasattr(ssl, 'PROTOCOL_SSLv2'): I'm not sure, but I've think you've just committed a no-op. (see http://hg.python.org/cpython/rev/5a080ebd311c) Regards Antoine. From nad at acm.org Mon Oct 31 19:36:43 2011 From: nad at acm.org (Ned Deily) Date: Mon, 31 Oct 2011 11:36:43 -0700 Subject: [Python-Dev] Packaging and binary distributions References: Message-ID: In article , Paul Moore wrote: > On 30 October 2011 18:04, Ned Deily wrote: > > Has anyone analyzed the current packages on PyPI to see how many provide > > binary distributions and in what format? > > A very quick and dirty check: > > dmg: 5 > rpm: 12 > msi: 23 > dumb: 132 > wininst: 364 > egg: 2570 > > That's number of packages with binary distributions in that format. > It's hard to be sure about egg distributions, as many of these could > be pure-python (there's no way I know, from the PyPI metadata, to > check this). Thanks. If you have access to the egg file name, you should be able to tell. AFAIK, eggs with extension modules include the Distutils platform name in the file name preceded by a '-', so '-linux', '-win32', '-macosx' for the main ones. Pure python eggs do not contain a platform name. http://pypi.python.org/pypi/pyinterval/ is a random example of the former. -- Ned Deily, nad at acm.org From p.f.moore at gmail.com Mon Oct 31 20:55:28 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 31 Oct 2011 19:55:28 +0000 Subject: [Python-Dev] Packaging and binary distributions In-Reply-To: References: Message-ID: On 31 October 2011 18:36, Ned Deily wrote: > In article > , > ?Paul Moore wrote: >> On 30 October 2011 18:04, Ned Deily wrote: >> > Has anyone analyzed the current packages on PyPI to see how many provide >> > binary distributions and in what format? >> >> A very quick and dirty check: >> >> dmg: 5 >> rpm: 12 >> msi: 23 >> dumb: 132 >> wininst: 364 >> egg: 2570 >> >> That's number of packages with binary distributions in that format. >> It's hard to be sure about egg distributions, as many of these could >> be pure-python (there's no way I know, from the PyPI metadata, to >> check this). > > Thanks. ?If you have access to the egg file name, you should be able to > tell. ?AFAIK, eggs with extension modules include the Distutils platform > name in the file name preceded by a '-', so '-linux', '-win32', > '-macosx' for the main ones. ?Pure python eggs do not contain a platform > name. ?http://pypi.python.org/pypi/pyinterval/ is a random example of > the former. 136 architecture-specific 2502 architecture independent About 5%. The numbers don't quite add up, so there's some funnies in there (possibly bad data that I'm not handling well) but it gives an idea. Counts by architecture: win32 70 linux-i686 43 win-amd64 33 linux-x86_64 26 macosx-10.3-fat 12 macosx-10.5-i386 11 macosx-10.6-universal 9 macosx-10.6-fat 8 macosx-10.3-i386 7 macosx-10.6-i386 6 macosx-10.7-intel 4 macosx-10.6-intel 3 macosx-10.6-x86_64 2 macosx-10.3-ppc 2 macosx-10.4-i386 2 macosx-10.4-ppc 2 py2.3-linux-i686 1 py2.4-linux-i686 1 gnu-0.3-i686-AT386 1 linux-ppc 1 cygwin-1.5.25-i686 1 py2.3 1 py2.4 1 py2.5 1 macosx-10.7-x86_64 1 macosx-10.4-universal 1 py2.5-linux-i686 1 Most of the 1-counts are bad data in some form. I'm not sure what this proves, to be honest, but what I take from it is: - Nearly all binary distributions are for Windows - Architecture-neutral eggs are common (but not relevant here as packaging can install from source with these) - Ignoring architecture-neutral eggs, most popular formats are wininst, egg, dumb(!!!) and msi - Even the most popular binary format (wininst) only accounts for 2% of all packages. Having said all of this, there are two major caveats I'd include: - Not everything is on PyPI. - This analysis ignores relative importance. It's hard to claim that numpy is no more significant than, say, "Products.CMFDynamicViewFTI" (whatever that might be - I picked it at random, so apologies to the author :-)) Paul. From carl at oddbird.net Mon Oct 31 21:10:11 2011 From: carl at oddbird.net (Carl Meyer) Date: Mon, 31 Oct 2011 14:10:11 -0600 Subject: [Python-Dev] draft PEP: virtual environments In-Reply-To: References: <4EAAF66F.9020603@oddbird.net> <20111029172342.61dbbd71@pitrou.net> <20111030132837.220c124d@pitrou.net> <4EAEC3CD.50408@oddbird.net> Message-ID: <4EAF00A3.90400@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/31/2011 10:28 AM, Paul Moore wrote: > On 31 October 2011 16:08, Tres Seaver wrote: >> On 10/31/2011 11:50 AM, Carl Meyer wrote: >> >>> I have no problem including the basic posix/nt activate scripts if >>> no one else is concerned about the added maintenance burden there. >>> >>> I'm not sure that my cross-shell-scripting fu is sufficient to >>> write posix/activate in a cross-shell-compatible way; I use bash >>> and am not very familiar with other shells. If it runs under >>> /bin/sh is that sufficient to make it compatible with "all Unix >>> shells" (for some definition of "all")? If so, I can work on this. >> >> >> I would say this is a perfect "opportunity to delegate," in this case >> to the devotees of other cults^Wshells than bash. Good call - we'll stick with what we've got until such devotees show up :-) Hey devotees, if you're listening, this is what you want to test/port: https://bitbucket.org/vinay.sajip/pythonv/src/6d057cfaaf53/Lib/venv/scripts/posix/activate For reference, here's what virtualenv ships with (includes a .fish and .csh script): https://github.com/pypa/virtualenv/tree/develop/virtualenv_support > For Windows, can you point me at the nt scripts? If they aren't too > complex, I'd be willing to port to Powershell. Thanks! They are here: https://bitbucket.org/vinay.sajip/pythonv/src/6d057cfaaf53/Lib/venv/scripts/nt Carl -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6vAKMACgkQ8W4rlRKtE2eEfwCgtpzQtUktUSU8ZyDDeqjD0yEe QXgAoLoCD8EQ74jHR1lWPFjgnwQFkM46 =6+Rn -----END PGP SIGNATURE----- From ncoghlan at gmail.com Mon Oct 31 21:24:09 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 1 Nov 2011 06:24:09 +1000 Subject: [Python-Dev] Code cleanups in stable branches? In-Reply-To: <4EAED974.602@netwok.org> References: <4EAED974.602@netwok.org> Message-ID: Removing dead code and bypassing redundant code are both reasonable bug fixes. The kind of change to be avoided is gratuitous replacement of older idioms with newer ones. -- Nick Coghlan (via Gmail on Android, so likely to be more terse than usual) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Oct 31 21:28:06 2011 From: brett at python.org (Brett Cannon) Date: Mon, 31 Oct 2011 13:28:06 -0700 Subject: [Python-Dev] Code cleanups in stable branches? In-Reply-To: References: <4EAED974.602@netwok.org> Message-ID: On Mon, Oct 31, 2011 at 13:24, Nick Coghlan wrote: > Removing dead code and bypassing redundant code are both reasonable bug > fixes. The kind of change to be avoided is gratuitous replacement of older > idioms with newer ones. > > What Nick said as I was in the middle of typing when he sent this. =) -Brett > -- > Nick Coghlan (via Gmail on Android, so likely to be more terse than usual) > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: