From ned at nedbatchelder.com Sat Jan 1 18:18:10 2011 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sat, 01 Jan 2011 12:18:10 -0500 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: <4D1B3592.3040403@v.loewis.de> <4D1DFCF8.8080109@stoneleaf.us> <19742.1684.846410.770977@montanaro.dyndns.org> Message-ID: <4D1F61D2.1010909@nedbatchelder.com> On 12/31/2010 12:51 PM, Cesare Di Mauro wrote: > 2010/12/31 > > > > >> Another example. I can totally remove the variable i, just > using the > >> stack, so a debugger (or, in general, having the tracing enabled) > >> cannot even find something to change about it. > > Ethan> -1 > > Ethan> Debugging is challenging enough as it is -- why would > you want to > Ethan> make it even more difficult? > > > I don't know. Maybe he wants his program to run faster. > > > > :D > > "Aggressive" optimizations can be enabled with explicit options, in > order to leave normal "debugger-prone" code. I wish the Python compiler would adopt a strategy of being able to disable optimizations. I wrote a bug about a "leaky abstraction" optimization messing up coverage testing 2.5 years ago, and it was closed as won't fix: http://bugs.python.org/issue2506. The debate there centered around, "but that line isn't executed, because it's been optimized away." It's common in sophisticated compilers (as in, any C compiler) to be able to choose whether you want optimizations for speed, or disabling optimizations for debugging and reasoning about the code. Python would benefit from the same choice. --Ned. > > If you use print statements for the bulk of your debugging (many > people do), > unrolling loops doesn't affect your debugging ability. > > Skip > > > It's a common practice. Also IDEs helps a lot, and advanced > interactive shells too (such as DreamPie). > > Cesare > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ned%40nedbatchelder.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat Jan 1 22:07:02 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 01 Jan 2011 16:07:02 -0500 Subject: [Python-Dev] [Python-checkins] r87603 - python/branches/py3k/Misc/NEWS In-Reply-To: <20110101100730.BE5B9EE987@mail.python.org> References: <20110101100730.BE5B9EE987@mail.python.org> Message-ID: <4D1F9776.10006@udel.edu> On 1/1/2011 5:07 AM, georg.brandl wrote: > Author: georg.brandl > Date: Sat Jan 1 11:07:30 2011 > New Revision: 87603 > > Log: > Fix issue references. (add '#' to issue numbers). Whoops, two of those are mine. I am still learning and will try to remember to include it in both log messages and NEWS entries. Terry From victor.stinner at haypocalc.com Sun Jan 2 14:09:39 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sun, 02 Jan 2011 14:09:39 +0100 Subject: [Python-Dev] Issue #10348: concurrent.futures doesn't work on BSD In-Reply-To: <4D1CADB1.6060803@v.loewis.de> References: <1293628653.16756.13.camel@marge> <4D1B5394.10802@v.loewis.de> <4D1B7F37.3080201@v.loewis.de> <4D1B9EE2.70806@v.loewis.de> <1293663348.18690.163.camel@marge> <1FC42F4E-BFAE-47E0-984F-0040AACD6804@sweetapp.com> <4D1C4DFB.30302@v.loewis.de> <20101230164014.218a1f27@pitrou.net> <4D1CADB1.6060803@v.loewis.de> Message-ID: <1293973779.23881.36.camel@marge> Le jeudi 30 d?cembre 2010 ? 17:05 +0100, "Martin v. L?wis" a ?crit : > > I really don't think it is our job to maintain a list of OS/versions > > which work and don't work. > > Of course not. I would propose a dynamic test: check how many POSIX > semaphores the installation supports, and fail if it's less than > 200 (say). I added informations about FreeBSD, NetBSD, Darwin and OpenBSD to the issue #10348: http://bugs.python.org/issue10348#msg125042 The maximum number of POSIX semaphores can be read with sysctl: - FreeBSD: "p1003_1b.sem_nsems_max" - NetBSD: "kern.posix.semmax" - Darwin: "kern.posix.sem.max" - OpenBSD: (no support) Victor From victor.stinner at haypocalc.com Sun Jan 2 14:17:27 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sun, 02 Jan 2011 14:17:27 +0100 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: <4D1B3592.3040403@v.loewis.de> References: <4D1B3592.3040403@v.loewis.de> Message-ID: <1293974247.23881.39.camel@marge> Le mercredi 29 d?cembre 2010 ? 14:20 +0100, "Martin v. L?wis" a ?crit : > Am 28.12.2010 18:08, schrieb Lukas Lueg: > > Also, the > > load_fast in lne 22 to reference x could be taken out of the loop as x > > will always point to the same object.... > > That's not true; a debugger may change the value of x. That's why Python has the following option: -O : optimize generated bytecode slightly; also PYTHONOPTIMIZE=x I regulary recompile programs with gcc -O0 -g to debug them. It is very difficult to debug (with gdb) a program compiled with gcc -O2: many variables are stored in registers, and gdb doesn't support that correctly. Victor From rdmurray at bitdance.com Sun Jan 2 17:02:18 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Sun, 02 Jan 2011 11:02:18 -0500 Subject: [Python-Dev] [Python-checkins] r87603 - python/branches/py3k/Misc/NEWS In-Reply-To: <4D1F9776.10006@udel.edu> References: <20110101100730.BE5B9EE987@mail.python.org> <4D1F9776.10006@udel.edu> Message-ID: <20110102160218.AEF61239FFC@kimball.webabinitio.net> On Sat, 01 Jan 2011 16:07:02 -0500, Terry Reedy wrote: > On 1/1/2011 5:07 AM, georg.brandl wrote: > > Author: georg.brandl > > Date: Sat Jan 1 11:07:30 2011 > > New Revision: 87603 > > > > Log: > > Fix issue references. > > (add '#' to issue numbers). Whoops, two of those are mine. I am still > learning and will try to remember to include it in both log messages and > NEWS entries. Heh. I think two of them were mine, and I'm supposed to know better by now. -- R. David Murray www.bitdance.com From g.brandl at gmx.net Sun Jan 2 19:13:48 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 02 Jan 2011 19:13:48 +0100 Subject: [Python-Dev] r87577 - in python/branches/py3k: Makefile.pre.in configure configure.in pyconfig.h.in In-Reply-To: <20101230145548.20AE7EE9A8@mail.python.org> References: <20101230145548.20AE7EE9A8@mail.python.org> Message-ID: Am 30.12.2010 15:55, schrieb martin.v.loewis: > Author: martin.v.loewis > Date: Thu Dec 30 15:55:47 2010 > New Revision: 87577 > > Log: > Build and install libpython3.so. > Modified: python/branches/py3k/configure.in > ============================================================================== > --- python/branches/py3k/configure.in (original) > +++ python/branches/py3k/configure.in Thu Dec 30 15:55:47 2010 > @@ -737,6 +738,10 @@ > BLDLIBRARY='-Wl,-R,$(LIBDIR) -L. -lpython$(LDVERSION)' > RUNSHARED=LD_LIBRARY_PATH=`pwd`:${LD_LIBRARY_PATH} > INSTSONAME="$LDLIBRARY".$SOVERSION > + if test $with_pydebug == no > + then > + PY3LIBRARY=libpython3.so > + fi > ;; > Linux*|GNU*|NetBSD*|FreeBSD*|DragonFly*) > LDLIBRARY='libpython$(LDVERSION).so' > @@ -748,6 +753,11 @@ > ;; > esac > INSTSONAME="$LDLIBRARY".$SOVERSION > + PY3LIBRARY=libpython3.so > + if test $with_pydebug == no > + then > + PY3LIBRARY=libpython3.so > + fi > ;; > hp*|HP*) > case `uname -m` in These changes do not work as written: if --with-pydebug is not given, $with_pydebug is empty, not "no". Also, in the second case the unconditional PY3LIBRARY assignment should probably be deleted. cheers, Georg From martin at v.loewis.de Mon Jan 3 00:59:04 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Mon, 03 Jan 2011 00:59:04 +0100 Subject: [Python-Dev] Issue #10348: concurrent.futures doesn't work on BSD In-Reply-To: <1293973779.23881.36.camel@marge> References: <1293628653.16756.13.camel@marge> <4D1B5394.10802@v.loewis.de> <4D1B7F37.3080201@v.loewis.de> <4D1B9EE2.70806@v.loewis.de> <1293663348.18690.163.camel@marge> <1FC42F4E-BFAE-47E0-984F-0040AACD6804@sweetapp.com> <4D1C4DFB.30302@v.loewis.de> <20101230164014.218a1f27@pitrou.net> <4D1CADB1.6060803@v.loewis.de> <1293973779.23881.36.camel@marge> Message-ID: <4D211148.10808@v.loewis.de> > I added informations about FreeBSD, NetBSD, Darwin and OpenBSD to the > issue #10348: > http://bugs.python.org/issue10348#msg125042 > > The maximum number of POSIX semaphores can be read with sysctl: > - FreeBSD: "p1003_1b.sem_nsems_max" > - NetBSD: "kern.posix.semmax" > - Darwin: "kern.posix.sem.max" > - OpenBSD: (no support) I've been using os.sysconf("SC_SEM_NSEMS_MAX"), which seems to have worked fine (it's also what POSIX mandates). See #10798. Regards, Martin From ned at nedbatchelder.com Mon Jan 3 02:25:56 2011 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sun, 02 Jan 2011 20:25:56 -0500 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: <1293974247.23881.39.camel@marge> References: <4D1B3592.3040403@v.loewis.de> <1293974247.23881.39.camel@marge> Message-ID: <4D2125A4.2090409@nedbatchelder.com> On 1/2/2011 8:17 AM, Victor Stinner wrote: > Le mercredi 29 d?cembre 2010 ? 14:20 +0100, "Martin v. L?wis" a ?crit : >> Am 28.12.2010 18:08, schrieb Lukas Lueg: >>> Also, the >>> load_fast in lne 22 to reference x could be taken out of the loop as x >>> will always point to the same object.... >> That's not true; a debugger may change the value of x. > That's why Python has the following option: > > -O : optimize generated bytecode slightly; also PYTHONOPTIMIZE=x > > I regulary recompile programs with gcc -O0 -g to debug them. It is very > difficult to debug (with gdb) a program compiled with gcc -O2: many > variables are stored in registers, and gdb doesn't support that > correctly. > Victor, you seem to be equating the gcc -O flag with the Python -O flag. They are described similarly, but can't be used the same way. In particular, there is no Python equivalent to gcc's -O0: there is no way to disable the Python peephole optimizer. --Ned. > Victor > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ned%40nedbatchelder.com From alex.gaynor at gmail.com Mon Jan 3 02:50:47 2011 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Mon, 3 Jan 2011 01:50:47 +0000 (UTC) Subject: [Python-Dev] =?utf-8?q?Possible_optimization_for_LOAD=5FFAST_=3F?= References: Message-ID: Cesare Di Mauro gmail.com> writes: > > > 2010/12/28 Lukas Lueg googlemail.com> > > Consider the following code: > def foobar(x): > ? ?for i in range(5): > ? ? ? ?x[i] = i > The bytecode in python 2.7 is the following: > ?2 ? ? ? ? ? 0 SETUP_LOOP ? ? ? ? ? ? ?30 (to 33) > ? ? ? ? ? ? ?3 LOAD_GLOBAL ? ? ? ? ? ? ?0 (range) > ? ? ? ? ? ? ?6 LOAD_CONST ? ? ? ? ? ? ? 1 (5) > ? ? ? ? ? ? ?9 CALL_FUNCTION ? ? ? ? ? ?1 > ? ? ? ? ? ? 12 GET_ITER > ? ? ? ?>> ? 13 FOR_ITER ? ? ? ? ? ? ? ?16 (to 32) > ? ? ? ? ? ? 16 STORE_FAST ? ? ? ? ? ? ? 1 (i) > ?3 ? ? ? ? ?19 LOAD_FAST ? ? ? ? ? ? ? ?1 (i) > ? ? ? ? ? ? 22 LOAD_FAST ? ? ? ? ? ? ? ?0 (x) > ? ? ? ? ? ? 25 LOAD_FAST ? ? ? ? ? ? ? ?1 (i) > ? ? ? ? ? ? 28 STORE_SUBSCR > ? ? ? ? ? ? 29 JUMP_ABSOLUTE ? ? ? ? ? 13 > ? ? ? ?>> ? 32 POP_BLOCK > ? ? ? ?>> ? 33 LOAD_CONST ? ? ? ? ? ? ? 0 (None) > ? ? ? ? ? ? 36 RETURN_VALUE > Can't we optimize the LOAD_FAST in lines 19 and 25 to a single load > and put the reference twice on the stack? There is no way that the > reference of i might change in between the two lines. Also, the > load_fast in lne 22 to reference x could be taken out of the loop as x?will always point to the same object.... > > Yes, you can, but you need: > - a better AST evaluator (to mark symbols/variables with proper attributes); > - a better optimizer (usually located on compile.c) which has a "global vision" (not limited to single instructions and/or single expressions). > > It's not that simple, and the results aren't guaranteed to be good. > > Also, consider that Python, as a dynamic-and-not-statically-compiled language need to find a good trade-off between compilation time and execution. > > Just to be clear, a C program is usually compiled once, then executed, so you can spend even *hours* to better optimize the final binary code. > > With a dynamic language, usually the code is compiled and the executed as needed, in "realtime". So it isn't practical neither desirable having to wait too much time before execution begins (the "startup" problem). > > Python stays in a "gray area", because modules are usually compiled once (when they are first used), and executed many times, but it isn't the only case. > > You cannot assume that optimization techniques used on other (static) languages can be used/ported in Python. > > Cesare > No, it's singularly impossible to prove that any global load will be any given value at compile time. Any optimization based on this premise is wrong. Alex From guido at python.org Mon Jan 3 04:18:09 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 2 Jan 2011 19:18:09 -0800 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: Message-ID: On Sun, Jan 2, 2011 at 5:50 PM, Alex Gaynor wrote: > No, it's singularly impossible to prove that any global load will be any given > value at compile time. ?Any optimization based on this premise is wrong. True. My proposed way out of this conundrum has been to change the language semantics slightly so that global names which (a) coincide with a builtin, and (b) have no explicit assignment to them in the current module, would be fair game for such optimizations, with the understanding that the presence of e.g. "len = len" anywhere in the module (even in dead code!) would be sufficient to disable the optimization. But barring someone interested in implementing something based on this rule, the proposal has languished for many years. FWIW, this is reminiscent of Fortran's rules for "intrinsics" (its name for builtins), which have a similar optimization behavior (except there the potential overrides that the compiler doesn't need to take into account are load-time definitions). -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Mon Jan 3 06:36:24 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 03 Jan 2011 00:36:24 -0500 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: Message-ID: On 1/2/2011 10:18 PM, Guido van Rossum wrote: > My proposed way out of this conundrum has been to change the language > semantics slightly so that global names which (a) coincide with a > builtin, and (b) have no explicit assignment to them in the current > module, would be fair game for such optimizations, with the > understanding that the presence of e.g. "len = len" anywhere in the > module (even in dead code!) would be sufficient to disable the > optimization. I believe this amounts to saying 1) Python code executes in three scopes (rather than two): global builtin, modular (misleadingly call global), and local. This much is a possible viewpoint today. 2) A name that is not an assignment target anywhere -- and that matches a builtin name -- is treated as a builtin. This is the new part, and it amounts to a rule for entire modules that is much like the current rule for separating local and global names within a function. The difference from the global/local rule would be that unassigned non-builtin names would be left to runtime resolution in globals. It would seem that this new rule would simplify the lookup of module ('global') names since if xxx in not in globals, there is no need to look in builtins. This is assuming that following 'len=len' with 'del len' cannot 'unmodularize' the name. For the rule to work 'retroactively' within a module as it does within functions would require a similar preliminary pass. So it could not work interactively. Should batch mode main modules work the same as when imported? Interactive mode could work as it does at present or with slight modification, which would be that builtin names within functions, if not yet overriden, also get resolved when the function is compiled. -- Terry Jan Reedy From cesare.di.mauro at gmail.com Mon Jan 3 07:45:20 2011 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Mon, 3 Jan 2011 07:45:20 +0100 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: <4D1F61D2.1010909@nedbatchelder.com> References: <4D1B3592.3040403@v.loewis.de> <4D1DFCF8.8080109@stoneleaf.us> <19742.1684.846410.770977@montanaro.dyndns.org> <4D1F61D2.1010909@nedbatchelder.com> Message-ID: 2011/1/1 Ned Batchelder > On 12/31/2010 12:51 PM, Cesare Di Mauro wrote: > > "Aggressive" optimizations can be enabled with explicit options, in order > to leave normal "debugger-prone" code. > > I wish the Python compiler would adopt a strategy of being able to disable > optimizations. I wrote a bug about a "leaky abstraction" optimization > messing up coverage testing 2.5 years ago, and it was closed as won't fix: > http://bugs.python.org/issue2506. The debate there centered around, "but > that line isn't executed, because it's been optimized away." It's common in > sophisticated compilers (as in, any C compiler) to be able to choose whether > you want optimizations for speed, or disabling optimizations for debugging > and reasoning about the code. Python would benefit from the same choice. > > --Ned. > Command line parameters and/or environment variables are suitable for this, but they aren't immediate and, also, have global effect. I wish an explicit ("Explicit is better than implicit") and a finer control over optimizations, with a per-module usage: from __compiler__ import disable_peepholer, strict_syntax, static_builtins, globals_as_fasts Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From cesare.di.mauro at gmail.com Mon Jan 3 07:52:58 2011 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Mon, 3 Jan 2011 07:52:58 +0100 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: Message-ID: 2011/1/3 Alex Gaynor > No, it's singularly impossible to prove that any global load will be any > given > value at compile time. Any optimization based on this premise is wrong. > > Alex > That's your opinion, but I have very different ideas. Of course we can't leave the problem only on the compiler shoulders, but I think that can be ways to threat builtins as "static" variables, and globals like local (fast) variables too, taking into account changes on the builtins' and modules dictionaries. But it doesn't make sense to invest time in these things: JITs are becoming a good alternative, and may be they will be ready soon to take the CPython place as the "mainstream" implementation. Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Jan 3 08:28:58 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 3 Jan 2011 17:28:58 +1000 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: Message-ID: On Mon, Jan 3, 2011 at 3:36 PM, Terry Reedy wrote: > I believe this amounts to saying > > 1) Python code executes in three scopes (rather than two): global builtin, > modular (misleadingly call global), and local. This much is a possible > viewpoint today. > > 2) A name that is not an assignment target anywhere -- and that matches a > builtin name -- is treated as a builtin. This is the new part, and it > amounts to a rule for entire modules that is much like the current rule for > separating local and global names within a function. The difference from the > global/local rule would be that unassigned non-builtin names would be left > to runtime resolution in globals. > > It would seem that this new rule would simplify the lookup of module > ('global') names since if xxx in not in globals, there is no need to look in > builtins. This is assuming that following 'len=len' with 'del len' cannot > 'unmodularize' the name. > > For the rule to work 'retroactively' within a module as it does within > functions would require a similar preliminary pass. So it could not work > interactively. Should batch mode main modules work the same as when > imported? > > Interactive mode could work as it does at present or with slight > modification, which would be that builtin names within functions, if not yet > overriden, also get resolved when the function is compiled. This could potentially be handled by having the "exec" mode in compile() assume it can see all the global assignments (and hence assume builtin names refer to the builtins), while "single" would assume this was not the case (and hence skip the optimisation). It may also need an additional parameter to tell the compiler which names are known to be visible in the current locals and globals (e.g. to allow exec() to do the right thing) This kind of issue is the reason Guido pointed out the idea really needs someone else to pick it up and run with it - not just to figure out the implementation details, but also to ferret out the full implications for the language semantics and backwards compatibility. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From orsenthil at gmail.com Mon Jan 3 10:57:53 2011 From: orsenthil at gmail.com (Senthil Kumaran) Date: Mon, 3 Jan 2011 17:57:53 +0800 Subject: [Python-Dev] [Python-checkins] r87677 - python/branches/py3k/py3rsa.py In-Reply-To: <20110103094710.07327EE993@mail.python.org> References: <20110103094710.07327EE993@mail.python.org> Message-ID: Sorry Folks. I commited to a wrong respository. I was testing it against the latest version py3k and I thought i moved it back to my original respository. Apologize for the trouble and I shall remove it immediately. -- Senthil On Mon, Jan 3, 2011 at 5:47 PM, senthil.kumaran wrote: > Author: senthil.kumaran > Date: Mon Jan ?3 10:47:09 2011 > New Revision: 87677 > > Log: > py3k implmentation of RSA algorithm, > > > > Added: > ? python/branches/py3k/py3rsa.py ? (contents, props changed) > > Added: python/branches/py3k/py3rsa.py > ============================================================================== > --- (empty file) > +++ python/branches/py3k/py3rsa.py ? ? ?Mon Jan ?3 10:47:09 2011 > @@ -0,0 +1,181 @@ > +# Copyright (c) 2010 Russell Dias > +# Licensed under the MIT licence. > +# http://www.inversezen.com > +# > +# This is an implementation of the RSA public key > +# encryption written in Python by Russell Dias > + > +__author__ = 'Russell Dias // inversezen.com' > +# Py3k port done by Senthil (senthil at uthcode.com) > +__date__ = '05/12/2010' > +__version__ = '0.0.1' > + > +import random > +from math import log > + > +def gcd(u, v): > + ? ?""" The Greatest Common Divisor, returns > + ? ? ? ?the largest positive integer that divides > + ? ? ? ?u with v without a remainder. > + ? ?""" > + ? ?while v: > + ? ? ? ?u, v = u, u % v > + ? ?return u > + > +def eec(u, v): > + ? ?""" The Extended Eculidean Algorithm > + ? ? ? ?For u and v this algorithm finds (u1, u2, u3) > + ? ? ? ?such that uu1 + vu2 = u3 = gcd(u, v) > + > + ? ? ? ?We also use auxiliary vectors (v1, v2, v3) and > + ? ? ? ?(tmp1, tmp2, tmp3) > + ? ?""" > + ? ?(u1, u2, u3) = (1, 0, u) > + ? ?(v1, v2, v3) = (0, 1, v) > + ? ?while (v3 != 0): > + ? ? ? ?quotient = u3 // v3 > + ? ? ? ?tmp1 = u1 - quotient * v1 > + ? ? ? ?tmp2 = u2 - quotient * v2 > + ? ? ? ?tmp3 = u3 - quotient * v3 > + ? ? ? ?(u1, u2, u3) = (v1, v2, v3) > + ? ? ? ?(v1, v2, v3) = (tmp1, tmp2, tmp3) > + ? ?return u3, u1, u2 > + > +def stringEncode(string): > + ? ?""" Brandon Sterne's algorithm to convert > + ? ? ? ?string to long > + ? ?""" > + ? ?message = 0 > + ? ?messageCount = len(string) - 1 > + > + ? ?for letter in string: > + ? ? ? ?message += (256**messageCount) * ord(letter) > + ? ? ? ?messageCount -= 1 > + ? ?return message > + > +def stringDecode(number): > + ? ?""" Convert long back to string > + ? ?""" > + > + ? ?letters = [] > + ? ?text = '' > + ? ?integer = int(log(number, 256)) > + > + ? ?while(integer >= 0): > + ? ? ? ?letter = number // (256**integer) > + ? ? ? ?letters.append(chr(letter)) > + ? ? ? ?number -= letter * (256**integer) > + ? ? ? ?integer -= 1 > + ? ?for char in letters: > + ? ? ? ?text += char > + > + ? ?return text > + > +def split_to_odd(n): > + ? ?""" Return values 2 ^ k, such that 2^k*q = n; > + ? ? ? ?or an odd integer to test for primiality > + ? ? ? ?Let n be an odd prime. Then n-1 is even, > + ? ? ? ?where k is a positive integer. > + ? ?""" > + ? ?k = 0 > + ? ?while (n > 0) and (n % 2 == 0): > + ? ? ? ?k += 1 > + ? ? ? ?n >>= 1 > + ? ?return (k, n) > + > +def prime(a, q, k, n): > + ? ?if pow(a, q, n) == 1: > + ? ? ? ?return True > + ? ?elif (n - 1) in [pow(a, q*(2**j), n) for j in range(k)]: > + ? ? ? ?return True > + ? ?else: > + ? ? ? ?return False > + > +def miller_rabin(n, trials): > + ? ?""" > + ? ? ? ?There is still a small chance that n will return a > + ? ? ? ?false positive. To reduce risk, it is recommended to use > + ? ? ? ?more trials. > + ? ?""" > + ? ?# 2^k * q = n - 1; q is an odd int > + ? ?(k, q) = split_to_odd(n - 1) > + > + ? ?for trial in range(trials): > + ? ? ? ?a = random.randint(2, n-1) > + ? ? ? ?if not prime(a, q, k, n): > + ? ? ? ? ? ?return False > + ? ?return True > + > +def get_prime(k): > + ? ?""" Generate prime of size k bits, with 50 tests > + ? ? ? ?to ensure accuracy. > + ? ?""" > + ? ?prime = 0 > + ? ?while (prime == 0): > + ? ? ? ?prime = random.randrange(pow(2,k//2-1) + 1, pow(2, k//2), 2) > + ? ? ? ?if not miller_rabin(prime, 50): > + ? ? ? ? ? ?prime = 0 > + ? ?return prime > + > +def modular_inverse(a, m): > + ? ?""" To calculate the decryption exponent such that > + ? ? ? ?(d * e) mod phi(N) = 1 OR g == 1 in our implementation. > + ? ? ? ?Where m is Phi(n) (PHI = (p-1) * (q-1) ) > + > + ? ? ? ?s % m or d (decryption exponent) is the multiplicative inverse of > + ? ? ? ?the encryption exponent e. > + ? ?""" > + ? ?g, s, t = eec(a, m) > + ? ?if g == 1: > + ? ? ? ?return s % m > + ? ?else: > + ? ? ? ?return None > + > +def key_gen(bits): > + ? ?""" The public encryption exponent e, > + ? ? ? ?can be an artibrary prime number. > + > + ? ? ? ?Obviously, the higher the number, > + ? ? ? ?the more secure the key pairs are. > + ? ?""" > + ? ?e = 17 > + ? ?p = get_prime(bits) > + ? ?q = get_prime(bits) > + ? ?d = modular_inverse(e, (p-1)*(q-1)) > + ? ?return p*q,d,e > + > +def write_to_file(e, d, n): > + ? ?""" Write our public and private keys to file > + ? ?""" > + ? ?public = open("publicKey", "w") > + ? ?public.write(str(e)) > + ? ?public.write("\n") > + ? ?public.write(str(n)) > + ? ?public.close() > + > + ? ?private = open("privateKey", "w") > + ? ?private.write(str(d)) > + ? ?private.write("\n") > + ? ?private.write(str(n)) > + ? ?private.close() > + > + > +if __name__ == '__main__': > + ? ?bits = input("Enter the size of your key pairs, in bits: ") > + > + ? ?n, d, e = key_gen(int(bits)) > + > + ? ?#Write keys to file > + ? ?write_to_file(e, d, n) > + > + ? ?print("Your keys pairs have been saved to file") > + > + ? ?m = input("Enter the message you would like to encrypt: ") > + > + ? ?m = stringEncode(m) > + ? ?encrypted = pow(m, e, n) > + > + ? ?print("Your encrypted message is: %s" % encrypted) > + ? ?decrypted = pow(encrypted, d, n) > + ? ?message = stringDecode(decrypted) > + ? ?print("You message decrypted is: %s" % message) > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > -- Senthil From eric at trueblade.com Mon Jan 3 10:56:28 2011 From: eric at trueblade.com (Eric Smith) Date: Mon, 03 Jan 2011 04:56:28 -0500 Subject: [Python-Dev] [Python-checkins] r87677 - python/branches/py3k/py3rsa.py In-Reply-To: <20110103094710.07327EE993@mail.python.org> References: <20110103094710.07327EE993@mail.python.org> Message-ID: <4D219D4C.2070009@trueblade.com> On 1/3/2011 4:47 AM, senthil.kumaran wrote: > Author: senthil.kumaran > Date: Mon Jan 3 10:47:09 2011 > New Revision: 87677 > > Log: > py3k implmentation of RSA algorithm, > > > > Added: > python/branches/py3k/py3rsa.py (contents, props changed) Did you really mean this to go in the py3k top-level directory? From glyph at twistedmatrix.com Mon Jan 3 15:12:12 2011 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Mon, 3 Jan 2011 09:12:12 -0500 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: Message-ID: <7B7E90AE-C18C-435D-A111-24B307213D88@twistedmatrix.com> On Jan 2, 2011, at 10:18 PM, Guido van Rossum wrote: > On Sun, Jan 2, 2011 at 5:50 PM, Alex Gaynor wrote: >> No, it's singularly impossible to prove that any global load will be any given >> value at compile time. Any optimization based on this premise is wrong. > > True. > > My proposed way out of this conundrum has been to change the language > semantics slightly so that global names which (a) coincide with a > builtin, and (b) have no explicit assignment to them in the current > module, would be fair game for such optimizations, with the > understanding that the presence of e.g. "len = len" anywhere in the > module (even in dead code!) would be sufficient to disable the > optimization. > > But barring someone interested in implementing something based on this > rule, the proposal has languished for many years. Wouldn't this optimization break things like mocking out 'open' for testing via 'module.open = fakeopen'? I confess I haven't ever wanted to change 'len' but that one seems pretty useful. If CPython wants such optimizations, it should do what PyPy and its ilk do, which is to notice the assignment, but recompile code in that module to disable the fast path at runtime, preserving the existing semantics. From michael at voidspace.org.uk Mon Jan 3 16:33:48 2011 From: michael at voidspace.org.uk (Michael Foord) Date: Mon, 03 Jan 2011 15:33:48 +0000 Subject: [Python-Dev] Tools/unicode Message-ID: <4D21EC5C.6040500@voidspace.org.uk> Hello all, In the Tools/ directory (py3k) we have a tool/directory called "unicode". The description in Tools/README is: unicode Tools used to generate unicode database files for Python 2.0 (by Fredrik Lundh). As described this is not at all useful for Python 3.2. I'm removing the "for Python 2.0" from the description, but I would rather remove the tool altogether. If someone knows if this tool is still used/useful then please let us know how the description should best be updated. If there are no replies I'll remove it. All the best, Michael Foord -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From alexander.belopolsky at gmail.com Mon Jan 3 16:39:12 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 3 Jan 2011 10:39:12 -0500 Subject: [Python-Dev] Tools/unicode In-Reply-To: <4D21EC5C.6040500@voidspace.org.uk> References: <4D21EC5C.6040500@voidspace.org.uk> Message-ID: On Mon, Jan 3, 2011 at 10:33 AM, Michael Foord wrote: .. > If someone knows if this tool is still used/useful then please let us know > how the description should best be updated. If there are no replies I'll > remove it. If you are talking about Tools/unicode/, this is definitely a very useful tool used to generate unicodedata and encoding modules from raw unicode.org files. From michael at voidspace.org.uk Mon Jan 3 16:41:36 2011 From: michael at voidspace.org.uk (Michael Foord) Date: Mon, 03 Jan 2011 15:41:36 +0000 Subject: [Python-Dev] Tools/unicode In-Reply-To: References: <4D21EC5C.6040500@voidspace.org.uk> Message-ID: <4D21EE30.4020209@voidspace.org.uk> On 03/01/2011 15:39, Alexander Belopolsky wrote: > On Mon, Jan 3, 2011 at 10:33 AM, Michael Foord wrote: > .. >> If someone knows if this tool is still used/useful then please let us know >> how the description should best be updated. If there are no replies I'll >> remove it. > If you are talking about Tools/unicode/, this is definitely a very > useful tool used to generate unicodedata and encoding modules from raw > unicode.org files. The description currently reads "Tools used to generate unicode database files". I'll update it to read: "tool used to generate unicodedata and encoding modules from raw unicode.org files" All the best, Michael Foord -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From guido at python.org Mon Jan 3 16:58:50 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 3 Jan 2011 07:58:50 -0800 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: Message-ID: On Sun, Jan 2, 2011 at 9:36 PM, Terry Reedy wrote: > On 1/2/2011 10:18 PM, Guido van Rossum wrote: > >> My proposed way out of this conundrum has been to change the language >> semantics slightly so that global names which (a) coincide with a >> builtin, and (b) have no explicit assignment to them in the current >> module, would be fair game for such optimizations, with the >> understanding that the presence of e.g. "len = len" anywhere in the >> module (even in dead code!) would be sufficient to disable the >> optimization. > > I believe this amounts to saying > > 1) Python code executes in three scopes (rather than two): global builtin, > modular (misleadingly call global), and local. This much is a possible > viewpoint today. In fact it is the specification today. > 2) A name that is not an assignment target anywhere -- and that matches a > builtin name -- is treated as a builtin. This is the new part, and it > amounts to a rule for entire modules that is much like the current rule for > separating local and global names within a function. The difference from the > global/local rule would be that unassigned non-builtin names would be left > to runtime resolution in globals. > > It would seem that this new rule would simplify the lookup of module > ('global') names since if xxx in not in globals, there is no need to look in > builtins. This is assuming that following 'len=len' with 'del len' cannot > 'unmodularize' the name. Actually I would leave the lookup mechanism for names that don't get special treatment the same -- the only difference would be for builtins in contexts where the compiler can generate better code (typically involving a new opcode) based on all the conditions being met. > For the rule to work 'retroactively' within a module as it does within > functions would require a similar preliminary pass. We actually already do such a pass. > So it could not work interactively. That's fine. We could also disable it automatically in when eval() or exec() is the source of the code. > Should batch mode main modules work the same as when > imported? Yes. > Interactive mode could work as it does at present or with slight > modification, which would be that builtin names within functions, if not yet > overridden, also get resolved when the function is compiled. Interactive mode would just work as it does today. I would also make a rule saying that 'open' is not treated this way. It is the only one where I can think of legitimate reasons for changing the semantics dynamically in a way that is not detectable by the compiler, assuming it only sees the source code for one module at a time. Some things that could be optimized this way: len(x), isinstance(x, (int, float)), range(10), issubclass(x, str), bool(x), int(x), hash(x), etc... in general, the less the function does the better a target for this optimization it is. One more thing: to avoid heisenbugs, I propose that, for any particular builtin, if this optimization is used anywhere in a module, it is should be used everywhere in that module (except in scopes where the name has a different meaning). This means that we can tell users about it and they can observe it without too much of a worry that a slight change to their program might disable it. (I've seen this with optimizations in gcc, and it makes performance work tricky.) Still, it's all academic until someone implements some of the optimizations. (There rest of the work is all in the docs and in the users' minds.) -- --Guido van Rossum (python.org/~guido) From guido at python.org Mon Jan 3 17:01:31 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 3 Jan 2011 08:01:31 -0800 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: <7B7E90AE-C18C-435D-A111-24B307213D88@twistedmatrix.com> References: <7B7E90AE-C18C-435D-A111-24B307213D88@twistedmatrix.com> Message-ID: On Mon, Jan 3, 2011 at 6:12 AM, Glyph Lefkowitz wrote: > Wouldn't this optimization break things like mocking out 'open' for testing via 'module.open = fakeopen'? ?I confess I haven't ever wanted to change 'len' but that one seems pretty useful. I am explicitly excluding open from this optimization, for that very reason. > If CPython wants such optimizations, it should do what PyPy and its ilk do, which is to notice the assignment, but recompile code in that module to disable the fast path at runtime, preserving the existing semantics. In general I am against duplicating bytecode -- it can blow up too much. (It is an entirely appropriate technique for JIT compilers -- but my point here is that bytecode is different.) Recompiling a module is not a trivial change -- for example, either code objects would have to become mutable, or we'd have to track down all the code objects and replace them. Neither sounds attractive to me. -- --Guido van Rossum (python.org/~guido) From mal at egenix.com Mon Jan 3 17:19:08 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 03 Jan 2011 17:19:08 +0100 Subject: [Python-Dev] Tools/unicode In-Reply-To: <4D21EE30.4020209@voidspace.org.uk> References: <4D21EC5C.6040500@voidspace.org.uk> <4D21EE30.4020209@voidspace.org.uk> Message-ID: <4D21F6FC.1010500@egenix.com> Michael Foord wrote: > On 03/01/2011 15:39, Alexander Belopolsky wrote: >> On Mon, Jan 3, 2011 at 10:33 AM, Michael >> Foord wrote: >> .. >>> If someone knows if this tool is still used/useful then please let us >>> know >>> how the description should best be updated. If there are no replies I'll >>> remove it. >> If you are talking about Tools/unicode/, this is definitely a very >> useful tool used to generate unicodedata and encoding modules from raw >> unicode.org files. > The description currently reads "Tools used to generate unicode database > files". I'll update it to read: > > "tool used to generate unicodedata and encoding modules from raw > unicode.org files" Make that "Tools for generating unicodedata and codecs from unicode.org and other mapping files". The scripts in that dir are not just one tool, but several tools needed to maintain the Unicode database in Python as well as generate new codecs from mapping files. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 03 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From fuzzyman at voidspace.org.uk Mon Jan 3 17:20:41 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 03 Jan 2011 16:20:41 +0000 Subject: [Python-Dev] Tools/unicode In-Reply-To: <4D21F6FC.1010500@egenix.com> References: <4D21EC5C.6040500@voidspace.org.uk> <4D21EE30.4020209@voidspace.org.uk> <4D21F6FC.1010500@egenix.com> Message-ID: <4D21F759.40405@voidspace.org.uk> On 03/01/2011 16:19, M.-A. Lemburg wrote: > Michael Foord wrote: >> On 03/01/2011 15:39, Alexander Belopolsky wrote: >>> On Mon, Jan 3, 2011 at 10:33 AM, Michael >>> Foord wrote: >>> .. >>>> If someone knows if this tool is still used/useful then please let us >>>> know >>>> how the description should best be updated. If there are no replies I'll >>>> remove it. >>> If you are talking about Tools/unicode/, this is definitely a very >>> useful tool used to generate unicodedata and encoding modules from raw >>> unicode.org files. >> The description currently reads "Tools used to generate unicode database >> files". I'll update it to read: >> >> "tool used to generate unicodedata and encoding modules from raw >> unicode.org files" > Make that "Tools for generating unicodedata and codecs from unicode.org > and other mapping files". > > The scripts in that dir are not just one tool, but several tools needed > to maintain the Unicode database in Python as well as generate new > codecs from mapping files. > Thanks Marc-Andre. I'll add you and Martin as maintainers in the README description as well. All the best, Michael Foord -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From dmalcolm at redhat.com Mon Jan 3 18:52:26 2011 From: dmalcolm at redhat.com (David Malcolm) Date: Mon, 03 Jan 2011 12:52:26 -0500 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: Message-ID: <1294077146.26496.26.camel@radiator.bos.redhat.com> On Sun, 2011-01-02 at 19:18 -0800, Guido van Rossum wrote: > On Sun, Jan 2, 2011 at 5:50 PM, Alex Gaynor wrote: > > No, it's singularly impossible to prove that any global load will be any given > > value at compile time. Any optimization based on this premise is wrong. > > True. > > My proposed way out of this conundrum has been to change the language > semantics slightly so that global names which (a) coincide with a > builtin, and (b) have no explicit assignment to them in the current > module, would be fair game for such optimizations, with the > understanding that the presence of e.g. "len = len" anywhere in the > module (even in dead code!) would be sufficient to disable the > optimization. > > But barring someone interested in implementing something based on this > rule, the proposal has languished for many years. Is there a PEP for this? > > FWIW, this is reminiscent of Fortran's rules for "intrinsics" (its > name for builtins), which have a similar optimization behavior (except > there the potential overrides that the compiler doesn't need to take > into account are load-time definitions). I've been attempting another way in: http://bugs.python.org/issue10399 using a new "JUMP_IF_SPECIALIZABLE" opcode. This compares what a value is against a compile-time prediction, branching to an optimized implementation if the guess was correct. I use this to implement function-call inlining within the generated bytecode. Caveat-of-doom: That code's very much a work-in-progress at this stage, though: sometimes it doesn't segfault :) and the way that I track the predicted values is taking some other liberties with semantics (see that URL and the dmalcolm-ast-optimization-branch in SVN). (There's probably at least 2 PEPs in the above idea, though have yet to write my first PEP) Hope this is helpful Dave From alexander.belopolsky at gmail.com Tue Jan 4 01:06:04 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 3 Jan 2011 19:06:04 -0500 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime Message-ID: There are several reports of bugs caused by the fact that the behavior of C functions asctime and ctime is undefined when they are asked to format time for more than 4-digit years: http://bugs.python.org/issue8013 http://bugs.python.org/issue6608 (closed) http://bugs.python.org/issue10563 (superseded by #8013) I have a patch ready at issue 8013 that adds a check for large values and causes time.asctime and time.ctime raise ValueError instead of producing system-dependent results or in some cases crashing or corrupting the python process. There is little dispute that python should not crash on invalid input, but I would like to ask for a second opinion on whether it would be better to produce some distinct 24-character string, say 'Mon Jan 1 00:00:00 *999', instead of raising an exception. Note that on some Windows systems, the current behavior is to produce '%c999' % (year // 1000 + ord('0')) for at least some large values of year. Linux asctime produces strings that are longer than 26 characters, but I don't think we should support this behavior because POSIX defines asctime() result as a 26 character string and Python manual defines time.asctime() result as a 24 character string. Producing longer timestamps is likely to break as many applications as accepting large years will fix. OSX asctime returns a NULL pointer for large years. My position is that raising an error is the right solution. This is consistent with year range supported by datetime. Another small issue that I would like to raise here is issue6608 patch resulting in time.asctime() accepting 0 as a valid entry at any position of the timetuple. This is consistent with the behavior of time.strftime(), but was overlooked when issue6608 was reviewed. I find the case for accepting say 0 month or 0 day in time.asctime() weaker than that for time.strftime() where month or day values may be ignored. From guido at python.org Tue Jan 4 01:47:07 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 3 Jan 2011 16:47:07 -0800 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: Message-ID: Given the rule garbage in -> garbage out, I'd do the most useful thing, which would be to produce a longer output string (and update the docs). This would match the behavior of e.g. '%04d' % y when y > 9999. If that means the platform libc asctime/ctime can't be used, too bad. --Guido On Mon, Jan 3, 2011 at 4:06 PM, Alexander Belopolsky wrote: > There are several reports of bugs caused by the fact that the behavior > of C functions asctime and ctime is undefined when they are asked to > format time for more than 4-digit years: > > http://bugs.python.org/issue8013 > http://bugs.python.org/issue6608 (closed) > http://bugs.python.org/issue10563 (superseded by #8013) > > I have a patch ready at issue 8013 that adds a check for large values > and causes time.asctime and time.ctime raise ValueError instead of > producing system-dependent results or in some cases crashing or > corrupting the python process. > > There is little dispute that python should not crash on invalid input, > but I would like to ask for a second opinion on whether it would be > better to produce some distinct 24-character string, say 'Mon Jan ?1 > 00:00:00 *999', instead of raising an exception. > > Note that on some Windows systems, the current behavior is to produce > '%c999' % (year // 1000 + ord('0')) for at least some large values of > year. ?Linux asctime produces strings that are longer than 26 > characters, but I don't think we should support this behavior because > POSIX defines asctime() result as a 26 character string and Python > manual defines time.asctime() result as a 24 character string. > Producing longer timestamps is likely to break as many applications as > accepting large years will fix. OSX asctime returns a NULL pointer for > large years. > > My position is that raising an error is the right solution. ?This is > consistent with year range supported by datetime. > > Another small issue that I would like to raise here is issue6608 patch > resulting in time.asctime() accepting 0 as a valid entry at any > position of the timetuple. ?This is consistent with the behavior of > time.strftime(), but was overlooked when issue6608 was reviewed. ? I > find the case for accepting say 0 month or 0 day in time.asctime() > weaker than that for time.strftime() where month or day values may be > ignored. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) From guido at python.org Tue Jan 4 02:02:35 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 3 Jan 2011 17:02:35 -0800 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: <1294077146.26496.26.camel@radiator.bos.redhat.com> References: <1294077146.26496.26.camel@radiator.bos.redhat.com> Message-ID: On Mon, Jan 3, 2011 at 9:52 AM, David Malcolm wrote: > On Sun, 2011-01-02 at 19:18 -0800, Guido van Rossum wrote: >> On Sun, Jan 2, 2011 at 5:50 PM, Alex Gaynor wrote: >> > No, it's singularly impossible to prove that any global load will be any given >> > value at compile time. ?Any optimization based on this premise is wrong. >> >> True. >> >> My proposed way out of this conundrum has been to change the language >> semantics slightly so that global names which (a) coincide with a >> builtin, and (b) have no explicit assignment to them in the current >> module, would be fair game for such optimizations, with the >> understanding that the presence of e.g. "len = len" anywhere in the >> module (even in dead code!) would be sufficient to disable the >> optimization. >> >> But barring someone interested in implementing something based on this >> rule, the proposal has languished for many years. > > Is there a PEP for this? Not that I know of, otherwise I'd have mentioned it. :-) It would be useful if someone wrote it up, since the idea comes back in one form or another regularly. >> FWIW, this is reminiscent of Fortran's rules for "intrinsics" (its >> name for builtins), which have a similar optimization behavior (except >> there the potential overrides that the compiler doesn't need to take >> into account are load-time definitions). > > I've been attempting another way in: > ?http://bugs.python.org/issue10399 > using a new "JUMP_IF_SPECIALIZABLE" opcode. ?This compares what a value > is against a compile-time prediction, branching to an optimized > implementation if the guess was correct. ?I use this to implement > function-call inlining within the generated bytecode. Yeah, that's what everybody proposes to keep the language semantics unchanged. But I claim that an easier solution is to say to hell with those semantics, let's change them to make the implementation simpler. That's from the Zen of Python: "If the implementation is easy to explain, it may be a good idea." I guess few people can seriously propose to change Python's semantics, that's why *I* am proposing it. :-) Note that the semantics of locals (e.g. UnboundLocalError) were also changed specifically to allow a significant optimization -- again by me. > Caveat-of-doom: That code's very much a work-in-progress at this stage, > though: sometimes it doesn't segfault :) and the way that I track the > predicted values is taking some other liberties with semantics (see that > URL and the dmalcolm-ast-optimization-branch in SVN). > > (There's probably at least 2 PEPs in the above idea, though have yet to > write my first PEP) If you want to write up a PEP for the semantics change I am proposing, everything you need is in this thread. -- --Guido van Rossum (python.org/~guido) From victor.stinner at haypocalc.com Tue Jan 4 03:44:53 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 04 Jan 2011 03:44:53 +0100 Subject: [Python-Dev] PEP 3333: wsgi_string() function Message-ID: <1294109093.14661.4.camel@marge> Hi, In the PEP 3333, I read: -------------- import os, sys enc, esc = sys.getfilesystemencoding(), 'surrogateescape' def wsgi_string(u): # Convert an environment variable to a WSGI "bytes-as-unicode" string return u.encode(enc, esc).decode('iso-8859-1') def run_with_cgi(application): environ = {k: wsgi_string(v) for k,v in os.environ.items()} environ['wsgi.input'] = sys.stdin environ['wsgi.errors'] = sys.stderr environ['wsgi.version'] = (1, 0) ... -------------- What is this horrible encoding "bytes-as-unicode"? os.environ is supposed to be correctly decoded and contain valid unicode characters. If WSGI uses another encoding than the locale encoding (which is a bad idea), it should use os.environb and decodes keys and values using its own encoding. If you really want to store bytes in unicode, str is not the right type: use the bytes type and use os.environb instead. Victor From fuzzyman at voidspace.org.uk Tue Jan 4 11:49:04 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 04 Jan 2011 10:49:04 +0000 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: <1294077146.26496.26.camel@radiator.bos.redhat.com> Message-ID: <4D22FB20.9010808@voidspace.org.uk> On 04/01/2011 01:02, Guido van Rossum wrote: > On Mon, Jan 3, 2011 at 9:52 AM, David Malcolm wrote: >> On Sun, 2011-01-02 at 19:18 -0800, Guido van Rossum wrote: >>> On Sun, Jan 2, 2011 at 5:50 PM, Alex Gaynor wrote: >>>> No, it's singularly impossible to prove that any global load will be any given >>>> value at compile time. Any optimization based on this premise is wrong. >>> True. >>> >>> My proposed way out of this conundrum has been to change the language >>> semantics slightly so that global names which (a) coincide with a >>> builtin, and (b) have no explicit assignment to them in the current >>> module, would be fair game for such optimizations, with the >>> understanding that the presence of e.g. "len = len" anywhere in the >>> module (even in dead code!) would be sufficient to disable the >>> optimization. >>> >>> But barring someone interested in implementing something based on this >>> rule, the proposal has languished for many years. >> Is there a PEP for this? > Not that I know of, otherwise I'd have mentioned it. :-) > > It would be useful if someone wrote it up, since the idea comes back > in one form or another regularly. > >>> FWIW, this is reminiscent of Fortran's rules for "intrinsics" (its >>> name for builtins), which have a similar optimization behavior (except >>> there the potential overrides that the compiler doesn't need to take >>> into account are load-time definitions). >> I've been attempting another way in: >> http://bugs.python.org/issue10399 >> using a new "JUMP_IF_SPECIALIZABLE" opcode. This compares what a value >> is against a compile-time prediction, branching to an optimized >> implementation if the guess was correct. I use this to implement >> function-call inlining within the generated bytecode. > Yeah, that's what everybody proposes to keep the language semantics > unchanged. But I claim that an easier solution is to say to hell with > those semantics, let's change them to make the implementation simpler. > That's from the Zen of Python: "If the implementation is easy to > explain, it may be a good idea." I guess few people can seriously > propose to change Python's semantics, that's why *I* am proposing it. > :-) Note that the semantics of locals (e.g. UnboundLocalError) were > also changed specifically to allow a significant optimization -- again > by me. > I think someone else pointed this out, but replacing builtins externally to a module is actually common for testing. In particular replacing the open function, but also other builtins, is often done temporarily to replace it with a mock. It seems like this optimisation would break those tests. Michael >> Caveat-of-doom: That code's very much a work-in-progress at this stage, >> though: sometimes it doesn't segfault :) and the way that I track the >> predicted values is taking some other liberties with semantics (see that >> URL and the dmalcolm-ast-optimization-branch in SVN). >> >> (There's probably at least 2 PEPs in the above idea, though have yet to >> write my first PEP) > If you want to write up a PEP for the semantics change I am proposing, > everything you need is in this thread. > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From solipsis at pitrou.net Tue Jan 4 13:20:59 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 4 Jan 2011 13:20:59 +0100 Subject: [Python-Dev] PEP 3333: wsgi_string() function References: <1294109093.14661.4.camel@marge> Message-ID: <20110104132059.1cbad273@pitrou.net> On Tue, 04 Jan 2011 03:44:53 +0100 Victor Stinner wrote: > def wsgi_string(u): > # Convert an environment variable to a WSGI "bytes-as-unicode" > string > return u.encode(enc, esc).decode('iso-8859-1') > > def run_with_cgi(application): > environ = {k: wsgi_string(v) for k,v in os.environ.items()} > environ['wsgi.input'] = sys.stdin > environ['wsgi.errors'] = sys.stderr > environ['wsgi.version'] = (1, 0) > ... > -------------- > > What is this horrible encoding "bytes-as-unicode"? os.environ is > supposed to be correctly decoded and contain valid unicode characters. > If WSGI uses another encoding than the locale encoding (which is a bad > idea), it should use os.environb and decodes keys and values using its > own encoding. > > If you really want to store bytes in unicode, str is not the right type: > use the bytes type and use os.environb instead. +1. We should minimize such reencoding dances, and avoid promoting them. Regards Antoine. From victor.stinner at haypocalc.com Tue Jan 4 14:33:37 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 04 Jan 2011 14:33:37 +0100 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <20110104132059.1cbad273@pitrou.net> References: <1294109093.14661.4.camel@marge> <20110104132059.1cbad273@pitrou.net> Message-ID: <1294148017.17901.7.camel@marge> Le mardi 04 janvier 2011 ? 13:20 +0100, Antoine Pitrou a ?crit : > On Tue, 04 Jan 2011 03:44:53 +0100 > Victor Stinner wrote: > > def wsgi_string(u): > > # Convert an environment variable to a WSGI "bytes-as-unicode" > > string > > return u.encode(enc, esc).decode('iso-8859-1') > > > > def run_with_cgi(application): > > environ = {k: wsgi_string(v) for k,v in os.environ.items()} > > environ['wsgi.input'] = sys.stdin > > environ['wsgi.errors'] = sys.stderr > > environ['wsgi.version'] = (1, 0) > > ... > > -------------- > > > > What is this horrible encoding "bytes-as-unicode"? os.environ is > > supposed to be correctly decoded and contain valid unicode characters. > > If WSGI uses another encoding than the locale encoding (which is a bad > > idea), it should use os.environb and decodes keys and values using its > > own encoding. > > > > If you really want to store bytes in unicode, str is not the right type: > > use the bytes type and use os.environb instead. > > +1. We should minimize such reencoding dances, and avoid promoting them. The example from the PEP is specific to CGI and is a little bit special. The reference implementation (wsgiref in py3k) only redecodes ("transcode") some variables: --- _is_request = { 'SCRIPT_NAME', 'PATH_INFO', 'QUERY_STRING', 'REQUEST_METHOD', 'AUTH_TYPE', 'CONTENT_TYPE', 'CONTENT_LENGTH', 'HTTPS', 'REMOTE_USER', 'REMOTE_IDENT', }.__contains__ def _needs_transcode(k): return _is_request(k) or k.startswith('HTTP_') or k.startswith('SSL_') \ or (k.startswith('REDIRECT_') and _needs_transcode(k[9:])) --- My problem is that I don't understand how I can know if a variable was converted to "bytes-as-unicode" or not. GrahamDumpleton told me on IRC, that the framework is supposed to redecodes one more time some variables (eg. PATH_INFO). But this is not explicit in the PEP and _needs_transcode() is a private function. Since the environ already contain different types (eg. wsgi.version is a tuple, wsgi.multithread is a boolean, ...), why not keeping these variables as raw bytes? Victor From solipsis at pitrou.net Tue Jan 4 14:51:30 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 4 Jan 2011 14:51:30 +0100 Subject: [Python-Dev] PEP 3333: wsgi_string() function References: <1294109093.14661.4.camel@marge> <20110104132059.1cbad273@pitrou.net> <1294148017.17901.7.camel@marge> Message-ID: <20110104145130.7fbde8a3@pitrou.net> On Tue, 04 Jan 2011 14:33:37 +0100 Victor Stinner wrote: > Le mardi 04 janvier 2011 ? 13:20 +0100, Antoine Pitrou a ?crit : > > On Tue, 04 Jan 2011 03:44:53 +0100 > > Victor Stinner wrote: > > > def wsgi_string(u): > > > # Convert an environment variable to a WSGI "bytes-as-unicode" > > > string > > > return u.encode(enc, esc).decode('iso-8859-1') > > > > > > def run_with_cgi(application): > > > environ = {k: wsgi_string(v) for k,v in os.environ.items()} > > > environ['wsgi.input'] = sys.stdin > > > environ['wsgi.errors'] = sys.stderr > > > environ['wsgi.version'] = (1, 0) > > > ... > > > -------------- > > > > > > What is this horrible encoding "bytes-as-unicode"? os.environ is > > > supposed to be correctly decoded and contain valid unicode characters. > > > If WSGI uses another encoding than the locale encoding (which is a bad > > > idea), it should use os.environb and decodes keys and values using its > > > own encoding. > > > > > > If you really want to store bytes in unicode, str is not the right type: > > > use the bytes type and use os.environb instead. > > > > +1. We should minimize such reencoding dances, and avoid promoting them. > > The example from the PEP is specific to CGI and is a little bit special. Well, it would be better if it used os.environb anyway ;) Regards Antoine. From barry at python.org Tue Jan 4 16:54:43 2011 From: barry at python.org (Barry Warsaw) Date: Tue, 4 Jan 2011 10:54:43 -0500 Subject: [Python-Dev] Backport troubles with mercurial In-Reply-To: <20101230075046.B4DBD238F3D@kimball.webabinitio.net> References: <87wrmsk6r7.fsf@uwakimon.sk.tsukuba.ac.jp> <20101229152510.1E766239DB5@kimball.webabinitio.net> <87tyhvke93.fsf@uwakimon.sk.tsukuba.ac.jp> <20101230075046.B4DBD238F3D@kimball.webabinitio.net> Message-ID: <20110104105443.5aa1ba14@mission> On Dec 30, 2010, at 02:50 AM, R. David Murray wrote: >You are welcome; thanks for the feedback. (I sometimes feel >like I'm working in a bit of a vacuum, though Barry does chime in >occasionally...but I do realize that people are busy; that's >why I inherited this job in the first place, after all :) It's you're own fault for doing such a damn good job. :) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From ncoghlan at gmail.com Tue Jan 4 17:03:35 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 5 Jan 2011 02:03:35 +1000 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: <4D22FB20.9010808@voidspace.org.uk> References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> Message-ID: On Tue, Jan 4, 2011 at 8:49 PM, Michael Foord wrote: > I think someone else pointed this out, but replacing builtins externally to > a module is actually common for testing. In particular replacing the open > function, but also other builtins, is often done temporarily to replace it > with a mock. It seems like this optimisation would break those tests. print() and input() come to mind. However, so long as appropriate tools are provided to recompile a module with this optimisation disabled for certain builtins (with some builtins such as open(), print() and input() blacklisted by default) then that issue should be manageable. I've extracted the full list of 68 builtin functions from the table in the docs below. I've placed asterisks next to the ones I think we would *want* to be eligible for optimisation. Aside from the 3 mentioned above, we could fairly easily omit ones which are used primarily at the interactive prompt (i.e. dir(), help()), as well as __import__() (since that lookup is handled specially anyway). Cheers, Nick. __import__() abs() * all() * any() * ascii() * bin() * bool() * bytearray() * bytes() * callable() * chr() * classmethod() * compile() * complex() * delattr() * dict() * dir() divmod() * enumerate() * eval() * exec() * filter() * float() * format() * frozenset() * getattr() * globals() * hasattr() * hash() * help() hex() * id() * input() int() * isinstance() * issubclass() * iter() * len() * list() * locals() * map() * max() * memoryview() * min() * next() * object() * oct() * open() ord() * pow() * print() property() * range() * repr() * reversed() * round() * set() * setattr() * slice() * sorted() * staticmethod() * str() * sum() * super() * tuple() * type() * vars() * zip() * -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Tue Jan 4 16:58:46 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Jan 2011 07:58:46 -0800 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: <4D22FB20.9010808@voidspace.org.uk> References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> Message-ID: On Tue, Jan 4, 2011 at 2:49 AM, Michael Foord wrote: > I think someone else pointed this out, but replacing builtins externally to > a module is actually common for testing. In particular replacing the open > function, but also other builtins, is often done temporarily to replace it > with a mock. It seems like this optimisation would break those tests. Hm, I already suggested to make an exception for open, (and one should be added for __import__) but if this is done for other builtins that is indeed a problem. Can you point to example code doing this? -- --Guido van Rossum (python.org/~guido) From alexander.belopolsky at gmail.com Tue Jan 4 17:12:11 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 4 Jan 2011 11:12:11 -0500 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: Message-ID: On Mon, Jan 3, 2011 at 7:47 PM, Guido van Rossum wrote: > Given the rule garbage in -> garbage out, I'd do the most useful > thing, which would be to produce a longer output string (and update > the docs). I did not know that GIGO was a design rule, but after thinking about it some more, I agree. It is very unlikely that a Python program would care about precise length of the string produced by time.asctime() and these strings are not well suited for passing timestamps to other programs that may care. (Use of asctime() timestamps in internet protocols has been long deprecated and surely won't be in use in 10-th millennium :-) From ncoghlan at gmail.com Tue Jan 4 17:20:29 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 5 Jan 2011 02:20:29 +1000 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> Message-ID: On Wed, Jan 5, 2011 at 1:58 AM, Guido van Rossum wrote: > On Tue, Jan 4, 2011 at 2:49 AM, Michael Foord wrote: >> I think someone else pointed this out, but replacing builtins externally to >> a module is actually common for testing. In particular replacing the open >> function, but also other builtins, is often done temporarily to replace it >> with a mock. It seems like this optimisation would break those tests. > > Hm, I already suggested to make an exception for open, (and one should > be added for __import__) but if this is done for other builtins that > is indeed a problem. Can you point to example code doing this? I've seen it done to write tests for simple CLI behaviour by mocking input() and print() (replacing sys.stdin and sys.stdout instead is far more common, but replacing the functions works too). If compile() accepted a blacklist of builtins that it wasn't allowed to optimise, then that should deal with the core of the problem as far as testing goes. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From alex.gaynor at gmail.com Tue Jan 4 17:21:57 2011 From: alex.gaynor at gmail.com (Alex Gaynor) Date: Tue, 4 Jan 2011 10:21:57 -0600 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> Message-ID: On Tue, Jan 4, 2011 at 10:20 AM, Nick Coghlan wrote: > On Wed, Jan 5, 2011 at 1:58 AM, Guido van Rossum wrote: > > On Tue, Jan 4, 2011 at 2:49 AM, Michael Foord > wrote: > >> I think someone else pointed this out, but replacing builtins externally > to > >> a module is actually common for testing. In particular replacing the > open > >> function, but also other builtins, is often done temporarily to replace > it > >> with a mock. It seems like this optimisation would break those tests. > > > > Hm, I already suggested to make an exception for open, (and one should > > be added for __import__) but if this is done for other builtins that > > is indeed a problem. Can you point to example code doing this? > > I've seen it done to write tests for simple CLI behaviour by mocking > input() and print() (replacing sys.stdin and sys.stdout instead is far > more common, but replacing the functions works too). If compile() > accepted a blacklist of builtins that it wasn't allowed to optimise, > then that should deal with the core of the problem as far as testing > goes. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > Ugh, I can't be the only one who finds these special cases to be a little nasty? Special cases aren't special enough to break the rules. Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero "Code can always be simpler than you think, but never as simple as you want" -- Me -------------- next part -------------- An HTML attachment was scrubbed... URL: From tseaver at palladion.com Tue Jan 4 17:22:13 2011 From: tseaver at palladion.com (Tres Seaver) Date: Tue, 04 Jan 2011 11:22:13 -0500 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <1294109093.14661.4.camel@marge> References: <1294109093.14661.4.camel@marge> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 01/03/2011 09:44 PM, Victor Stinner wrote: > Hi, > > In the PEP 3333, I read: > -------------- > import os, sys > > enc, esc = sys.getfilesystemencoding(), 'surrogateescape' > > def wsgi_string(u): > # Convert an environment variable to a WSGI "bytes-as-unicode" > string > return u.encode(enc, esc).decode('iso-8859-1') > > def run_with_cgi(application): > environ = {k: wsgi_string(v) for k,v in os.environ.items()} > environ['wsgi.input'] = sys.stdin > environ['wsgi.errors'] = sys.stderr > environ['wsgi.version'] = (1, 0) > ... > -------------- > > What is this horrible encoding "bytes-as-unicode"? os.environ is > supposed to be correctly decoded and contain valid unicode characters. > If WSGI uses another encoding than the locale encoding (which is a bad > idea), it should use os.environb and decodes keys and values using its > own encoding. > > If you really want to store bytes in unicode, str is not the right type: > use the bytes type and use os.environb instead. I'm not clear on the semantics here, but I'm pretty sure you'll find that the web-SIG does know them well. I have CC'ed that list (via gmane). Note that Guido just recently wrote on that list that he considers that PEP to be de facto accepted. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk0jSTUACgkQ+gerLs4ltQ4cCQCgyc9QsRfzC2lrtnDO0v8TvK6W rVwAnjvvwD47J1chgupqM3unt5c2jd6p =8LEf -----END PGP SIGNATURE----- From pje at telecommunity.com Tue Jan 4 17:27:53 2011 From: pje at telecommunity.com (P.J. Eby) Date: Tue, 04 Jan 2011 11:27:53 -0500 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <1294109093.14661.4.camel@marge> References: <1294109093.14661.4.camel@marge> Message-ID: <20110104162750.B58EC3A40A8@sparrow.telecommunity.com> At 03:44 AM 1/4/2011 +0100, Victor Stinner wrote: >Hi, > >In the PEP 3333, I read: >-------------- >import os, sys > >enc, esc = sys.getfilesystemencoding(), 'surrogateescape' > >def wsgi_string(u): > # Convert an environment variable to a WSGI "bytes-as-unicode" >string > return u.encode(enc, esc).decode('iso-8859-1') > >def run_with_cgi(application): > environ = {k: wsgi_string(v) for k,v in os.environ.items()} > environ['wsgi.input'] = sys.stdin > environ['wsgi.errors'] = sys.stderr > environ['wsgi.version'] = (1, 0) >... >-------------- > >What is this horrible encoding "bytes-as-unicode"? os.environ is >supposed to be correctly decoded and contain valid unicode characters. >If WSGI uses another encoding than the locale encoding (which is a bad >idea), it should use os.environb and decodes keys and values using its >own encoding. > >If you really want to store bytes in unicode, str is not the right type: >use the bytes type and use os.environb instead. If you want to discuss this, the Web-SIG is the appropriate place. Also, it was the appropriate place months ago, when the final decision on the environ encoding was made. ;-) IOW, the above change to the PEP is merely fixing the code example to be correct for Python 3, where it previously was correct only for Python 2. The PEP itself has already required this since the previous revisions, and wsgiref in the stdlib is already compliant with the above (although it uses a more sophisticated approach for dealing with win32 compatibility). The rationale for this choice is described in the PEP, and was also discussed in the mailing list emails back when the work was being done. IOW, this particular ship already sailed a long time ago. In fact, for Jython this bytes-as-unicode approach has been the PEP 333-defined encoding for at least *six years*... so it's REALLY late to complain about it now! ;-) PEP 3333 is merely a mapping of PEP 333 to allow WSGI apps to be ported from Python 2 to Python 3. There is work in progress on the Web-SIG now on PEP 444, which will support only Python 2.6+, where 'b' literals and the 'bytes' alias are available. It is as yet uncertain what environ encoding will be used, but at the moment I'm not convinced that either pure bytes or pure unicode are acceptable replacements for the PEP 333-compatible approach. In any event, that is a discussion for the Web-SIG, not Python-Dev. From reid.kleckner at gmail.com Tue Jan 4 17:49:14 2011 From: reid.kleckner at gmail.com (Reid Kleckner) Date: Tue, 4 Jan 2011 08:49:14 -0800 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> Message-ID: On Tue, Jan 4, 2011 at 8:21 AM, Alex Gaynor wrote: > Ugh, I can't be the only one who finds these special cases to be a little > nasty? > Special cases aren't special enough to break the rules. > Alex +1, I don't think nailing down a few builtins is that helpful for optimizing Python. Anyone attempting to seriously optimize Python is going to need to use more general techniques that apply to non-builtins as well. In unladen swallow (I'm sure something similar is done in PyPy) we have some infrastructure for watching dictionaries for changes, and in particular we tend to watch the builtins and module dictionaries. Reid From barry at python.org Tue Jan 4 17:54:35 2011 From: barry at python.org (Barry Warsaw) Date: Tue, 4 Jan 2011 11:54:35 -0500 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> Message-ID: <20110104115435.6945d7d5@mission> On Jan 04, 2011, at 10:21 AM, Alex Gaynor wrote: >Ugh, I can't be the only one who finds these special cases to be a little >nasty? > >Special cases aren't special enough to break the rules. Yeah, I agree. Still it would be interesting to see what kind of performance improvement this would result in. That seems to be the only way to decide whether the cost is worth the benefit. Outside of testing, I do agree that most of the builtins could be pretty safely optimized (even open()). There needs to be a way to stop all optimizations for testing purposes. Perhaps a sys variable, plus command line option and/or environment variable? -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From fuzzyman at voidspace.org.uk Tue Jan 4 17:57:24 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 04 Jan 2011 16:57:24 +0000 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: <20110104115435.6945d7d5@mission> References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> <20110104115435.6945d7d5@mission> Message-ID: <4D235174.8030209@voidspace.org.uk> On 04/01/2011 16:54, Barry Warsaw wrote: > On Jan 04, 2011, at 10:21 AM, Alex Gaynor wrote: > >> Ugh, I can't be the only one who finds these special cases to be a little >> nasty? >> >> Special cases aren't special enough to break the rules. > Yeah, I agree. Still it would be interesting to see what kind of performance > improvement this would result in. That seems to be the only way to decide > whether the cost is worth the benefit. > > Outside of testing, I do agree that most of the builtins could be pretty > safely optimized (even open()). There needs to be a way to stop all > optimizations for testing purposes. Perhaps a sys variable, plus command line > option and/or environment variable? Although testing in an environment deliberately different from production is a recipe for hard to diagnose bugs. Michael > -Barry > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From lukas.lueg at googlemail.com Tue Jan 4 18:33:31 2011 From: lukas.lueg at googlemail.com (Lukas Lueg) Date: Tue, 4 Jan 2011 18:33:31 +0100 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: <4D235174.8030209@voidspace.org.uk> References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> <20110104115435.6945d7d5@mission> <4D235174.8030209@voidspace.org.uk> Message-ID: Doesnt this all boil down to being able to monitor PyDict for changes to it's key-space? The keys are immutable anyway so the instances of PyDict could manage a opaque value (in fact, a counter) that changes every time a new value is written to any key. Once we get a reference out of the dict, we can can do very fast lookups by passing the key, the reference we know from the last lookup and our last state. The lookup returns a new reference and the new state. If the dict has not changed, the state doesnt change and the reference is simply taken from the passed value passed to the lookup. That way the code remains the same no matter if the dict has changed or not. 2011/1/4 Michael Foord : > On 04/01/2011 16:54, Barry Warsaw wrote: > > On Jan 04, 2011, at 10:21 AM, Alex Gaynor wrote: > > Ugh, I can't be the only one who finds these special cases to be a little > nasty? > > Special cases aren't special enough to break the rules. > > Yeah, I agree. Still it would be interesting to see what kind of > performance > improvement this would result in. That seems to be the only way to decide > whether the cost is worth the benefit. > > Outside of testing, I do agree that most of the builtins could be pretty > safely optimized (even open()). There needs to be a way to stop all > optimizations for testing purposes. Perhaps a sys variable, plus command > line > option and/or environment variable? > > Although testing in an environment deliberately different from production is > a recipe for hard to diagnose bugs. > > Michael > > -Barry > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > > > -- > http://www.voidspace.org.uk/ > > May you do good and not evil > May you find forgiveness for yourself and forgive others > May you share freely, never taking more than you give. > -- the sqlite blessing http://www.sqlite.org/different.html > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/lukas.lueg%40gmail.com > > From nas at arctrix.com Tue Jan 4 18:51:17 2011 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 4 Jan 2011 17:51:17 +0000 (UTC) Subject: [Python-Dev] Possible optimization for LOAD_FAST ? References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> <20110104115435.6945d7d5@mission> Message-ID: Barry Warsaw wrote: > On Jan 04, 2011, at 10:21 AM, Alex Gaynor wrote: > >>Ugh, I can't be the only one who finds these special cases to be a little >>nasty? >> >>Special cases aren't special enough to break the rules. > > Yeah, I agree. Still it would be interesting to see what kind of > performance improvement this would result in. That seems to be > the only way to decide whether the cost is worth the benefit. Yuck from me as well. I would guess that attribute lookups would be just as significant as global variable lookups (depending on coding style, of course). In contrast, the local variable semantic change provided a big speed increase for a minor language complexity cost. Neil From guido at python.org Tue Jan 4 19:38:16 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Jan 2011 10:38:16 -0800 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: References: <1294109093.14661.4.camel@marge> Message-ID: On Tue, Jan 4, 2011 at 8:22 AM, Tres Seaver wrote: > Note that Guido just recently wrote on that list that he considers that > PEP to be de facto accepted. That was conditional on there not being any objections in the next 24 hours. There have been plenty, so I'm retracting that. -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Tue Jan 4 22:50:44 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 05 Jan 2011 08:50:44 +1100 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> Message-ID: <4D239634.3000209@pearwood.info> Guido van Rossum wrote: > On Tue, Jan 4, 2011 at 2:49 AM, Michael Foord wrote: >> I think someone else pointed this out, but replacing builtins externally to >> a module is actually common for testing. In particular replacing the open >> function, but also other builtins, is often done temporarily to replace it >> with a mock. It seems like this optimisation would break those tests. > > Hm, I already suggested to make an exception for open, (and one should > be added for __import__) but if this is done for other builtins that > is indeed a problem. Can you point to example code doing this? > I've been known to monkey-patch builtins in the interactive interpreter and in test code. One example that comes to mind is that I had some over-complicated recursive while loop (!), and I wanted to work out the Big Oh behaviour so I knew exactly how horrible it was. Working it out from first principles was too hard, so I cheated: I knew each iteration called len() exactly once, so I monkey-patched len() to count how many times it was called. Problem solved. I also have a statistics package that has its own version of sum, and I rely on calls to sum() from within the package picking up my version rather than the builtin one. -- Steven From guido at python.org Tue Jan 4 23:15:16 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Jan 2011 14:15:16 -0800 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: <4D239634.3000209@pearwood.info> References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> <4D239634.3000209@pearwood.info> Message-ID: On Tue, Jan 4, 2011 at 1:50 PM, Steven D'Aprano wrote: > I've been known to monkey-patch builtins in the interactive interpreter and > in test code. One example that comes to mind is that I had some > over-complicated recursive while loop (!), and I wanted to work out the Big > Oh behaviour so I knew exactly how horrible it was. Working it out from > first principles was too hard, so I cheated: I knew each iteration called > len() exactly once, so I monkey-patched len() to count how many times it was > called. Problem solved. But why couldn't you edit the source code? > I also have a statistics package that has its own version of sum, and I rely > on calls to sum() from within the package picking up my version rather than > the builtin one. As long as you have a definition or import of sum at the top of (or really anywhere in) the module, that will still work. It's only if you were to do things like import builtins builtins.len = ... (whether inside your package or elsewhere) that things would stop working with the proposed optimization. -- --Guido van Rossum (python.org/~guido) From stutzbach at google.com Tue Jan 4 23:15:18 2011 From: stutzbach at google.com (Daniel Stutzbach) Date: Tue, 4 Jan 2011 14:15:18 -0800 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> <20110104115435.6945d7d5@mission> <4D235174.8030209@voidspace.org.uk> Message-ID: On Tue, Jan 4, 2011 at 9:33 AM, Lukas Lueg wrote: > The keys are immutable anyway so the instances of PyDict could manage > a opaque value (in fact, a counter) that changes every time a new > value is written to any key. Once we get a reference out of the dict, > we can can do very fast lookups by passing the key, the reference we > know from the last lookup and our last state. The lookup returns a new > reference and the new state. > If the dict has not changed, the state doesnt change and the reference > is simply taken from the passed value passed to the lookup. That way > the code remains the same no matter if the dict has changed or not. > I have had similar ideas in the past but have never found time to explore them. The same mechanism could also be used to speed up attribute access on objects. -- Daniel Stutzbach -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Jan 4 23:30:36 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Jan 2011 14:30:36 -0800 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> <20110104115435.6945d7d5@mission> <4D235174.8030209@voidspace.org.uk> Message-ID: On Tue, Jan 4, 2011 at 2:15 PM, Daniel Stutzbach wrote: > On Tue, Jan 4, 2011 at 9:33 AM, Lukas Lueg > wrote: >> >> The keys are immutable anyway so the instances of PyDict could manage >> a opaque value (in fact, a counter) that changes every time a new >> value is written to any key. Once we get a reference out of the dict, >> we can can do very fast lookups by passing the key, the reference we >> know from the last lookup and our last state. The lookup returns a new >> reference and the new state. >> If the dict has not changed, the state doesnt change and the reference >> is simply taken from the passed value passed to the lookup. That way >> the code remains the same no matter if the dict has changed or not. > > I have had similar ideas in the past but have never found time to explore > them. ?The same mechanism could also be used to speed up attribute access on > objects. Check out the various approaches in PEP 266, PEP 267, and PEP 280. -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Wed Jan 5 00:13:31 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 05 Jan 2011 10:13:31 +1100 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> <4D239634.3000209@pearwood.info> Message-ID: <4D23A99B.8040007@pearwood.info> Guido van Rossum wrote: > On Tue, Jan 4, 2011 at 1:50 PM, Steven D'Aprano wrote: >> I've been known to monkey-patch builtins in the interactive interpreter and >> in test code. One example that comes to mind is that I had some >> over-complicated recursive while loop (!), and I wanted to work out the Big >> Oh behaviour so I knew exactly how horrible it was. Working it out from >> first principles was too hard, so I cheated: I knew each iteration called >> len() exactly once, so I monkey-patched len() to count how many times it was >> called. Problem solved. > > But why couldn't you edit the source code? Because there was no source code -- I was experimenting in the interactive interpreter. I could have just re-created the function by using the readline history, but it was just easier to redefine len. Oh... it's just occurred to me that you were asking for use-cases for assigning to builtins.len directly, rather than just to len. No, I've never done that -- sorry for the noise. >> I also have a statistics package that has its own version of sum, and I rely >> on calls to sum() from within the package picking up my version rather than >> the builtin one. > > As long as you have a definition or import of sum at the top of (or > really anywhere in) the module, that will still work. It's only if you > were to do things like > > import builtins > builtins.len = ... > > (whether inside your package or elsewhere) that things would stop > working with the proposed optimization. Ha, well, that's the sort of thing that gives monkey-patching a bad name, surely? Is there a use-case for globally replacing builtins for all modules, everywhere? I suppose that's what you're asking. The only example I can think of might be the use of mocks for testing purposes, but even there I'd prefer to inject the mock into the module I was testing: mymodule.len = mylen But I haven't done much work with mocks, so I'm just guessing. -- Steven From lukas.lueg at googlemail.com Tue Jan 4 23:13:19 2011 From: lukas.lueg at googlemail.com (Lukas Lueg) Date: Tue, 4 Jan 2011 23:13:19 +0100 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: <4D239634.3000209@pearwood.info> References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> <4D239634.3000209@pearwood.info> Message-ID: I very much like the fact that python has *very* little black magic revealed to the user. Strong -1 on optimizing picked builtins in a picked way. 2011/1/4 Steven D'Aprano : > Guido van Rossum wrote: >> >> On Tue, Jan 4, 2011 at 2:49 AM, Michael Foord >> wrote: >>> >>> I think someone else pointed this out, but replacing builtins externally >>> to >>> a module is actually common for testing. In particular replacing the open >>> function, but also other builtins, is often done temporarily to replace >>> it >>> with a mock. It seems like this optimisation would break those tests. >> >> Hm, I already suggested to make an exception for open, (and one should >> be added for __import__) but if this is done for other builtins that >> is indeed a problem. Can you point to example code doing this? >> > > I've been known to monkey-patch builtins in the interactive interpreter and > in test code. One example that comes to mind is that I had some > over-complicated recursive while loop (!), and I wanted to work out the Big > Oh behaviour so I knew exactly how horrible it was. Working it out from > first principles was too hard, so I cheated: I knew each iteration called > len() exactly once, so I monkey-patched len() to count how many times it was > called. Problem solved. > > I also have a statistics package that has its own version of sum, and I rely > on calls to sum() from within the package picking up my version rather than > the builtin one. > > > -- > Steven > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/lukas.lueg%40gmail.com > From guido at python.org Wed Jan 5 00:29:37 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Jan 2011 15:29:37 -0800 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: <4D23A99B.8040007@pearwood.info> References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> <4D239634.3000209@pearwood.info> <4D23A99B.8040007@pearwood.info> Message-ID: On Tue, Jan 4, 2011 at 3:13 PM, Steven D'Aprano wrote: > Guido van Rossum wrote: >> But why couldn't you edit the source code? > > Because there was no source code -- I was experimenting in the interactive > interpreter. I could have just re-created the function by using the readline > history, but it was just easier to redefine len. The interactive interpreter will always be excluded from this kind of optimization (well, in my proposal anyway). > Oh... it's just occurred to me that you were asking for use-cases for > assigning to builtins.len directly, rather than just to len. No, I've never > done that -- sorry for the noise. There are two versions of the "assign to global named 'len'" idiom. One is benign: if the assignment occurs in the source code of the module (i.e., where the compiler can see it when it is compiling the module) the optimization will be disabled in that module. The second is not: if a module-global named 'len' is set in a module from outside that module, the compiler cannot see that assignment when it considers the optimization, and it may generate optimized code that will not take the global by that name into account (it would use an opcode that computes the length of an object directly). The third way to mess with the optimization is messing with builtins.len. This one is also outside what the compiler can see. [...] > Ha, well, that's the sort of thing that gives monkey-patching a bad name, > surely? Monkey-patching intentionally has a bad name -- there's always a code smell. (And it looks like one, too. :-) > Is there a use-case for globally replacing builtins for all modules, > everywhere? I suppose that's what you're asking. I think the folks referring to monkey-patching builtins in unittests were referring to this. But they might also be referring to the second option above. > The only example I can think of might be the use of mocks for testing > purposes, but even there I'd prefer to inject the mock into the module I was > testing: > > mymodule.len = mylen > > But I haven't done much work with mocks, so I'm just guessing. Same here. But it would fail (i.e. not be picked up by the optimization) either way. -- --Guido van Rossum (python.org/~guido) From guido at python.org Wed Jan 5 00:30:36 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Jan 2011 15:30:36 -0800 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> <4D239634.3000209@pearwood.info> Message-ID: On Tue, Jan 4, 2011 at 2:13 PM, Lukas Lueg wrote: > I very much like the fact that python has *very* little black magic > revealed to the user. Strong -1 on optimizing picked builtins in a > picked way. That's easy for you to say. -- --Guido van Rossum (python.org/~guido) From fuzzyman at voidspace.org.uk Wed Jan 5 00:32:57 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 04 Jan 2011 23:32:57 +0000 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> <4D239634.3000209@pearwood.info> <4D23A99B.8040007@pearwood.info> Message-ID: <4D23AE29.3060406@voidspace.org.uk> On 04/01/2011 23:29, Guido van Rossum wrote: > On Tue, Jan 4, 2011 at 3:13 PM, Steven D'Aprano wrote: >> Guido van Rossum wrote: >>> But why couldn't you edit the source code? >> Because there was no source code -- I was experimenting in the interactive >> interpreter. I could have just re-created the function by using the readline >> history, but it was just easier to redefine len. > The interactive interpreter will always be excluded from this kind of > optimization (well, in my proposal anyway). > >> Oh... it's just occurred to me that you were asking for use-cases for >> assigning to builtins.len directly, rather than just to len. No, I've never >> done that -- sorry for the noise. > There are two versions of the "assign to global named 'len'" idiom. > > One is benign: if the assignment occurs in the source code of the > module (i.e., where the compiler can see it when it is compiling the > module) the optimization will be disabled in that module. > > The second is not: if a module-global named 'len' is set in a module > from outside that module, the compiler cannot see that assignment when > it considers the optimization, and it may generate optimized code that > will not take the global by that name into account (it would use an > opcode that computes the length of an object directly). > > The third way to mess with the optimization is messing with > builtins.len. This one is also outside what the compiler can see. > > [...] >> Ha, well, that's the sort of thing that gives monkey-patching a bad name, >> surely? > Monkey-patching intentionally has a bad name -- there's always a code > smell. (And it looks like one, too. :-) > >> Is there a use-case for globally replacing builtins for all modules, >> everywhere? I suppose that's what you're asking. > I think the folks referring to monkey-patching builtins in unittests > were referring to this. But they might also be referring to the second > option above. > I prefer monkey patching builtins (where I do such a thing) in the namespace where they are used. I know it is *common* to monkeypatch __builtins__.open (python 2) however. I don't recall monkey patching anything other than open and raw_input myself but I *bet* there are people doing it for reasons they see as legitimate in tests. :-) Michael >> The only example I can think of might be the use of mocks for testing >> purposes, but even there I'd prefer to inject the mock into the module I was >> testing: >> >> mymodule.len = mylen >> >> But I haven't done much work with mocks, so I'm just guessing. > Same here. But it would fail (i.e. not be picked up by the > optimization) either way. > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From fuzzyman at voidspace.org.uk Wed Jan 5 00:36:32 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 04 Jan 2011 23:36:32 +0000 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> <4D239634.3000209@pearwood.info> <4D23A99B.8040007@pearwood.info> Message-ID: <4D23AF00.1010808@voidspace.org.uk> On 04/01/2011 23:29, Guido van Rossum wrote: > On Tue, Jan 4, 2011 at 3:13 PM, Steven D'Aprano wrote: >> Guido van Rossum wrote: >>> But why couldn't you edit the source code? >> Because there was no source code -- I was experimenting in the interactive >> interpreter. I could have just re-created the function by using the readline >> history, but it was just easier to redefine len. > The interactive interpreter will always be excluded from this kind of > optimization (well, in my proposal anyway). > >> Oh... it's just occurred to me that you were asking for use-cases for >> assigning to builtins.len directly, rather than just to len. No, I've never >> done that -- sorry for the noise. > There are two versions of the "assign to global named 'len'" idiom. > > One is benign: if the assignment occurs in the source code of the > module (i.e., where the compiler can see it when it is compiling the > module) the optimization will be disabled in that module. > > The second is not: if a module-global named 'len' is set in a module > from outside that module, the compiler cannot see that assignment when > it considers the optimization, and it may generate optimized code that > will not take the global by that name into account (it would use an > opcode that computes the length of an object directly). > > The third way to mess with the optimization is messing with > builtins.len. This one is also outside what the compiler can see. > > [...] >> Ha, well, that's the sort of thing that gives monkey-patching a bad name, >> surely? > Monkey-patching intentionally has a bad name -- there's always a code > smell. (And it looks like one, too. :-) > >> Is there a use-case for globally replacing builtins for all modules, >> everywhere? I suppose that's what you're asking. > I think the folks referring to monkey-patching builtins in unittests > were referring to this. But they might also be referring to the second > option above. > The only examples I could find from a quick search (using the patch decorator from my mock module which is reasonably well used) patch __builtins__.open and raw_input. https://www.google.com/codesearch?hl=en&lr=&q=%22patch%28%27__builtin__.%22+lang%3Apython+case%3Ayes :-) Michael Foord >> The only example I can think of might be the use of mocks for testing >> purposes, but even there I'd prefer to inject the mock into the module I was >> testing: >> >> mymodule.len = mylen >> >> But I haven't done much work with mocks, so I'm just guessing. > Same here. But it would fail (i.e. not be picked up by the > optimization) either way. > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From guido at python.org Wed Jan 5 00:39:49 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Jan 2011 15:39:49 -0800 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: <4D23AF00.1010808@voidspace.org.uk> References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> <4D239634.3000209@pearwood.info> <4D23A99B.8040007@pearwood.info> <4D23AF00.1010808@voidspace.org.uk> Message-ID: On Tue, Jan 4, 2011 at 3:36 PM, Michael Foord wrote: > The only examples I could find from a quick search (using the patch > decorator from my mock module which is reasonably well used) patch > __builtins__.open and raw_input. > > https://www.google.com/codesearch?hl=en&lr=&q=%22patch%28%27__builtin__.%22+lang%3Apython+case%3Ayes So, that significantly weakens the argument that this optimization will break unit tests, since I am happy to promise never to optimize these builtins, and any other builtins intended for I/O. Surely it will break *somebody's* code. That hasn't stopped us with other changes. The crux is whether it breaks significant amounts of code or code that would be really hard to write in another way. -- --Guido van Rossum (python.org/~guido) From skip at pobox.com Wed Jan 5 01:10:35 2011 From: skip at pobox.com (skip at pobox.com) Date: Tue, 4 Jan 2011 18:10:35 -0600 Subject: [Python-Dev] Steroidal builtins (was: Possible optimization for LOAD_FAST ?) In-Reply-To: <4D239634.3000209@pearwood.info> References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> <4D239634.3000209@pearwood.info> Message-ID: <19747.46843.253222.506387@montanaro.dyndns.org> Steven> I've been known to monkey-patch builtins in the interactive Steven> interpreter and in test code. Me too. I use a slightly beefed up dir() funcion which identifies modules within a package which haven't been imported yet. Handy for quick-n-dirty introspection. >>> import email >>> dir(email) ['Charset', 'Encoders', 'Errors', 'FeedParser', 'Generator', 'Header', 'Iterators', 'LazyImporter', 'MIMEAudio', 'MIMEBase', 'MIMEImage', 'MIMEMessage', 'MIMEMultipart', 'MIMENonMultipart', 'MIMEText', 'Message', 'Parser', 'Utils', '[_parseaddr]', '[base64mime]', '[charset]', '[encoders]', '[errors]', '[feedparser]', '[generator]', '[header]', '[iterators]', '[message]', '[parser]', '[quoprimime]', '[test/]', '[utils]', '_LOWERNAMES', '_MIMENAMES', '__all__', '__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__', '__version__', '_name', 'base64MIME', 'email', 'importer', 'message_from_file', 'message_from_string', 'mime', 'quopriMIME', 'sys'] Those names with [...] bracketing the names are submodules or subpackages of the email package which haven't been imported yet. Skip From brett at python.org Wed Jan 5 01:36:42 2011 From: brett at python.org (Brett Cannon) Date: Tue, 4 Jan 2011 16:36:42 -0800 Subject: [Python-Dev] Started my PSF core grant today Message-ID: For those of you who don't know, the PSF has given me a two month grant to work on the core. It's mostly focused on the long overdue overhaul of the dev docs (now being called the devguide) and writing a HOWTO on porting Python 2 code to Python 3. If I have time left over it will be spent on the test suite. I have a blog post with links to my original grant proposal at http://sayspy.blogspot.com/2011/01/psf-core-grant-day-1.html . From raymond.hettinger at gmail.com Wed Jan 5 02:44:48 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 4 Jan 2011 17:44:48 -0800 Subject: [Python-Dev] Started my PSF core grant today In-Reply-To: References: Message-ID: On Jan 4, 2011, at 4:36 PM, Brett Cannon wrote: > For those of you who don't know, the PSF has given me a two month > grant to work on the core. It's mostly focused on the long overdue > overhaul of the dev docs (now being called the devguide) and writing a > HOWTO on porting Python 2 code to Python 3. If I have time left over > it will be spent on the test suite. Woohoo. Nice to have you working on these tasks. Raymond From ncoghlan at gmail.com Wed Jan 5 03:13:31 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 5 Jan 2011 12:13:31 +1000 Subject: [Python-Dev] [Python-checkins] devguide: Strip out all generic svn instructions from the FAQ. It's not only In-Reply-To: References: Message-ID: On Wed, Jan 5, 2011 at 5:55 AM, brett.cannon wrote: > brett.cannon pushed 72a286c3452d to devguide: > > http://hg.python.org/devguide/rev/72a286c3452d > changeset: ? 13:72a286c3452d > user: ? ? ? ?Brett Cannon > date: ? ? ? ?Tue Jan 04 11:48:38 2011 -0800 > summary: > ?Strip out all generic svn instructions from the FAQ. It's not only > silly to duplicate instructions that can be found all over the > internet that are maintained by the creators of the tools under > discussion, but it's a maintenance burden that is unneeded. Your call as the author, but please reconsider this one. I've found it *hugely* convenient over the years to have these task oriented answers in the FAQ. The problem with the answers all over the internet is that I (or someone new to our source control tool) may not know enough to ask the right question, and hence those answers may as well not exist. Even if these FAQ answers don't always provide everything needed, they usually provide enough information to let me search for the full answers. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eliben at gmail.com Wed Jan 5 07:18:20 2011 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 5 Jan 2011 08:18:20 +0200 Subject: [Python-Dev] [Python-checkins] devguide: Strip out all generic svn instructions from the FAQ. It's not only In-Reply-To: References: Message-ID: On Wed, Jan 5, 2011 at 04:13, Nick Coghlan wrote: > On Wed, Jan 5, 2011 at 5:55 AM, brett.cannon > wrote: > > brett.cannon pushed 72a286c3452d to devguide: > > > > http://hg.python.org/devguide/rev/72a286c3452d > > changeset: 13:72a286c3452d > > user: Brett Cannon > > date: Tue Jan 04 11:48:38 2011 -0800 > > summary: > > Strip out all generic svn instructions from the FAQ. It's not only > > silly to duplicate instructions that can be found all over the > > internet that are maintained by the creators of the tools under > > discussion, but it's a maintenance burden that is unneeded. > > Your call as the author, but please reconsider this one. I've found it > *hugely* convenient over the years to have these task oriented answers > in the FAQ. The problem with the answers all over the internet is that > I (or someone new to our source control tool) may not know enough to > ask the right question, and hence those answers may as well not exist. > Even if these FAQ answers don't always provide everything needed, they > usually provide enough information to let me search for the full > answers. > > I agree with Nick here. I also found these instructions useful in the past, although I'm quite familiar with SVN. New devs interested in contributing to Python but not too familiar with the source-control tool it's using at the time will benefit even more from this. As for maintenance nightmare, I'm sure it's simple enough to attract contributors. For example, I can volunteer to maintain it. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From phil.le.bienheureux at gmail.com Wed Jan 5 08:16:11 2011 From: phil.le.bienheureux at gmail.com (Phil Le Bienheureux) Date: Wed, 5 Jan 2011 08:16:11 +0100 Subject: [Python-Dev] [issue8033] sqlite: broken long integer handling for arguments to user-defined functions Message-ID: Hello, I am quite new to development in python, and as a first contribution to the community, I have provided a patch to the issue 8033 (quite trivial). I then ran the test suite an everything was ok. However, the status has not changed, and nobody has answered so far (patch provided one month ago). So my question : has I missed something in the procedure that I read carefully, to deliver a patch, or something else? Thank you for your help, and for taking care of python. Philippe -------------- next part -------------- An HTML attachment was scrubbed... URL: From hsoft at hardcoded.net Wed Jan 5 09:25:13 2011 From: hsoft at hardcoded.net (Virgil Dupras) Date: Wed, 5 Jan 2011 09:25:13 +0100 Subject: [Python-Dev] [issue8033] sqlite: broken long integer handling for arguments to user-defined functions In-Reply-To: References: Message-ID: <87470E62-5F47-4637-B3DC-991623DCE990@hardcoded.net> On 2011-01-05, at 8:16 AM, Phil Le Bienheureux wrote: > Hello, > > I am quite new to development in python, and as a first contribution to the community, I have provided a patch to the issue 8033 (quite trivial). I then ran the test suite an everything was ok. However, the status has not changed, and nobody has answered so far (patch provided one month ago). So my question : has I missed something in the procedure that I read carefully, to deliver a patch, or something else? > > I'm not a core developer, but there's two reasons I can think of: 1. Your diff doesn't include tests. 2. Core developers are busy, these things take time. I don't think any bugfix gets checked in without a regression test to go with it. A core developer coming by your issue could maybe do it himself, but since he's likely very busy, he won't have time for this. So your best bet for this fix to be checked in is to add a test, but even then, sometimes, patches fade into oblivion and you might have to regularly "freshen" your diff to match with the trunk so it applies cleanly. Virgil Dupras From tjreedy at udel.edu Wed Jan 5 09:56:10 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 05 Jan 2011 03:56:10 -0500 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> <4D239634.3000209@pearwood.info> <4D23A99B.8040007@pearwood.info> <4D23AF00.1010808@voidspace.org.uk> Message-ID: On 1/4/2011 6:39 PM, Guido van Rossum wrote: > So, that significantly weakens the argument that this optimization > will break unit tests, since I am happy to promise never to optimize > these builtins, and any other builtins intended for I/O. This is one comprehensible rule rather than a list of exceptions, so easier to remember. It has two rationales: such often need to be over-riden for testing, possibly in hidden ways; such are inherently 'slow' so optimizing dict lookup away hardly makes sense. -- Terry Jan Reedy From tjreedy at udel.edu Wed Jan 5 10:08:44 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 05 Jan 2011 04:08:44 -0500 Subject: [Python-Dev] [Python-checkins] devguide: Strip out all generic svn instructions from the FAQ. It's not only In-Reply-To: References: Message-ID: On 1/5/2011 1:18 AM, Eli Bendersky wrote: > On Wed, Jan 5, 2011 at 04:13, Nick Coghlan Your call as the author, but please reconsider this one. I've found it > *hugely* convenient over the years to have these task oriented answers > in the FAQ. The problem with the answers all over the internet is that > I (or someone new to our source control tool) may not know enough to > ask the right question, and hence those answers may as well not exist. > Even if these FAQ answers don't always provide everything needed, they > usually provide enough information to let me search for the full > answers. > > > I agree with Nick here. I also found these instructions useful in the > past, although I'm quite familiar with SVN. New devs interested in > contributing to Python but not too familiar with the source-control tool > it's using at the time will benefit even more from this. > > As for maintenance nightmare, I'm sure it's simple enough to attract > contributors. For example, I can volunteer to maintain it. As a complete neophyte at actually using a source code system, I found the stripped-down step-by-step instructions useful even though I am using TortoiseSVN. Even the TortoiseSVN help doc is a bit overwhelming because it includes so much that I do not need to read. It would be a bit like a beginning programmer trying to learn Python from the Langauge Reference without having the Tutorial to read. (And even as an experienced C programmer, I started with the latter.) -- Terry Jan Reedy From swamiyeswanth at hotmail.com Wed Jan 5 12:48:00 2011 From: swamiyeswanth at hotmail.com (yeswanth) Date: Wed, 5 Jan 2011 17:18:00 +0530 Subject: [Python-Dev] Hello everyone Message-ID: Hello everyone, My name is Yeswanth . I am doing my third year Btech in Computer Science in India. My desire is to get into gsoc 2011 . I have been looking over the projects of last year to see where I would fit in. And I found python to be interesting, something I can contribute. I dont know if Python Software Foundation will apply for Gsoc this year, I just hope it will . So here I am , planning to make an entry in contributing to this open source project. I can just program in Python ,never contributed anything to it , atleast of now. So I have read the development links provided in the python.org site. Can anyone suggest me some areas where I can actually start with developing for this proje From ncoghlan at gmail.com Wed Jan 5 13:31:11 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 5 Jan 2011 22:31:11 +1000 Subject: [Python-Dev] Omit Py_buffer struct from Stable ABI for Python 3.2? Message-ID: Currently [1], the implementation and the documentation for PEP 3118's Py_buffer struct don't line up (there's an extra field in the implementation that the PEP doesn't mention). Accordingly, Mark and I think it may be a good idea to leave this structure (and possibly related APIs) out of the stable ABI for the 3.2 release. I don't *think* it needs changing, but I'm not 100% certain until we finish working through the problem and realign the implementation and documentation. Applications and extension modules that use this interface would still work - they would just have to wait until 3.3 before they could consider migrating to the stable ABI. Regards, Nick. [1] http://bugs.python.org/issue10181 -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From dickinsm at gmail.com Wed Jan 5 13:55:55 2011 From: dickinsm at gmail.com (Mark Dickinson) Date: Wed, 5 Jan 2011 12:55:55 +0000 Subject: [Python-Dev] Omit Py_buffer struct from Stable ABI for Python 3.2? In-Reply-To: References: Message-ID: On Wed, Jan 5, 2011 at 12:31 PM, Nick Coghlan wrote: > Currently [1], the implementation and the documentation for PEP 3118's > Py_buffer struct don't line up (there's an extra field in the > implementation that the PEP doesn't mention). I think there are actually two such fields: smalltable and obj. The need for obj is a little ugly: as far as I can tell, it's meaningless for a 3rd-party object that wants to export buffers---it's only really used by the memoryview object and by internal Python types. Mark From solipsis at pitrou.net Wed Jan 5 14:13:23 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 5 Jan 2011 14:13:23 +0100 Subject: [Python-Dev] Omit Py_buffer struct from Stable ABI for Python 3.2? References: Message-ID: <20110105141323.5eaed4f9@pitrou.net> On Wed, 5 Jan 2011 12:55:55 +0000 Mark Dickinson wrote: > On Wed, Jan 5, 2011 at 12:31 PM, Nick Coghlan wrote: > > Currently [1], the implementation and the documentation for PEP 3118's > > Py_buffer struct don't line up (there's an extra field in the > > implementation that the PEP doesn't mention). > > I think there are actually two such fields: smalltable and obj. > > The need for obj is a little ugly: as far as I can tell, it's > meaningless for a 3rd-party object that wants to export buffers---it's > only really used by the memoryview object and by internal Python > types. I don't think it's ugly. It's the only way to know which object exported a Py_buffer. Otherwise you have to track the information separately, which is quite a bit uglier (especially when in conjunction with PyArg_ParseTuple and friends). Regards Antoine. From fijall at gmail.com Wed Jan 5 14:27:31 2011 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 5 Jan 2011 15:27:31 +0200 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> <4D239634.3000209@pearwood.info> <4D23A99B.8040007@pearwood.info> <4D23AF00.1010808@voidspace.org.uk> Message-ID: How about not changing semantics and still making this optimization possible? PyPy already has CALL_LIKELY_BUILTIN which checks whether builtins has been altered (by keeping a flag on the module dictionary) and if not, loads a specific builtin on top of value stack. From my current experience, I would make a bet that someone is altering pretty much every builtin for some dark reasons. One that comes to mind is to test something using external library which is not playing along too well. That however only lets one avoid dictionary lookups, it doesn't give potential for other optimizations (which in my opinion are limited until we hit something dynamic like an instance, but let's ignore it). How about creating two copies of bytecode (that's not arbitrary number, just 2) and a way to go from more optimized to less optimized in case *any* of promises is invalidated? That gives an ability to save semantics, while allowing optimizations. That said, I think CPython should stay as a simple VM and refrain from doing things that are much easier in the presence of a JIT (and give a lot more speedups), but who am I to judge. Cheers, fijal From dickinsm at gmail.com Wed Jan 5 15:04:16 2011 From: dickinsm at gmail.com (Mark Dickinson) Date: Wed, 5 Jan 2011 14:04:16 +0000 Subject: [Python-Dev] Omit Py_buffer struct from Stable ABI for Python 3.2? In-Reply-To: References: <20110105141323.5eaed4f9@pitrou.net> Message-ID: On Wed, Jan 5, 2011 at 2:03 PM, Mark Dickinson wrote: > Maybe I'm misunderstanding. ?What's the responsibility of a buffer > export w.r.t. the obj field---i.e., what should 3rd party code be Grr. *buffer exporter*, not *buffer export*. Mark From dickinsm at gmail.com Wed Jan 5 15:03:41 2011 From: dickinsm at gmail.com (Mark Dickinson) Date: Wed, 5 Jan 2011 14:03:41 +0000 Subject: [Python-Dev] Omit Py_buffer struct from Stable ABI for Python 3.2? In-Reply-To: <20110105141323.5eaed4f9@pitrou.net> References: <20110105141323.5eaed4f9@pitrou.net> Message-ID: On Wed, Jan 5, 2011 at 1:13 PM, Antoine Pitrou wrote: > On Wed, 5 Jan 2011 12:55:55 +0000 > Mark Dickinson wrote: >> The need for obj is a little ugly: ?as far as I can tell, it's >> meaningless for a 3rd-party object that wants to export buffers---it's >> only really used by the memoryview object and by internal Python >> types. > > I don't think it's ugly. It's the only way to know which object exported > a Py_buffer. Otherwise you have to track the information separately, > which is quite a bit uglier (especially when in conjunction with > PyArg_ParseTuple and friends). Maybe I'm misunderstanding. What's the responsibility of a buffer export w.r.t. the obj field---i.e., what should 3rd party code be filling that obj field with in a call to getbuffer? It looks to me as though it's really the memoryview object that needs this information; that it doesn't belong in the Py_buffer struct. Isn't that what the 'base' field in PyMemoryViewObject in PEP 3118 was supposed to be for? Though I notice that that field is unused in the actual PyMemoryViewObject in Include/memoryobject.h. Mark From solipsis at pitrou.net Wed Jan 5 15:24:29 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 5 Jan 2011 15:24:29 +0100 Subject: [Python-Dev] Omit Py_buffer struct from Stable ABI for Python 3.2? In-Reply-To: References: <20110105141323.5eaed4f9@pitrou.net> Message-ID: <20110105152429.170d4196@pitrou.net> On Wed, 5 Jan 2011 14:03:41 +0000 Mark Dickinson wrote: > On Wed, Jan 5, 2011 at 1:13 PM, Antoine Pitrou wrote: > > On Wed, 5 Jan 2011 12:55:55 +0000 > > Mark Dickinson wrote: > >> The need for obj is a little ugly: ?as far as I can tell, it's > >> meaningless for a 3rd-party object that wants to export buffers---it's > >> only really used by the memoryview object and by internal Python > >> types. > > > > I don't think it's ugly. It's the only way to know which object exported > > a Py_buffer. Otherwise you have to track the information separately, > > which is quite a bit uglier (especially when in conjunction with > > PyArg_ParseTuple and friends). > > Maybe I'm misunderstanding. What's the responsibility of a buffer > export w.r.t. the obj field---i.e., what should 3rd party code be > filling that obj field with in a call to getbuffer? It would let PyBuffer_FillInfo() do the job. If it doesn't want to, it must put itself in that field, and increment its reference count. > It looks to me as though it's really the memoryview object that needs > this information; No, anyone wanting to release a buffer without keeping separate track of the original object needs it. As I said, this also applies to users of PyArg_ParseTuple and friends ("s*", "y*" etc. format codes). > Isn't that what the 'base' field in PyMemoryViewObject in PEP 3118 was > supposed to be for? Perhaps, but practice (implementing "s*" etc.) suggested it was useful in other cases. That field ('base') is removed in 3.2. Regards Antoine. From ncoghlan at gmail.com Wed Jan 5 17:07:37 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Jan 2011 02:07:37 +1000 Subject: [Python-Dev] Omit Py_buffer struct from Stable ABI for Python 3.2? In-Reply-To: References: <20110105141323.5eaed4f9@pitrou.net> Message-ID: On Thu, Jan 6, 2011 at 12:03 AM, Mark Dickinson wrote: > Maybe I'm misunderstanding. ?What's the responsibility of a buffer > export w.r.t. the obj field---i.e., what should 3rd party code be > filling that obj field with in a call to getbuffer? It should be a pointer to the object (with the reference count incremented appropriately). GetBuffer/ReleaseBuffer should actually manage it automatically, but I'd have to look at the code to make sure that is the case (and, if it isn't, there may be backwards compatibility implications in fixing it). > It looks to me as though it's really the memoryview object that needs > this information; ?that it doesn't belong in the Py_buffer struct. > Isn't that what the 'base' field in PyMemoryViewObject in PEP 3118 was > supposed to be for? ?Though I notice that that field is unused in the > actual PyMemoryViewObject in Include/memoryobject.h. If nothing else, PyObject_ReleaseBuffer needs it - otherwise the function signature would need to include a separate argument to tell it who the buffer belongs to (so it can find the appropriate function pointer to call). The implementation makes sense (since every call to GetBuffer needs to be paired with a corresponding call to ReleaseBuffer, it makes sense to keep the object reference inside the Py_buffer struct), but the fact the documentation was never corrected suggests there are going to be plenty of broken implementations of the protocol kicking around, potentially even in the standard library. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From alexander.belopolsky at gmail.com Wed Jan 5 18:33:55 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 5 Jan 2011 12:33:55 -0500 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: Message-ID: On Mon, Jan 3, 2011 at 7:47 PM, Guido van Rossum wrote: > Given the rule garbage in -> garbage out, I'd do the most useful > thing, which would be to produce a longer output string (and update > the docs). This would match the behavior of e.g. '%04d' % y when y > > 9999. If that means the platform libc asctime/ctime can't be used, too > bad. I've committed code that does not use platform libc asctime/ctime anymore. Now it seems odd that we support years > 9999 but not years < 1900. A commonly given explanation for rejecting years < 1900 is that Python has to support POSIX standard for 2-digit years. However, this support is conditional on the value of time.accept2dyear and several people argued that when it is set to false, full range of years should be supported. Furthermore, in order to support 2-digit years, there is no need to reject years < 1900. It may be confusing to map 99 to 1999 while accepting 100 as is, but I don't see much of the problem in accepting 4-digit years from 1000 through 1899 while mapping [0 - 99] to present times according to POSIX standard. See http://bugs.python.org/issue10827 for more. From solipsis at pitrou.net Wed Jan 5 18:48:55 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 5 Jan 2011 18:48:55 +0100 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime References: Message-ID: <20110105184855.6b06c9ae@pitrou.net> On Wed, 5 Jan 2011 12:33:55 -0500 Alexander Belopolsky wrote: > On Mon, Jan 3, 2011 at 7:47 PM, Guido van Rossum wrote: > > Given the rule garbage in -> garbage out, I'd do the most useful > > thing, which would be to produce a longer output string (and update > > the docs). This would match the behavior of e.g. '%04d' % y when y > > > 9999. If that means the platform libc asctime/ctime can't be used, too > > bad. > > I've committed code that does not use platform libc asctime/ctime > anymore. Now it seems odd that we support years > 9999 but not years > < 1900. A commonly given explanation for rejecting years < 1900 is > that Python has to support POSIX standard for 2-digit years. However, > this support is conditional on the value of time.accept2dyear and > several people argued that when it is set to false, full range of > years should be supported. Couldn't we deprecate and remove time.accept2dyear? It has been there for "backward compatibility" since Python 1.5.2. Not to mention that global settings affecting behaviour are generally bad, since multiple libraries could have conflicting expectations about it. And parsing times and dates is the kind of thing that a library will often rely on. Regards Antoine. From alexander.belopolsky at gmail.com Wed Jan 5 19:12:38 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 5 Jan 2011 13:12:38 -0500 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: <20110105184855.6b06c9ae@pitrou.net> References: <20110105184855.6b06c9ae@pitrou.net> Message-ID: On Wed, Jan 5, 2011 at 12:48 PM, Antoine Pitrou wrote: .. > Couldn't we deprecate and remove time.accept2dyear? It has been there > for "backward compatibility" since Python 1.5.2. > It will be useful for another 50 years or so. (POSIX 2-digit years cover 1969 - 2068.) In any case, this is not an option for 3.2 while extending accepted range is a borderline case IMO. > Not to mention that global settings affecting behaviour are generally > bad, since multiple libraries could have conflicting expectations about > it. And parsing times and dates is the kind of thing that a library > will often rely on. Yes, for 3.3 I am going to propose an optional accept2dyear argument to time.{asctime, strftime} in addition to or instead of a global variable. This is also necessary to implement a pure python version of datetime.strftime that would support full range of datetime. See http://bugs.python.org/issue1777412 . From stefan_ml at behnel.de Wed Jan 5 19:21:15 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 05 Jan 2011 19:21:15 +0100 Subject: [Python-Dev] Omit Py_buffer struct from Stable ABI for Python 3.2? In-Reply-To: References: Message-ID: Mark Dickinson, 05.01.2011 13:55: > On Wed, Jan 5, 2011 at 12:31 PM, Nick Coghlan wrote: >> Currently [1], the implementation and the documentation for PEP 3118's >> Py_buffer struct don't line up (there's an extra field in the >> implementation that the PEP doesn't mention). > > I think there are actually two such fields: smalltable and obj. > > The need for obj is a little ugly: as far as I can tell, it's > meaningless for a 3rd-party object that wants to export buffers---it's > only really used by the memoryview object and by internal Python > types. Not at all. It's the reason why some of the buffer API functions could be changed to a simpler signature after earlier versions of the PEP had been written. Stefan From guido at python.org Wed Jan 5 19:35:58 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 5 Jan 2011 10:35:58 -0800 Subject: [Python-Dev] Possible optimization for LOAD_FAST ? In-Reply-To: References: <1294077146.26496.26.camel@radiator.bos.redhat.com> <4D22FB20.9010808@voidspace.org.uk> <4D239634.3000209@pearwood.info> <4D23A99B.8040007@pearwood.info> <4D23AF00.1010808@voidspace.org.uk> Message-ID: On Wed, Jan 5, 2011 at 5:27 AM, Maciej Fijalkowski wrote: > How about not changing semantics and still making this optimization possible? > > PyPy already has CALL_LIKELY_BUILTIN which checks whether builtins has > been altered (by keeping a flag on the module dictionary) and if not, > loads a specific builtin on top of value stack. I can only repeat what I said before. That's what everybody proposes, and if you have the infrastructure, it's a fine solution. But to me, those semantics aren't sacred, and I want to at least explore an alternative. Putting a hook on two dicts (the module globals and builtins.__dict__) is a lot of work in CPython, and has the risk of slowing everything down (just a tad, but still -- AFAIK dicts currently are not hookable). Checking whether there's a global named 'len' is much simpler in the current CPython compiler. -- --Guido van Rossum (python.org/~guido) From brett at python.org Wed Jan 5 19:37:28 2011 From: brett at python.org (Brett Cannon) Date: Wed, 5 Jan 2011 10:37:28 -0800 Subject: [Python-Dev] [Python-checkins] devguide: Strip out all generic svn instructions from the FAQ. It's not only In-Reply-To: References: Message-ID: To those that want to keep those steps in the dev FAQ, go ahead but I recuse myself from maintaining it. Having had so many instances of people asking "how do I do this?" and me almost always able to go "read the dev FAQ" has basically made me feel like it is not worth the effort if people are not going to bother to check it and just simply ask how to do things. The copy of the dev FAQ on the website has not been touched, so me cutting this stuff out so I know what has and has not been covered has no permanent impact. Plus having the devguide on hg.python.org and not the website means anyone with commit rights can modify the devguide, including adding/maintaining a dev FAQ on common VCS/SSH/whatever tools. On Wed, Jan 5, 2011 at 01:08, Terry Reedy wrote: > On 1/5/2011 1:18 AM, Eli Bendersky wrote: >> >> On Wed, Jan 5, 2011 at 04:13, Nick Coghlan >> ? ?Your call as the author, but please reconsider this one. I've found it >> ? ?*hugely* convenient over the years to have these task oriented answers >> ? ?in the FAQ. The problem with the answers all over the internet is that >> ? ?I (or someone new to our source control tool) may not know enough to >> ? ?ask the right question, and hence those answers may as well not exist. >> ? ?Even if these FAQ answers don't always provide everything needed, they >> ? ?usually provide enough information to let me search for the full >> ? ?answers. >> >> >> I agree with Nick here. I also found these instructions useful in the >> past, although I'm quite familiar with SVN. New devs interested in >> contributing to Python but not too familiar with the source-control tool >> it's using at the time will benefit even more from this. >> >> As for maintenance nightmare, I'm sure it's simple enough to attract >> contributors. For example, I can volunteer to maintain it. > > As a complete neophyte at actually using a source code system, I found the > stripped-down step-by-step instructions useful even though I am using > TortoiseSVN. Even the TortoiseSVN help doc is a bit overwhelming because it > includes so much that I do not need to read. It would be a bit like a > beginning programmer trying to learn Python from the Langauge Reference > without having the Tutorial to read. (And even as an experienced C > programmer, I started with the latter.) > > -- > Terry Jan Reedy > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > From rdmurray at bitdance.com Wed Jan 5 19:31:34 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 05 Jan 2011 13:31:34 -0500 Subject: [Python-Dev] Hello everyone In-Reply-To: References: Message-ID: <20110105183134.E641623BFAA@kimball.webabinitio.net> On Wed, 05 Jan 2011 17:18:00 +0530, yeswanth wrote: > My name is Yeswanth . I am doing my third year Btech in Computer Science [...] > Can anyone suggest me some areas where I can actually start with > developing for this proje Welcome, Yeswanth. Great idea to get involved early :) I'm guessing the PSF will apply to GSoC in 2011, but I'm not involved in that decision so I don't really know anything. The best way to start out helping is to do what you've done, read the developer docs (which Brett Cannon is currently updating, by the way). Next you could take a look at the bug tracker at bugs.python.org. There are plenty of open issues there that need to be reviewed (anyone can do reviews). Try out patches for issues that have existing patches, note anything missing (tests, doc updates, etc) (supply them if you like), and report your experiences with testing the patch, and any comments you may have on it. When you feel ready to try your hand at writing patches, click on the 'easy issues' button on the left. That tag is assigned to bugs where the reviewer thought the patch could be written in a day or less of work (of course, if you are still relatively new to Python coding it may take longer to do the necessary research to be able to write the patch). If you like you can also come hang out on the #python-dev IRC channel on freenode, where a number of the core developers and other folks hang out and discuss issues (among other things :) -- R. David Murray www.bitdance.com From brian.curtin at gmail.com Wed Jan 5 20:11:52 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Wed, 5 Jan 2011 13:11:52 -0600 Subject: [Python-Dev] Hello everyone In-Reply-To: References: Message-ID: On Wed, Jan 5, 2011 at 05:48, yeswanth wrote: > Hello everyone, > My name is Yeswanth . I am doing my third year Btech in Computer Science in > India. My desire is to get into gsoc 2011 . I have been looking over the > projects of last year to see where I would fit in. And I found python to be > interesting, something I can contribute. I dont know if Python Software > Foundation will apply for Gsoc this year, I just hope it will . So here I am > , planning to make an entry in contributing to this open source project. I > can just program in Python ,never contributed anything to it , atleast of > now. So I have read the development links provided in the python.org site. > > Can anyone suggest me some areas where I can actually start with developing > for this proje Yeswanth, http://docs.pythonsprints.com/core_development/beginners.html might be helpful for getting started, and to supplement David's suggestions. It was written for users like yourself to go from zero to successful contribution as quick as possible. Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Wed Jan 5 20:18:46 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Jan 2011 20:18:46 +0100 Subject: [Python-Dev] Hello everyone In-Reply-To: References: Message-ID: <4D24C416.9000306@v.loewis.de> Am 05.01.2011 12:48, schrieb yeswanth: > Hello everyone, > My name is Yeswanth . I am doing my third year Btech in Computer Science > in India. My desire is to get into gsoc 2011 . I have been looking over > the projects of last year to see where I would fit in. And I found > python to be interesting, something I can contribute. I dont know if > Python Software Foundation will apply for Gsoc this year, I just hope it > will . So here I am , planning to make an entry in contributing to this > open source project. I can just program in Python ,never contributed > anything to it , atleast of now. So I have read the development links > provided in the python.org site. > > Can anyone suggest me some areas where I can actually start with > developing for this proje PSF GSoC applicants will be asked to submit a patch to the Python(ic) project they are going to contribute to, as a proof that they actually know how to write code. So I suggest you browse through the bug tracker, find an open issue with no patch, and write a patch. You may want to focus on issues marked as "easy". Regards, Martin From guido at python.org Wed Jan 5 20:19:34 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 5 Jan 2011 11:19:34 -0800 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: <20110105184855.6b06c9ae@pitrou.net> Message-ID: On Wed, Jan 5, 2011 at 10:12 AM, Alexander Belopolsky wrote: > On Wed, Jan 5, 2011 at 12:48 PM, Antoine Pitrou wrote: > .. >> Couldn't we deprecate and remove time.accept2dyear? It has been there >> for "backward compatibility" since Python 1.5.2. >> > > It will be useful for another 50 years or so. ?(POSIX 2-digit years > cover 1969 - 2068.) ?In any case, this is not an option for 3.2 while > extending accepted range is a borderline case IMO. I like accepting all years >= 1 when accept2dyear is False. In 3.3 we should switch its default value to False (in addition to the keyword arg you are proposing below, maybe). Maybe we can add a deprecation warning in 3.2 when a 2d year is actually received? The posix standard notwithstanding they should be rare, and it would be better to make this the app's responsibility if we could. >> Not to mention that global settings affecting behaviour are generally >> bad, since multiple libraries could have conflicting expectations about >> it. And parsing times and dates is the kind of thing that a library >> will often rely on. > > Yes, for 3.3 I am going to propose an optional accept2dyear argument > to time.{asctime, strftime} in addition to or instead of a global > variable. ?This is also necessary to implement a pure python version > of datetime.strftime that would support full range of datetime. ?See > http://bugs.python.org/issue1777412 . I wish we didn't have to do that -- isn't it easy enough for the app to do the 2d -> 4d conversion itself before calling the library function? The only exception would be when parsing a string -- but strptime can tell whether a 2d or 4d year is requested by the format code (%y or %Y). -- --Guido van Rossum (python.org/~guido) From alexander.belopolsky at gmail.com Wed Jan 5 21:58:07 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 5 Jan 2011 15:58:07 -0500 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: <20110105184855.6b06c9ae@pitrou.net> Message-ID: On Wed, Jan 5, 2011 at 2:19 PM, Guido van Rossum wrote: .. >> extending accepted range is a borderline case IMO. > > I like accepting all years >= 1 when accept2dyear is False. > Why >= 1? Shouldn't it be >= 1900 - maxint? Also, what is your take on always accepting [1000 - 1899]? Now, to play the devil's advocate a little, with the new logic accept2dyear would actually mean "map 2-digit year" because 2-digit years will be accepted when accept2dyear is False, just not mapped to reasonable range. I don't have much of a problem with having a deprecated setting that does not have the meaning that its name suggests. (At the moment accept2dyear = True is actually treated as accept2dyear = 0!) I am mentioning this because I think the logic should be if accept2dyear: if 0 <= y < 69: y += 2000 elif 69 <= y < 100: y += 1900 elif 100 <= y < 1000: raise ValueError("3-digit year in map 2-digit year mode") and even the last elif may not be necessary. > In 3.3 we should switch its default value to False (in addition to the > keyword arg you are proposing below, maybe). > Note that time.accept2dyear is controlled by PYTHONY2K environment variable. If we switch the default, we may need to add a variable with the opposite meaning. > Maybe we can add a deprecation warning in 3.2 when a 2d year is > actually received? +1, but only when with accept2dyear = 1. When accept2dyear = 0, any year should just pass through and this should eventually become the only behavior. > The posix standard notwithstanding they should be > rare, and it would be better to make this the app's responsibility if > we could. > .. > I wish we didn't have to do that -- isn't it easy enough for the app > to do the 2d -> 4d conversion itself before calling the library > function? Note that this is already done at least in two places in stdlib: in email package parsedate_tz and in _strptime.py. Given that the POSIX convention is arbitrary and unintuitive, maybe we should provide time.posix2dyear() function for this purpose. > The only exception would be when parsing a string -- but > strptime can tell whether a 2d or 4d year is requested by the format > code (%y or %Y). > Existing stdlib date parsing code already does that and ignores accept2dyear setting. From guido at python.org Wed Jan 5 22:33:50 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 5 Jan 2011 13:33:50 -0800 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: <20110105184855.6b06c9ae@pitrou.net> Message-ID: On Wed, Jan 5, 2011 at 12:58 PM, Alexander Belopolsky wrote: > On Wed, Jan 5, 2011 at 2:19 PM, Guido van Rossum wrote: > .. >>> extending accepted range is a borderline case IMO. >> >> I like accepting all years >= 1 when accept2dyear is False. >> > > Why >= 1? Because that's what the datetime module accepts. > ?Shouldn't it be >= 1900 - maxint? ?Also, what is your take > on always accepting [1000 - 1899]? > > Now, to play the devil's advocate a little, with the new logic > accept2dyear would actually mean "map 2-digit year" because 2-digit > years will be accepted when accept2dyear is False, just not mapped to > reasonable range. ?I don't have much of a problem with having a > deprecated setting that does not have the meaning that its name > suggests. ?(At the moment accept2dyear = True is actually treated as > accept2dyear = 0!) ?I am mentioning this because I think the logic > should be > > if accept2dyear: > ? ?if 0 <= y < 69: > ? ? ? y += 2000 > ? ?elif 69 <= y < 100: > ? ? ? y += 1900 > ? ?elif 100 <= y < 1000: > ? ? ? raise ValueError("3-digit year in map 2-digit year mode") > > and even the last elif may not be necessary. Shouldn't the logic be to take the current year into account? By the time 2070 comes around, I'd expect "70" to refer to 2070, not to 1970. In fact, I'd expect it to refer to 2070 long before 2070 comes around. All of which makes me think that this is better left to the app, which can decide for itself whether it is more important to represent dates in the future or dates in the past. >> In 3.3 we should switch its default value to False (in addition to the >> keyword arg you are proposing below, maybe). > > Note that time.accept2dyear is controlled by PYTHONY2K environment > variable. If we switch the default, we may need to add a variable with > the opposite meaning. Yeah, but who sets that variable? Couldn't we make it so that if PYTHONY2K is set (even to the empty string) it wins, but if it's not set (at all) we can make the default adjust over time? >> Maybe we can add a deprecation warning in 3.2 when a 2d year is >> actually received? > > +1, but only when with accept2dyear = 1. ?When accept2dyear = 0, any > year should just pass through and this should eventually become the > only behavior. > >> The posix standard notwithstanding they should be >> rare, and it would be better to make this the app's responsibility if >> we could. >> > > .. >> I wish we didn't have to do that -- isn't it easy enough for the app >> to do the 2d -> 4d conversion itself before calling the library >> function? > > Note that this is already done at least in two places in stdlib: in > email package parsedate_tz and in _strptime.py. ?Given that the POSIX > convention is arbitrary and unintuitive, maybe we should provide > time.posix2dyear() function for this purpose. > >> The only exception would be when parsing a string -- but >> strptime can tell whether a 2d or 4d year is requested by the format >> code (%y or %Y). >> > > Existing stdlib date parsing code already does that and ignores > accept2dyear setting. > -- --Guido van Rossum (python.org/~guido) From glyph at twistedmatrix.com Wed Jan 5 23:14:18 2011 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Wed, 5 Jan 2011 17:14:18 -0500 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: <20110105184855.6b06c9ae@pitrou.net> Message-ID: <6847568F-D3CD-4DCF-8C5C-A3FA752F484F@twistedmatrix.com> On Jan 5, 2011, at 4:33 PM, Guido van Rossum wrote: > Shouldn't the logic be to take the current year into account? By the > time 2070 comes around, I'd expect "70" to refer to 2070, not to 1970. > In fact, I'd expect it to refer to 2070 long before 2070 comes around. > > All of which makes me think that this is better left to the app, which > can decide for itself whether it is more important to represent dates > in the future or dates in the past. The point of this somewhat silly flag (as I understood its description earlier in the thread) is to provide compatibility with POSIX 2-year dates. As per http://pubs.opengroup.org/onlinepubs/007908799/xsh/strptime.html - %y is the year within century. When a century is not otherwise specified, values in the range 69-99 refer to years in the twentieth century (1969 to 1999 inclusive); values in the range 00-68 refer to years in the twenty-first century (2000 to 2068 inclusive). Leading zeros are permitted but not required. So, "70" means "1970", forever, in programs that care about this nonsense. Personally, by the time 2070 comes around, I hope that "70" will just refer to 70 A.D., and get you odd looks if you use it in a written date - you might as well just write '0' :). -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Jan 5 23:21:23 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 05 Jan 2011 17:21:23 -0500 Subject: [Python-Dev] [Python-checkins] r87768 - in python/branches/py3k: Lib/socket.py Lib/test/test_socket.py Misc/NEWS In-Reply-To: <20110105210348.5015AEE98A@mail.python.org> References: <20110105210348.5015AEE98A@mail.python.org> Message-ID: <4D24EEE3.4060202@udel.edu> > Issue #7995: When calling accept() on a socket with a timeout, the returned > socket is now always non-blocking, regardless of the operating system. Seems clear enough > + # Issue #7995: if no default timeout is set and the listening > + # socket had a (non-zero) timeout, force the new socket in blocking > + # mode to override platform-specific socket flags inheritance. Slightly confusing > + # Issue #7995: when calling accept() on a listening socket with a > + # timeout, the resulting socket should not be non-blocking. Seems to contradict the first. 'sould not be non-blocking' to me means 'should be blocking', as opposed to 'is now ... non-blocking'. Terry From tjreedy at udel.edu Wed Jan 5 23:43:32 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 05 Jan 2011 17:43:32 -0500 Subject: [Python-Dev] [Python-checkins] devguide: Start a doc on running and writing unit tests. In-Reply-To: References: Message-ID: <4D24F414.9080103@udel.edu> > +The shortest, simplest way of running the test suite is:: > + > + ./python -m test Not on Windows. C:\Programs\Python32>./python -m test '.' is not recognized as an internal or external command, operable program or batch file. python -m test works (until it failed, separate issue). I would like to know, insofar as possible, how to run tests from the interpreter prompt (or IDLE simulation thereof) from whatmod import whatfunc; whatfunc() # ?? ditto for such remaining alternatives you give as can be made from prompt. Besides the convenience for Windows users (for whom the Command Prompt window is hidden away and possibly unknown), I think we should know if any tests are incompatible with interactive mode. --- Terry Jan Reedy From solipsis at pitrou.net Wed Jan 5 23:43:53 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 5 Jan 2011 23:43:53 +0100 Subject: [Python-Dev] [Python-checkins] r87768 - in python/branches/py3k: Lib/socket.py Lib/test/test_socket.py Misc/NEWS References: <20110105210348.5015AEE98A@mail.python.org> <4D24EEE3.4060202@udel.edu> Message-ID: <20110105234353.7a793784@pitrou.net> On Wed, 05 Jan 2011 17:21:23 -0500 Terry Reedy wrote: > > > Issue #7995: When calling accept() on a socket with a timeout, the returned > > socket is now always non-blocking, regardless of the operating system. > > Seems clear enough > > > + # Issue #7995: if no default timeout is set and the listening > > + # socket had a (non-zero) timeout, force the new socket in blocking > > + # mode to override platform-specific socket flags inheritance. > > Slightly confusing > > > + # Issue #7995: when calling accept() on a listening socket with a > > + # timeout, the resulting socket should not be non-blocking. > > Seems to contradict the first. 'sould not be non-blocking' to me means > 'should be blocking', as opposed to 'is now ... non-blocking'. Thank you for spotting the contradiction; this is now fixed. Regards Antoine. From alexander.belopolsky at gmail.com Wed Jan 5 23:55:19 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 5 Jan 2011 17:55:19 -0500 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: <20110105184855.6b06c9ae@pitrou.net> Message-ID: On Wed, Jan 5, 2011 at 4:33 PM, Guido van Rossum wrote: .. >> Why >= 1? > > Because that's what the datetime module accepts. What the datetime module accepts is irrelevant here. Note that functions affected by accept2dyear are: time.mktime(), time.asctime(), time.strftime() and indirectly time.ctime(). Neither of them produces result that is directly usable by the datetime module. Furthermore, this thread started with me arguing that year > 9999 should raise ValueError and if we wanted to restrict time module functions to datetime-supported year range, that would be the right thing to do. If I understand your "garbage in garbage out" principle correctly, time-processig functions should not introduce arbitrary limits unless there is a specific reason for them. In datetime module, calendar calculations would be too complicated if we had to support date range that does not fit in 32-bit integer. There is no such consideration in the time module, so we should support whatever the underlying system can. This said, I would be perfectly happy with just changing y >= 1900 to y >= 1000. Doing so will spare us from making a choice between '0012', '12' and ' 12' in time.asctime(). Time-series that extend back to 19th century are not unheard of and in many places modern calendar was already in use back then. Anything prior to year 1000 would certainly require a custom calendar module anyways. From solipsis at pitrou.net Wed Jan 5 23:57:34 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 5 Jan 2011 23:57:34 +0100 Subject: [Python-Dev] [Python-checkins] devguide: Start a doc on running and writing unit tests. References: <4D24F414.9080103@udel.edu> Message-ID: <20110105235734.5d609e7b@pitrou.net> On Wed, 05 Jan 2011 17:43:32 -0500 Terry Reedy wrote: > > Not on Windows. > C:\Programs\Python32>./python -m test > '.' is not recognized as an internal or external command, > operable program or batch file. > > python -m test > works (until it failed, separate issue). This will not run the right interpreter, unless this is an installed build. You must use: - "PCbuild\python_d.exe" in debug mode - "PCbuild\python.exe" in release mode - "PCbuild\amd64\python_d.exe" in 64-bit debug mode - "PCbuild\amd64\python.exe" in 64-bit release mode > I would like to know, insofar as possible, how to run tests from the > interpreter prompt (or IDLE simulation thereof) You can't. There is no such supported thing. Regards Antoine. From tjreedy at udel.edu Thu Jan 6 00:00:18 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 05 Jan 2011 18:00:18 -0500 Subject: [Python-Dev] [Python-checkins] devguide: Start a doc on running and writing unit tests. In-Reply-To: References: Message-ID: <4D24F802.2040005@udel.edu> > +Running > +------- Is there a way to skip a particular test, such as one that crashes the test process? Terry From guido at python.org Thu Jan 6 00:12:04 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 5 Jan 2011 15:12:04 -0800 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: <20110105184855.6b06c9ae@pitrou.net> Message-ID: On Wed, Jan 5, 2011 at 2:55 PM, Alexander Belopolsky wrote: > On Wed, Jan 5, 2011 at 4:33 PM, Guido van Rossum wrote: > .. >>> Why >= 1? >> >> Because that's what the datetime module accepts. > > What the datetime module accepts is irrelevant here. Not completely -- they are both about dates and times, there are some links between them (time tuples are used by both), both have a strftime() method. If they both impose some arbitrary limits, it would be easier for users to remember the limits if they were the same for both modules. (In fact datetime.strftime() is currently limited by what time.strftime() can handle -- more linkage.) > Note that > functions affected by accept2dyear are: time.mktime(), time.asctime(), > time.strftime() and indirectly time.ctime(). ?Neither of them produces > result that is directly usable by the datetime module. But the latter calls strftime() -- although never with a 2d year of course. >?Furthermore, > this thread started with me arguing that year > 9999 should raise > ValueError and if we wanted to restrict time module functions to > datetime-supported year range, that would be the right thing to do. I'd be fine with a ValueError too, if that's what it takes to align the two modules better. > If I understand your "garbage in garbage out" principle correctly, > time-processig functions should not introduce arbitrary limits unless > there is a specific reason for them. ?In datetime module, calendar > calculations would be too complicated if we had to support date range > that does not fit in 32-bit integer. ?There is no such consideration > in the time module, so we should support whatever the underlying > system can. (Except that the *originally* underlying system, libc, was too poorly standardized and too buggy on some platforms, so we have ended up reimplementing more and more of it.) > This said, I would be perfectly happy with just changing y >= 1900 to > y >= 1000. ?Doing so will spare us from making a choice between > '0012', '12' and ' ? 12' in time.asctime(). ? Time-series that extend > back to 19th century are not unheard of and in many places modern > calendar was already in use back then. ?Anything prior to year 1000 > would certainly require a custom calendar module anyways. Yeah, but datetime takes a position here (arbitrarily extending the Gregorian calendar all the way back to the year 1, and forward to the year 9999). I'd be happiest if time took the same position. For example it would fix the problem that datetime accepts years < 1900 but then you cannot call strftime() on those. -- --Guido van Rossum (python.org/~guido) From brian.curtin at gmail.com Thu Jan 6 00:15:36 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Wed, 5 Jan 2011 17:15:36 -0600 Subject: [Python-Dev] [Python-checkins] devguide: Start a doc on running and writing unit tests. In-Reply-To: <4D24F802.2040005@udel.edu> References: <4D24F802.2040005@udel.edu> Message-ID: On Wed, Jan 5, 2011 at 17:00, Terry Reedy wrote: > > +Running >> +------- >> > > Is there a way to skip a particular test, such as one that crashes the test > process? -x {list of tests to skip} -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Jan 6 00:17:33 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 6 Jan 2011 00:17:33 +0100 Subject: [Python-Dev] [Python-checkins] devguide: Start a doc on running and writing unit tests. References: <4D24F802.2040005@udel.edu> Message-ID: <20110106001733.3ac31c62@pitrou.net> On Wed, 05 Jan 2011 18:00:18 -0500 Terry Reedy wrote: > > > +Running > > +------- > > Is there a way to skip a particular test, such as one that crashes the > test process? -x test_foo From brian.curtin at gmail.com Thu Jan 6 00:18:53 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Wed, 5 Jan 2011 17:18:53 -0600 Subject: [Python-Dev] [Python-checkins] devguide: Start a doc on running and writing unit tests. In-Reply-To: <20110105235734.5d609e7b@pitrou.net> References: <4D24F414.9080103@udel.edu> <20110105235734.5d609e7b@pitrou.net> Message-ID: On Jan 5, 2011 4:45 PM, "Terry Reedy" wrote: > > >> +The shortest, simplest way of running the test suite is:: >> + >> + ./python -m test > > > Not on Windows. > C:\Programs\Python32>./python -m test > '.' is not recognized as an internal or external command, > operable program or batch file. > > python -m test > works (until it failed, separate issue). > > I would like to know, insofar as possible, how to run tests from the interpreter prompt (or IDLE simulation thereof) > > from whatmod import whatfunc; whatfunc() # ?? > > ditto for such remaining alternatives you give as can be made from prompt. > > Besides the convenience for Windows users (for whom the Command Prompt window is hidden away and possibly unknown), I think we should know if any tests are incompatible with interactive mode. > > --- > Terry Jan Reedy The command prompt on Windows is no more hidden than it is on any other OS. In fact it's easier to find than on OS X (IMO) :) I think we do need to make *some* assumptions in the developer docs that the reader is actually a developer (who would know where cmd is) and not a first-time user of the OS, otherwise it becomes a computer users guide and not a development guide. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Thu Jan 6 00:47:00 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 05 Jan 2011 18:47:00 -0500 Subject: [Python-Dev] 3.2b2 fails test suite on (my) Windows XP Message-ID: To test Brett's test running instruction, I ran python -m test # not ./Python! in a Command Prompt window --- Microsoft Windows XP [Version 5.1.2600] == CPython 3.2b2 (r32b2:87398, Dec 19 2010, 22:51:00) [MSC v.1500 32 bit (Intel)] == Windows-XP-5.1.2600-SP3 little-endian == c:\docume~1\terry\locals~1\temp\test_python_3528 [ 1/351] test_grammar ... [ 10/351] test___all__ Warning -- os.environ was modified by test___all__ [ 11/351] test___future__ ... [ 37/351] test_capi Window hangs, can only close. Error popup says "python.exe has encountered a problem..." at 000a03f7 in python32.dll RUN 2, same command, I get [ 37/351] test_capi test test_capi failed -- Traceback (most recent call last): File "C:\Programs\Python32\lib\test\test_capi.py", line 50, in test_no_FatalEr ror_infinite_loop b'Fatal Python error:' AssertionError: b"Fatal Python error: PyThreadState_Get: no current thread\r\n\r \nThis application has requested the Runtime to terminate it in an unusual way.\ nPlease contact the application's support team for more information." != b'Fatal Python error: PyThreadState_Get: no current thread' and it continued on with test_cfgparser, etc, so crashing rather than mere failure is intermitant. BUT process then stopped (hung, no error popup) at [ 67/351] test_concurrent_futures Traceback (most recent call last): File "", line 1, in File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 369, in main prepare(preparation_data) File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 477, in prepa re assert main_name not in sys.modules, main_name AssertionError: __main__ RUN 3 python -m test -x test_capi test_concurrent_futures went further, more failed tests, then process started repeatedly (hundred of times) outputting assert main_name not in sys.modules, main_name AssertionError: __main__ Traceback (most recent call last): File "", line 1, in File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 369, in main prepare(preparation_data) File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 477, Occasionally a new test would start in between this stuff. It ended with test_sax. I cannot say when it began because the volume overfilled the output buffer. [306/349] test_ttk_guionly # and test_tk test_ttk_guionly skipped -- ttk not available: Can't find a usable init.tcl in t he following directories: C:/Programs/Python32/lib/tcl8.5 C:/Programs/lib/tcl8.5 C:/lib/tcl8.5 C:/Prog rams/library C:/library C:/tcl8.5.9/library C:/tcl8.5.9/library This probably means that Tcl wasn't installed properly. Funny, IDLE works fine. In any case, I did a standard install from the distributed installer. Something is definitely not ready for final release. The final mishmash: [349/349] test_zlib 295 tests OK. 11 tests failed: test_datetime test_difflib.bak test_ftplib test_lib2to3 test_multiprocessing test_os.bak test_pep277 test_pkgutil test_posixpath test_runpy test_tcl 2 tests altered the execution environment: test___all__ test_site 41 tests skipped: test_codecmaps_cn test_codecmaps_hk test_codecmaps_jp test_codecmaps_kr test_codecmaps_tw test_crypt test_curses test_dbm_gnu test_dbm_ndbm test_epoll test_fcntl test_fork1 test_gdb test_grp test_ioctl test_kqueue test_largefile test_nis test_openpty test_ossaudiodev test_pipes test_poll test_posix test_pty test_pwd test_readline test_resource test_smtpnet test_socketserver test_syslog test_threadsignals test_timeout test_tk test_ttk_guionly test_urllib2net test_urllibnet test_wait3 test_wait4 test_winsound test_xmlrpc_net test_zipfile64 4 skips unexpected on win32: test_gdb test_readline test_tk test_ttk_guionly Traceback (most recent call last): File "C:\Programs\Python32\lib\test\support.py", line 468, in temp_cwd yield os.getcwd() File "C:\Programs\Python32\lib\test\__main__.py", line 13, in regrtest.main() File "C:\Programs\Python32\lib\test\regrtest.py", line 704, in main sys.exit(len(bad) > 0 or interrupted) SystemExit: True During handling of the above exception, another exception occurred: Traceback (most recent call last): File "C:\Programs\Python32\lib\runpy.py", line 160, in _run_module_as_main "__main__", fname, loader, pkg_name) File "C:\Programs\Python32\lib\runpy.py", line 73, in _run_code exec(code, run_globals) File "C:\Programs\Python32\lib\test\__main__.py", line 13, in regrtest.main() File "C:\Programs\Python32\lib\contextlib.py", line 46, in __exit__ self.gen.throw(type, value, traceback) File "C:\Programs\Python32\lib\test\support.py", line 472, in temp_cwd rmtree(name) File "C:\Programs\Python32\lib\test\support.py", line 198, in rmtree shutil.rmtree(path) File "C:\Programs\Python32\lib\shutil.py", line 287, in rmtree onerror(os.rmdir, path, sys.exc_info()) File "C:\Programs\Python32\lib\shutil.py", line 285, in rmtree os.rmdir(path) WindowsError: [Error 32] The process cannot access the file because it is being used by another process: 'c:\\docume~1\\terry\\locals~1\\temp\\test_python_2372' Traceback (most recent call last): File "C:\Programs\Python32\lib\multiprocessing\util.py", line 261, in _run_fin alizers finalizer() File "C:\Programs\Python32\lib\multiprocessing\util.py", line 200, in __call__ res = self._callback(*self._args, **self._kwargs) File "C:\Programs\Python32\lib\multiprocessing\pool.py", line 492, in _termina te_pool p.terminate() File "C:\Programs\Python32\lib\multiprocessing\process.py", line 137, in termi nate self._popen.terminate() AttributeError: 'NoneType' object has no attribute 'terminate' C:\Programs\Python32>Traceback (most recent call last): File "", line 1, in File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 368, in main preparation_data = load(from_parent) EOFError -- Terry Jan Reedy From brian.curtin at gmail.com Thu Jan 6 00:56:53 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Wed, 5 Jan 2011 17:56:53 -0600 Subject: [Python-Dev] 3.2b2 fails test suite on (my) Windows XP In-Reply-To: References: Message-ID: On Wed, Jan 5, 2011 at 17:47, Terry Reedy wrote: > To test Brett's test running instruction, I ran > python -m test # not ./Python! > in a Command Prompt window > --- > Microsoft Windows XP [Version 5.1.2600] > > == CPython 3.2b2 (r32b2:87398, Dec 19 2010, 22:51:00) > [MSC v.1500 32 bit (Intel)] > == Windows-XP-5.1.2600-SP3 little-endian > == c:\docume~1\terry\locals~1\temp\test_python_3528 > [ 1/351] test_grammar > ... > [ 10/351] test___all__ > Warning -- os.environ was modified by test___all__ > [ 11/351] test___future__ > ... > [ 37/351] test_capi > > Window hangs, can only close. > Error popup says "python.exe has encountered a problem..." > at 000a03f7 in python32.dll > > RUN 2, same command, I get > [ 37/351] test_capi > test test_capi failed -- Traceback (most recent call last): > File "C:\Programs\Python32\lib\test\test_capi.py", line 50, in > test_no_FatalEr > ror_infinite_loop > b'Fatal Python error:' > AssertionError: b"Fatal Python error: PyThreadState_Get: no current > thread\r\n\r > \nThis application has requested the Runtime to terminate it in an unusual > way.\ > nPlease contact the application's support team for more information." != > b'Fatal > Python error: PyThreadState_Get: no current thread' > > and it continued on with test_cfgparser, etc, so crashing rather than mere > failure is intermitant. > > BUT process then stopped (hung, no error popup) at > [ 67/351] test_concurrent_futures > Traceback (most recent call last): > File "", line 1, in > File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 369, in > main > prepare(preparation_data) > File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 477, in > prepa > re > assert main_name not in sys.modules, main_name > AssertionError: __main__ > > RUN 3 > python -m test -x test_capi test_concurrent_futures > > went further, more failed tests, then process started repeatedly (hundred > of times) outputting > > assert main_name not in sys.modules, main_name > AssertionError: __main__ > Traceback (most recent call last): > File "", line 1, in > File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 369, in > main > prepare(preparation_data) > File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 477, > > Occasionally a new test would start in between this stuff. It ended with > test_sax. I cannot say when it began because the volume overfilled the > output buffer. > > [306/349] test_ttk_guionly # and test_tk > test_ttk_guionly skipped -- ttk not available: Can't find a usable init.tcl > in t > he following directories: > C:/Programs/Python32/lib/tcl8.5 C:/Programs/lib/tcl8.5 C:/lib/tcl8.5 > C:/Prog > rams/library C:/library C:/tcl8.5.9/library C:/tcl8.5.9/library > This probably means that Tcl wasn't installed properly. > > Funny, IDLE works fine. In any case, I did a standard install from the > distributed installer. > > Something is definitely not ready for final release. The final mishmash: > > [349/349] test_zlib > 295 tests OK. > 11 tests failed: > test_datetime test_difflib.bak test_ftplib test_lib2to3 > test_multiprocessing test_os.bak test_pep277 test_pkgutil > test_posixpath test_runpy test_tcl > 2 tests altered the execution environment: > test___all__ test_site > 41 tests skipped: > test_codecmaps_cn test_codecmaps_hk test_codecmaps_jp > test_codecmaps_kr test_codecmaps_tw test_crypt test_curses > test_dbm_gnu test_dbm_ndbm test_epoll test_fcntl test_fork1 > test_gdb test_grp test_ioctl test_kqueue test_largefile test_nis > test_openpty test_ossaudiodev test_pipes test_poll test_posix > test_pty test_pwd test_readline test_resource test_smtpnet > test_socketserver test_syslog test_threadsignals test_timeout > test_tk test_ttk_guionly test_urllib2net test_urllibnet test_wait3 > test_wait4 test_winsound test_xmlrpc_net test_zipfile64 > 4 skips unexpected on win32: > test_gdb test_readline test_tk test_ttk_guionly > Traceback (most recent call last): > File "C:\Programs\Python32\lib\test\support.py", line 468, in temp_cwd > yield os.getcwd() > File "C:\Programs\Python32\lib\test\__main__.py", line 13, in > regrtest.main() > File "C:\Programs\Python32\lib\test\regrtest.py", line 704, in main > sys.exit(len(bad) > 0 or interrupted) > SystemExit: True > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File "C:\Programs\Python32\lib\runpy.py", line 160, in _run_module_as_main > "__main__", fname, loader, pkg_name) > File "C:\Programs\Python32\lib\runpy.py", line 73, in _run_code > exec(code, run_globals) > File "C:\Programs\Python32\lib\test\__main__.py", line 13, in > regrtest.main() > File "C:\Programs\Python32\lib\contextlib.py", line 46, in __exit__ > self.gen.throw(type, value, traceback) > File "C:\Programs\Python32\lib\test\support.py", line 472, in temp_cwd > rmtree(name) > File "C:\Programs\Python32\lib\test\support.py", line 198, in rmtree > shutil.rmtree(path) > File "C:\Programs\Python32\lib\shutil.py", line 287, in rmtree > onerror(os.rmdir, path, sys.exc_info()) > File "C:\Programs\Python32\lib\shutil.py", line 285, in rmtree > os.rmdir(path) > WindowsError: [Error 32] The process cannot access the file because it is > being > used by another process: > 'c:\\docume~1\\terry\\locals~1\\temp\\test_python_2372' > > Traceback (most recent call last): > File "C:\Programs\Python32\lib\multiprocessing\util.py", line 261, in > _run_fin > alizers > finalizer() > File "C:\Programs\Python32\lib\multiprocessing\util.py", line 200, in > __call__ > > res = self._callback(*self._args, **self._kwargs) > File "C:\Programs\Python32\lib\multiprocessing\pool.py", line 492, in > _termina > te_pool > p.terminate() > File "C:\Programs\Python32\lib\multiprocessing\process.py", line 137, in > termi > nate > self._popen.terminate() > AttributeError: 'NoneType' object has no attribute 'terminate' > > C:\Programs\Python32>Traceback (most recent call last): > File "", line 1, in > File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 368, in > main > preparation_data = load(from_parent) > EOFError > > -- > Terry Jan Reedy http://bugs.python.org/issue9116 covers this issue. The reason it doesn't fail on any of the build slaves is because they modify a registry value for Windows Error Reporting to not display the pop-up window, or at least mine does. I think I got the idea from one of the other Windows build slave maintainers. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.curtin at gmail.com Thu Jan 6 00:58:17 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Wed, 5 Jan 2011 17:58:17 -0600 Subject: [Python-Dev] 3.2b2 fails test suite on (my) Windows XP In-Reply-To: References: Message-ID: On Wed, Jan 5, 2011 at 17:56, Brian Curtin wrote: > On Wed, Jan 5, 2011 at 17:47, Terry Reedy wrote: > >> To test Brett's test running instruction, I ran >> python -m test # not ./Python! >> in a Command Prompt window >> --- >> Microsoft Windows XP [Version 5.1.2600] >> >> == CPython 3.2b2 (r32b2:87398, Dec 19 2010, 22:51:00) >> [MSC v.1500 32 bit (Intel)] >> == Windows-XP-5.1.2600-SP3 little-endian >> == c:\docume~1\terry\locals~1\temp\test_python_3528 >> [ 1/351] test_grammar >> ... >> [ 10/351] test___all__ >> Warning -- os.environ was modified by test___all__ >> [ 11/351] test___future__ >> ... >> [ 37/351] test_capi >> >> Window hangs, can only close. >> Error popup says "python.exe has encountered a problem..." >> at 000a03f7 in python32.dll >> >> RUN 2, same command, I get >> [ 37/351] test_capi >> test test_capi failed -- Traceback (most recent call last): >> File "C:\Programs\Python32\lib\test\test_capi.py", line 50, in >> test_no_FatalEr >> ror_infinite_loop >> b'Fatal Python error:' >> AssertionError: b"Fatal Python error: PyThreadState_Get: no current >> thread\r\n\r >> \nThis application has requested the Runtime to terminate it in an unusual >> way.\ >> nPlease contact the application's support team for more information." != >> b'Fatal >> Python error: PyThreadState_Get: no current thread' >> >> and it continued on with test_cfgparser, etc, so crashing rather than mere >> failure is intermitant. >> >> BUT process then stopped (hung, no error popup) at >> [ 67/351] test_concurrent_futures >> Traceback (most recent call last): >> File "", line 1, in >> File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 369, in >> main >> prepare(preparation_data) >> File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 477, in >> prepa >> re >> assert main_name not in sys.modules, main_name >> AssertionError: __main__ >> >> RUN 3 >> python -m test -x test_capi test_concurrent_futures >> >> went further, more failed tests, then process started repeatedly (hundred >> of times) outputting >> >> assert main_name not in sys.modules, main_name >> AssertionError: __main__ >> Traceback (most recent call last): >> File "", line 1, in >> File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 369, in >> main >> prepare(preparation_data) >> File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 477, >> >> Occasionally a new test would start in between this stuff. It ended with >> test_sax. I cannot say when it began because the volume overfilled the >> output buffer. >> >> [306/349] test_ttk_guionly # and test_tk >> test_ttk_guionly skipped -- ttk not available: Can't find a usable >> init.tcl in t >> he following directories: >> C:/Programs/Python32/lib/tcl8.5 C:/Programs/lib/tcl8.5 C:/lib/tcl8.5 >> C:/Prog >> rams/library C:/library C:/tcl8.5.9/library C:/tcl8.5.9/library >> This probably means that Tcl wasn't installed properly. >> >> Funny, IDLE works fine. In any case, I did a standard install from the >> distributed installer. >> >> Something is definitely not ready for final release. The final mishmash: >> >> [349/349] test_zlib >> 295 tests OK. >> 11 tests failed: >> test_datetime test_difflib.bak test_ftplib test_lib2to3 >> test_multiprocessing test_os.bak test_pep277 test_pkgutil >> test_posixpath test_runpy test_tcl >> 2 tests altered the execution environment: >> test___all__ test_site >> 41 tests skipped: >> test_codecmaps_cn test_codecmaps_hk test_codecmaps_jp >> test_codecmaps_kr test_codecmaps_tw test_crypt test_curses >> test_dbm_gnu test_dbm_ndbm test_epoll test_fcntl test_fork1 >> test_gdb test_grp test_ioctl test_kqueue test_largefile test_nis >> test_openpty test_ossaudiodev test_pipes test_poll test_posix >> test_pty test_pwd test_readline test_resource test_smtpnet >> test_socketserver test_syslog test_threadsignals test_timeout >> test_tk test_ttk_guionly test_urllib2net test_urllibnet test_wait3 >> test_wait4 test_winsound test_xmlrpc_net test_zipfile64 >> 4 skips unexpected on win32: >> test_gdb test_readline test_tk test_ttk_guionly >> Traceback (most recent call last): >> File "C:\Programs\Python32\lib\test\support.py", line 468, in temp_cwd >> yield os.getcwd() >> File "C:\Programs\Python32\lib\test\__main__.py", line 13, in >> regrtest.main() >> File "C:\Programs\Python32\lib\test\regrtest.py", line 704, in main >> sys.exit(len(bad) > 0 or interrupted) >> SystemExit: True >> >> During handling of the above exception, another exception occurred: >> >> Traceback (most recent call last): >> File "C:\Programs\Python32\lib\runpy.py", line 160, in >> _run_module_as_main >> "__main__", fname, loader, pkg_name) >> File "C:\Programs\Python32\lib\runpy.py", line 73, in _run_code >> exec(code, run_globals) >> File "C:\Programs\Python32\lib\test\__main__.py", line 13, in >> regrtest.main() >> File "C:\Programs\Python32\lib\contextlib.py", line 46, in __exit__ >> self.gen.throw(type, value, traceback) >> File "C:\Programs\Python32\lib\test\support.py", line 472, in temp_cwd >> rmtree(name) >> File "C:\Programs\Python32\lib\test\support.py", line 198, in rmtree >> shutil.rmtree(path) >> File "C:\Programs\Python32\lib\shutil.py", line 287, in rmtree >> onerror(os.rmdir, path, sys.exc_info()) >> File "C:\Programs\Python32\lib\shutil.py", line 285, in rmtree >> os.rmdir(path) >> WindowsError: [Error 32] The process cannot access the file because it is >> being >> used by another process: >> 'c:\\docume~1\\terry\\locals~1\\temp\\test_python_2372' >> >> Traceback (most recent call last): >> File "C:\Programs\Python32\lib\multiprocessing\util.py", line 261, in >> _run_fin >> alizers >> finalizer() >> File "C:\Programs\Python32\lib\multiprocessing\util.py", line 200, in >> __call__ >> >> res = self._callback(*self._args, **self._kwargs) >> File "C:\Programs\Python32\lib\multiprocessing\pool.py", line 492, in >> _termina >> te_pool >> p.terminate() >> File "C:\Programs\Python32\lib\multiprocessing\process.py", line 137, in >> termi >> nate >> self._popen.terminate() >> AttributeError: 'NoneType' object has no attribute 'terminate' >> >> C:\Programs\Python32>Traceback (most recent call last): >> File "", line 1, in >> File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 368, in >> main >> preparation_data = load(from_parent) >> EOFError >> >> -- >> Terry Jan Reedy > > > http://bugs.python.org/issue9116 covers this issue. > > The reason it doesn't fail on any of the build slaves is because they > modify a registry value for Windows Error Reporting to not display the > pop-up window, or at least mine does. I think I got the idea from one of the > other Windows build slave maintainers. > Sorry, should have specified -- that issue only covers the test_capi failure. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Thu Jan 6 01:02:32 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 5 Jan 2011 19:02:32 -0500 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: <20110105184855.6b06c9ae@pitrou.net> Message-ID: On Wed, Jan 5, 2011 at 6:12 PM, Guido van Rossum wrote: .. > If they both impose some arbitrary limits, it would be easier for > users to remember the limits if they were the same for both modules. > Unfortunately, that is not possible on 32-bit systems where range supported by say time.ctime() is limited by the range of time_t. > (In fact datetime.strftime() is currently limited by what > time.strftime() can handle -- more linkage.) > Not really. There is a patch at http://bugs.python.org/issue1777412 that removes this limit for datetime.strftime. There is an issue for pure python implementation that does depend on time.strftime(), but that can be addressed in several ways including ignoring it until time modules is fixed. .. >>?Furthermore, >> this thread started with me arguing that year > 9999 should raise >> ValueError and if we wanted to restrict time module functions to >> datetime-supported year range, that would be the right thing to do. > > I'd be fine with a ValueError too, if that's what it takes to align > the two modules better. > Do you want to *add* year range checks to say time.localtime(t) so that it would not produce time tuple with out of range year? IMO, range checks are justified when they allow simpler implementation. As far as users are concerned, I don't think anyone would care about precise limits if they are wider than [1000 - 9999]. .. >> This said, I would be perfectly happy with just changing y >= 1900 to >> y >= 1000. ?Doing so will spare us from making a choice between >> '0012', '12' and ' ? 12' in time.asctime(). ? Time-series that extend >> back to 19th century are not unheard of and in many places modern >> calendar was already in use back then. ?Anything prior to year 1000 >> would certainly require a custom calendar module anyways. > > Yeah, but datetime takes a position here (arbitrarily extending the > Gregorian calendar all the way back to the year 1, and forward to the > year 9999). I'd be happiest if time took the same position. Doesn't it already? On my system, $ cal 9 1752 September 1752 Su Mo Tu We Th Fr Sa 1 2 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 but >>> (datetime(1752, 9, 2) - datetime(1970,1,1))//timedelta(0, 1) -6858259200 >>> time.gmtime(-6858259200)[:3] (1752, 9, 2) >>> datetime(1752, 9, 2).weekday() 5 >>> time.gmtime(-6858259200).tm_wday 5 From ncoghlan at gmail.com Thu Jan 6 02:59:05 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Jan 2011 11:59:05 +1000 Subject: [Python-Dev] 3.2b2 fails test suite on (my) Windows XP In-Reply-To: References: Message-ID: On Thu, Jan 6, 2011 at 9:47 AM, Terry Reedy wrote: > To test Brett's test running instruction, I ran > python -m test # not ./Python! > in a Command Prompt window Does it behave itself if you add "-x test_capi" to the command line? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Thu Jan 6 03:18:44 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 5 Jan 2011 18:18:44 -0800 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: <20110105184855.6b06c9ae@pitrou.net> Message-ID: I'm sorry, but at this point I'm totally confused about what you're asking or proposing. You keep referring to various implementation details and behaviors. Maybe if you summarized how the latest implementation (say python 3.2) works and what you propose to change that would be quicker than this back-and-forth about whether or not datetime and time behave the same or should behave the same or whatever. -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Thu Jan 6 03:45:13 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Jan 2011 12:45:13 +1000 Subject: [Python-Dev] [Python-checkins] devguide: Add a note about a possible starter project. In-Reply-To: References: Message-ID: On Thu, Jan 6, 2011 at 8:11 AM, brett.cannon wrote: > +.. todo:: > + ? ?See if tempfile or test.support has a context manager that creates and > + ? ?deletes a temp file so as to move off of test.support.TESTFN. Yeah, tempfile.TemporaryFile and friends all support the CM protocol. There's also the tempfile.TemporaryDirectory CM (as of 3.2). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From alexander.belopolsky at gmail.com Thu Jan 6 03:46:49 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 5 Jan 2011 21:46:49 -0500 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: <20110105184855.6b06c9ae@pitrou.net> Message-ID: On Wed, Jan 5, 2011 at 9:18 PM, Guido van Rossum wrote: > I'm sorry, but at this point I'm totally confused about what you're > asking or proposing. You keep referring to various implementation > details and behaviors. Maybe if you summarized how the latest > implementation (say python 3.2) works and what you propose to change I'll try. The current implementation is of time.asctime and time.strftime is roughly if y < 1900: if accept2dyear: if 69 <= y <= 99: y += 1900 elif 0 <= y <= 68: y += 2000 else: raise ValueError("year out of range") else: raise ValueError("year out of range") # call system function with tm_year = y - 1900 I propose to change that to if y < 1000: if accept2dyear: if 69 <= y <= 99: y += 1900 elif 0 <= y <= 68: y += 2000 else: raise ValueError("year out of range") # call system function with tm_year = y - 1900 From guido at python.org Thu Jan 6 04:50:10 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 5 Jan 2011 19:50:10 -0800 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: <20110105184855.6b06c9ae@pitrou.net> Message-ID: On Wed, Jan 5, 2011 at 6:46 PM, Alexander Belopolsky wrote: > On Wed, Jan 5, 2011 at 9:18 PM, Guido van Rossum wrote: >> I'm sorry, but at this point I'm totally confused about what you're >> asking or proposing. You keep referring to various implementation >> details and behaviors. Maybe if you summarized how the latest >> implementation (say python 3.2) works and what you propose to change > > I'll try. ?The current implementation is of time.asctime and > time.strftime is roughly > > if y < 1900: > ? ?if accept2dyear: > ? ? ? ?if 69 <= y <= 99: > ? ? ? ? ? ?y += 1900 > ? ? ? ?elif 0 <= y <= 68: > ? ? ? ? ? ?y += 2000 > ? ? ? ?else: > ? ? ? ? ? ?raise ValueError("year out of range") > ? ?else: > ? ? ? ? raise ValueError("year out of range") > # call system function with tm_year = y - 1900 > > I propose to change that to > > if y < 1000: > ? ?if accept2dyear: > ? ? ? ?if 69 <= y <= 99: > ? ? ? ? ? ?y += 1900 > ? ? ? ?elif 0 <= y <= 68: > ? ? ? ? ? ?y += 2000 > ? ? ? ?else: > ? ? ? ? ? ?raise ValueError("year out of range") > # call system function with tm_year = y - 1900 The new logic doesn't look right, am I right that this is what you meant? if accept2dyear and 0 <= y < 100: (convert to year >= 1970) if y < 1000: raise ... But what guarantees do we have that the system functions accept negative values for tm_year on all relevant platforms? The 1000 limit still seems pretty arbitrary to me -- if it's only because you don't want to decide whether to use leading spaces or zeros for numbers shorter than 4 digits, let me propose leading zeros since we use those uniformly for months, days, hours, minutes and seconds < 10, and then you can make the year range accepted the same for these as for datetime (i.e. 1 <= y <= 9999). Tim Peters picked those at least in part because they are right round numbers... -- --Guido van Rossum (python.org/~guido) From alexander.belopolsky at gmail.com Thu Jan 6 05:48:38 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 5 Jan 2011 23:48:38 -0500 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: <20110105184855.6b06c9ae@pitrou.net> Message-ID: On Wed, Jan 5, 2011 at 10:50 PM, Guido van Rossum wrote: .. >> I propose to change that to >> >> if y < 1000: >> ? ?if accept2dyear: >> ? ? ? ?if 69 <= y <= 99: >> ? ? ? ? ? ?y += 1900 >> ? ? ? ?elif 0 <= y <= 68: >> ? ? ? ? ? ?y += 2000 >> ? ? ? ?else: >> ? ? ? ? ? ?raise ValueError("year out of range") >> # call system function with tm_year = y - 1900 > > The new logic doesn't look right, am I right that this is what you meant? > > if accept2dyear and 0 <= y < 100: > ?(convert to year >= 1970) > if y < 1000: > ?raise ... > Not quite. My proposed logic would not do any range checking if accept2dyear == 0. > But what guarantees do we have that the system functions accept > negative values for tm_year on all relevant platforms? > I've already committed an implementation of asctime, so time.asctime and time.ctime don't call system functions anymore. This leaves time.mktime and time.strftime. The latter caused Tim Peters to limit year range to >= 1900 eight years ago: http://svn.python.org/view?view=rev&revision=30224 For these functions, range checks are necessary only when system functions may crash on out of range values. If we can detect error return and convert it to an exception, there is no need to look before you leap. (Note that asctime was different because the relevant standards specifically allowed it to have undefined behavior for out of range values.) I cannot rule out that there are systems out there with buggy strftime, but the situation has improved in the last eight years and we have buildbots and unittests to check behavior on relevant platforms. If we do find a platform with buggy strftime which crashes or produces nonsense with negative tm_year, we can add a platform specific range check to work around platform bug, or just ask users to bug their OS vendor. :-) > The 1000 limit still seems pretty arbitrary to me -- if it's only > because you don't want to decide whether to use leading spaces or > zeros for numbers shorter than 4 digits, let me propose leading zeros > since we use those uniformly for months, days, hours, minutes and > seconds < 10, Except we don't: >>> time.asctime((2000, 1, 1, 0, 0, 0, 0, 0, -1)) 'Sat Jan 1 00:00:00 2000' (note that day is space-filled.) I am not sure, however, what you are proposing here. Are you arguing for a wider or a narrower year range? I would be happy with just if accept2dyear: if 69 <= y <= 99: y += 1900 elif 0 <= y <= 68: y += 2000 # call system function with tm_year = y - 1900 but I thought that would be too radical. From alexander.belopolsky at gmail.com Thu Jan 6 06:10:41 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 6 Jan 2011 00:10:41 -0500 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: <20110105184855.6b06c9ae@pitrou.net> Message-ID: On Wed, Jan 5, 2011 at 10:50 PM, Guido van Rossum wrote: .. > But what guarantees do we have that the system functions accept > negative values for tm_year on all relevant platforms? > Also note that the subject of this thread is limited to "time.asctime and time.ctime." The other functions came into discussion only because the year range checking code is shared inside the time module. If calling specific system functions such as strftime with tm_year < 0 is deemed unsafe, we can move the check to where the system function is called. No system function is called from time.asctime anymore and time.ctime(t) is now time.asctime(localtime(t)). From tjreedy at udel.edu Thu Jan 6 06:34:11 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 06 Jan 2011 00:34:11 -0500 Subject: [Python-Dev] [Python-checkins] devguide: Start a doc on running and writing unit tests. In-Reply-To: <20110105235734.5d609e7b@pitrou.net> References: <4D24F414.9080103@udel.edu> <20110105235734.5d609e7b@pitrou.net> Message-ID: On 1/5/2011 5:57 PM, Antoine Pitrou wrote: > On Wed, 05 Jan 2011 17:43:32 -0500 > Terry Reedy wrote: >> >> Not on Windows. >> C:\Programs\Python32>./python -m test Installation, not checkout. >> '.' is not recognized as an internal or external command, >> operable program or batch file. >> >> python -m test >> works (until it failed, separate issue). > > This will not run the right interpreter, unless this is an installed > build. It is, from 32b2.msi. I have no compiler ;-). -- Terry Jan Reedy From tjreedy at udel.edu Thu Jan 6 06:36:31 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 06 Jan 2011 00:36:31 -0500 Subject: [Python-Dev] [Python-checkins] r87768 - in python/branches/py3k: Lib/socket.py Lib/test/test_socket.py Misc/NEWS In-Reply-To: <20110105234353.7a793784@pitrou.net> References: <20110105210348.5015AEE98A@mail.python.org> <4D24EEE3.4060202@udel.edu> <20110105234353.7a793784@pitrou.net> Message-ID: On 1/5/2011 5:43 PM, Antoine Pitrou wrote: > On Wed, 05 Jan 2011 17:21:23 -0500 > Terry Reedy wrote: > Thank you for spotting the contradiction; this is now fixed. I am following your example of looking at checkins. -- Terry Jan Reedy From tjreedy at udel.edu Thu Jan 6 07:00:45 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 06 Jan 2011 01:00:45 -0500 Subject: [Python-Dev] 3.2b2 fails test suite on (my) Windows XP In-Reply-To: References: Message-ID: On 1/5/2011 8:59 PM, Nick Coghlan wrote: > On Thu, Jan 6, 2011 at 9:47 AM, Terry Reedy wrote: >> To test Brett's test running instruction, I ran >> python -m test # not ./Python! >> in a Command Prompt window > > Does it behave itself if you add "-x test_capi" to the command line? No, it gets worse. Really. Let me summarize a long post. Run 1: normal (as above) Process stops at capi test with Windows error message. Close command prompt window with [x] buttom (crtl-whatever had no effect). Run 2: normal (as before) Process reported capi test failure (supposedly fatal) but continued. Process just stopped ('hung') at concurrent futures. Close as before. Run 3: -x test_capi test_concurrent_futures Instead of the normal output I expected, I got some of the craziest stuff I have ever seen. Things like " assert main_name not in sys.modules, main_name AssertionError: __main__ Traceback (most recent call last): File "", line 1, in File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 369, in main prepare(preparation_data) File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 477, " were printed 100s of times intermixed with the normal sequential test startup lines. They stopped after text_sax started and output became normal through the end of the report. 295 tests OK. 11 tests failed: test_datetime test_difflib.bak test_ftplib test_lib2to3 test_multiprocessing test_os.bak test_pep277 test_pkgutil test_posixpath test_runpy test_tcl 2 tests altered the execution environment: test___all__ test_site 41 tests skipped: [snip] 4 skips unexpected on win32: test_gdb test_readline test_tk test_ttk_guionly (It previously said it could not find tk (or ttk), even though IDLE does just fine.) Then chained error craziness during shutdown: SystemExit, WindowsError, AttributeError, EOFError (details in original post). I forgot to mention before that test_ftplib runs into Windows security and pops up a window (which I closed). If I did not know better, I might have thought python to be a buggy piece of junk, but my well-tested package-in-progress runs fine (from IDLE edit window) in 3.2b2, unchanged from 3.1. I think fixing test regressions should happen before a 'release candidate'. On same machine (again, installed from Martin's .msi) C:\Programs\Python31>python -m test.regrtest seems to run 'normally' (same security popup), no craziness (except for blocked ftplib test), with results 298 tests OK. 3 tests failed: test_ftplib test_lib2to3 test_tcl 39 tests skipped: [snip] 2 skips unexpected on win32: test_tk test_ttk_guionly test_tcl had multiple errors, tk,ttk skips are from not finding usable init.tcl Similar result with 2.7 with addition of test_distutils failure and 'unexpected skips' of test_gbd and test_readline (but I presume these really should be expected). -- Terry Jan Reedy From solipsis at pitrou.net Thu Jan 6 08:26:00 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 6 Jan 2011 08:26:00 +0100 Subject: [Python-Dev] [Python-checkins] devguide: Start a doc on running and writing unit tests. References: <4D24F414.9080103@udel.edu> <20110105235734.5d609e7b@pitrou.net> Message-ID: <20110106082600.53e794ee@pitrou.net> On Thu, 06 Jan 2011 00:34:11 -0500 > >> > >> python -m test > >> works (until it failed, separate issue). > > > > This will not run the right interpreter, unless this is an installed > > build. > > It is, from 32b2.msi. I have no compiler ;-). Ah, sorry. For the devguide, however, I recommend assuming an uninstalled up-to-date build, since keeping fresh sources is the most productive way (or the least frustrating) way of contributing. Note that, if you're willing to give it a try, Microsoft Visual Studio Express is a free download (free as in beer). Regards Antoine. From solipsis at pitrou.net Thu Jan 6 10:59:52 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 6 Jan 2011 10:59:52 +0100 Subject: [Python-Dev] 3.2b2 fails test suite on (my) Windows XP References: Message-ID: <20110106105952.26039d64@pitrou.net> On Wed, 5 Jan 2011 17:56:53 -0600 Brian Curtin wrote: > > http://bugs.python.org/issue9116 covers this issue. > > The reason it doesn't fail on any of the build slaves is because they modify > a registry value for Windows Error Reporting to not display the pop-up > window, or at least mine does. I think I got the idea from one of the other > Windows build slave maintainers. How about simply using the -n flag to regrtest? Regards Antoine. From victor.stinner at haypocalc.com Thu Jan 6 12:47:22 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 06 Jan 2011 12:47:22 +0100 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: <20110105184855.6b06c9ae@pitrou.net> Message-ID: <1294314442.9643.3.camel@marge> Le mercredi 05 janvier 2011 ? 23:48 -0500, Alexander Belopolsky a ?crit : > I would be happy with just > > if accept2dyear: > if 69 <= y <= 99: > y += 1900 > elif 0 <= y <= 68: > y += 2000 > # call system function with tm_year = y - 1900 Perfect. That's what I expect from a "2 digits" option: it should not touch 3 (100..999) or 4 digits digits (>= 1000). Remember that the "2 digit option" is a hack to workaround the y2k bug. It is maybe time to try to remove the workaround: disable accept2dyear by default and remove PYTHONY2K env var. > but I thought that would be too radical. Why ? Victor From victor.stinner at haypocalc.com Thu Jan 6 12:55:24 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 06 Jan 2011 12:55:24 +0100 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: <20110105184855.6b06c9ae@pitrou.net> Message-ID: <1294314924.18575.4.camel@marge> Le jeudi 06 janvier 2011 ? 00:10 -0500, Alexander Belopolsky a ?crit : > If calling specific system functions such as strftime with tm_year < > 0 is deemed unsafe, we can move the check to where the system function > is called. What do you mean by "unsafe"? Does it crash? On my Linux box, strftime("%Y") is able to format integers in [-2^31-1900; 2^31-1-1900] (full range of the int type). Can't we add a test in the configure script to check for "broken" strftime() implementation? Victor From fuzzyman at voidspace.org.uk Thu Jan 6 13:24:18 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 06 Jan 2011 12:24:18 +0000 Subject: [Python-Dev] [Python-checkins] devguide: Strip out all generic svn instructions from the FAQ. It's not only In-Reply-To: References: Message-ID: <4D25B472.2090801@voidspace.org.uk> On 05/01/2011 18:37, Brett Cannon wrote: > To those that want to keep those steps in the dev FAQ, go ahead but I > recuse myself from maintaining it. Having had so many instances of > people asking "how do I do this?" and me almost always able to go > "read the dev FAQ" has basically made me feel like it is not worth the > effort if people are not going to bother to check it and just simply > ask how to do things. I think you have it backwards. The benefit of having a FAQ is not that people read it first (they will almost never do that) but that you have a single place to send them when they ask the questions. It sounds like it's working! :-) All the best, Michael Foord > The copy of the dev FAQ on the website has not been touched, so me > cutting this stuff out so I know what has and has not been covered has > no permanent impact. Plus having the devguide on hg.python.org and not > the website means anyone with commit rights can modify the devguide, > including adding/maintaining a dev FAQ on common VCS/SSH/whatever > tools. > > On Wed, Jan 5, 2011 at 01:08, Terry Reedy wrote: >> On 1/5/2011 1:18 AM, Eli Bendersky wrote: >>> On Wed, Jan 5, 2011 at 04:13, Nick Coghlan>> Your call as the author, but please reconsider this one. I've found it >>> *hugely* convenient over the years to have these task oriented answers >>> in the FAQ. The problem with the answers all over the internet is that >>> I (or someone new to our source control tool) may not know enough to >>> ask the right question, and hence those answers may as well not exist. >>> Even if these FAQ answers don't always provide everything needed, they >>> usually provide enough information to let me search for the full >>> answers. >>> >>> >>> I agree with Nick here. I also found these instructions useful in the >>> past, although I'm quite familiar with SVN. New devs interested in >>> contributing to Python but not too familiar with the source-control tool >>> it's using at the time will benefit even more from this. >>> >>> As for maintenance nightmare, I'm sure it's simple enough to attract >>> contributors. For example, I can volunteer to maintain it. >> As a complete neophyte at actually using a source code system, I found the >> stripped-down step-by-step instructions useful even though I am using >> TortoiseSVN. Even the TortoiseSVN help doc is a bit overwhelming because it >> includes so much that I do not need to read. It would be a bit like a >> beginning programmer trying to learn Python from the Langauge Reference >> without having the Tutorial to read. (And even as an experienced C >> programmer, I started with the latter.) >> >> -- >> Terry Jan Reedy >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/brett%40python.org >> > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From rdmurray at bitdance.com Thu Jan 6 16:47:25 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 06 Jan 2011 10:47:25 -0500 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: <1294314924.18575.4.camel@marge> References: <20110105184855.6b06c9ae@pitrou.net> <1294314924.18575.4.camel@marge> Message-ID: <20110106154725.29D841FDB65@kimball.webabinitio.net> On Thu, 06 Jan 2011 12:55:24 +0100, Victor Stinner wrote: >Le jeudi 06 janvier 2011 ?? 00:10 -0500, Alexander Belopolsky a ??crit : >> If calling specific system functions such as strftime with tm_year < >> 0 is deemed unsafe, we can move the check to where the system function >> is called. > >What do you mean by "unsafe"? Does it crash? On my Linux box, >strftime("%Y") is able to format integers in [-2^31-1900; 2^31-1-1900] >(full range of the int type). I believe that we have had several cases where Windows "crashed" when out-of-range values were passed to the CRT that other platforms accepted. -- R. David Murray www.bitdance.com From victor.stinner at haypocalc.com Thu Jan 6 17:08:09 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 06 Jan 2011 17:08:09 +0100 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: <20110106154725.29D841FDB65@kimball.webabinitio.net> References: <20110105184855.6b06c9ae@pitrou.net> <1294314924.18575.4.camel@marge> <20110106154725.29D841FDB65@kimball.webabinitio.net> Message-ID: <1294330089.23192.2.camel@marge> Le jeudi 06 janvier 2011 ? 10:47 -0500, R. David Murray a ?crit : > On Thu, 06 Jan 2011 12:55:24 +0100, Victor Stinner wrote: > >Le jeudi 06 janvier 2011 ? 00:10 -0500, Alexander Belopolsky a ?crit : > >> If calling specific system functions such as strftime with tm_year < > >> 0 is deemed unsafe, we can move the check to where the system function > >> is called. > > > >What do you mean by "unsafe"? Does it crash? On my Linux box, > >strftime("%Y") is able to format integers in [-2^31-1900; 2^31-1-1900] > >(full range of the int type). > > I believe that we have had several cases where Windows "crashed" when > out-of-range values were passed to the CRT that other platforms > accepted. If there are only issues on Windows, we can add a #ifdef _MSC_VER and raise a ValueError("Stupid OS, install Linux or recompile with Cygwin") for year < 1900. Does Cygwin and MinGW have the same issues? Victor From eric at trueblade.com Thu Jan 6 17:30:17 2011 From: eric at trueblade.com (Eric Smith) Date: Thu, 06 Jan 2011 11:30:17 -0500 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: <1294330089.23192.2.camel@marge> References: <20110105184855.6b06c9ae@pitrou.net> <1294314924.18575.4.camel@marge> <20110106154725.29D841FDB65@kimball.webabinitio.net> <1294330089.23192.2.camel@marge> Message-ID: <4D25EE19.2030606@trueblade.com> On 01/06/2011 11:08 AM, Victor Stinner wrote: > Le jeudi 06 janvier 2011 ? 10:47 -0500, R. David Murray a ?crit : >> On Thu, 06 Jan 2011 12:55:24 +0100, Victor Stinner wrote: >>> Le jeudi 06 janvier 2011 ? 00:10 -0500, Alexander Belopolsky a ?crit : >>>> If calling specific system functions such as strftime with tm_year< >>>> 0 is deemed unsafe, we can move the check to where the system function >>>> is called. >>> >>> What do you mean by "unsafe"? Does it crash? On my Linux box, >>> strftime("%Y") is able to format integers in [-2^31-1900; 2^31-1-1900] >>> (full range of the int type). >> >> I believe that we have had several cases where Windows "crashed" when >> out-of-range values were passed to the CRT that other platforms >> accepted. > > If there are only issues on Windows, we can add a #ifdef _MSC_VER and > raise a ValueError("Stupid OS, install Linux or recompile with Cygwin") > for year< 1900. Is strftime really so complex that we shouldn't just write our own? I'd be willing to do it. Over the years the platform strftime has caused any number of problems. The last time I looked at it we already have to do some work pre-parsing the format string and passing it off to platform strftime, so it's not like it's not already a maintenance hassle. I understand strptime is probably more complex and there's some value to having strptime/strftime coming from the same library. But I'd be willing to look at it, too. Eric. From alexander.belopolsky at gmail.com Thu Jan 6 17:55:35 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 6 Jan 2011 11:55:35 -0500 Subject: [Python-Dev] Implementing strftime Was: Checking input range in time.asctime and time.ctime Message-ID: On Thu, Jan 6, 2011 at 11:30 AM, Eric Smith wrote: .. > Is strftime really so complex that we shouldn't just write our own? I'd be > willing to do it. Over the years the platform strftime has caused any number > of problems. The last time I looked at it we already have to do some work > pre-parsing the format string and passing it off to platform strftime, so > it's not like it's not already a maintenance hassle. > This is the subject of issue 3173: http://bugs.python.org/issue3173 As far as I can tell, the main problem with implementing strftime is that it has to be locale aware and locale API is as inconsistent/buggy across platforms as strftime itself. > I understand strptime is probably more complex and there's some value to > having strptime/strftime coming from the same library. But I'd be willing to > look at it, too. strptime is already implemented (in pure python) by stdlib, but it piggybacks on strftime for locale information. See Lib/_strptime.py. From alexander.belopolsky at gmail.com Thu Jan 6 19:33:42 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 6 Jan 2011 13:33:42 -0500 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: <1294314442.9643.3.camel@marge> References: <20110105184855.6b06c9ae@pitrou.net> <1294314442.9643.3.camel@marge> Message-ID: On Thu, Jan 6, 2011 at 6:47 AM, Victor Stinner wrote: > Le mercredi 05 janvier 2011 ? 23:48 -0500, Alexander Belopolsky a > ?crit : >> I would be happy with just >> >> ? ?if accept2dyear: >> ? ? ? ?if 69 <= y <= 99: >> ? ? ? ? ? ?y += 1900 >> ? ? ? ?elif 0 <= y <= 68: >> ? ? ? ? ? ?y += 2000 >> ? ?# call system function with tm_year = y - 1900 .. >> but I thought that would be too radical. > > Why ? ISTM that time.asctime() called with a 3-digit year, particularly a low 3-digit, one is much more likely to be a manifestation of a lingering Y2K bug than a real intent to print an ancient date. I do remember that many devices were showing Jan 1, 100 back in early 2000. The same logic does not apply to programs that run with PYTHONY2K set because presumably users who bothered to set PYTHONY2K know that their program does not use 2-digit years. That's why, I think for 3.2 we should do the following: 1. Keep PYTHONY2K logic and accept2dyear = 1 default. 2. With default accept2dyear = 1: - for 0 <= year < 100 issue a deprecation warning and supply century according to POSIX rules - for 100 <= year < 1000 raise ValueError - for year >= 1000 leave year unchanged 3. With accept2dyear = 0 leave year unchanged regardless of value. For 3.3, remove PYTHONY2K and accept2dyear and leave year unchanged regardless of value. Can we agree that this is reasonable for time.asctime()? In time/datetime.strftime we can impose stricter limits if necessary to work around platform bugs. From nad at acm.org Thu Jan 6 22:04:42 2011 From: nad at acm.org (Ned Deily) Date: Thu, 06 Jan 2011 13:04:42 -0800 Subject: [Python-Dev] devguide: Point out that OS X users need to change examples to use python.exe instead of References: Message-ID: In article , brett.cannon wrote: [...] > summary: > Point out that OS X users need to change examples to use python.exe instead > of python. > Once Python is done building you will then have a working build of Python > that can be run in-place; ``./python`` on most machines, ``./python.exe`` > -on OS X. > +on OS X (all examples throughout this documentation say ``./python`` but > +implies you choose the proper name based on your OS). That's true on OS X if you are using a case-insensitive file system. But wIth the newer, case-sensitive HFS+, for example, you get ./python. -- Ned Deily, nad at acm.org From and-dev at doxdesk.com Fri Jan 7 00:50:06 2011 From: and-dev at doxdesk.com (And Clover) Date: Thu, 06 Jan 2011 23:50:06 +0000 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <1294109093.14661.4.camel@marge> References: <1294109093.14661.4.camel@marge> Message-ID: <1294357806.2970.33.camel@stalk> On Tue, 2011-01-04 at 03:44 +0100, Victor Stinner wrote: > What is this horrible encoding "bytes-as-unicode"? It is a unicode string decoded from bytes using ISO-8859-1. ISO-8859-1 is the encoding specified by the HTTP RFC, as well as having the happy property of preserving every input byte. PEP 3333 requires it. > os.environ is supposed to be correctly decoded and contain valid unicode characters. It is not possible to ?correctly? decode to unicode for os.environ because that decoding happens long before the web application (the only party that knows what encoding should be in use) gets a look in. Maybe the web application is using UTF-8, maybe it's using cp1252, but if we let the server/gateway decide and do that decoding before the application can do anything about it, we will get the wrong encoding in *many* cases and the result will be permanent, unrecoverable mangling of non-ASCII characters in submitted headers. > If WSGI uses another encoding than the locale encoding (which is a bad idea), It's an absolutely necessary idea. The locale encoding is nothing to do with the web application's encoding. Windows applications need to be able to use UTF-8 (which is never the ANSI code page), and web applications in general need to be deployable to any server without having to worry about the server's locale. The locale-dependent status quo is that non-ASCII characters in URL paths and other HTTP headers don't work for Python apps. The recoding dances present in wsgiref's CGIHandler for 3.2 are distasteful but completely necessary to normalise differences in encodings used by various servers and platforms to generate their CGI environment. > it should use os.environb and decodes keys and values using its > own encoding. Well yes, but: (a) os.environb doesn't exist in previous Python 3.1, making it impossible to implement WSGI before 3.2; (b) a byte environment on Windows would have to be encoded from the Unicode environment, with a server-specific encoding, and then what encoding are you going to choose for the variables that contain non-HTTP-sourced native Unicode strings (such as, very commonly, Windows pathnames)? The bytes-or-bytes-in-Unicode argument is something that has been bounced around Web-SIG for literally *years*; this is what we ended up with. Although I personally like bytes, frankly, a re-run of this argument *again* whilst WSGI remains in perpetual stalemate does not appeal. WSGI and wsgiref in Python 3.0-3.1 simply does not work. This has long been an embarrassing situation for what is supposed to be a leading web language. Let us not perpetuate this sorry story to 3.2 as well. -- And Clover mailto:and at doxdesk.com http://www.doxdesk.com skype:uknrbobince gtalk:chat?jid=bobince at gmail.com From raymond.hettinger at gmail.com Fri Jan 7 01:00:27 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Thu, 6 Jan 2011 16:00:27 -0800 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <1294357806.2970.33.camel@stalk> References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> Message-ID: <4BE3FAB4-9E11-496C-BC77-8CC7AC8B0254@gmail.com> Can you please take a look at http://docs.python.org/dev/whatsnew/3.2.html#pep-3333-python-web-server-gateway-interface-v1-0-1 to see if it accurately recaps the resolution of the WSGI text/bytes issues. I would appreciate any feedback, as it is likely that the whatsnew document will be most people's first chance to hear the outcome of the multi-year discussion. Thanks, Raymond On Jan 6, 2011, at 3:50 PM, And Clover wrote: > On Tue, 2011-01-04 at 03:44 +0100, Victor Stinner wrote: >> What is this horrible encoding "bytes-as-unicode"? > > It is a unicode string decoded from bytes using ISO-8859-1. ISO-8859-1 > is the encoding specified by the HTTP RFC, as well as having the happy > property of preserving every input byte. PEP 3333 requires it. > >> os.environ is supposed to be correctly decoded and contain valid > unicode characters. > > It is not possible to ?correctly? decode to unicode for os.environ > because that decoding happens long before the web application (the > only party that knows what encoding should be in use) gets a look in. > > Maybe the web application is using UTF-8, maybe it's using cp1252, > but if we let the server/gateway decide and do that decoding > before the application can do anything about it, we will get the wrong > encoding in *many* cases and the result will be permanent, unrecoverable > mangling of non-ASCII characters in submitted headers. > >> If WSGI uses another encoding than the locale encoding (which is a bad > idea), > > It's an absolutely necessary idea. The locale encoding is nothing to do > with the web application's encoding. Windows applications need to be > able to use UTF-8 (which is never the ANSI code page), and web > applications in general need to be deployable to any server without > having to worry about the server's locale. > > The locale-dependent status quo is that non-ASCII characters in URL > paths and other HTTP headers don't work for Python apps. > > The recoding dances present in wsgiref's CGIHandler for 3.2 are > distasteful but completely necessary to normalise differences in > encodings used by various servers and platforms to generate their CGI > environment. > >> it should use os.environb and decodes keys and values using its >> own encoding. > > Well yes, but: > > (a) os.environb doesn't exist in previous Python 3.1, making it > impossible to implement WSGI before 3.2; > (b) a byte environment on Windows would have to be encoded > from the Unicode environment, with a server-specific encoding, > and then what encoding are you going to choose for the variables > that contain non-HTTP-sourced native Unicode strings (such as, > very commonly, Windows pathnames)? > > The bytes-or-bytes-in-Unicode argument is something that has been > bounced around Web-SIG for literally *years*; this is what we ended up > with. Although I personally like bytes, frankly, a re-run of this > argument *again* whilst WSGI remains in perpetual stalemate does not > appeal. WSGI and wsgiref in Python 3.0-3.1 simply does not work. This > has long been an embarrassing situation for what is supposed to be a > leading > web language. Let us not perpetuate this sorry story to 3.2 as well. > > -- > And Clover > mailto:and at doxdesk.com http://www.doxdesk.com > skype:uknrbobince gtalk:chat?jid=bobince at gmail.com > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/raymond.hettinger%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Fri Jan 7 02:16:19 2011 From: v+python at g.nevcal.com (Glenn Linderman) Date: Thu, 06 Jan 2011 17:16:19 -0800 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <1294357806.2970.33.camel@stalk> References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> Message-ID: <4D266963.1000700@g.nevcal.com> On 1/6/2011 3:50 PM, And Clover wrote: > ISO-8859-1 is the encoding specified by the HTTP RFC Please could I have the reference to that specification? I only recall ASCII and UTF-8 in my readings of various things HTTP and HTML, for headers, and form data. Naturally data pages can have any encoding they please, as there are headers and tags to describe their encodings. -------------- next part -------------- An HTML attachment was scrubbed... URL: From digitalxero at gmail.com Fri Jan 7 02:55:47 2011 From: digitalxero at gmail.com (Dj Gilcrease) Date: Thu, 6 Jan 2011 20:55:47 -0500 Subject: [Python-Dev] 3.2b2 fails test suite on (my) Windows XP In-Reply-To: References: Message-ID: On Thu, Jan 6, 2011 at 1:00 AM, Terry Reedy wrote: > On 1/5/2011 8:59 PM, Nick Coghlan wrote: > Run 3: -x test_capi test_concurrent_futures > Instead of the normal output I expected, I got some of the craziest stuff I > have ever seen. Things like > " > ? ?assert main_name not in sys.modules, main_name > AssertionError: __main__ > Traceback (most recent call last): > ?File "", line 1, in > ?File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 369, in > main > ? ?prepare(preparation_data) > ?File "C:\Programs\Python32\lib\multiprocessing\forking.py", line 477, > " > were printed 100s of times intermixed with the normal sequential test > startup lines. They stopped after text_sax started and output became normal > through the end of the report. The 100's or 1000's of processes popping up is cause by a test that uses multi-processing and failing to have the if __name__ == '__main__' check around where it creates the processes From stephen at xemacs.org Fri Jan 7 03:25:47 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 07 Jan 2011 11:25:47 +0900 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: <20110106154725.29D841FDB65@kimball.webabinitio.net> References: <20110105184855.6b06c9ae@pitrou.net> <1294314924.18575.4.camel@marge> <20110106154725.29D841FDB65@kimball.webabinitio.net> Message-ID: <87bp3tbgb8.fsf@uwakimon.sk.tsukuba.ac.jp> R. David Murray writes: > I believe that we have had several cases where Windows "crashed" when > out-of-range values were passed to the CRT that other platforms > accepted. XEmacs had crashes due to strftime on Windows native with VC++. Never went so far as to BSOD, but a couple of users lost recently input data. :-( IIRC Cygwin was OK (their libc uses a different code base). Dunno mingw, almost all of our users either want the full Cygwin environment or they don't care about GCC, but since mingw uses MSFT runtime, it's probably vulnerable too. From foom at fuhm.net Fri Jan 7 04:31:01 2011 From: foom at fuhm.net (James Y Knight) Date: Thu, 6 Jan 2011 22:31:01 -0500 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <4D266963.1000700@g.nevcal.com> References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> <4D266963.1000700@g.nevcal.com> Message-ID: <1E1515A0-4F48-46F9-B4F8-07256C67BC48@fuhm.net> On Jan 6, 2011, at 8:16 PM, Glenn Linderman wrote: > On 1/6/2011 3:50 PM, And Clover wrote: >> >> ISO-8859-1 is the encoding specified by the HTTP RFC > > Please could I have the reference to that specification? I only recall ASCII and UTF-8 in my readings of various things HTTP and HTML, for headers, and form data. Naturally data pages can have any encoding they please, as there are headers and tags to describe their encodings. Did you try google? http://www.google.com/search?http+rfc James From stephen at xemacs.org Fri Jan 7 04:37:19 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 07 Jan 2011 12:37:19 +0900 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <4D266963.1000700@g.nevcal.com> References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> <4D266963.1000700@g.nevcal.com> Message-ID: <87tyhl9yfk.fsf@uwakimon.sk.tsukuba.ac.jp> Glenn Linderman writes: > On 1/6/2011 3:50 PM, And Clover wrote: > > ISO-8859-1 is the encoding specified by the HTTP RFC > > Please could I have the reference to that specification? RFC 2616 (probably obsolete by now, but IRC ISO 8859/1 is already there IIRC), and I don't think UTF-8 is the default for anything until you get to XHTML (and maybe HTML5). From ncoghlan at gmail.com Fri Jan 7 05:49:37 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Jan 2011 14:49:37 +1000 Subject: [Python-Dev] [Python-checkins] devguide: Strip out all generic svn instructions from the FAQ. It's not only In-Reply-To: <4D25B472.2090801@voidspace.org.uk> References: <4D25B472.2090801@voidspace.org.uk> Message-ID: On Thu, Jan 6, 2011 at 10:24 PM, Michael Foord wrote: > I think you have it backwards. The benefit of having a FAQ is not that > people read it first (they will almost never do that) but that you have a > single place to send them when they ask the questions. It sounds like it's > working! :-) I believe it's also the case that *any given person* usually only needs to be sent to the dev FAQ once :) Still, Brett's right that any of us will be able to add the source control cheat sheet once he has finished with the basic structure of the dev guide. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Fri Jan 7 05:54:50 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Jan 2011 14:54:50 +1000 Subject: [Python-Dev] 3.2b2 fails test suite on (my) Windows XP In-Reply-To: References: Message-ID: On Thu, Jan 6, 2011 at 4:00 PM, Terry Reedy wrote: >> Does it behave itself if you add "-x test_capi" to the command line? > > No, it gets worse. Really. > Let me summarize a long post. > > Run 1: normal (as above) > Process stops at capi test with Windows error message. > Close command prompt window with [x] buttom (crtl-whatever had no effect). > > Run 2: normal (as before) > Process reported capi test failure (supposedly fatal) but continued. > Process just stopped ('hung') at concurrent futures. Close as before. > > Run 3: -x test_capi test_concurrent_futures > Instead of the normal output I expected, I got some of the craziest stuff I > have ever seen. Things like Does it all go back to normal if you use "python -m test.regrtest" instead? Antoine discovered that multiprocessing on Windows gets thoroughly confused if __file__ in the main module ends with "__main__.py" (see http://bugs.python.org/issue10845) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From v+python at g.nevcal.com Fri Jan 7 05:59:55 2011 From: v+python at g.nevcal.com (Glenn Linderman) Date: Thu, 06 Jan 2011 20:59:55 -0800 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <87tyhl9yfk.fsf@uwakimon.sk.tsukuba.ac.jp> References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> <4D266963.1000700@g.nevcal.com> <87tyhl9yfk.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4D269DCB.2020604@g.nevcal.com> On 1/6/2011 7:37 PM, Stephen J. Turnbull wrote: > Glenn Linderman writes: > > On 1/6/2011 3:50 PM, And Clover wrote: > > > ISO-8859-1 is the encoding specified by the HTTP RFC > > > > Please could I have the reference to that specification? > > RFC 2616 (probably obsolete by now, but IRC ISO 8859/1 is already > there IIRC), and I don't think UTF-8 is the default for anything until > you get to XHTML (and maybe HTML5). Thanks. Looking back, it is 2068 and 1945 also, I just had a mental blind spot, thinking I understood the header formats from email-land, where they are more required to be ASCII, as mentioned in my reply to James. UTF-8 is the default for FORM DATA when using multipart/form-data encoding, using the POST method. Otherwise, it FORM DATA is limited to ASCII. Per http://www.w3.org/TR/html401/interact/forms.html#h-17.13.1 which is HTML 4.01 (and maybe earlier, but I didn't go back further). Nice to quote chapter and verse (or link) when declaring that something is in a standard. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Fri Jan 7 06:12:16 2011 From: pje at telecommunity.com (P.J. Eby) Date: Fri, 07 Jan 2011 00:12:16 -0500 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <4BE3FAB4-9E11-496C-BC77-8CC7AC8B0254@gmail.com> References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> <4BE3FAB4-9E11-496C-BC77-8CC7AC8B0254@gmail.com> Message-ID: <20110107051217.BDC143A40A8@sparrow.telecommunity.com> At 04:00 PM 1/6/2011 -0800, Raymond Hettinger wrote: >Can you please take a look at >http://docs.python.org/dev/whatsnew/3.2.html#pep-3333-python-web-server-gateway-interface-v1-0-1 >to see if it accurately recaps the resolution of the WSGI text/bytes issues. >I would appreciate any feedback, as it is likely that the whatsnew >document will be most people's first chance to hear the outcome >of the multi-year discussion. Hi Raymond -- nice work there. A few minor suggestions: 1. Native strings are used as the keys and values of the environ dictionary, not just as headers for start_response. 2. The read_environ() method is strictly for use with CGI-to-WSGI gateways, or for bridging other CGI-like protocols (e.g. FastCGI) to WSGI. It is ONLY for server implementers, in other words, and the typical app developer is doing something terribly wrong if they are even bothering to read its documentation. ;-) 3. The primary relevance of the "native string" type to an app developer is that when porting code from Python 2 to 3, they must still decode environment variable values, even though they are "already" Unicode. If their code was previously dealing only in Python 2 'str' objects, then nothing really changes. If they were previously decoding from environ str's to unicode, then they must replace their prior .decode('whatever') with .encode('latin1').decode('whatever'). That's basically it for porting from Python 2. IOW, this design choice allows most HTTP header manipulating code (whether input or output) to be ported to Python 3 with a very mechanical change pattern. Most such code is working with ASCII anyway, since normally both input and output headers are, and there are few headers that an application would be likely to convert to actual unicode anyway. On output via send_response(), if an application is currently encoding an output header -- why they would be, I have no idea, but if they are -- they need to add a re-encode to latin1. (i.e., .encode('whatever').decode('latin1')) IOW, a short 2-to-3 porting guide for WSGI: * If you just used strings for headers before, that part of your code doesn't change. (And if it was broken before, it's still broken in exactly the same way. No new breakage is introduced. ;-) ) * If you encoded any output headers or decoded any input headers, you must take into account the extra latin1 step. This is expected to be rare, since it's usually only SCRIPT_NAME and PATH_INFO that anybody would ever care about on input, and almost never anything on output. * Values yielded by an application or sent via a write() call MUST be byte strings; The environ and start_response() MUST be native strings. No mixing and matching. From tjreedy at udel.edu Fri Jan 7 09:02:54 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 07 Jan 2011 03:02:54 -0500 Subject: [Python-Dev] 3.2b2 fails test suite on (my) Windows XP In-Reply-To: References: Message-ID: On 1/6/2011 11:54 PM, Nick Coghlan wrote: > On Thu, Jan 6, 2011 at 4:00 PM, Terry Reedy wrote: >>> Does it behave itself if you add "-x test_capi" to the command line? >> >> No, it gets worse. Really. >> Let me summarize a long post. >> >> Run 1: normal (as above) >> Process stops at capi test with Windows error message. >> Close command prompt window with [x] buttom (crtl-whatever had no effect). >> >> Run 2: normal (as before) >> Process reported capi test failure (supposedly fatal) but continued. >> Process just stopped ('hung') at concurrent futures. Close as before. >> >> Run 3: -x test_capi test_concurrent_futures >> Instead of the normal output I expected, I got some of the craziest stuff I >> have ever seen. Things like > > Does it all go back to normal if you use "python -m test.regrtest" > instead? Antoine discovered that multiprocessing on Windows gets > thoroughly confused if __file__ in the main module ends with > "__main__.py" (see http://bugs.python.org/issue10845) Yes. As I reported on the issue, only 'normal' test failure output. Later, I will try to see if there are already issues for all of them. -- Terry Jan Reedy From and-py at doxdesk.com Fri Jan 7 00:35:59 2011 From: and-py at doxdesk.com (And Clover) Date: Thu, 06 Jan 2011 23:35:59 +0000 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <1294109093.14661.4.camel@marge> References: <1294109093.14661.4.camel@marge> Message-ID: <1294356959.2970.23.camel@stalk> On Tue, 2011-01-04 at 03:44 +0100, Victor Stinner wrote: > What is this horrible encoding "bytes-as-unicode"? It is a unicode string decoded from bytes using ISO-8859-1. ISO-8859-1 is the encoding specified by the HTTP RFC, as well as having the happy property of preserving every input byte. > os.environ is supposed to be correctly decoded and contain valid unicode characters. Nope. It is not possible to ?correctly? decode to unicode for os.environ because that decoding happens long before the web application gets a look in. Maybe the web application is using UTF-8, maybe it's using cp1252, but if we let the server/gateway decide and do that decoding before the application can do anything about it, we will get the wrong encoding in *many* cases and the result will be permanent, unrecoverable mangling of non-ASCII characters in submitted headers. > If WSGI uses another encoding than the locale encoding (which is a bad idea), It's an absolutely necessary idea. The locale encoding is nothing to do with the web application's encoding. Windows applications need to be able to use UTF-8 (which is never the ANSI code page), and web applications in general need to be deployable to any server without having to worry about the server's locale. The locale-dependent status quo is that non-ASCII characters in URL paths and other HTTP headers don't work for Python apps. The recoding dances present in wsgiref's CGIHandler for 3.2 are distasteful but completely necessary to normalise differences in encodings used by various servers and platforms to generate their CGI environment. > it should use os.environb and decodes keys and values using its > own encoding. Well yes, but: (a) os.environb doesn't exist in previous Python 3.1, making it impossible to implement WSGI before 3.2; (b) there are also non-HTTP-related environment variables, which may contain native Unicode strings (eg, very commonly, Windows pathnames), so you have to have both environ *and* environb. The bytes-or-bytes-in-Unicode argument is something that has been bounced around Web-SIG for literally *years*; this is what we ended up with. Although I personally like bytes, frankly, a re-run of this argument *again* whilst WSGI remains in perpetual stalemate does not appeal. WSGI and wsgiref in Python 3.0-3.1 simply not work at all. This has been an embarrassing situation for what is supposed to be a leading web language. Let's not perpetuate this sorry story to 3.2 as well. -- And Clover mailto:and at doxdesk.com http://www.doxdesk.com skype:uknrbobince gtalk:chat?jid=bobince at gmail.com From victor.stinner at haypocalc.com Fri Jan 7 12:51:01 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 07 Jan 2011 12:51:01 +0100 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <1294357806.2970.33.camel@stalk> References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> Message-ID: <1294401061.14078.30.camel@marge> Le jeudi 06 janvier 2011 ? 23:50 +0000, And Clover a ?crit : > On Tue, 2011-01-04 at 03:44 +0100, Victor Stinner wrote: > > What is this horrible encoding "bytes-as-unicode"? > > It is a unicode string decoded from bytes using ISO-8859-1. ISO-8859-1 > is the encoding specified by the HTTP RFC, as well as having the happy > property of preserving every input byte. PEP 3333 requires it. ISO-8859-1 for all fields: SERVER_NAME, PATH_INFO, the URL, form data, ...? > > os.environ is supposed to be correctly decoded and contain valid > unicode characters. > > It is not possible to ?correctly? decode to unicode for os.environ > because that decoding happens long before the web application (the > only party that knows what encoding should be in use) gets a look in. Agreed. > Maybe the web application is using UTF-8, maybe it's using cp1252, > but if we let the server/gateway decide and do that decoding (...) > It's an absolutely necessary idea. The locale encoding is nothing > to do with the web application's encoding. (...) Ok, so you must pass byte strings to the server/gateway. If you pass unicode, how do the server/gateway know that it has to redecode a value? Should it redecode all values? Anything, it is stupid to use a temporary useless pseudo-encoding (bytes-in-unicode). > The recoding dances present in wsgiref's CGIHandler for 3.2 are > distasteful but completely necessary to normalise differences in > encodings used by various servers and platforms to generate their CGI > environment. I don't understand why read_environ() gives unicode values: as you explained, the server/gateway will have to encode the values again, and then finally to decode them from the correct encoding. On POSIX, the current code looks like that: a) the OS pass a bytes environ to the program b) Python decodes environ from the locale encoding c) wsgi.read_environ() encodes environ to the locale encoding to get back the original bytes environ: this step can be skipped if os.environb is available d) wsgi.read_environ() decodes environ from ISO-8859-1 e) the server/gateway encodes environ to ISO-8859-1 f) the server/gateway decodes environ from the right encoding Hey! Don't you think that there are useless encode/decode steps here? Especially (d)-(e) is useless and introduces a confusion: the environ contains other keys that don't come from os.environ and are already correctly decoded, how do the the server/gateway know that they are already correctly decoded? I propose simply (for Python 3.2): a) the OS pass a bytes environ to the program: wsgi.read_environ() uses it b) the server/gateway decodes environ from the right encoding and... > (a) os.environb doesn't exist in previous Python 3.1, making it > impossible to implement WSGI before 3.2; For Python 3.1, add a step between (a) and (b): encode environ to the locale encoding (with surrogateescape) to get back the original bytes environ. > (b) a byte environment on Windows would have to be encoded > from the Unicode environment, with a server-specific encoding, > and then what encoding are you going to choose for the variables > that contain non-HTTP-sourced native Unicode strings (such as, > very commonly, Windows pathnames)? The variables coming from the HTTP server should be encoded again to the server-specific encoding. Other variables should be kept unchanged. The server/gateway can simply test the type of the variable: if it's uncode, nothing to do, if it's bytes: decode it from the correct encoding. > The bytes-or-bytes-in-Unicode argument is something that has been > bounced around Web-SIG for literally *years*; (...) WSGI and wsgiref > in Python 3.0-3.1 simply does not work. I don't understand why you are attached to this horrible hack (bytes-in-unicode). It introduces more work and more confusing than using raw bytes unchanged. It doesn't work and so something has to be changed. Victor From stephen at xemacs.org Fri Jan 7 14:02:04 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 07 Jan 2011 22:02:04 +0900 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <1294401061.14078.30.camel@marge> References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> <1294401061.14078.30.camel@marge> Message-ID: <87ei8oamur.fsf@uwakimon.sk.tsukuba.ac.jp> Victor Stinner writes: > It doesn't work and so something has to be changed. What specific bug have you observed? Everybody hates this hack, or at the very least is somewhat embarrassed by it, but the working group clearly believes that it works and something like it is necessary. They've studied it for years. To get rid of it, "somebody" needs to demonstrate a bug, and propose something better, plus implement it in code, plus fix any tests that expect Unicode and now get bytes, plus create any additional tests that may be necessitated by changing from a Unicode representation to a bytes representation. I hate it too, but not enough to to ask anybody to do any of the above without a real bug. From ncoghlan at gmail.com Fri Jan 7 15:18:44 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 8 Jan 2011 00:18:44 +1000 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <1294401061.14078.30.camel@marge> References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> <1294401061.14078.30.camel@marge> Message-ID: On Fri, Jan 7, 2011 at 9:51 PM, Victor Stinner wrote: > On POSIX, the current code looks like that: > > ?a) the OS pass a bytes environ to the program > ?b) Python decodes environ from the locale encoding > ?c) wsgi.read_environ() encodes environ to the locale encoding to get > back the original bytes environ: this step can be skipped if os.environb > is available > ?d) wsgi.read_environ() decodes environ from ISO-8859-1 > ?e) the server/gateway encodes environ to ISO-8859-1 > ?f) the server/gateway decodes environ from the right encoding > > Hey! Don't you think that there are useless encode/decode steps here? > Especially (d)-(e) is useless and introduces a confusion: the environ > contains other keys that don't come from os.environ and are already > correctly decoded, how do the the server/gateway know that they are > already correctly decoded? Because WSGI is platform neutral. WSGI apps have no idea if they're running on Windows or POSIX. The type used to communicate between the WSGI engine and the WSGI must be either bytes *or* unicode, and either choice causes problems depending on the underlying OS. bytes-as-unicode is not a great choice, it is merely the least bad choice of the available options. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From foom at fuhm.net Fri Jan 7 15:43:25 2011 From: foom at fuhm.net (James Y Knight) Date: Fri, 7 Jan 2011 09:43:25 -0500 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <1294401061.14078.30.camel@marge> References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> <1294401061.14078.30.camel@marge> Message-ID: <925E2041-7B6F-4868-8E9D-856A15784C25@fuhm.net> On Jan 7, 2011, at 6:51 AM, Victor Stinner wrote: > I don't understand why you are attached to this horrible hack > (bytes-in-unicode). It introduces more work and more confusing than > using raw bytes unchanged. > > It doesn't work and so something has to be changed. It's gross but it does work. This has been discussed ad-nausium on web-sig over a period of years. I'd like to reiterate that it is only even a potential issue for the PATH_INFO/SCRIPT_NAME keys. Those two keys are required to have been urldecoded already, into byte-data in some encoding. For all the other keys (including the ones from os.environ), they are either *properly* decoded in 8859-1 or are just ascii (possibly still urlencoded, so the app needs to urldecode and decode into a string with the correct encoding). James From guido at python.org Fri Jan 7 16:52:01 2011 From: guido at python.org (Guido van Rossum) Date: Fri, 7 Jan 2011 07:52:01 -0800 Subject: [Python-Dev] Checking input range in time.asctime and time.ctime In-Reply-To: References: <20110105184855.6b06c9ae@pitrou.net> Message-ID: I think I've said all I can say in this thread; I'm sure you will come up with a satisfactory solution. -- --Guido van Rossum (python.org/~guido) From victor.stinner at haypocalc.com Fri Jan 7 17:18:38 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 07 Jan 2011 17:18:38 +0100 Subject: [Python-Dev] [Python-checkins] r87815 - peps/trunk/pep-3333.txt In-Reply-To: <20110107153928.3CE34EE988@mail.python.org> References: <20110107153928.3CE34EE988@mail.python.org> Message-ID: <1294417118.22838.0.camel@marge> Le vendredi 07 janvier 2011 ? 16:39 +0100, phillip.eby a ?crit : > Author: phillip.eby > Date: Fri Jan 7 16:39:27 2011 > New Revision: 87815 > > Log: > More bytes I/O fixes > > > Modified: > peps/trunk/pep-3333.txt > > Modified: peps/trunk/pep-3333.txt > ============================================================================== > --- peps/trunk/pep-3333.txt (original) > +++ peps/trunk/pep-3333.txt Fri Jan 7 16:39:27 2011 > @@ -310,9 +310,9 @@ > elif not headers_sent: > # Before the first output, send the stored headers > status, response_headers = headers_sent[:] = headers_set > - sys.stdout.write('Status: %s\r\n' % status) > + sys.stdout.buffer.write('Status: %s\r\n' % status) > for header in response_headers: > - sys.stdout.write('%s: %s\r\n' % header) > + sys.stdout.buffer.write('%s: %s\r\n' % header) > sys.stdout.write('\r\n') Are ('Status: %s\r\n' % status) and ('%s: %s\r\n' % header) byte strings or unicode strings? Victor From solipsis at pitrou.net Fri Jan 7 17:32:58 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 7 Jan 2011 17:32:58 +0100 Subject: [Python-Dev] r87816 - peps/trunk/pep-3333.txt References: <20110107154526.4B42CEEA31@mail.python.org> Message-ID: <20110107173258.1bddae4f@pitrou.net> On Fri, 7 Jan 2011 16:45:26 +0100 (CET) phillip.eby wrote: > Author: phillip.eby > Date: Fri Jan 7 16:45:26 2011 > New Revision: 87816 > > Log: > Fix re-raise syntax for Python 3 > > > Modified: > peps/trunk/pep-3333.txt > > Modified: peps/trunk/pep-3333.txt > ============================================================================== > --- peps/trunk/pep-3333.txt (original) > +++ peps/trunk/pep-3333.txt Fri Jan 7 16:45:26 2011 > @@ -323,7 +323,7 @@ > try: > if headers_sent: > # Re-raise original exception if headers sent > - raise exc_info[0], exc_info[1], exc_info[2] > + raise exc_info[1].with_traceback(exc_info[2]) You shouldn't need that. Just "raise exc_info[1]". Regards Antoine. From pje at telecommunity.com Fri Jan 7 18:04:46 2011 From: pje at telecommunity.com (P.J. Eby) Date: Fri, 07 Jan 2011 12:04:46 -0500 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <925E2041-7B6F-4868-8E9D-856A15784C25@fuhm.net> References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> <1294401061.14078.30.camel@marge> <925E2041-7B6F-4868-8E9D-856A15784C25@fuhm.net> Message-ID: <20110107170449.ECC643A411A@sparrow.telecommunity.com> At 09:43 AM 1/7/2011 -0500, James Y Knight wrote: >On Jan 7, 2011, at 6:51 AM, Victor Stinner wrote: > > I don't understand why you are attached to this horrible hack > > (bytes-in-unicode). It introduces more work and more confusing than > > using raw bytes unchanged. > > > > It doesn't work and so something has to be changed. > >It's gross but it does work. This has been discussed ad-nausium on >web-sig over a period of years. > >I'd like to reiterate that it is only even a potential issue for the >PATH_INFO/SCRIPT_NAME keys. Those two keys are required to have been >urldecoded already, into byte-data in some encoding. For all the >other keys (including the ones from os.environ), they are either >*properly* decoded in 8859-1 or are just ascii (possibly still >urlencoded, so the app needs to urldecode and decode into a string >with the correct encoding). Right. Also, it should be mentioned that none of this would be necessary if we could've gotten a "bytes of a known encoding" type. If you look back to the last big Python-Dev discussion on bytes/unicode and stdlib API breakage, this was the holdup for getting a sane WSGI spec. Since we couldn't change the language to fix the problem (due to the moratorium), we had to use this less-pleasant way of dealing with things, in order to get a final WSGI spec for Python 3. (If anybody is wondering about the specifics of the language change that was needed, it'd be having a "bytes with known encoding" type, that when combined in any polymorphic operation with a unicode string, would result in bytes-with-encoding output, and would raise an error if the resulting value could not be encoded in the target encoding. Then we would simply do all WSGI header operations with this type, using latin-1 as the target encoding.) From status at bugs.python.org Fri Jan 7 18:07:04 2011 From: status at bugs.python.org (Python tracker) Date: Fri, 7 Jan 2011 18:07:04 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20110107170704.EB1FE1CCA4@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2010-12-31 - 2011-01-07) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 2501 (-24) closed 20138 (+80) total 22639 (+56) Open issues with patches: 1045 Issues opened (40) ================== #4188: test_threading hang when running as verbose http://bugs.python.org/issue4188 reopened by r.david.murray #8109: Server-side support for TLS Server Name Indication extension http://bugs.python.org/issue8109 reopened by pitrou #10789: Lock.acquire documentation is misleading http://bugs.python.org/issue10789 reopened by terry.reedy #10803: ctypes: better support of bytearray objects http://bugs.python.org/issue10803 opened by mfxmfx #10805: traceback.print_exception throws AttributeError when exception http://bugs.python.org/issue10805 opened by abingham #10808: ssl unwrap fails with Error 0 http://bugs.python.org/issue10808 opened by apollo13 #10811: sqlite segfault with generators http://bugs.python.org/issue10811 opened by Erick.Tryzelaar #10812: Add some posix functions http://bugs.python.org/issue10812 opened by rosslagerwall #10813: Suppress adding decimal point for places=0 in moneyfmt() http://bugs.python.org/issue10813 opened by cgrohmann #10817: urllib.request.urlretrieve never raises ContentTooShortError i http://bugs.python.org/issue10817 opened by RC #10818: pydoc: Remove old server and tk panel http://bugs.python.org/issue10818 opened by haypo #10820: 3.2 Makefile changes for versioned scripts break OS X framewor http://bugs.python.org/issue10820 opened by ned.deily #10822: test_getgroups failure under Solaris http://bugs.python.org/issue10822 opened by pitrou #10826: pass_fds sometimes fails http://bugs.python.org/issue10826 opened by pitrou #10827: Functions in time module should support year < 1900 when accep http://bugs.python.org/issue10827 opened by belopolsky #10829: PyUnicode_FromFormatV() bugs with "%" and "%%" format strings http://bugs.python.org/issue10829 opened by haypo #10830: PyUnicode_FromFormatV("%c") doesn't support non-BMP characters http://bugs.python.org/issue10830 opened by haypo #10831: PyUnicode_FromFormatV() doesn't support %li, %lli, %zi http://bugs.python.org/issue10831 opened by haypo #10832: Add support of bytes objects in PyBytes_FromFormatV() http://bugs.python.org/issue10832 opened by haypo #10833: Replace %.100s by %s in PyErr_Format(): the arbitrary limit of http://bugs.python.org/issue10833 opened by haypo #10834: Python 2.7 x86 fails to run in Windows 7 http://bugs.python.org/issue10834 opened by excubated #10835: sys.executable default and altinstall http://bugs.python.org/issue10835 opened by allan #10836: TypeError during exception handling in urllib.request.urlretri http://bugs.python.org/issue10836 opened by Alexandru.Mo??oi #10837: Issue catching KeyboardInterrupt while reading stdin http://bugs.python.org/issue10837 opened by Josh.Hanson #10838: subprocess __all__ is incomplete http://bugs.python.org/issue10838 opened by a.badger #10839: email module should not allow some header field repetitions http://bugs.python.org/issue10839 opened by adrien-saladin #10841: binary stdio http://bugs.python.org/issue10841 opened by v+python #10842: Update third-party libraries for OS X installer builds http://bugs.python.org/issue10842 opened by ned.deily #10843: OS X installer: install the Tools source directory http://bugs.python.org/issue10843 opened by ned.deily #10845: test_multiprocessing failure under Windows http://bugs.python.org/issue10845 opened by pitrou #10847: Distutils drops -fno-strict-aliasing when CFLAGS are set http://bugs.python.org/issue10847 opened by skrah #10848: Move test.regrtest from getopt to argparse http://bugs.python.org/issue10848 opened by brett.cannon #10849: Backport test/__main__ http://bugs.python.org/issue10849 opened by belopolsky #10850: inconsistent behavior concerning multiprocessing.manager.BaseM http://bugs.python.org/issue10850 opened by chrysn #10851: further extend ssl SNI and ciphers API http://bugs.python.org/issue10851 opened by grooverdan #10852: SSL/TLS sni use in smtp,pop,imap,nntp,ftp client libs by defau http://bugs.python.org/issue10852 opened by grooverdan #10854: Output DLL name in error message of ImportError when DLL is mi http://bugs.python.org/issue10854 opened by techtonik #10855: wave.Wave_read.close() doesn't release file http://bugs.python.org/issue10855 opened by pjcreath #10856: documentation for ImportError parameters and attributes http://bugs.python.org/issue10856 opened by techtonik #10828: Cannot use nonascii utf8 in names of files imported from http://bugs.python.org/issue10828 opened by ingemar Most recent 15 issues with no replies (15) ========================================== #10856: documentation for ImportError parameters and attributes http://bugs.python.org/issue10856 #10855: wave.Wave_read.close() doesn't release file http://bugs.python.org/issue10855 #10850: inconsistent behavior concerning multiprocessing.manager.BaseM http://bugs.python.org/issue10850 #10847: Distutils drops -fno-strict-aliasing when CFLAGS are set http://bugs.python.org/issue10847 #10837: Issue catching KeyboardInterrupt while reading stdin http://bugs.python.org/issue10837 #10836: TypeError during exception handling in urllib.request.urlretri http://bugs.python.org/issue10836 #10831: PyUnicode_FromFormatV() doesn't support %li, %lli, %zi http://bugs.python.org/issue10831 #10830: PyUnicode_FromFormatV("%c") doesn't support non-BMP characters http://bugs.python.org/issue10830 #10822: test_getgroups failure under Solaris http://bugs.python.org/issue10822 #10820: 3.2 Makefile changes for versioned scripts break OS X framewor http://bugs.python.org/issue10820 #10817: urllib.request.urlretrieve never raises ContentTooShortError i http://bugs.python.org/issue10817 #10811: sqlite segfault with generators http://bugs.python.org/issue10811 #10808: ssl unwrap fails with Error 0 http://bugs.python.org/issue10808 #10803: ctypes: better support of bytearray objects http://bugs.python.org/issue10803 #10799: Improve webbrowser.open doc (and, someday, behavior?) http://bugs.python.org/issue10799 Most recent 15 issues waiting for review (15) ============================================= #10852: SSL/TLS sni use in smtp,pop,imap,nntp,ftp client libs by defau http://bugs.python.org/issue10852 #10851: further extend ssl SNI and ciphers API http://bugs.python.org/issue10851 #10843: OS X installer: install the Tools source directory http://bugs.python.org/issue10843 #10842: Update third-party libraries for OS X installer builds http://bugs.python.org/issue10842 #10841: binary stdio http://bugs.python.org/issue10841 #10833: Replace %.100s by %s in PyErr_Format(): the arbitrary limit of http://bugs.python.org/issue10833 #10832: Add support of bytes objects in PyBytes_FromFormatV() http://bugs.python.org/issue10832 #10831: PyUnicode_FromFormatV() doesn't support %li, %lli, %zi http://bugs.python.org/issue10831 #10830: PyUnicode_FromFormatV("%c") doesn't support non-BMP characters http://bugs.python.org/issue10830 #10829: PyUnicode_FromFormatV() bugs with "%" and "%%" format strings http://bugs.python.org/issue10829 #10827: Functions in time module should support year < 1900 when accep http://bugs.python.org/issue10827 #10820: 3.2 Makefile changes for versioned scripts break OS X framewor http://bugs.python.org/issue10820 #10818: pydoc: Remove old server and tk panel http://bugs.python.org/issue10818 #10812: Add some posix functions http://bugs.python.org/issue10812 #10798: test_concurrent_futures fails on FreeBSD http://bugs.python.org/issue10798 Top 10 most discussed issues (10) ================================= #4953: cgi module cannot handle POST with multipart/form-data in 3.0 http://bugs.python.org/issue4953 43 msgs #10841: binary stdio http://bugs.python.org/issue10841 24 msgs #10181: Problems with Py_buffer management in memoryobject.c (and else http://bugs.python.org/issue10181 21 msgs #10512: regrtest ResourceWarning - unclosed sockets and files http://bugs.python.org/issue10512 15 msgs #5945: PyMapping_Check returns 1 for lists http://bugs.python.org/issue5945 14 msgs #9566: Compilation warnings under x64 Windows http://bugs.python.org/issue9566 10 msgs #10834: Python 2.7 x86 fails to run in Windows 7 http://bugs.python.org/issue10834 10 msgs #2193: Cookie Colon Name Bug http://bugs.python.org/issue2193 8 msgs #10812: Add some posix functions http://bugs.python.org/issue10812 8 msgs #1674555: sys.path in tests contains system directories http://bugs.python.org/issue1674555 8 msgs Issues closed (76) ================== #1187: pipe fd handling issues in subprocess.py on POSIX http://bugs.python.org/issue1187 closed by pitrou #1452: subprocess's popen.stdout.seek(0) doesn't raise an error http://bugs.python.org/issue1452 closed by pitrou #3466: urllib2 should support HTTPS connections with client keys http://bugs.python.org/issue3466 closed by pitrou #3839: wsgi.simple_server resets 'Content-Length' header on empty con http://bugs.python.org/issue3839 closed by pitrou #4662: posix module lacks several DeprecationWarning's http://bugs.python.org/issue4662 closed by pitrou #5369: __ppc__ macro checking is incorrect http://bugs.python.org/issue5369 closed by pitrou #5485: pyexpat has no unit tests for UseForeignDTD functionality http://bugs.python.org/issue5485 closed by pitrou #6269: threading documentation makes no mention of the GIL http://bugs.python.org/issue6269 closed by pitrou #6285: Silent abort on XP help document display http://bugs.python.org/issue6285 closed by terry.reedy #6293: Have regrtest.py echo back sys.flags http://bugs.python.org/issue6293 closed by pitrou #6610: Subprocess descriptor debacle http://bugs.python.org/issue6610 closed by georg.brandl #6643: Throw away more radioactive locks that could be held across a http://bugs.python.org/issue6643 closed by gregory.p.smith #6664: readlines should understand Line Separator and Paragraph Separ http://bugs.python.org/issue6664 closed by pitrou #6800: os.exec* raises "OSError: [Errno 45] Operation not supported" http://bugs.python.org/issue6800 closed by pitrou #7716: IPv6 detection, don't assume existence of /usr/xpg4/bin/grep http://bugs.python.org/issue7716 closed by pitrou #7858: os.utime(file, (0,0,)) fails on on vfat, but doesn't fail imme http://bugs.python.org/issue7858 closed by pitrou #7995: On Mac / BSD sockets returned by accept inherit the parent's F http://bugs.python.org/issue7995 closed by pitrou #8013: time.asctime segfaults when given a time in the far future http://bugs.python.org/issue8013 closed by georg.brandl #8278: os.utime doesn't allow a atime (Last Access) which is 27 years http://bugs.python.org/issue8278 closed by amaury.forgeotdarc #8458: buildbot: test_cmd_line failure on Tiger: [Errno 9] Bad file d http://bugs.python.org/issue8458 closed by pitrou #8499: Set a timeout in test_urllibnet http://bugs.python.org/issue8499 closed by sandro.tosi #8626: TypeError: rsplit() takes no keyword arguments http://bugs.python.org/issue8626 closed by eric.araujo #8719: buildbot: segfault on FreeBSD (signal 11) http://bugs.python.org/issue8719 closed by haypo #8731: BeOS for 2.7 - add to unsupported http://bugs.python.org/issue8731 closed by pitrou #8992: convertsimple() doesn't need to call converterr() if an except http://bugs.python.org/issue8992 closed by haypo #9074: subprocess closes standard file descriptors when it should not http://bugs.python.org/issue9074 closed by georg.brandl #9115: test_site: support for systems without unsetenv http://bugs.python.org/issue9115 closed by eric.araujo #9332: Document requirements for os.symlink usage on Windows http://bugs.python.org/issue9332 closed by brian.curtin #9361: Tests for leapdays in calendar.py module http://bugs.python.org/issue9361 closed by r.david.murray #9370: Add reader redirect from test package docs to unittest module http://bugs.python.org/issue9370 closed by ncoghlan #9671: test_executable_without_cwd fails: AssertionError: 1 != 47 http://bugs.python.org/issue9671 closed by sandro.tosi #9854: SocketIO should return None on EWOULDBLOCK http://bugs.python.org/issue9854 closed by pitrou #9905: subprocess.Popen fails with stdout=PIPE, stderr=PIPE if standa http://bugs.python.org/issue9905 closed by pitrou #9977: TestCase.assertItemsEqual's description of differences http://bugs.python.org/issue9977 closed by michael.foord #10001: ~Py_buffer.obj field is undocumented, though not hidden http://bugs.python.org/issue10001 closed by ncoghlan #10028: test_concurrent_futures fails on Windows Server 2003 http://bugs.python.org/issue10028 closed by bquinlan #10104: test_socket failures on Debian unstable http://bugs.python.org/issue10104 closed by pitrou #10130: Create epub format docs and offer them on the download page http://bugs.python.org/issue10130 closed by georg.brandl #10267: test_ttk_guionly leaks many references http://bugs.python.org/issue10267 closed by pitrou #10270: Fix resource warnings in test_threading http://bugs.python.org/issue10270 closed by sandro.tosi #10333: Remove ancient backwards compatibility GC API http://bugs.python.org/issue10333 closed by pitrou #10475: hardcoded compilers for LDSHARED/LDCXXSHARED on NetBSD http://bugs.python.org/issue10475 closed by pitrou #10492: test_doctest fails with iso-8859-15 locale http://bugs.python.org/issue10492 closed by haypo #10502: Add unittestguirunner to Tools/ http://bugs.python.org/issue10502 closed by michael.foord #10563: Spurious newline in time.ctime http://bugs.python.org/issue10563 closed by georg.brandl #10619: Failed module loading in test discovery loses traceback http://bugs.python.org/issue10619 closed by michael.foord #10620: `python -m uniittest` should work with file paths as well as t http://bugs.python.org/issue10620 closed by michael.foord #10655: Wrong powerpc define in Python/ceval.c http://bugs.python.org/issue10655 closed by dmalcolm #10737: test_concurrent_futures failure on Windows http://bugs.python.org/issue10737 closed by bquinlan #10751: REMOTE_USER and Remote-User collision in wsgiref http://bugs.python.org/issue10751 closed by Alex.Raitz #10786: unittest.TextTextRunner does not respect redirected stderr http://bugs.python.org/issue10786 closed by michael.foord #10788: test_logging failure http://bugs.python.org/issue10788 closed by bquinlan #10790: Header.append's charset logic is bogus, 'shift_jis' and "euc_j http://bugs.python.org/issue10790 closed by r.david.murray #10801: zipfile.ZipFile().extractall() header mismatch for non-ASCII c http://bugs.python.org/issue10801 closed by georg.brandl #10802: python3.2 AFTER b2 release has subprocess.Popen broken under c http://bugs.python.org/issue10802 closed by georg.brandl #10804: Copy and paste error in _json.c http://bugs.python.org/issue10804 closed by georg.brandl #10806: Subprocess error if fds 0,1,2 are closed http://bugs.python.org/issue10806 closed by pitrou #10807: `b'dGVzdA==\n'.decode('base64')` raise exception http://bugs.python.org/issue10807 closed by haypo #10809: complex() comments wrongly say it supports NaN and inf http://bugs.python.org/issue10809 closed by dalke #10810: logging.handlers.TimedRotatingFileHandler.__init__(): ST_MTIME http://bugs.python.org/issue10810 closed by georg.brandl #10814: assertion failed on Windows buildbots http://bugs.python.org/issue10814 closed by belopolsky #10815: Write to /dev/full does not raise IOError http://bugs.python.org/issue10815 closed by haypo #10816: test_multiprocessing: unclosed sockets http://bugs.python.org/issue10816 closed by haypo #10819: ValueError on repr(closed_socket_file) http://bugs.python.org/issue10819 closed by haypo #10821: gethostbyname(gethostname()) is wrong when IP is changed http://bugs.python.org/issue10821 closed by georg.brandl #10823: "conversion from 'Py_ssize_t' to 'int', possible loss of data" http://bugs.python.org/issue10823 closed by pitrou #10824: urandom should not block http://bugs.python.org/issue10824 closed by pitrou #10825: use assertIsNone(...) instead of assertEquals(None, ...) http://bugs.python.org/issue10825 closed by rhettinger #10840: pyarg_parsetuple docs and py_buffer http://bugs.python.org/issue10840 closed by pitrou #10844: OS X installer: update copyright dates in app bundles http://bugs.python.org/issue10844 closed by georg.brandl #10846: typo in threading doc's: "size of the resource size" http://bugs.python.org/issue10846 closed by georg.brandl #10853: SSL/TLS sni use in smtp,pop,imap,nntp,ftp client libs by defau http://bugs.python.org/issue10853 closed by pitrou #10857: ImportError module attribute http://bugs.python.org/issue10857 closed by brian.curtin #976613: socket timeout problems on Solaris http://bugs.python.org/issue976613 closed by pitrou #1665333: Documentation missing for OptionGroup class in optparse http://bugs.python.org/issue1665333 closed by georg.brandl #1677694: test_timeout refactoring http://bugs.python.org/issue1677694 closed by pitrou From techtonik at gmail.com Fri Jan 7 18:20:14 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 7 Jan 2011 19:20:14 +0200 Subject: [Python-Dev] API refactoring tracker field for Python4 Message-ID: There are many API changes and proposals that were forgotten and didn't get into Python 3, although they should be, because it was the only chance to change things with backwards compatibility break. For example http://bugs.python.org/issue1559549 This happened, because of poor bug management, where community doesn't play any role in determining which issues are desired. This mostly because of limitation of our tracker and desire of people to extend it to get damn "stars", module split, sorting, digging and tagging options. I won't be surprised if things won't change in the next couple of years, that's why I'd like to propose a very small change, so that when time will come to create Python4 (and standard library won't be separated from interpreter by this time), everybody can get quickly get a list of proposed API enhancements and filter which are eligible for the next BC API break. This change is a simple "api-refactoring" flag that could be added to corresponding issues by tracker users. -- anatoly t. From brian.curtin at gmail.com Fri Jan 7 18:41:56 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Fri, 7 Jan 2011 11:41:56 -0600 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: References: Message-ID: On Fri, Jan 7, 2011 at 11:20, anatoly techtonik wrote: > There are many API changes and proposals that were forgotten and > didn't get into Python 3, although they should be, because it was the > only chance to change things with backwards compatibility break. For > example http://bugs.python.org/issue1559549 That can be added in 3.3. To answer your comment on the issue: no investigation is needed. It didn't make it in yet because there was no code written for it. It's really not a big deal, it happens all the time. > This happened, because of poor bug management, where community doesn't > play any role in determining which issues are desired. > The community absolutely plays a role in determining which issues are desired. They do this by action when they want something. A patch says a whole lot about desire. > This mostly because of limitation of our tracker and desire of people > to extend it to get damn "stars", module split, sorting, digging and > tagging options. > I have no idea what any of this means. I won't be surprised if things won't change in the next couple of > years, that's why I'd like to propose a very small change, so that > when time will come to create Python4 (and standard library won't be > separated from interpreter by this time), everybody can get quickly > get a list of proposed API enhancements and filter which are eligible > for the next BC API break. This change is a simple "api-refactoring" > flag that could be added to corresponding issues by tracker users. I'm not sure I see the need for such a flag, as there are probably too few cases for this in the first place. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at voidspace.org.uk Fri Jan 7 18:55:17 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 07 Jan 2011 17:55:17 +0000 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: <20110107170704.EB1FE1CCA4@psf.upfronthosting.co.za> References: <20110107170704.EB1FE1CCA4@psf.upfronthosting.co.za> Message-ID: <4D275385.4030607@voidspace.org.uk> On 07/01/2011 17:07, Python tracker wrote: > ACTIVITY SUMMARY (2010-12-31 - 2011-01-07) > Python tracker athttp://bugs.python.org/ > > To view or respond to any of the issues listed below, click on the issue. > Do NOT respond to this message. > > Issues counts and deltas: > open 2501 (-24) > closed 20138 (+80) > total 22639 (+56) > Nice work everyone. :-) At this rate we'll be down to zero open issues in only 2 years. ;-) Michael -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessinghttp://www.sqlite.org/different.html From techtonik at gmail.com Fri Jan 7 19:14:55 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 7 Jan 2011 20:14:55 +0200 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: References: Message-ID: On Fri, Jan 7, 2011 at 7:41 PM, Brian Curtin wrote: >> >> There are many API changes and proposals that were forgotten and >> didn't get into Python 3, although they should be, because it was the >> only chance to change things with backwards compatibility break. For >> example http://bugs.python.org/issue1559549 > > That can be added in 3.3. > To answer your comment on the issue: no investigation is needed. It didn't > make it in yet because there was no code written for it. It's really not a > big deal, it happens all the time. Don't you think that if more people were aware of this issue, the patch could be made faster? >> This happened, because of poor bug management, where community doesn't >> play any role in determining which issues are desired. > > The community absolutely plays a role in determining which issues are > desired. They do this by action when they want something. A patch says a > whole lot about desire. > Don't you think that if people could review issues and "star" them then such minor issues could be scheduled for release not only by "severity" status as decided be release manager and several core developers, but also by community vote? Patch requires time, experience and approved contribution agreement, which you've sent using ground mail beforehand. Voting doesn't require any of this, but helps core developers see what user community wants. With the list of desired features Jesse Noller sponsored sprints will have more value for all of us. >> >> This mostly because of limitation of our tracker and desire of people >> to extend it to get damn "stars", module split, sorting, digging and >> tagging options. > > I have no idea what any of this means. Stars: go http://code.google.com/p/support/issues/list find Stars column guess Module split: try to get all issues for 'os' module try to subscribe to all commits for 'CGIHTTPServer' Sorting: click on column titles in bug tracker search results Tagging: as a tracker user, try to add tag 'easy' to some easy issue >> >> I won't be surprised if things won't change in the next couple of >> years, that's why I'd like to propose a very small change, so that >> when time will come to create Python4 (and standard library won't be >> separated from interpreter by this time), everybody can get quickly >> get a list of proposed API enhancements and filter which are eligible >> for the next BC API break. This change is a simple "api-refactoring" >> flag that could be added to corresponding issues by tracker users. > > I'm not sure I see the need for such a flag, as there are probably too few > cases for this in the first place. I haven't started using Python 3 yet, but I already know some annoying API issues that are not fixed there. Unfortunately, I don't remember them to give you a list. That's why I asked for a flag. -- anatoly t. From techtonik at gmail.com Fri Jan 7 19:17:40 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 7 Jan 2011 20:17:40 +0200 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: <4D275385.4030607@voidspace.org.uk> References: <20110107170704.EB1FE1CCA4@psf.upfronthosting.co.za> <4D275385.4030607@voidspace.org.uk> Message-ID: On Fri, Jan 7, 2011 at 7:55 PM, Michael Foord wrote: > On 07/01/2011 17:07, Python tracker wrote: >> >> ACTIVITY SUMMARY (2010-12-31 - 2011-01-07) >> Python tracker athttp://bugs.python.org/ >> >> To view or respond to any of the issues listed below, click on the issue. >> Do NOT respond to this message. >> >> Issues counts and deltas: >> ? open ? ?2501 (-24) >> ? closed 20138 (+80) >> ? total ?22639 (+56) >> > > Nice work everyone. :-) > > At this rate we'll be down to zero open issues in only 2 years. ;-) Less users -> less issues. It's always easy to speedup the process by leaving the most irritating ones. ;) -- anatoly t. From brian.curtin at gmail.com Fri Jan 7 19:29:34 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Fri, 7 Jan 2011 12:29:34 -0600 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: References: Message-ID: On Fri, Jan 7, 2011 at 12:14, anatoly techtonik wrote: > On Fri, Jan 7, 2011 at 7:41 PM, Brian Curtin > wrote: > >> > >> There are many API changes and proposals that were forgotten and > >> didn't get into Python 3, although they should be, because it was the > >> only chance to change things with backwards compatibility break. For > >> example http://bugs.python.org/issue1559549 > > > > That can be added in 3.3. > > To answer your comment on the issue: no investigation is needed. It > didn't > > make it in yet because there was no code written for it. It's really not > a > > big deal, it happens all the time. > > Don't you think that if more people were aware of this issue, the > patch could be made faster? > Maybe, but someone still has to write the code. You could start a facebook group for the issue and it could have 10,000 "likes", but it still doesn't solve the problem. I'm reminded of the saying "9 women can't have a baby in 1 month"... I do think it would be great if more people were involved in the issue tracker. I don't know what it will take to get more people involved, but I know it involves a lot more than modifying the tracker itself. > > >> This happened, because of poor bug management, where community doesn't > >> play any role in determining which issues are desired. > > > > The community absolutely plays a role in determining which issues are > > desired. They do this by action when they want something. A patch says a > > whole lot about desire. > > > Don't you think that if people could review issues and "star" them > then such minor issues could be scheduled for release not only by > "severity" status as decided be release manager and several core > developers, but also by community vote? > I'm not sure thatt's the right answer here. I'd rather people "star" or vote on issues by completing a step of the process rather than just clicking a thumbs up button. Writing a test case or checking that a patch applies on a particular branch is a vote to me. > Patch requires time, experience and approved contribution agreement, > which you've sent using ground mail beforehand. Voting doesn't require > any of this, but helps core developers see what user community wants. > I think the fact that it requires no "skin in the game" is a negative point. I don't show up at government meetings and vote on things -- I don't have that power. If I want something voted on, I go through a representative and I tell them my story, my side of things, and show them what I want and why I want it. If we just let people vote on things, the first issue that would be created would be "Remove the GIL" and it would have 10,000 votes and zero patches. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Fri Jan 7 19:36:26 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 7 Jan 2011 13:36:26 -0500 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: References: Message-ID: On Fri, Jan 7, 2011 at 1:14 PM, anatoly techtonik wrote: .. > Don't you think that if people could review issues and "star" them > then such minor issues could be scheduled for release not only by > "severity" status as decided be release manager and several core > developers, but also by community vote? > Anyone can already cast his or her vote by posting a comment with +1 or -1 in it. Doing so brings the issue to the top of the default view and gets an e-mail into many developers' mailboxes. Number of votes is never a deciding factor on any issue, so tallying them automatically is rather pointless. A vote that is accompanied by a rationale or a patch will always carry greater weight than just a +1. -1 on the "star system" for the tracker (Note that some kind of vote/star system is contemplated for the community documentation.) From hsoft at hardcoded.net Fri Jan 7 19:37:01 2011 From: hsoft at hardcoded.net (Virgil Dupras) Date: Fri, 7 Jan 2011 19:37:01 +0100 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: References: Message-ID: On 2011-01-07, at 7:14 PM, anatoly techtonik wrote: > Don't you think that if people could review issues and "star" them > then such minor issues could be scheduled for release not only by > "severity" status as decided be release manager and several core > developers, but also by community vote? > > Patch requires time, experience and approved contribution agreement, > which you've sent using ground mail beforehand. Voting doesn't require > any of this, but helps core developers see what user community wants. > With the list of desired features Jesse Noller sponsored sprints will > have more value for all of us. Two things. First, technically, the bug tracker already has "stars". It's the nosy list. You can even run a search by nosy count. Second, I'm not sure starring matters that much. Ultimately, for something to be done, you need a patch. Sure, sometimes, the patch is going to be made by someone who has no interest in it, but I think most of the time the patch is submitted by someone wanting the patch to be applied. I don't think the number of stars affect the likeliness of a patch being created very much. Maybe you can point to a google code project for which starring is used intensively and to observably good results? Virgil Dupras From fumanchu at aminus.org Fri Jan 7 19:36:52 2011 From: fumanchu at aminus.org (Robert Brewer) Date: Fri, 7 Jan 2011 10:36:52 -0800 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <20110107170449.ECC643A411A@sparrow.telecommunity.com> References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk><1294401061.14078.30.camel@marge><925E2041-7B6F-4868-8E9D-856A15784C25@fuhm.net> <20110107170449.ECC643A411A@sparrow.telecommunity.com> Message-ID: P.J. Eby wrote: > At 09:43 AM 1/7/2011 -0500, James Y Knight wrote: > >On Jan 7, 2011, at 6:51 AM, Victor Stinner wrote: > > > I don't understand why you are attached to this horrible hack > > > (bytes-in-unicode). It introduces more work and more confusing than > > > using raw bytes unchanged. > > > > > > It doesn't work and so something has to be changed. > > > >It's gross but it does work. This has been discussed ad-nausium on > >web-sig over a period of years. > > > >I'd like to reiterate that it is only even a potential issue for the > >PATH_INFO/SCRIPT_NAME keys. Those two keys are required to have been > >urldecoded already, into byte-data in some encoding. For all the > >other keys (including the ones from os.environ), they are either > >*properly* decoded in 8859-1 or are just ascii (possibly still > >urlencoded, so the app needs to urldecode and decode into a string > >with the correct encoding). > > Right. Also, it should be mentioned that none of this would be > necessary if we could've gotten a "bytes of a known encoding" > type. If you look back to the last big Python-Dev discussion on > bytes/unicode and stdlib API breakage, this was the holdup for > getting a sane WSGI spec. > > Since we couldn't change the language to fix the problem (due to the > moratorium), we had to use this less-pleasant way of dealing with > things, in order to get a final WSGI spec for Python 3. > > (If anybody is wondering about the specifics of the language change > that was needed, it'd be having a "bytes with known encoding" type, > that when combined in any polymorphic operation with a unicode > string, would result in bytes-with-encoding output, and would raise > an error if the resulting value could not be encoded in the target > encoding. Then we would simply do all WSGI header operations with > this type, using latin-1 as the target encoding.) Still looking forward to the day when that moratorium is lifted. Anyone have any idea when that will be? Bob From brian.curtin at gmail.com Fri Jan 7 19:39:34 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Fri, 7 Jan 2011 12:39:34 -0600 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: References: Message-ID: On Fri, Jan 7, 2011 at 12:14, anatoly techtonik wrote: > On Fri, Jan 7, 2011 at 7:41 PM, Brian Curtin > wrote: > >> > >> This mostly because of limitation of our tracker and desire of people > >> to extend it to get damn "stars", module split, sorting, digging and > >> tagging options. > > > > I have no idea what any of this means. > > Stars: > go http://code.google.com/p/support/issues/list > find Stars column > guess > This reminds me of my inbox, where I star emails all the time and do absolutely nothing different to them compared to non-starred emails. I personally don't see the need for that, so that's a -1 for me. > Module split: > try to get all issues for 'os' module > No solution for this right now, but people have suggested that we add drop-downs for each module. I'm -0 on that. > try to subscribe to all commits for 'CGIHTTPServer' > You can subscribe to the python-checkins mailing list and create a filter that looks for whatever you want. > > Sorting: > click on column titles in bug tracker search results > This could probably be solved with a patch to our Roundup instance. > > Tagging: > as a tracker user, try to add tag 'easy' to some easy issue > You probably need escalated privileges for this. If you can't change it, you can always request on the issue that a field be changed. > >> > >> I won't be surprised if things won't change in the next couple of > >> years, that's why I'd like to propose a very small change, so that > >> when time will come to create Python4 (and standard library won't be > >> separated from interpreter by this time), everybody can get quickly > >> get a list of proposed API enhancements and filter which are eligible > >> for the next BC API break. This change is a simple "api-refactoring" > >> flag that could be added to corresponding issues by tracker users. > > > > I'm not sure I see the need for such a flag, as there are probably too > few > > cases for this in the first place. > > I haven't started using Python 3 yet, but I already know some annoying > API issues that are not fixed there. Unfortunately, I don't remember > them to give you a list. That's why I asked for a flag. If you haven't used it yet, then how are you already annoyed...? -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Fri Jan 7 19:43:00 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 7 Jan 2011 13:43:00 -0500 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: References: <20110107170704.EB1FE1CCA4@psf.upfronthosting.co.za> <4D275385.4030607@voidspace.org.uk> Message-ID: On Fri, Jan 7, 2011 at 1:17 PM, anatoly techtonik wrote: .. >>> Issues counts and deltas: >>> ? open ? ?2501 (-24) >>> ? closed 20138 (+80) >>> ? total ?22639 (+56) .. > Less users -> less issues. It's always easy to speedup the process by > leaving the most irritating ones. ;) You should read the summary more carefully before leaving a witty comment like this. Hint: "total ?22639 (+56)". From solipsis at pitrou.net Fri Jan 7 19:44:09 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 7 Jan 2011 19:44:09 +0100 Subject: [Python-Dev] API refactoring tracker field for Python4 References: Message-ID: <20110107194409.348f3bf8@pitrou.net> On Fri, 7 Jan 2011 13:36:26 -0500 Alexander Belopolsky wrote: > On Fri, Jan 7, 2011 at 1:14 PM, anatoly techtonik wrote: > .. > > Don't you think that if people could review issues and "star" them > > then such minor issues could be scheduled for release not only by > > "severity" status as decided be release manager and several core > > developers, but also by community vote? > > > > Anyone can already cast his or her vote by posting a comment with +1 > or -1 in it. Doing so brings the issue to the top of the default view > and gets an e-mail into many developers' mailboxes. I certainly hope casual users don't start posting lots of +1s and -1s around, though. Regards Antoine. From solipsis at pitrou.net Fri Jan 7 19:51:48 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 7 Jan 2011 19:51:48 +0100 Subject: [Python-Dev] API refactoring tracker field for Python4 References: Message-ID: <20110107195148.5e7b8d2f@pitrou.net> On Fri, 7 Jan 2011 12:39:34 -0600 Brian Curtin wrote: > > > > I haven't started using Python 3 yet, but I already know some annoying > > API issues that are not fixed there. Unfortunately, I don't remember > > them to give you a list. That's why I asked for a flag. > > If you haven't used it yet, then how are you already annoyed...? Anatoly is apparently annoyed by a lot of things he never participates in (such as contributing to Python, for example). Regards Antoine. From g.rodola at gmail.com Fri Jan 7 20:07:53 2011 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Fri, 7 Jan 2011 20:07:53 +0100 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: References: Message-ID: > Module split: > ?try to get all issues for 'os' module > ?try to subscribe to all commits for 'CGIHTTPServer' +1 I've been thinking about such a thing as well and I think it would be useful. Every now and then I go to the bug tracker to see whether the modules I usually maintain (mainly ftplib, asyncore, asynchat) need some attention. I do this by using the plain text search but a select box containing all the module names would be better. --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ 2011/1/7 anatoly techtonik : > On Fri, Jan 7, 2011 at 7:41 PM, Brian Curtin wrote: >>> >>> There are many API changes and proposals that were forgotten and >>> didn't get into Python 3, although they should be, because it was the >>> only chance to change things with backwards compatibility break. For >>> example http://bugs.python.org/issue1559549 >> >> That can be added in 3.3. >> To answer your comment on the issue: no investigation is needed. It didn't >> make it in yet because there was no code written for it. It's really not a >> big deal, it happens all the time. > > Don't you think that if more people were aware of this issue, the > patch could be made faster? > >>> This happened, because of poor bug management, where community doesn't >>> play any role in determining which issues are desired. >> >> The community absolutely plays a role in determining which issues are >> desired. They do this by action when they want something. A patch says a >> whole lot about desire. >> > Don't you think that if people could review issues and "star" them > then such minor issues could be scheduled for release not only by > "severity" status as decided be release manager and several core > developers, but also by community vote? > > Patch requires time, experience and approved contribution agreement, > which you've sent using ground mail beforehand. Voting doesn't require > any of this, but helps core developers see what user community wants. > With the list of desired features Jesse Noller sponsored sprints will > have more value for all of us. > >>> >>> This mostly because of limitation of our tracker and desire of people >>> to extend it to get damn "stars", module split, sorting, digging and >>> tagging options. >> >> I have no idea what any of this means. > > Stars: > ?go http://code.google.com/p/support/issues/list > ?find Stars column > ?guess > > Module split: > ?try to get all issues for 'os' module > ?try to subscribe to all commits for 'CGIHTTPServer' > > Sorting: > ?click on column titles in bug tracker search results > > Tagging: > ?as a tracker user, try to add tag 'easy' to some easy issue > >>> >>> I won't be surprised if things won't change in the next couple of >>> years, that's why I'd like to propose a very small change, so that >>> when time will come to create Python4 (and standard library won't be >>> separated from interpreter by this time), everybody can get quickly >>> get a list of proposed API enhancements and filter which are eligible >>> for the next BC API break. This change is a simple "api-refactoring" >>> flag that could be added to corresponding issues by tracker users. >> >> I'm not sure I see the need for such a flag, as there are probably too few >> cases for this in the first place. > > I haven't started using Python 3 yet, but I already know some annoying > API issues that are not fixed there. Unfortunately, I don't remember > them to give you a list. That's why I asked for a flag. > -- > anatoly t. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/g.rodola%40gmail.com > From guido at python.org Fri Jan 7 20:11:47 2011 From: guido at python.org (Guido van Rossum) Date: Fri, 7 Jan 2011 11:11:47 -0800 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: References: Message-ID: On Fri, Jan 7, 2011 at 10:36 AM, Alexander Belopolsky wrote: > -1 on the "star system" for the tracker The tracker on Google Code uses stars. We use this tracker to track external App Engine issues. It works very well to measure how widespread a particular issue or need is (even if we don't always fix the highest-star issues first -- the top issues are "unfixable" like PHP support :-). Maybe it works because in that tracker, a star means you get emailed when the issue is updated; this makes people think twice before frivolously adding a star. This is not quite the same as the "nosy" list: adding a star is less work in the UI, you don't have to think up something meaningful to say, and no email is generated merely because someone adds or removes a star. -- --Guido van Rossum (python.org/~guido) From g.brandl at gmx.net Fri Jan 7 20:14:24 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 07 Jan 2011 20:14:24 +0100 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: References: Message-ID: Am 07.01.2011 19:39, schrieb Brian Curtin: > Tagging: > as a tracker user, try to add tag 'easy' to some easy issue > > > You probably need escalated privileges for this. If you can't change it, you can > always request on the issue that a field be changed. He *could* also behave reasonable for a while, leading to him being granted tracker maintenance privileges. Georg From fuzzyman at voidspace.org.uk Fri Jan 7 20:15:35 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 07 Jan 2011 19:15:35 +0000 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: References: Message-ID: <4D276657.7040009@voidspace.org.uk> On 07/01/2011 19:11, Guido van Rossum wrote: > On Fri, Jan 7, 2011 at 10:36 AM, Alexander Belopolsky > wrote: >> -1 on the "star system" for the tracker > The tracker on Google Code uses stars. We use this tracker to track > external App Engine issues. It works very well to measure how > widespread a particular issue or need is (even if we don't always fix > the highest-star issues first -- the top issues are "unfixable" like > PHP support :-). > > Maybe it works because in that tracker, a star means you get emailed > when the issue is updated; this makes people think twice before > frivolously adding a star. This is not quite the same as the "nosy" > list: adding a star is less work in the UI, you don't have to think up > something meaningful to say, and no email is generated merely because > someone adds or removes a star. > In our issue tracker it is more or less the same. Adding yourself as nosy sends you emails when it is updated and there is a convenient button for adding yourself as nosy without having to think up a meaningful comment. The only (sometimes annoying but sometimes useful or interesting) difference is that you also get emailed when someone else adds themselves as nosy. Michael -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From solipsis at pitrou.net Fri Jan 7 20:18:11 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 7 Jan 2011 20:18:11 +0100 Subject: [Python-Dev] API refactoring tracker field for Python4 References: Message-ID: <20110107201811.005ae1e1@pitrou.net> On Fri, 7 Jan 2011 11:11:47 -0800 Guido van Rossum wrote: > On Fri, Jan 7, 2011 at 10:36 AM, Alexander Belopolsky > wrote: > > -1 on the "star system" for the tracker > > The tracker on Google Code uses stars. We use this tracker to track > external App Engine issues. It works very well to measure how > widespread a particular issue or need is (even if we don't always fix > the highest-star issues first -- the top issues are "unfixable" like > PHP support :-). > > Maybe it works because in that tracker, a star means you get emailed > when the issue is updated; this makes people think twice before > frivolously adding a star. This is not quite the same as the "nosy" > list: adding a star is less work in the UI, you don't have to think up > something meaningful to say, and no email is generated merely because > someone adds or removes a star. I'd also mention that many bugzilla installs have a "voting" facility where people can vote for a limited number of issues of their choice (I think the number of votes also depends on the user's number of contributions, although I'm not sure). Regards Antoine. From guido at python.org Fri Jan 7 20:22:04 2011 From: guido at python.org (Guido van Rossum) Date: Fri, 7 Jan 2011 11:22:04 -0800 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: <4D276657.7040009@voidspace.org.uk> References: <4D276657.7040009@voidspace.org.uk> Message-ID: On Fri, Jan 7, 2011 at 11:15 AM, Michael Foord wrote: > On 07/01/2011 19:11, Guido van Rossum wrote: >> >> On Fri, Jan 7, 2011 at 10:36 AM, Alexander Belopolsky >> ?wrote: >>> >>> -1 on the "star system" for the tracker >> >> The tracker on Google Code uses stars. We use this tracker to track >> external App Engine issues. It works very well to measure how >> widespread a particular issue or need is (even if we don't always fix >> the highest-star issues first -- the top issues are "unfixable" like >> PHP support :-). >> >> Maybe it works because in that tracker, a star means you get emailed >> when the issue is updated; this makes people think twice before >> frivolously adding a star. This is not quite the same as the "nosy" >> list: adding a star is less work in the UI, you don't have to think up >> something meaningful to say, and no email is generated merely because >> someone adds or removes a star. >> > In our issue tracker it is more or less the same. Adding yourself as nosy > sends you emails when it is updated and there is a convenient button for > adding yourself as nosy without having to think up a meaningful comment. Ah, that must be new -- I didn't realize that. Nice. Now I also want a button to *remove* myself from the nosy list. (Of course, a better UI for adding/removing yourself could be a star. Clicking the star changes your nosy status. It should work immediately, unlike the existing [+] button.) > The only (sometimes annoying but sometimes useful or interesting) difference > is that you also get emailed when someone else adds themselves as nosy. Maybe that could be fixed? Then the remaining feature would be a way to sort issue lists by number of nosy people, and to display the length of the nosy list. -- --Guido van Rossum (python.org/~guido) From guido at python.org Fri Jan 7 20:22:44 2011 From: guido at python.org (Guido van Rossum) Date: Fri, 7 Jan 2011 11:22:44 -0800 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: <20110107201811.005ae1e1@pitrou.net> References: <20110107201811.005ae1e1@pitrou.net> Message-ID: On Fri, Jan 7, 2011 at 11:18 AM, Antoine Pitrou wrote: > I'd also mention that many bugzilla installs have a "voting" facility > where people can vote for a limited number of issues of their choice (I > think the number of votes also depends on the user's number of > contributions, although I'm not sure). The latter part sounds like overengineering by geeks too worried about abuse. -- --Guido van Rossum (python.org/~guido) From fuzzyman at voidspace.org.uk Fri Jan 7 20:23:06 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 07 Jan 2011 19:23:06 +0000 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: <20110107194409.348f3bf8@pitrou.net> References: <20110107194409.348f3bf8@pitrou.net> Message-ID: <4D27681A.9050207@voidspace.org.uk> On 07/01/2011 18:44, Antoine Pitrou wrote: > On Fri, 7 Jan 2011 13:36:26 -0500 > Alexander Belopolsky wrote: > >> On Fri, Jan 7, 2011 at 1:14 PM, anatoly techtonik wrote: >> .. >>> Don't you think that if people could review issues and "star" them >>> then such minor issues could be scheduled for release not only by >>> "severity" status as decided be release manager and several core >>> developers, but also by community vote? >>> >> Anyone can already cast his or her vote by posting a comment with +1 >> or -1 in it. Doing so brings the issue to the top of the default view >> and gets an e-mail into many developers' mailboxes. > I certainly hope casual users don't start posting lots of +1s and -1s > around, though. Well, some indication of how many users this affects may be useful when looking at issues to work on. Launchpad has a button for "this affects me" and you can see how many users are affected by an issue (or have declared that at least). Not sure if this sends you email, but I'm pretty sure it is different to subscribing to an issue - which is nice. Sometimes I care about an issue but can neither fix it myself nor want to receive every email from the discussion on the issue tracker. Michael > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From fuzzyman at voidspace.org.uk Fri Jan 7 20:25:15 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 07 Jan 2011 19:25:15 +0000 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: References: <4D276657.7040009@voidspace.org.uk> Message-ID: <4D27689B.3050801@voidspace.org.uk> On 07/01/2011 19:22, Guido van Rossum wrote: > On Fri, Jan 7, 2011 at 11:15 AM, Michael Foord > wrote: >> On 07/01/2011 19:11, Guido van Rossum wrote: >>> On Fri, Jan 7, 2011 at 10:36 AM, Alexander Belopolsky >>> wrote: >>>> -1 on the "star system" for the tracker >>> The tracker on Google Code uses stars. We use this tracker to track >>> external App Engine issues. It works very well to measure how >>> widespread a particular issue or need is (even if we don't always fix >>> the highest-star issues first -- the top issues are "unfixable" like >>> PHP support :-). >>> >>> Maybe it works because in that tracker, a star means you get emailed >>> when the issue is updated; this makes people think twice before >>> frivolously adding a star. This is not quite the same as the "nosy" >>> list: adding a star is less work in the UI, you don't have to think up >>> something meaningful to say, and no email is generated merely because >>> someone adds or removes a star. >>> >> In our issue tracker it is more or less the same. Adding yourself as nosy >> sends you emails when it is updated and there is a convenient button for >> adding yourself as nosy without having to think up a meaningful comment. > Ah, that must be new -- I didn't realize that. Nice. It is. Sorry I should have made that clearer. > Now I also want a > button to *remove* myself from the nosy list. > Me too - but it was considered unnecessary clutter in the UI. I > (Of course, a better UI for adding/removing yourself could be a star. > Clicking the star changes your nosy status. It should work > immediately, unlike the existing [+] button.) Right, you still need to submit after clicking [+] at the moment. >> The only (sometimes annoying but sometimes useful or interesting) difference >> is that you also get emailed when someone else adds themselves as nosy. > Maybe that could be fixed? Then the remaining feature would be a way > to sort issue lists by number of nosy people, and to display the > length of the nosy list. Sounds good too me. Michael -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From p.f.moore at gmail.com Fri Jan 7 20:28:14 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 7 Jan 2011 19:28:14 +0000 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> <1294401061.14078.30.camel@marge> <925E2041-7B6F-4868-8E9D-856A15784C25@fuhm.net> <20110107170449.ECC643A411A@sparrow.telecommunity.com> Message-ID: On 7 January 2011 18:36, Robert Brewer wrote: > Still looking forward to the day when that moratorium is lifted. Anyone > have any idea when that will be? See PEP 3003 (http://www.python.org/dev/peps/pep-3003/) - Python 3.3 is expected to be post-moratorium. Paul. From jnoller at gmail.com Fri Jan 7 21:01:16 2011 From: jnoller at gmail.com (Jesse Noller) Date: Fri, 7 Jan 2011 15:01:16 -0500 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <20110107170449.ECC643A411A@sparrow.telecommunity.com> References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> <1294401061.14078.30.camel@marge> <925E2041-7B6F-4868-8E9D-856A15784C25@fuhm.net> <20110107170449.ECC643A411A@sparrow.telecommunity.com> Message-ID: On Fri, Jan 7, 2011 at 12:04 PM, P.J. Eby wrote: > At 09:43 AM 1/7/2011 -0500, James Y Knight wrote: >> >> On Jan 7, 2011, at 6:51 AM, Victor Stinner wrote: >> > I don't understand why you are attached to this horrible hack >> > (bytes-in-unicode). It introduces more work and more confusing than >> > using raw bytes unchanged. >> > >> > It doesn't work and so something has to be changed. >> >> It's gross but it does work. This has been discussed ad-nausium on web-sig >> over a period of years. >> >> I'd like to reiterate that it is only even a potential issue for the >> PATH_INFO/SCRIPT_NAME keys. Those two keys are required to have been >> urldecoded already, into byte-data in some encoding. For all the other keys >> (including the ones from os.environ), they are either *properly* decoded in >> 8859-1 or are just ascii (possibly still urlencoded, so the app needs to >> urldecode and decode into a string with the correct encoding). > > Right. ?Also, it should be mentioned that none of this would be necessary if > we could've gotten a "bytes of a known encoding" type. ?If you look back to > the last big Python-Dev discussion on bytes/unicode and stdlib API breakage, > this was the holdup for getting a sane WSGI spec. > > Since we couldn't change the language to fix the problem (due to the > moratorium), we had to use this less-pleasant way of dealing with things, in > order to get a final WSGI spec for Python 3. If the fix was that critical; exceptions should be made. From janssen at parc.com Fri Jan 7 21:13:04 2011 From: janssen at parc.com (Bill Janssen) Date: Fri, 7 Jan 2011 12:13:04 PST Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <20110107170449.ECC643A411A@sparrow.telecommunity.com> References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> <1294401061.14078.30.camel@marge> <925E2041-7B6F-4868-8E9D-856A15784C25@fuhm.net> <20110107170449.ECC643A411A@sparrow.telecommunity.com> Message-ID: <64146.1294431184@parc.com> P.J. Eby wrote: > Right. Also, it should be mentioned that none of this would be > necessary if we could've gotten a "bytes of a known encoding" type. Indeed! Or even "string using a known encoding"... > If you look back to the last big Python-Dev discussion on > bytes/unicode and stdlib API breakage, this was the holdup for getting > a sane WSGI spec. Yep. Bill From fumanchu at aminus.org Fri Jan 7 21:16:07 2011 From: fumanchu at aminus.org (Robert Brewer) Date: Fri, 7 Jan 2011 12:16:07 -0800 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: References: <1294109093.14661.4.camel@marge><1294357806.2970.33.camel@stalk><1294401061.14078.30.camel@marge><925E2041-7B6F-4868-8E9D-856A15784C25@fuhm.net><20110107170449.ECC643A411A@sparrow.telecommunity.com> Message-ID: Paul Moore wrote: > Robert Brewer wrote: > > P.J. Eby wrote: > > > Also, it should be mentioned that none of this would be > > > necessary if we could've gotten a "bytes of a known encoding" > > > type. > > > > Still looking forward to the day when that moratorium is lifted. > > Anyone have any idea when that will be? > > See PEP 3003 (http://www.python.org/dev/peps/pep-3003/) - Python 3.3 > is expected to be post-moratorium. "This PEP proposes a temporary moratorium (suspension) of all changes to the Python language syntax, semantics, and built-ins for a period of at least two years from the release of Python 3.1." Python 3.1 was released June 27th, 2009. We're coming up faster on the two-year period than we seem to be on a revised WSGI spec. Maybe we should shoot for a "bytes of a known encoding" type first. Robert Brewer fumanchu at aminus.org From solipsis at pitrou.net Fri Jan 7 23:18:39 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 7 Jan 2011 23:18:39 +0100 Subject: [Python-Dev] r87838 - python/branches/py3k/Doc/library/threading.rst References: <20110107215418.5D942EEA6B@mail.python.org> Message-ID: <20110107231839.2c87415d@pitrou.net> On Fri, 7 Jan 2011 22:54:18 +0100 (CET) raymond.hettinger wrote: > Author: raymond.hettinger > Date: Fri Jan 7 22:54:18 2011 > New Revision: 87838 > > Log: > Revert r87821 which moved the source link to the wrong section (from the module intro covering the module to a section on thread imports). Well, I insist, Raymond. The threading module's source code is less important than the threading module API, so can you please leave that link at the bottom? *Especially* given you're not even involved in maintenance of that module. Thank you Antoine. From ncoghlan at gmail.com Sat Jan 8 08:07:32 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 8 Jan 2011 17:07:32 +1000 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> <1294401061.14078.30.camel@marge> <925E2041-7B6F-4868-8E9D-856A15784C25@fuhm.net> <20110107170449.ECC643A411A@sparrow.telecommunity.com> Message-ID: On Sat, Jan 8, 2011 at 6:16 AM, Robert Brewer wrote: > Python 3.1 was released June 27th, 2009. We're coming up faster on the > two-year period than we seem to be on a revised WSGI spec. Maybe we > should shoot for a "bytes of a known encoding" type first. There were a few minor* practical issues in getting agreement on how such a type would actually behave. Instead, the approach WSGI adopted (or the stricter, 7-bit ASCII only approach used internally by urllib.parse to handle bytes in 3.2) was deemed sufficient, since it could be done right now without having to agree on how many different bikesheds were needed and what colours they should all be. Cheers, Nick. *i.e. major :) -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Sat Jan 8 11:03:11 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 8 Jan 2011 11:03:11 +0100 Subject: [Python-Dev] r87849 - python/branches/py3k/Lib/test/test_ssl.py References: <20110108031605.A039BEEA40@mail.python.org> Message-ID: <20110108110311.0595e1fb@pitrou.net> On Sat, 8 Jan 2011 04:16:05 +0100 (CET) victor.stinner wrote: > Author: victor.stinner > Date: Sat Jan 8 04:16:05 2011 > New Revision: 87849 > > Log: > test_ssl: test SHA256 using sha256.tbs-internet.com instead of sha2.hboeck.de > > Modified: > python/branches/py3k/Lib/test/test_ssl.py > > Modified: python/branches/py3k/Lib/test/test_ssl.py > ============================================================================== > --- python/branches/py3k/Lib/test/test_ssl.py (original) > +++ python/branches/py3k/Lib/test/test_ssl.py Sat Jan 8 04:16:05 2011 > @@ -599,8 +599,8 @@ > # SHA256 was added in OpenSSL 0.9.8 > if ssl.OPENSSL_VERSION_INFO < (0, 9, 8, 0, 15): > self.skipTest("SHA256 not available on %r" % ssl.OPENSSL_VERSION) > - # NOTE: https://sha256.tbs-internet.com is another possible test host > - remote = ("sha2.hboeck.de", 443) > + # https://sha2.hboeck.de/ was used until 2011-01-08 (no route to host) > + remote = ("sha256.tbs-internet.com", 443) > sha256_cert = os.path.join(os.path.dirname(__file__), "sha256.pem") > with support.transient_internet("sha2.hboeck.de"): You obviously need to update the certificate file and also the host name above. From rwgk at yahoo.com Sat Jan 8 21:03:35 2011 From: rwgk at yahoo.com (Ralf W. Grosse-Kunstleve) Date: Sat, 8 Jan 2011 12:03:35 -0800 (PST) Subject: [Python-Dev] FYI: Python 2.7.1 + gcc 4.6 (experimental) probable optimizer problem Message-ID: <107884.6325.qm@web111405.mail.gq1.yahoo.com> I just wanted to share an observation in case Python developers are interested: Python 2.7.1 doesn't build with the current gcc 4.6 svn. Note that gcc 4.6 is now in "bug-fix only" mode. Some details: Fedora 14 64-bit. The first time I noticed the problem was in Nov or early Dec 2010; I'm pretty sure it worked in Oct maybe still early Nov. Python configured simply with ./configure g++ (GCC) 4.6.0 20101206 (experimental) % make /bin/sh: line 1: 41686 Segmentation fault (core dumped) CC='gcc -pthread' LDSHARED='gcc -pthread -shared ' OPT='-DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes' ./python -E ./setup.py build make: *** [sharedmods] Error 139 g++ (GCC) 4.6.0 20110108 (experimental) % make XXX lineno: 743, opcode: 0 Traceback (most recent call last): File "/net/theta/raid1/rwgk/junk/Python-2.7.1/Lib/site.py", line 62, in import os File "/net/theta/raid1/rwgk/junk/Python-2.7.1/Lib/os.py", line 743, in def urandom(n): SystemError: unknown opcode make: *** [sharedmods] Error 1 make finishes OK if I configure --with-pydebug. Therefore my guess is that there is an optimizer bug in the current gcc 4.6 that's only triggered by a specific construct in Python. (A lot of other stuff builds and runs fine.) BTW: I've been doing gcc pre-release testing regularly for many year, starting with gcc 3.3. This is the first time I see the Python build fail persistently for several weeks. From brett at python.org Sat Jan 8 21:37:02 2011 From: brett at python.org (Brett Cannon) Date: Sat, 8 Jan 2011 12:37:02 -0800 Subject: [Python-Dev] devguide: Point out that OS X users need to change examples to use python.exe instead of In-Reply-To: References: Message-ID: On Thu, Jan 6, 2011 at 13:04, Ned Deily wrote: > In article , > ?brett.cannon wrote: > [...] >> summary: >> ? Point out that OS X users need to change examples to use python.exe instead >> ? of python. >> ?Once Python is done building you will then have a working build of Python >> ?that can be run in-place; ``./python`` on most machines, ``./python.exe`` >> -on OS X. >> +on OS X (all examples throughout this documentation say ``./python`` but >> +implies you choose the proper name based on your OS). > > That's true on OS X if you are using a case-insensitive file system. > But wIth the newer, case-sensitive HFS+, for example, you get ./python. Are you thinking of UFS, because I am running HFS+ and I still get python.exe since it's case-preserving. Regardless, I will add a note about the case-sensitivity. > > -- > ?Ned Deily, > ?nad at acm.org > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > From nad at acm.org Sat Jan 8 22:13:38 2011 From: nad at acm.org (Ned Deily) Date: Sat, 08 Jan 2011 13:13:38 -0800 Subject: [Python-Dev] devguide: Point out that OS X users need to change examples to use python.exe instead of References: Message-ID: In article , Brett Cannon wrote: > On Thu, Jan 6, 2011 at 13:04, Ned Deily wrote: > > In article , > > ?brett.cannon wrote: > > [...] > >> summary: > >> ? Point out that OS X users need to change examples to use python.exe > >> instead > >> ? of python. > >> ?Once Python is done building you will then have a working build of Python > >> ?that can be run in-place; ``./python`` on most machines, ``./python.exe`` > >> -on OS X. > >> +on OS X (all examples throughout this documentation say ``./python`` but > >> +implies you choose the proper name based on your OS). > > > > That's true on OS X if you are using a case-insensitive file system. > > But wIth the newer, case-sensitive HFS+, for example, you get ./python. > > Are you thinking of UFS, because I am running HFS+ and I still get > python.exe since it's case-preserving. No, not UFS. Since at least 10.4, OS X has supported the creation of at least four variants of HFS+ via Disk Utility.app or disktutil(8). The 10.6 version of diskutil added a handy way to list all available file systems: $ diskutil listFileSystems Formattable filesystems [...] -------------------------------------------------------------------- PERSONALITY USER VISIBLE NAME -------------------------------------------------------------------- [...] HFS+ Mac OS Extended Case-sensitive HFS+ Mac OS Extended (Case-sensitive) (or) hfsx Case-sensitive Journaled HFS+ Mac OS Extended (Case-sensitive, Journaled) (or) jhfsx Journaled HFS+ Mac OS Extended (Journaled) (or) jhfs+ These days, one of the latter two is used to format the primary file system where OS X resides: I believe journaled is a requirement from at least 10.5 on, case-sensitive is optional. I've been using "jhfsx" for my primary development machine since 10.5 was released a few years ago. Since it is a file system type, AFAIK it is necessary to re-initialize the partition and reload files on it. > Regardless, I will add a note about the case-sensitivity. Thanks! -- Ned Deily, nad at acm.org From solipsis at pitrou.net Sat Jan 8 22:15:33 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 8 Jan 2011 22:15:33 +0100 Subject: [Python-Dev] FYI: Python 2.7.1 + gcc 4.6 (experimental) probable optimizer problem References: <107884.6325.qm@web111405.mail.gq1.yahoo.com> Message-ID: <20110108221533.3ff7f4bb@pitrou.net> On Sat, 8 Jan 2011 12:03:35 -0800 (PST) "Ralf W. Grosse-Kunstleve" wrote: > I just wanted to share an observation in case Python developers are > interested: > Python 2.7.1 doesn't build with the current gcc 4.6 svn. > Note that gcc 4.6 is now in "bug-fix only" mode. You should report a bug with the gcc developers. By the way, can you try to build Python 3.2 too? From martin at v.loewis.de Sat Jan 8 22:16:09 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 08 Jan 2011 22:16:09 +0100 Subject: [Python-Dev] FYI: Python 2.7.1 + gcc 4.6 (experimental) probable optimizer problem In-Reply-To: <107884.6325.qm@web111405.mail.gq1.yahoo.com> References: <107884.6325.qm@web111405.mail.gq1.yahoo.com> Message-ID: <4D28D419.8070608@v.loewis.de> > BTW: I've been doing gcc pre-release testing regularly for many year, starting > with gcc 3.3. This is the first time I see the Python build fail persistently > for several weeks. Wild guess: did configure detect that it needs to use -fno-strict-aliasing? Regards, Martin From stefan at bytereef.org Sat Jan 8 22:58:51 2011 From: stefan at bytereef.org (Stefan Krah) Date: Sat, 8 Jan 2011 22:58:51 +0100 Subject: [Python-Dev] FYI: Python 2.7.1 + gcc 4.6 (experimental) probable optimizer problem In-Reply-To: <20110108221533.3ff7f4bb@pitrou.net> References: <107884.6325.qm@web111405.mail.gq1.yahoo.com> <20110108221533.3ff7f4bb@pitrou.net> Message-ID: <20110108215851.GA25382@yoda.bytereef.org> Antoine Pitrou wrote: > On Sat, 8 Jan 2011 12:03:35 -0800 (PST) > "Ralf W. Grosse-Kunstleve" wrote: > > I just wanted to share an observation in case Python developers are > > interested: > > Python 2.7.1 doesn't build with the current gcc 4.6 svn. > > Note that gcc 4.6 is now in "bug-fix only" mode. > > You should report a bug with the gcc developers. > By the way, can you try to build Python 3.2 too? I can reproduce this with release27-maint on Fedora-14/amd64/gcc-4.6. -fno-strict-aliasing is enabled. py3k is fine. Hard to tell if it's a gcc bug or not. gcc-4.6 increased the ANSI compliance requirements yet again, exposing third party bugs like this one: http://sourceware.org/ml/libc-alpha/2010-12/msg00009.html There is an issue for this: http://bugs.python.org/issue9880 Stefan Krah From solipsis at pitrou.net Sat Jan 8 23:22:35 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 8 Jan 2011 23:22:35 +0100 Subject: [Python-Dev] devguide: Add an intermediate task of helping triage issues (not to be confused with the References: Message-ID: <20110108232235.735fea7c@pitrou.net> On Sat, 08 Jan 2011 23:05:06 +0100 brett.cannon wrote: > +For bugs, an issue needs to: > + > +* Clearly explain the bug so it can be reproduced > +* All relevant platform details are included > +* What version(s) of Python are affected by the bug are fully known > +* Is there a proper unit test that can reproduce the bug? > + > +These are things anyone can help with. FWIW, I'm really not fond of handing out triage tasks to beginners. First because the claim that it doesn't require any specific knowledge is wrong (in the case of Python, because it is a highly technical product; it might be right for office suites, who knows). Second because a newbie triager gets to interact with other newbies who might be very confused if they are given misleading comments or asked misleading (or completely irrelevant) questions. Things may be different when the person in question has been a long-time community member, or has specific expertise, and is therefore able to communicate meaningful advice. But for true beginners, I think it would be much better to let them write a patch or a doc fix. Regards Antoine. From solipsis at pitrou.net Sat Jan 8 23:26:03 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 8 Jan 2011 23:26:03 +0100 Subject: [Python-Dev] FYI: Python 2.7.1 + gcc 4.6 (experimental) probable optimizer problem References: <107884.6325.qm@web111405.mail.gq1.yahoo.com> <20110108221533.3ff7f4bb@pitrou.net> <20110108215851.GA25382@yoda.bytereef.org> Message-ID: <20110108232603.63fbbc1c@pitrou.net> On Sat, 8 Jan 2011 22:58:51 +0100 Stefan Krah wrote: > Antoine Pitrou wrote: > > On Sat, 8 Jan 2011 12:03:35 -0800 (PST) > > "Ralf W. Grosse-Kunstleve" wrote: > > > I just wanted to share an observation in case Python developers are > > > interested: > > > Python 2.7.1 doesn't build with the current gcc 4.6 svn. > > > Note that gcc 4.6 is now in "bug-fix only" mode. > > > > You should report a bug with the gcc developers. > > By the way, can you try to build Python 3.2 too? > > I can reproduce this with release27-maint on Fedora-14/amd64/gcc-4.6. > -fno-strict-aliasing is enabled. It might be interesting to have a buildbot with a bleeding edge toolchain. Although in this case nobody rushed to diagnose the three-month old issue anyway: > There is an issue for this: > > http://bugs.python.org/issue9880 Regards Antoine. From stefan at bytereef.org Sun Jan 9 00:44:37 2011 From: stefan at bytereef.org (Stefan Krah) Date: Sun, 9 Jan 2011 00:44:37 +0100 Subject: [Python-Dev] FYI: Python 2.7.1 + gcc 4.6 (experimental) probable optimizer problem In-Reply-To: <20110108232603.63fbbc1c@pitrou.net> References: <107884.6325.qm@web111405.mail.gq1.yahoo.com> <20110108221533.3ff7f4bb@pitrou.net> <20110108215851.GA25382@yoda.bytereef.org> <20110108232603.63fbbc1c@pitrou.net> Message-ID: <20110108234437.GA26020@yoda.bytereef.org> Antoine Pitrou wrote: > > I can reproduce this with release27-maint on Fedora-14/amd64/gcc-4.6. > > -fno-strict-aliasing is enabled. > > It might be interesting to have a buildbot with a bleeding edge > toolchain. Although in this case nobody rushed to diagnose the > three-month old issue anyway: I narrowed the issue down to -ftree-vectorize, which is part of -O3. Searching briefly for 'ftree-vectorize + bug' makes me think that we should wait for the stable gcc-4.6. Stefan Krah From stephen at xemacs.org Sun Jan 9 08:47:45 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 09 Jan 2011 16:47:45 +0900 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> <1294401061.14078.30.camel@marge> <925E2041-7B6F-4868-8E9D-856A15784C25@fuhm.net> <20110107170449.ECC643A411A@sparrow.telecommunity.com> Message-ID: <87d3o6a57i.fsf@uwakimon.sk.tsukuba.ac.jp> Robert Brewer writes: > Python 3.1 was released June 27th, 2009. We're coming up faster on the > two-year period than we seem to be on a revised WSGI spec. Maybe we > should shoot for a "bytes of a known encoding" type first. You have one. It's called "ISO 2022: Information processing -- ISO 7-bit and 8-bit coded character sets -- Code extension techniques". The popularity of that standard speaks for itself. From g.brandl at gmx.net Sun Jan 9 09:26:37 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 09 Jan 2011 09:26:37 +0100 Subject: [Python-Dev] devguide: Add an intermediate task of helping triage issues (not to be confused with the In-Reply-To: <20110108232235.735fea7c@pitrou.net> References: <20110108232235.735fea7c@pitrou.net> Message-ID: Am 08.01.2011 23:22, schrieb Antoine Pitrou: > On Sat, 08 Jan 2011 23:05:06 +0100 > brett.cannon wrote: >> +For bugs, an issue needs to: >> + >> +* Clearly explain the bug so it can be reproduced >> +* All relevant platform details are included >> +* What version(s) of Python are affected by the bug are fully known >> +* Is there a proper unit test that can reproduce the bug? >> + >> +These are things anyone can help with. > > FWIW, I'm really not fond of handing out triage tasks to beginners. > First because the claim that it doesn't require any specific knowledge > is wrong (in the case of Python, because it is a highly technical > product; it might be right for office suites, who knows). > Second because a newbie triager gets to interact with other newbies who > might be very confused if they are given misleading comments or asked > misleading (or completely irrelevant) questions. +1. Remember, this is not a purely hypothetical statement. > Things may be different when the person in question has been a long-time > community member, or has specific expertise, and is therefore able to > communicate meaningful advice. But for true beginners, I think it would > be much better to let them write a patch or a doc fix. Yep. Georg From barry at python.org Sun Jan 9 13:57:04 2011 From: barry at python.org (Barry Warsaw) Date: Sun, 9 Jan 2011 07:57:04 -0500 Subject: [Python-Dev] RELEASED python-mode.el 5.2.0 Message-ID: <20110109075704.40ca6a04@python.org> On behalf of the python-mode developers I'm happy to announce the release of python-mode.el 5.2.0. A summary of the changes since 5.1.0 is included below. python-mode.el is a major mode for editing Python code in Emacs and XEmacs. This version has been supported and developed by core Python developers since 1992, and predates by many years the python.el mode that comes with GNU Emacs. It provides many useful features, including Ken Manheimer's awesome pdbtrack for command line debugging. Many thanks to Andreas Roehler and Georg Brandl for the majority of the work on this version. You can download the python-mode.el file or the full tarball from: https://launchpad.net/python-mode Bugs can be filed at: https://bugs.launchpad.net/python-mode Enjoy, -Barry New in version 5.2.0 -------------------- - Fixed filling of triple-quoted strings. - Add new font-lock faces for class names and exception names. - Do not fill when calling fill-paragraph with point in a region of code. - Fixed font-locking of exception names in parenthesized lists. - Fixed font-locking of decorators with arguments. - Fixed font-locking of triple-quoted strings; single quotes appearing in triple-quoted strings no longer upset font-locking. - Fixed the stack-entry regexp used by pdbtrack so that it now works with module-level frames. - Do not bind C-c C-h; `py-help-at-point' is now on C-c C-e by default. - hide-show mode is now supported. - When shifting regions right and left, keep the region active in Emacs. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From fuzzyman at voidspace.org.uk Fri Jan 7 18:47:39 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 07 Jan 2011 17:47:39 +0000 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: <20110107170704.EB1FE1CCA4@psf.upfronthosting.co.za> References: <20110107170704.EB1FE1CCA4@psf.upfronthosting.co.za> Message-ID: <4D2751BB.4070300@voidspace.org.uk> On 07/01/2011 17:07, Python tracker wrote: > ACTIVITY SUMMARY (2010-12-31 - 2011-01-07) > Python tracker at http://bugs.python.org/ > > To view or respond to any of the issues listed below, click on the issue. > Do NOT respond to this message. > > Issues counts and deltas: > open 2501 (-24) > closed 20138 (+80) > total 22639 (+56) > Nice work everyone. :-) At this rate we'll be down to zero open issues in only 2 years. ;-) Michael > Open issues with patches: 1045 > > > Issues opened (40) > ================== > > #4188: test_threading hang when running as verbose > http://bugs.python.org/issue4188 reopened by r.david.murray > > #8109: Server-side support for TLS Server Name Indication extension > http://bugs.python.org/issue8109 reopened by pitrou > > #10789: Lock.acquire documentation is misleading > http://bugs.python.org/issue10789 reopened by terry.reedy > > #10803: ctypes: better support of bytearray objects > http://bugs.python.org/issue10803 opened by mfxmfx > > #10805: traceback.print_exception throws AttributeError when exception > http://bugs.python.org/issue10805 opened by abingham > > #10808: ssl unwrap fails with Error 0 > http://bugs.python.org/issue10808 opened by apollo13 > > #10811: sqlite segfault with generators > http://bugs.python.org/issue10811 opened by Erick.Tryzelaar > > #10812: Add some posix functions > http://bugs.python.org/issue10812 opened by rosslagerwall > > #10813: Suppress adding decimal point for places=0 in moneyfmt() > http://bugs.python.org/issue10813 opened by cgrohmann > > #10817: urllib.request.urlretrieve never raises ContentTooShortError i > http://bugs.python.org/issue10817 opened by RC > > #10818: pydoc: Remove old server and tk panel > http://bugs.python.org/issue10818 opened by haypo > > #10820: 3.2 Makefile changes for versioned scripts break OS X framewor > http://bugs.python.org/issue10820 opened by ned.deily > > #10822: test_getgroups failure under Solaris > http://bugs.python.org/issue10822 opened by pitrou > > #10826: pass_fds sometimes fails > http://bugs.python.org/issue10826 opened by pitrou > > #10827: Functions in time module should support year< 1900 when accep > http://bugs.python.org/issue10827 opened by belopolsky > > #10829: PyUnicode_FromFormatV() bugs with "%" and "%%" format strings > http://bugs.python.org/issue10829 opened by haypo > > #10830: PyUnicode_FromFormatV("%c") doesn't support non-BMP characters > http://bugs.python.org/issue10830 opened by haypo > > #10831: PyUnicode_FromFormatV() doesn't support %li, %lli, %zi > http://bugs.python.org/issue10831 opened by haypo > > #10832: Add support of bytes objects in PyBytes_FromFormatV() > http://bugs.python.org/issue10832 opened by haypo > > #10833: Replace %.100s by %s in PyErr_Format(): the arbitrary limit of > http://bugs.python.org/issue10833 opened by haypo > > #10834: Python 2.7 x86 fails to run in Windows 7 > http://bugs.python.org/issue10834 opened by excubated > > #10835: sys.executable default and altinstall > http://bugs.python.org/issue10835 opened by allan > > #10836: TypeError during exception handling in urllib.request.urlretri > http://bugs.python.org/issue10836 opened by Alexandru.Mo?^(TM)oi > > #10837: Issue catching KeyboardInterrupt while reading stdin > http://bugs.python.org/issue10837 opened by Josh.Hanson > > #10838: subprocess __all__ is incomplete > http://bugs.python.org/issue10838 opened by a.badger > > #10839: email module should not allow some header field repetitions > http://bugs.python.org/issue10839 opened by adrien-saladin > > #10841: binary stdio > http://bugs.python.org/issue10841 opened by v+python > > #10842: Update third-party libraries for OS X installer builds > http://bugs.python.org/issue10842 opened by ned.deily > > #10843: OS X installer: install the Tools source directory > http://bugs.python.org/issue10843 opened by ned.deily > > #10845: test_multiprocessing failure under Windows > http://bugs.python.org/issue10845 opened by pitrou > > #10847: Distutils drops -fno-strict-aliasing when CFLAGS are set > http://bugs.python.org/issue10847 opened by skrah > > #10848: Move test.regrtest from getopt to argparse > http://bugs.python.org/issue10848 opened by brett.cannon > > #10849: Backport test/__main__ > http://bugs.python.org/issue10849 opened by belopolsky > > #10850: inconsistent behavior concerning multiprocessing.manager.BaseM > http://bugs.python.org/issue10850 opened by chrysn > > #10851: further extend ssl SNI and ciphers API > http://bugs.python.org/issue10851 opened by grooverdan > > #10852: SSL/TLS sni use in smtp,pop,imap,nntp,ftp client libs by defau > http://bugs.python.org/issue10852 opened by grooverdan > > #10854: Output DLL name in error message of ImportError when DLL is mi > http://bugs.python.org/issue10854 opened by techtonik > > #10855: wave.Wave_read.close() doesn't release file > http://bugs.python.org/issue10855 opened by pjcreath > > #10856: documentation for ImportError parameters and attributes > http://bugs.python.org/issue10856 opened by techtonik > > #10828: Cannot use nonascii utf8 in names of files imported from > http://bugs.python.org/issue10828 opened by ingemar > > > > Most recent 15 issues with no replies (15) > ========================================== > > #10856: documentation for ImportError parameters and attributes > http://bugs.python.org/issue10856 > > #10855: wave.Wave_read.close() doesn't release file > http://bugs.python.org/issue10855 > > #10850: inconsistent behavior concerning multiprocessing.manager.BaseM > http://bugs.python.org/issue10850 > > #10847: Distutils drops -fno-strict-aliasing when CFLAGS are set > http://bugs.python.org/issue10847 > > #10837: Issue catching KeyboardInterrupt while reading stdin > http://bugs.python.org/issue10837 > > #10836: TypeError during exception handling in urllib.request.urlretri > http://bugs.python.org/issue10836 > > #10831: PyUnicode_FromFormatV() doesn't support %li, %lli, %zi > http://bugs.python.org/issue10831 > > #10830: PyUnicode_FromFormatV("%c") doesn't support non-BMP characters > http://bugs.python.org/issue10830 > > #10822: test_getgroups failure under Solaris > http://bugs.python.org/issue10822 > > #10820: 3.2 Makefile changes for versioned scripts break OS X framewor > http://bugs.python.org/issue10820 > > #10817: urllib.request.urlretrieve never raises ContentTooShortError i > http://bugs.python.org/issue10817 > > #10811: sqlite segfault with generators > http://bugs.python.org/issue10811 > > #10808: ssl unwrap fails with Error 0 > http://bugs.python.org/issue10808 > > #10803: ctypes: better support of bytearray objects > http://bugs.python.org/issue10803 > > #10799: Improve webbrowser.open doc (and, someday, behavior?) > http://bugs.python.org/issue10799 > > > > Most recent 15 issues waiting for review (15) > ============================================= > > #10852: SSL/TLS sni use in smtp,pop,imap,nntp,ftp client libs by defau > http://bugs.python.org/issue10852 > > #10851: further extend ssl SNI and ciphers API > http://bugs.python.org/issue10851 > > #10843: OS X installer: install the Tools source directory > http://bugs.python.org/issue10843 > > #10842: Update third-party libraries for OS X installer builds > http://bugs.python.org/issue10842 > > #10841: binary stdio > http://bugs.python.org/issue10841 > > #10833: Replace %.100s by %s in PyErr_Format(): the arbitrary limit of > http://bugs.python.org/issue10833 > > #10832: Add support of bytes objects in PyBytes_FromFormatV() > http://bugs.python.org/issue10832 > > #10831: PyUnicode_FromFormatV() doesn't support %li, %lli, %zi > http://bugs.python.org/issue10831 > > #10830: PyUnicode_FromFormatV("%c") doesn't support non-BMP characters > http://bugs.python.org/issue10830 > > #10829: PyUnicode_FromFormatV() bugs with "%" and "%%" format strings > http://bugs.python.org/issue10829 > > #10827: Functions in time module should support year< 1900 when accep > http://bugs.python.org/issue10827 > > #10820: 3.2 Makefile changes for versioned scripts break OS X framewor > http://bugs.python.org/issue10820 > > #10818: pydoc: Remove old server and tk panel > http://bugs.python.org/issue10818 > > #10812: Add some posix functions > http://bugs.python.org/issue10812 > > #10798: test_concurrent_futures fails on FreeBSD > http://bugs.python.org/issue10798 > > > > Top 10 most discussed issues (10) > ================================= > > #4953: cgi module cannot handle POST with multipart/form-data in 3.0 > http://bugs.python.org/issue4953 43 msgs > > #10841: binary stdio > http://bugs.python.org/issue10841 24 msgs > > #10181: Problems with Py_buffer management in memoryobject.c (and else > http://bugs.python.org/issue10181 21 msgs > > #10512: regrtest ResourceWarning - unclosed sockets and files > http://bugs.python.org/issue10512 15 msgs > > #5945: PyMapping_Check returns 1 for lists > http://bugs.python.org/issue5945 14 msgs > > #9566: Compilation warnings under x64 Windows > http://bugs.python.org/issue9566 10 msgs > > #10834: Python 2.7 x86 fails to run in Windows 7 > http://bugs.python.org/issue10834 10 msgs > > #2193: Cookie Colon Name Bug > http://bugs.python.org/issue2193 8 msgs > > #10812: Add some posix functions > http://bugs.python.org/issue10812 8 msgs > > #1674555: sys.path in tests contains system directories > http://bugs.python.org/issue1674555 8 msgs > > > > Issues closed (76) > ================== > > #1187: pipe fd handling issues in subprocess.py on POSIX > http://bugs.python.org/issue1187 closed by pitrou > > #1452: subprocess's popen.stdout.seek(0) doesn't raise an error > http://bugs.python.org/issue1452 closed by pitrou > > #3466: urllib2 should support HTTPS connections with client keys > http://bugs.python.org/issue3466 closed by pitrou > > #3839: wsgi.simple_server resets 'Content-Length' header on empty con > http://bugs.python.org/issue3839 closed by pitrou > > #4662: posix module lacks several DeprecationWarning's > http://bugs.python.org/issue4662 closed by pitrou > > #5369: __ppc__ macro checking is incorrect > http://bugs.python.org/issue5369 closed by pitrou > > #5485: pyexpat has no unit tests for UseForeignDTD functionality > http://bugs.python.org/issue5485 closed by pitrou > > #6269: threading documentation makes no mention of the GIL > http://bugs.python.org/issue6269 closed by pitrou > > #6285: Silent abort on XP help document display > http://bugs.python.org/issue6285 closed by terry.reedy > > #6293: Have regrtest.py echo back sys.flags > http://bugs.python.org/issue6293 closed by pitrou > > #6610: Subprocess descriptor debacle > http://bugs.python.org/issue6610 closed by georg.brandl > > #6643: Throw away more radioactive locks that could be held across a > http://bugs.python.org/issue6643 closed by gregory.p.smith > > #6664: readlines should understand Line Separator and Paragraph Separ > http://bugs.python.org/issue6664 closed by pitrou > > #6800: os.exec* raises "OSError: [Errno 45] Operation not supported" > http://bugs.python.org/issue6800 closed by pitrou > > #7716: IPv6 detection, don't assume existence of /usr/xpg4/bin/grep > http://bugs.python.org/issue7716 closed by pitrou > > #7858: os.utime(file, (0,0,)) fails on on vfat, but doesn't fail imme > http://bugs.python.org/issue7858 closed by pitrou > > #7995: On Mac / BSD sockets returned by accept inherit the parent's F > http://bugs.python.org/issue7995 closed by pitrou > > #8013: time.asctime segfaults when given a time in the far future > http://bugs.python.org/issue8013 closed by georg.brandl > > #8278: os.utime doesn't allow a atime (Last Access) which is 27 years > http://bugs.python.org/issue8278 closed by amaury.forgeotdarc > > #8458: buildbot: test_cmd_line failure on Tiger: [Errno 9] Bad file d > http://bugs.python.org/issue8458 closed by pitrou > > #8499: Set a timeout in test_urllibnet > http://bugs.python.org/issue8499 closed by sandro.tosi > > #8626: TypeError: rsplit() takes no keyword arguments > http://bugs.python.org/issue8626 closed by eric.araujo > > #8719: buildbot: segfault on FreeBSD (signal 11) > http://bugs.python.org/issue8719 closed by haypo > > #8731: BeOS for 2.7 - add to unsupported > http://bugs.python.org/issue8731 closed by pitrou > > #8992: convertsimple() doesn't need to call converterr() if an except > http://bugs.python.org/issue8992 closed by haypo > > #9074: subprocess closes standard file descriptors when it should not > http://bugs.python.org/issue9074 closed by georg.brandl > > #9115: test_site: support for systems without unsetenv > http://bugs.python.org/issue9115 closed by eric.araujo > > #9332: Document requirements for os.symlink usage on Windows > http://bugs.python.org/issue9332 closed by brian.curtin > > #9361: Tests for leapdays in calendar.py module > http://bugs.python.org/issue9361 closed by r.david.murray > > #9370: Add reader redirect from test package docs to unittest module > http://bugs.python.org/issue9370 closed by ncoghlan > > #9671: test_executable_without_cwd fails: AssertionError: 1 != 47 > http://bugs.python.org/issue9671 closed by sandro.tosi > > #9854: SocketIO should return None on EWOULDBLOCK > http://bugs.python.org/issue9854 closed by pitrou > > #9905: subprocess.Popen fails with stdout=PIPE, stderr=PIPE if standa > http://bugs.python.org/issue9905 closed by pitrou > > #9977: TestCase.assertItemsEqual's description of differences > http://bugs.python.org/issue9977 closed by michael.foord > > #10001: ~Py_buffer.obj field is undocumented, though not hidden > http://bugs.python.org/issue10001 closed by ncoghlan > > #10028: test_concurrent_futures fails on Windows Server 2003 > http://bugs.python.org/issue10028 closed by bquinlan > > #10104: test_socket failures on Debian unstable > http://bugs.python.org/issue10104 closed by pitrou > > #10130: Create epub format docs and offer them on the download page > http://bugs.python.org/issue10130 closed by georg.brandl > > #10267: test_ttk_guionly leaks many references > http://bugs.python.org/issue10267 closed by pitrou > > #10270: Fix resource warnings in test_threading > http://bugs.python.org/issue10270 closed by sandro.tosi > > #10333: Remove ancient backwards compatibility GC API > http://bugs.python.org/issue10333 closed by pitrou > > #10475: hardcoded compilers for LDSHARED/LDCXXSHARED on NetBSD > http://bugs.python.org/issue10475 closed by pitrou > > #10492: test_doctest fails with iso-8859-15 locale > http://bugs.python.org/issue10492 closed by haypo > > #10502: Add unittestguirunner to Tools/ > http://bugs.python.org/issue10502 closed by michael.foord > > #10563: Spurious newline in time.ctime > http://bugs.python.org/issue10563 closed by georg.brandl > > #10619: Failed module loading in test discovery loses traceback > http://bugs.python.org/issue10619 closed by michael.foord > > #10620: `python -m uniittest` should work with file paths as well as t > http://bugs.python.org/issue10620 closed by michael.foord > > #10655: Wrong powerpc define in Python/ceval.c > http://bugs.python.org/issue10655 closed by dmalcolm > > #10737: test_concurrent_futures failure on Windows > http://bugs.python.org/issue10737 closed by bquinlan > > #10751: REMOTE_USER and Remote-User collision in wsgiref > http://bugs.python.org/issue10751 closed by Alex.Raitz > > #10786: unittest.TextTextRunner does not respect redirected stderr > http://bugs.python.org/issue10786 closed by michael.foord > > #10788: test_logging failure > http://bugs.python.org/issue10788 closed by bquinlan > > #10790: Header.append's charset logic is bogus, 'shift_jis' and "euc_j > http://bugs.python.org/issue10790 closed by r.david.murray > > #10801: zipfile.ZipFile().extractall() header mismatch for non-ASCII c > http://bugs.python.org/issue10801 closed by georg.brandl > > #10802: python3.2 AFTER b2 release has subprocess.Popen broken under c > http://bugs.python.org/issue10802 closed by georg.brandl > > #10804: Copy and paste error in _json.c > http://bugs.python.org/issue10804 closed by georg.brandl > > #10806: Subprocess error if fds 0,1,2 are closed > http://bugs.python.org/issue10806 closed by pitrou > > #10807: `b'dGVzdA==\n'.decode('base64')` raise exception > http://bugs.python.org/issue10807 closed by haypo > > #10809: complex() comments wrongly say it supports NaN and inf > http://bugs.python.org/issue10809 closed by dalke > > #10810: logging.handlers.TimedRotatingFileHandler.__init__(): ST_MTIME > http://bugs.python.org/issue10810 closed by georg.brandl > > #10814: assertion failed on Windows buildbots > http://bugs.python.org/issue10814 closed by belopolsky > > #10815: Write to /dev/full does not raise IOError > http://bugs.python.org/issue10815 closed by haypo > > #10816: test_multiprocessing: unclosed sockets > http://bugs.python.org/issue10816 closed by haypo > > #10819: ValueError on repr(closed_socket_file) > http://bugs.python.org/issue10819 closed by haypo > > #10821: gethostbyname(gethostname()) is wrong when IP is changed > http://bugs.python.org/issue10821 closed by georg.brandl > > #10823: "conversion from 'Py_ssize_t' to 'int', possible loss of data" > http://bugs.python.org/issue10823 closed by pitrou > > #10824: urandom should not block > http://bugs.python.org/issue10824 closed by pitrou > > #10825: use assertIsNone(...) instead of assertEquals(None, ...) > http://bugs.python.org/issue10825 closed by rhettinger > > #10840: pyarg_parsetuple docs and py_buffer > http://bugs.python.org/issue10840 closed by pitrou > > #10844: OS X installer: update copyright dates in app bundles > http://bugs.python.org/issue10844 closed by georg.brandl > > #10846: typo in threading doc's: "size of the resource size" > http://bugs.python.org/issue10846 closed by georg.brandl > > #10853: SSL/TLS sni use in smtp,pop,imap,nntp,ftp client libs by defau > http://bugs.python.org/issue10853 closed by pitrou > > #10857: ImportError module attribute > http://bugs.python.org/issue10857 closed by brian.curtin > > #976613: socket timeout problems on Solaris > http://bugs.python.org/issue976613 closed by pitrou > > #1665333: Documentation missing for OptionGroup class in optparse > http://bugs.python.org/issue1665333 closed by georg.brandl > > #1677694: test_timeout refactoring > http://bugs.python.org/issue1677694 closed by pitrou > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From ulidtko at gmail.com Sat Jan 8 08:55:19 2011 From: ulidtko at gmail.com (max ulidtko) Date: Sat, 08 Jan 2011 09:55:19 +0200 Subject: [Python-Dev] Add sendfile() to core? Message-ID: <1294473319.9834.48.camel@ulidtko> On Wed, 20 Mar 2002 14:53:58 -0500, Andrew Kuchling wrote: | sendfile() is used when writing really high-performance Web servers, | in order to save an unnecessary memory-to-memory copy. Question: | should I make up a patch to add a sendfile() wrapper to Python? So, was this proposal rejected? If so, for what reasons? Wrapper of such a useful call would be of great convenience, especially considering its availability on Win32 (see Thomas Heller's notice, [1]). Anyway, when one needs to send (arbitrarily large) files to a socket and back, ugly and slow workarounds are born, like this one: def copy_file(file1, file2, length, blocksize=40960): """ Transfer exactly length bytes from one file-like object to another """ sofar = 0 while sofar < length: amount = blocksize if sofar + blocksize <= length \ else length - sofar file2.write(file1.read(amount)) sofar += amount Using hypothetical os.sendfile() would be so much better! The only difficulty I can see is the choice of name for the wrapper. IMO, using "sendfile" from Linux and FreeBSD is pretty much okay; but objections may arise. [1] http://mail.python.org/pipermail/python-dev/2002-March/021543.html ------ Sincerely, max ulidtko From solipsis at pitrou.net Sun Jan 9 20:11:44 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 9 Jan 2011 20:11:44 +0100 Subject: [Python-Dev] Add sendfile() to core? References: <1294473319.9834.48.camel@ulidtko> Message-ID: <20110109201144.7ff6674f@pitrou.net> On Sat, 08 Jan 2011 09:55:19 +0200 max ulidtko wrote: > On Wed, 20 Mar 2002 14:53:58 -0500, Andrew Kuchling wrote: > | sendfile() is used when writing really high-performance Web servers, > | in order to save an unnecessary memory-to-memory copy. Question: > | should I make up a patch to add a sendfile() wrapper to Python? > > So, was this proposal rejected? If so, for what reasons? I saw no patch for it, so Andrew probably didn't get to it. Regards Antoine. From guido at python.org Sun Jan 9 20:17:40 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 9 Jan 2011 11:17:40 -0800 Subject: [Python-Dev] Add sendfile() to core? In-Reply-To: <1294473319.9834.48.camel@ulidtko> References: <1294473319.9834.48.camel@ulidtko> Message-ID: Isn't that just shutil.copyfileobj()? On Fri, Jan 7, 2011 at 11:55 PM, max ulidtko wrote: > On Wed, 20 Mar 2002 14:53:58 -0500, Andrew Kuchling wrote: > | sendfile() is used when writing really high-performance Web servers, > | in order to save an unnecessary memory-to-memory copy. ?Question: > | should I make up a patch to add a sendfile() wrapper to Python? > > So, was this proposal rejected? If so, for what reasons? > > Wrapper of such a useful call would be of great convenience, especially > considering its availability on Win32 (see Thomas Heller's notice, [1]). > > Anyway, when one needs to send (arbitrarily large) files to a socket and > back, ugly and slow workarounds are born, like this one: > > def copy_file(file1, file2, length, blocksize=40960): > ? ?""" Transfer exactly length bytes from one file-like object to > another """ > ? ?sofar = 0 > ? ?while sofar < length: > ? ? ? ?amount = blocksize if sofar + blocksize <= length \ > ? ? ? ? ? ? ? ? ? ? ? ? ? else length - sofar > ? ? ? ?file2.write(file1.read(amount)) > ? ? ? ?sofar += amount > > > Using hypothetical os.sendfile() would be so much better! > > The only difficulty I can see is the choice of name for the wrapper. > IMO, using "sendfile" from Linux and FreeBSD is pretty much okay; but > objections may arise. > > [1] http://mail.python.org/pipermail/python-dev/2002-March/021543.html > > ------ > Sincerely, > max ulidtko > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) From g.rodola at gmail.com Sun Jan 9 21:09:40 2011 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Sun, 9 Jan 2011 21:09:40 +0100 Subject: [Python-Dev] Add sendfile() to core? In-Reply-To: <1294473319.9834.48.camel@ulidtko> References: <1294473319.9834.48.camel@ulidtko> Message-ID: A strong +1. Projects such as Twisted would certainly benefit from such an addiction. I'm not sure the os module is the right place for sendfile() to land though. Implementation between different platforms tends to vary quite a bit. A good resource is the samba source code which contains an implementation for all major UNIX systems. --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ 2011/1/8 max ulidtko : > On Wed, 20 Mar 2002 14:53:58 -0500, Andrew Kuchling wrote: > | sendfile() is used when writing really high-performance Web servers, > | in order to save an unnecessary memory-to-memory copy. ?Question: > | should I make up a patch to add a sendfile() wrapper to Python? > > So, was this proposal rejected? If so, for what reasons? > > Wrapper of such a useful call would be of great convenience, especially > considering its availability on Win32 (see Thomas Heller's notice, [1]). > > Anyway, when one needs to send (arbitrarily large) files to a socket and > back, ugly and slow workarounds are born, like this one: > > def copy_file(file1, file2, length, blocksize=40960): > ? ?""" Transfer exactly length bytes from one file-like object to > another """ > ? ?sofar = 0 > ? ?while sofar < length: > ? ? ? ?amount = blocksize if sofar + blocksize <= length \ > ? ? ? ? ? ? ? ? ? ? ? ? ? else length - sofar > ? ? ? ?file2.write(file1.read(amount)) > ? ? ? ?sofar += amount > > > Using hypothetical os.sendfile() would be so much better! > > The only difficulty I can see is the choice of name for the wrapper. > IMO, using "sendfile" from Linux and FreeBSD is pretty much okay; but > objections may arise. > > [1] http://mail.python.org/pipermail/python-dev/2002-March/021543.html > > ------ > Sincerely, > max ulidtko > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/g.rodola%40gmail.com > From solipsis at pitrou.net Sun Jan 9 21:31:12 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 9 Jan 2011 21:31:12 +0100 Subject: [Python-Dev] Add sendfile() to core? References: <1294473319.9834.48.camel@ulidtko> Message-ID: <20110109213112.34a2fc55@pitrou.net> On Sun, 9 Jan 2011 11:17:40 -0800 Guido van Rossum wrote: > Isn't that just shutil.copyfileobj()? copyfileobj() still uses an user-space buffer (the Python bytes object used in the loop). The advantage of sendfile() is to bypass user-space logic and do the transfer entirely in kernel. How much it allows to gain *in practice* on a modern capable OS such as Linux I don't know. Regards Antoine. From meadori at gmail.com Sun Jan 9 20:21:11 2011 From: meadori at gmail.com (Meador Inge) Date: Sun, 9 Jan 2011 13:21:11 -0600 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: References: Message-ID: On Fri, Jan 7, 2011 at 11:20 AM, anatoly techtonik wrote: > This happened, because of poor bug management, where community doesn't > play any role in determining which issues are desired. > This mostly because of limitation of our tracker and desire of people > to extend it to get damn "stars", module split, sorting, digging and > tagging options. Adding a few new features to the issue tracker isn't going to make the forgotten changes problems (assuming that it is, indeed, a problem) that you mentioned magically go away. Tools alone don't fix problems, there are people using the tools involved too, and getting people to use tools effectively is much more difficult. Adding more features to a tool that is not be used effectively, just makes it be used even less effectively. I speak from recent experiences of helping roll out JIRA to a 50 man engineering team. The one regret that I have is that we turned too many stars, bells, and whistles on instead of helping people create good issue reports. Some times there is very good reason to add such features, but significant amount of data should be there backing that decision up. It is better to wait until the data is there pointing to the problem. I grabbed the following descriptions from a reply from another part of this thread: > Stars: > go http://code.google.com/p/support/issues/list > find Stars column > guess JIRA has voting, which I have used. However, it boils back to the tools vs. people problem. Enabling voting is useless if no one honors the votes. I have seen this happen. You must have community support. > Module split: > try to get all issues for 'os' module > try to subscribe to all commits for 'CGIHTTPServer' I have myself wanted this as well before. However, the downside is that having more options to select from will inevitably increase the amount of incorrect selections that are made. Fewer choices, better data. I would rather have better data. > Sorting: > click on column titles in bug tracker search results You can just do sorted searches, right? > Tagging: > as a tracker user, try to add tag 'easy' to some easy issue Are you suggesting that *any* tracker user be allowed to place arbitrary tags on an issue? If so, then I think that would be more confusing as there would be no uniformity to the entries. I like the keywords in use on the tracker today better. -- Meador From merwok at netwok.org Sun Jan 9 22:52:47 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Sun, 09 Jan 2011 22:52:47 +0100 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: References: Message-ID: <4D2A2E2F.7040007@netwok.org> Le 07/01/2011 19:39, Brian Curtin a ?crit : > On Fri, Jan 7, 2011 at 12:14, anatoly techtonik wrote: >> Module split: >> try to get all issues for 'os' module > No solution for this right now, but people have suggested that we add > drop-downs for each module. I'm -0 on that. I proposed that on http://wiki.python.org/moin/DesiredTrackerFeatures#new-field-for-module-package and R. David Murray replied that this had already been brought up and shot down numerous times on python-dev. I?ve been unable to find those threads. Regards From martin at v.loewis.de Sun Jan 9 23:06:22 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 09 Jan 2011 23:06:22 +0100 Subject: [Python-Dev] Add sendfile() to core? In-Reply-To: <20110109213112.34a2fc55@pitrou.net> References: <1294473319.9834.48.camel@ulidtko> <20110109213112.34a2fc55@pitrou.net> Message-ID: <4D2A315E.2090903@v.loewis.de> Am 09.01.2011 21:31, schrieb Antoine Pitrou: > On Sun, 9 Jan 2011 11:17:40 -0800 > Guido van Rossum wrote: >> Isn't that just shutil.copyfileobj()? > > copyfileobj() still uses an user-space buffer (the Python bytes > object used in the loop). The advantage of sendfile() is to bypass > user-space logic and do the transfer entirely in kernel. How much it > allows to gain *in practice* on a modern capable OS such as Linux I > don't know. There would be at least two layers of savings: a) no Python objects would be created, and no bytecode loop would run the copying b) the data are not even copied into userspace at all My guess is that the savings of doing a) are larger than the savings of doing b). Regards, Martin From martin at v.loewis.de Sun Jan 9 23:31:24 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 09 Jan 2011 23:31:24 +0100 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: References: <4D276657.7040009@voidspace.org.uk> Message-ID: <4D2A373C.7040701@v.loewis.de> > Maybe that could be fixed? Then the remaining feature would be a way > to sort issue lists by number of nosy people, and to display the > length of the nosy list. http://bugs.python.org/issue?@action=search&@columns=title,id,nosy_count&status=1&@sort=-nosy_count You can create an URL like this through the search form. Regards, Martin From solipsis at pitrou.net Sun Jan 9 23:33:04 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 9 Jan 2011 23:33:04 +0100 Subject: [Python-Dev] API refactoring tracker field for Python4 References: <4D2A2E2F.7040007@netwok.org> Message-ID: <20110109233304.00366008@pitrou.net> On Sun, 09 Jan 2011 22:52:47 +0100 ?ric Araujo wrote: > Le 07/01/2011 19:39, Brian Curtin a ?crit : > > On Fri, Jan 7, 2011 at 12:14, anatoly techtonik wrote: > >> Module split: > >> try to get all issues for 'os' module > > No solution for this right now, but people have suggested that we add > > drop-downs for each module. I'm -0 on that. > > I proposed that on > http://wiki.python.org/moin/DesiredTrackerFeatures#new-field-for-module-package > and R. David Murray replied that this had already been brought up and > shot down numerous times on python-dev. I?ve been unable to find those > threads. A drop-down would be terribly cumbersome. An input field with realtime completion would be probably better. Regards Antoine. From brett at python.org Mon Jan 10 00:18:12 2011 From: brett at python.org (Brett Cannon) Date: Sun, 9 Jan 2011 15:18:12 -0800 Subject: [Python-Dev] devguide: Add an intermediate task of helping triage issues (not to be confused with the In-Reply-To: References: <20110108232235.735fea7c@pitrou.net> Message-ID: On Sun, Jan 9, 2011 at 00:26, Georg Brandl wrote: > Am 08.01.2011 23:22, schrieb Antoine Pitrou: >> On Sat, 08 Jan 2011 23:05:06 +0100 >> brett.cannon wrote: >>> +For bugs, an issue needs to: >>> + >>> +* Clearly explain the bug so it can be reproduced >>> +* All relevant platform details are included >>> +* What version(s) of Python are affected by the bug are fully known >>> +* Is there a proper unit test that can reproduce the bug? >>> + >>> +These are things anyone can help with. >> >> FWIW, I'm really not fond of handing out triage tasks to beginners. >> First because the claim that it doesn't require any specific knowledge >> is wrong (in the case of Python, because it is a highly technical >> product; it might be right for office suites, who knows). >> Second because a newbie triager gets to interact with other newbies who >> might be very confused if they are given misleading comments or asked >> misleading (or completely irrelevant) questions. > > +1. ?Remember, this is not a purely hypothetical statement. OK, so the sentence is poorly phrased, but in the list of tasks it is labeled explicitly as intermediate when one is comfortable with the process, not a newbie. Does that alleviate the worry you both have? -Brett > >> Things may be different when the person in question has been a long-time >> community member, or has specific expertise, and is therefore able to >> communicate meaningful advice. But for true beginners, I think it would >> be much better to let them write a patch or a doc fix. > > Yep. > > Georg > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > From foom at fuhm.net Mon Jan 10 01:01:28 2011 From: foom at fuhm.net (James Y Knight) Date: Sun, 9 Jan 2011 19:01:28 -0500 Subject: [Python-Dev] Add sendfile() to core? In-Reply-To: <1294473319.9834.48.camel@ulidtko> References: <1294473319.9834.48.camel@ulidtko> Message-ID: If you're gonna wrap sendfile, it might be nice to also wrap the splice, tee, and vmsplice syscalls on linux, since they're a lot more flexible. Also note that sendfile on BSD has a completely different signature to sendfile on linux. The BSD one has the rather odd functionality of a built-in writev() before and after the sending of the file itself, with an extra struct argument to specify that, while on linux, if you want to write some other buffers, you're just expected to call writev yourself. James From ulidtko at gmail.com Mon Jan 10 01:00:06 2011 From: ulidtko at gmail.com (max ulidtko) Date: Mon, 10 Jan 2011 02:00:06 +0200 Subject: [Python-Dev] Add sendfile() to core? In-Reply-To: References: <1294473319.9834.48.camel@ulidtko> Message-ID: <1294617606.9834.184.camel@ulidtko> On Sun, 9 Jan 2011 11:17:40 -0800, Guido van Rossum wrote: | Isn't that just shutil.copyfileobj()? | This function has two drawbacks. First, it copies until EOF, and has no possibility to copy exactly N bytes from source fd (say, opened socket). This is the reason why I (re)wrote my custom copying function - to allow that. The second drawback is not using huge performance bonus achievable with sendfile(2) syscall on some unices and TransmitFile() on win32. It allows for sending and receiving files at ethernet speeds with no CPU load at all, because no memory copying occurs (and no Python wrapper objects would be created for the buffers, no heap allocation, etc). All needed I/O requests are handled inside the kernel without a single copying of data. Thus for e.g. FTP servers using this syscall became a standard. It would be good if Python supported it. P.S. There seems to be a package which enables the sendfile support for Linux, FreeBSD and AIX, . Though it's available only for 2.x. Hopefully I'll end up with a patch soon. ------- Regards, max ulidtko From exarkun at twistedmatrix.com Mon Jan 10 01:17:24 2011 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Mon, 10 Jan 2011 00:17:24 -0000 Subject: [Python-Dev] Add sendfile() to core? In-Reply-To: References: <1294473319.9834.48.camel@ulidtko> Message-ID: <20110110001724.1821.1469348533.divmod.xquotient.24@localhost.localdomain> On 9 Jan, 08:09 pm, g.rodola at gmail.com wrote: >A strong +1. >Projects such as Twisted would certainly benefit from such an >addiction. Eh. There would probably be some benefits, but I don't think they would be very large in the majority of cases. Also, since adding it to 2.x would be prohibited, it will be at least several years before Twisted actually benefits from its addition to the standard library. Plus, Pavel Pergamenshchik wrapped sendfile for Twisted already many years ago, but no one was interested enough to actually land the change in trunk. However, if it would help, I'm sure Pavel's code can be contributed to CPython. If anyone would like to take a look: http://twistedmatrix.com/trac/browser/branches/sendfile-585-4/twisted/python/test/test_sendfile.py http://twistedmatrix.com/trac/browser/branches/sendfile-585-4/twisted/python/_sendfile.c Jean-Paul From tjreedy at udel.edu Mon Jan 10 01:23:40 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 09 Jan 2011 19:23:40 -0500 Subject: [Python-Dev] Add sendfile() to core? In-Reply-To: <1294473319.9834.48.camel@ulidtko> References: <1294473319.9834.48.camel@ulidtko> Message-ID: On 1/8/2011 2:55 AM, max ulidtko wrote: > On Wed, 20 Mar 2002 14:53:58 -0500, Andrew Kuchling wrote: > | sendfile() is used when writing really high-performance Web servers, > | in order to save an unnecessary memory-to-memory copy. Question: > | should I make up a patch to add a sendfile() wrapper to Python? There is no issue on the tracker and he apparently never did. There is a general policy of lightly wrapping useful os calls in os. Martin said this already in what was essentially a 'go ahead'. Problems include os differences, > The only difficulty I can see is the choice of name for the wrapper. > IMO, using "sendfile" from Linux and FreeBSD is pretty much okay; but > objections may arise. > > [1] http://mail.python.org/pipermail/python-dev/2002-March/021543.html such as name differences (but I think *nix generally wins ;-), and the need for 'someone' to write the patches for the appropriate C-coded os files: posix, nt, os2, ce. Patch write makes initial decision on ironing out differences. The above was the second and last substantive answer to Andrew, as there was nothing much more to say. The tracker awaits ;-). Specify, if you can, whether you think the windows TransmitFile or modern equivalent is sufficiently compatible with the *nix sendfile to be wrapped with the same API or whether you propose Availability: Unix only. -- Terry Jan Reedy From stephen at xemacs.org Mon Jan 10 02:54:10 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 10 Jan 2011 10:54:10 +0900 Subject: [Python-Dev] API refactoring tracker field for Python4 In-Reply-To: <20110109233304.00366008@pitrou.net> References: <4D2A2E2F.7040007@netwok.org> <20110109233304.00366008@pitrou.net> Message-ID: <8739p1a5h9.fsf@uwakimon.sk.tsukuba.ac.jp> Antoine Pitrou writes: > A drop-down [list of modules] would be terribly cumbersome. On the XEmacs tracker, we use a multilink with a checkbox list for the modules field. This allows you to type in the text field, to check multiple boxes, and provides input checking. In my typical usage, I don't find this cumbersome at all; it's my preferred UI for that field. (OTOH, I set it up, so my favorite components are all at the top. :-) The main problem is format of the checkbox page. By default it only allows a limited number of entries per page, the limit is pretty low because its formatted as a single column, and if you switch pages it "forgets" the entries you've already checked. This shouldn't be hard to improve, but I haven't bothered as I have yet to hear a complaint about it. > An input field with realtime completion would be probably better. Maybe. I've often been unable to remember the initial letter of a package, which makes completion difficult. ;-) From ianb at colorstudy.com Mon Jan 10 18:24:53 2011 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 10 Jan 2011 11:24:53 -0600 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: <87d3o6a57i.fsf@uwakimon.sk.tsukuba.ac.jp> References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> <1294401061.14078.30.camel@marge> <925E2041-7B6F-4868-8E9D-856A15784C25@fuhm.net> <20110107170449.ECC643A411A@sparrow.telecommunity.com> <87d3o6a57i.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Sun, Jan 9, 2011 at 1:47 AM, Stephen J. Turnbull wrote: > Robert Brewer writes: > > > Python 3.1 was released June 27th, 2009. We're coming up faster on the > > two-year period than we seem to be on a revised WSGI spec. Maybe we > > should shoot for a "bytes of a known encoding" type first. > > You have one. It's called "ISO 2022: Information processing -- ISO > 7-bit and 8-bit coded character sets -- Code extension techniques". > The popularity of that standard speaks for itself. > The kind of object PJE was referring to is more like Ruby's strings, which do not embed the encoding inside the bytes themselves but have the encoding as a kind of annotation on the bytes, and do lazy transcoding when combining strings of different encodings. The goal with respect to WSGI is that you could annotate bytes with an encoding but also change or fix that encoding if other out-of-band information implied that you got the encoding wrong (e.g., some data is submitted with the encoding of the page the browser was on, and so nothing inside the request itself will indicate the encoding of the data). Latin1 is kind of the poor man's version of this -- it's a good guess at an encoding, that at worst requires transcoding that can be done in a predictable way. (Personally I think Latin1 gets us 99% of the way there, and so bytes-of-a-known-encoding are not really that important to the WSGI case.) Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at voidspace.org.uk Mon Jan 10 18:49:25 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 10 Jan 2011 17:49:25 +0000 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> <1294401061.14078.30.camel@marge> <925E2041-7B6F-4868-8E9D-856A15784C25@fuhm.net> <20110107170449.ECC643A411A@sparrow.telecommunity.com> <87d3o6a57i.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4D2B46A5.3060000@voidspace.org.uk> On 10/01/2011 17:24, Ian Bicking wrote: > On Sun, Jan 9, 2011 at 1:47 AM, Stephen J. Turnbull > > wrote: > > Robert Brewer writes: > > > Python 3.1 was released June 27th, 2009. We're coming up faster > on the > > two-year period than we seem to be on a revised WSGI spec. Maybe we > > should shoot for a "bytes of a known encoding" type first. > > You have one. It's called "ISO 2022: Information processing -- ISO > 7-bit and 8-bit coded character sets -- Code extension techniques". > The popularity of that standard speaks for itself. > > > The kind of object PJE was referring to is more like Ruby's strings, > which do not embed the encoding inside the bytes themselves but have > the encoding as a kind of annotation on the bytes, and do lazy > transcoding when combining strings of different encodings. The goal > with respect to WSGI is that you could annotate bytes with an encoding > but also change or fix that encoding if other out-of-band information > implied that you got the encoding wrong (e.g., some data is submitted > with the encoding of the page the browser was on, and so nothing > inside the request itself will indicate the encoding of the data). > Latin1 is kind of the poor man's version of this -- it's a good guess > at an encoding, that at worst requires transcoding that can be done in > a predictable way. (Personally I think Latin1 gets us 99% of the way > there, and so bytes-of-a-known-encoding are not really that important > to the WSGI case.) > I think the language moratorium was not the only objection to the inclusion of a third string type in Python (the "screwed string" - safe to treat neither as bytes nor as text). I recall objections in principle too from core developers during the EuroPython language summit. Michael > Ian > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Jan 10 18:55:09 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 11 Jan 2011 03:55:09 +1000 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> <1294401061.14078.30.camel@marge> <925E2041-7B6F-4868-8E9D-856A15784C25@fuhm.net> <20110107170449.ECC643A411A@sparrow.telecommunity.com> <87d3o6a57i.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Tue, Jan 11, 2011 at 3:24 AM, Ian Bicking wrote: > > The kind of object PJE was referring to is more like Ruby's strings, which > do not embed the encoding inside the bytes themselves but have the encoding > as a kind of annotation on the bytes, and do lazy transcoding when combining > strings of different encodings.? The goal with respect to WSGI is that you > could annotate bytes with an encoding but also change or fix that encoding > if other out-of-band information implied that you got the encoding wrong > (e.g., some data is submitted with the encoding of the page the browser was > on, and so nothing inside the request itself will indicate the encoding of > the data).? Latin1 is kind of the poor man's version of this -- it's a good > guess at an encoding, that at worst requires transcoding that can be done in > a predictable way.? (Personally I think Latin1 gets us 99% of the way there, > and so bytes-of-a-known-encoding are not really that important to the WSGI > case.) Having done the upgrade to urllib to support direct manipulation of byte sequences, I don't think such a type would help as much people hoped anyway. Converting to Unicode, manipulating as text and converting back really *is* the right way to do text manipulation (however, providing bytes-in-bytes-out APIs that do the conversions for you can also be quite convenient). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Mon Jan 10 19:01:55 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 10 Jan 2011 19:01:55 +0100 Subject: [Python-Dev] "unit test needed" Message-ID: <20110110190155.1b03667c@pitrou.net> Hello, I would like to advocate again for the removal of the "unit test needed" stage on the tracker, which regularly confuses our triagers into thinking it's an actual requirement or expectation from contributors and bug reporters. Regards Antoine. From solipsis at pitrou.net Mon Jan 10 19:24:03 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 10 Jan 2011 19:24:03 +0100 Subject: [Python-Dev] devguide: Add an intermediate task of helping triage issues (not to be confused with the In-Reply-To: References: <20110108232235.735fea7c@pitrou.net> Message-ID: <20110110192403.7a63fb11@pitrou.net> On Sun, 9 Jan 2011 15:18:12 -0800 Brett Cannon wrote: > > OK, so the sentence is poorly phrased, but in the list of tasks it is > labeled explicitly as intermediate when one is comfortable with the > process, not a newbie. Does that alleviate the worry you both have? It does seem to alleviate it :) Sorry for not noticing! However, could the following be removed from the list: ?Is there a proper unit test that can reproduce the bug?? We don't need or require unit tests to reproduce bugs; and besides, some things simply are very difficult to write an unit test for. A reporter need not be an experienced Python developer able (or willing) to write an elaborate unit test reproducing, for example, a timing issue involving Unix signals and the IO stack ;) Regards Antoine. From merwok at netwok.org Mon Jan 10 19:21:22 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 10 Jan 2011 19:21:22 +0100 Subject: [Python-Dev] "unit test needed" In-Reply-To: <20110110190155.1b03667c@pitrou.net> References: <20110110190155.1b03667c@pitrou.net> Message-ID: <4D2B4E22.9060001@netwok.org> > I would like to advocate again for the removal of the "unit test > needed" stage on the tracker, which regularly confuses our triagers > into thinking it's an actual requirement or expectation from > contributors and bug reporters. Speaking as a bug triager: +1 to rename it ?test needed? +1 to remove it Regards From g.brandl at gmx.net Mon Jan 10 20:42:21 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 10 Jan 2011 20:42:21 +0100 Subject: [Python-Dev] "unit test needed" In-Reply-To: <4D2B4E22.9060001@netwok.org> References: <20110110190155.1b03667c@pitrou.net> <4D2B4E22.9060001@netwok.org> Message-ID: Am 10.01.2011 19:21, schrieb ?ric Araujo: >> I would like to advocate again for the removal of the "unit test >> needed" stage on the tracker, which regularly confuses our triagers >> into thinking it's an actual requirement or expectation from >> contributors and bug reporters. > > Speaking as a bug triager: > +1 to rename it ?test needed? > +1 to remove it First rename, then remove? Georg From merwok at netwok.org Mon Jan 10 19:49:58 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 10 Jan 2011 19:49:58 +0100 Subject: [Python-Dev] "unit test needed" In-Reply-To: References: <20110110190155.1b03667c@pitrou.net> <4D2B4E22.9060001@netwok.org> Message-ID: <4D2B54D6.2020001@netwok.org> >> +1 to rename it ?test needed? >> +1 to remove it I meant either one would be an improvement. Regards From lukasz at langa.pl Mon Jan 10 19:37:01 2011 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Mon, 10 Jan 2011 19:37:01 +0100 Subject: [Python-Dev] devguide: Point out that OS X users need to change examples to use python.exe instead of In-Reply-To: References: Message-ID: <3160B46C-EEC0-4A0A-B4A1-0590678AA6F8@langa.pl> Wiadomo?? napisana przez Ned Deily w dniu 2011-01-08, o godz. 22:13: > In article > , > Brett Cannon wrote: > >> On Thu, Jan 6, 2011 at 13:04, Ned Deily wrote: >>> In article , >>> >>> That's true on OS X if you are using a case-insensitive file system. >>> But wIth the newer, case-sensitive HFS+, for example, you get ./python. >> >> Are you thinking of UFS, because I am running HFS+ and I still get >> python.exe since it's case-preserving. > > No, not UFS. Since at least 10.4, OS X has supported the creation of at > least four variants of HFS+ via Disk Utility.app or disktutil(8). I'm using the case-sensitive variant of HFS+ since 10.4. It works, I like it and you get ./python with it. -- Best regards, ?ukasz Langa tel. +48 791 080 144 WWW http://lukasz.langa.pl/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Jan 10 20:05:21 2011 From: brett at python.org (Brett Cannon) Date: Mon, 10 Jan 2011 11:05:21 -0800 Subject: [Python-Dev] devguide: Add an intermediate task of helping triage issues (not to be confused with the In-Reply-To: <20110110192403.7a63fb11@pitrou.net> References: <20110108232235.735fea7c@pitrou.net> <20110110192403.7a63fb11@pitrou.net> Message-ID: On Mon, Jan 10, 2011 at 10:24, Antoine Pitrou wrote: > On Sun, 9 Jan 2011 15:18:12 -0800 > Brett Cannon wrote: > > > > OK, so the sentence is poorly phrased, but in the list of tasks it is > > labeled explicitly as intermediate when one is comfortable with the > > process, not a newbie. Does that alleviate the worry you both have? > > It does seem to alleviate it :) Sorry for not noticing! > However, could the following be removed from the list: > > ?Is there a proper unit test that can reproduce the bug?? > > We don't need or require unit tests to reproduce bugs; and besides, > some things simply are very difficult to write an unit test for. A > reporter need not be an experienced Python developer able (or willing) > to write an elaborate unit test reproducing, for example, a timing > issue involving Unix signals and the IO stack ;) > > Fair enough. I will remove it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Mon Jan 10 20:11:23 2011 From: eliben at gmail.com (Eli Bendersky) Date: Mon, 10 Jan 2011 21:11:23 +0200 Subject: [Python-Dev] "unit test needed" In-Reply-To: <20110110190155.1b03667c@pitrou.net> References: <20110110190155.1b03667c@pitrou.net> Message-ID: > I would like to advocate again for the removal of the "unit test > needed" stage on the tracker, which regularly confuses our triagers > into thinking it's an actual requirement or expectation from > contributors and bug reporters. > > Perhaps a different wording would be preferred to removal. Suppose a reviewer accepts a patch but asks for a test before committing it. If it's hidden in the issue discussion, only those involved in the issue are aware of the situation. If it's in the issue state, then other potential contributors may notice it and provide tests. IMHO tests are simpler and less "scary" for newbies making their first steps in CPython. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at voidspace.org.uk Mon Jan 10 20:26:55 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 10 Jan 2011 19:26:55 +0000 Subject: [Python-Dev] devguide: Add an intermediate task of helping triage issues (not to be confused with the In-Reply-To: References: <20110108232235.735fea7c@pitrou.net> <20110110192403.7a63fb11@pitrou.net> Message-ID: <4D2B5D7F.8060306@voidspace.org.uk> On 10/01/2011 19:05, Brett Cannon wrote: > > > On Mon, Jan 10, 2011 at 10:24, Antoine Pitrou > wrote: > > On Sun, 9 Jan 2011 15:18:12 -0800 > Brett Cannon > wrote: > > > > OK, so the sentence is poorly phrased, but in the list of tasks > it is > > labeled explicitly as intermediate when one is comfortable with the > > process, not a newbie. Does that alleviate the worry you both have? > > It does seem to alleviate it :) Sorry for not noticing! > However, could the following be removed from the list: > > ?Is there a proper unit test that can reproduce the bug?? > > We don't need or require unit tests to reproduce bugs; and besides, > some things simply are very difficult to write an unit test for. A > reporter need not be an experienced Python developer able (or willing) > to write an elaborate unit test reproducing, for example, a timing > issue involving Unix signals and the IO stack ;) > > > Fair enough. I will remove it. Well, *often* a test that exposes the issue can be written - and if so it is a useful exercise (surely). Michael > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From merwok at netwok.org Mon Jan 10 20:27:00 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 10 Jan 2011 20:27:00 +0100 Subject: [Python-Dev] "unit test needed" In-Reply-To: References: <20110110190155.1b03667c@pitrou.net> Message-ID: <4D2B5D84.9010400@netwok.org> Le 10/01/2011 20:11, Eli Bendersky a ?crit : > Perhaps a different wording would be preferred to removal. Suppose a > reviewer accepts a patch but asks for a test before committing it. Well, we usually forewarn that a patch should include tests and docs, so I think ?patch needed? or ?patch review? is good enough. Regards From solipsis at pitrou.net Mon Jan 10 20:29:30 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 10 Jan 2011 20:29:30 +0100 Subject: [Python-Dev] "unit test needed" References: <20110110190155.1b03667c@pitrou.net> Message-ID: <20110110202930.4991aa32@pitrou.net> On Mon, 10 Jan 2011 21:11:23 +0200 Eli Bendersky wrote: > > I would like to advocate again for the removal of the "unit test > > needed" stage on the tracker, which regularly confuses our triagers > > into thinking it's an actual requirement or expectation from > > contributors and bug reporters. > > > > > Perhaps a different wording would be preferred to removal. Suppose a > reviewer accepts a patch but asks for a test before committing it. If it's > hidden in the issue discussion, only those involved in the issue are aware > of the situation. If it's in the issue state, then other potential > contributors may notice it and provide tests. IMHO tests are simpler and > less "scary" for newbies making their first steps in CPython. Then we would need a whole array of checkboxes for things missing in a patch: - missing unit test - missing documentation changes - other things? I don't think it's useful. As for "tests are simpler", it really depends on the issue :) I've worked on many issues where writing the test took much more time than actually fixing the bug (one-line fix vs. careful test setup to exercise the fix). (also, as a matter of principle, I think it's better that the same person who wrote the bugfix is asked to write the tests) Regards Antoine. From solipsis at pitrou.net Mon Jan 10 20:31:33 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 10 Jan 2011 20:31:33 +0100 Subject: [Python-Dev] devguide: Add an intermediate task of helping triage issues (not to be confused with the In-Reply-To: <4D2B5D7F.8060306@voidspace.org.uk> References: <20110108232235.735fea7c@pitrou.net> <20110110192403.7a63fb11@pitrou.net> <4D2B5D7F.8060306@voidspace.org.uk> Message-ID: <1294687893.3694.10.camel@localhost.localdomain> Le lundi 10 janvier 2011 ? 19:26 +0000, Michael Foord a ?crit : > > > > > > Fair enough. I will remove it. > > > > Well, *often* a test that exposes the issue can be written - and if so > it is a useful exercise (surely). Yes, well, that's a matter of "useful exercise for the contributor" vs. "required to advance on the issue". AFAICT the "stage" field aims at conveying the latter piece of information (the current wording says "unit test *needed*"). Regards Antoine. From fuzzyman at voidspace.org.uk Mon Jan 10 20:37:18 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 10 Jan 2011 19:37:18 +0000 Subject: [Python-Dev] devguide: Add an intermediate task of helping triage issues (not to be confused with the In-Reply-To: <1294687893.3694.10.camel@localhost.localdomain> References: <20110108232235.735fea7c@pitrou.net> <20110110192403.7a63fb11@pitrou.net> <4D2B5D7F.8060306@voidspace.org.uk> <1294687893.3694.10.camel@localhost.localdomain> Message-ID: <4D2B5FEE.1040408@voidspace.org.uk> On 10/01/2011 19:31, Antoine Pitrou wrote: > Le lundi 10 janvier 2011 ? 19:26 +0000, Michael Foord a ?crit : >>> >>> Fair enough. I will remove it. >>> >> Well, *often* a test that exposes the issue can be written - and if so >> it is a useful exercise (surely). > Yes, well, that's a matter of "useful exercise for the contributor" vs. > "required to advance on the issue". AFAICT the "stage" field aims at > conveying the latter piece of information (the current wording says > "unit test *needed*"). Aren't we discussing the dev guide? Discussion about tracker field is that away <-----. Michael > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From solipsis at pitrou.net Mon Jan 10 20:44:51 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 10 Jan 2011 20:44:51 +0100 Subject: [Python-Dev] devguide: Add an intermediate task of helping triage issues (not to be confused with the In-Reply-To: <4D2B5FEE.1040408@voidspace.org.uk> References: <20110108232235.735fea7c@pitrou.net> <20110110192403.7a63fb11@pitrou.net> <4D2B5D7F.8060306@voidspace.org.uk> <1294687893.3694.10.camel@localhost.localdomain> <4D2B5FEE.1040408@voidspace.org.uk> Message-ID: <1294688691.3694.12.camel@localhost.localdomain> Le lundi 10 janvier 2011 ? 19:37 +0000, Michael Foord a ?crit : > On 10/01/2011 19:31, Antoine Pitrou wrote: > > Le lundi 10 janvier 2011 ? 19:26 +0000, Michael Foord a ?crit : > >>> > >>> Fair enough. I will remove it. > >>> > >> Well, *often* a test that exposes the issue can be written - and if so > >> it is a useful exercise (surely). > > Yes, well, that's a matter of "useful exercise for the contributor" vs. > > "required to advance on the issue". AFAICT the "stage" field aims at > > conveying the latter piece of information (the current wording says > > "unit test *needed*"). > > Aren't we discussing the dev guide? Discussion about tracker field is > that away <-----. Oh, well. I think we're discussing the directions that a contributor willing to help triage could give so to advance an issue. I hope I'm not mistaken. Regards Antoine. From alexander.belopolsky at gmail.com Mon Jan 10 20:48:16 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 10 Jan 2011 14:48:16 -0500 Subject: [Python-Dev] "unit test needed" In-Reply-To: <4D2B54D6.2020001@netwok.org> References: <20110110190155.1b03667c@pitrou.net> <4D2B4E22.9060001@netwok.org> <4D2B54D6.2020001@netwok.org> Message-ID: On Mon, Jan 10, 2011 at 1:49 PM, ?ric Araujo wrote: >>> +1 to rename it ?test needed? >>> +1 to remove it > > I meant either one would be an improvement. +1 to remove it Let's remove it first, an then decide if another stage is necessary. The problems with "unit test needed" is that 1. It is not clear whether unit tests should be written before or after a patch and thus once a bug is acknowledged as valid, what an appropriate stage should be. 2. For a bug that needs confirmation as being reproducible, it suggests that familiarity with unit test framework in necessary to move the issue forward. In fact, in many cases a short stand-alone script is more helpful than a Lib/test patch. I think "patch needed" is a good enough first stage. For bugs it should be set when there is a rough consensus that the behavior is a bug and for RFEs, it should be set when a decision to include cannot be made without an implementation. While there is no agreement on whether the bug is valid or whether an RFE makes any sense, the stage can stay undefined. From fuzzyman at voidspace.org.uk Mon Jan 10 20:52:25 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 10 Jan 2011 19:52:25 +0000 Subject: [Python-Dev] "unit test needed" In-Reply-To: References: <20110110190155.1b03667c@pitrou.net> <4D2B4E22.9060001@netwok.org> <4D2B54D6.2020001@netwok.org> Message-ID: <4D2B6379.4040508@voidspace.org.uk> On 10/01/2011 19:48, Alexander Belopolsky wrote: > On Mon, Jan 10, 2011 at 1:49 PM, ?ric Araujo wrote: >>>> +1 to rename it ?test needed? >>>> +1 to remove it >> I meant either one would be an improvement. > +1 to remove it > > Let's remove it first, an then decide if another stage is necessary. > The problems with "unit test needed" is that > > 1. It is not clear whether unit tests should be written before or > after a patch and thus once a bug is acknowledged as valid, what an > appropriate stage should be. > > 2. For a bug that needs confirmation as being reproducible, it > suggests that familiarity with unit test framework in necessary to > move the issue forward. In fact, in many cases a short stand-alone > script is more helpful than a Lib/test patch. > > I think "patch needed" is a good enough first stage. For bugs it > should be set when there is a rough consensus that the behavior is a > bug and for RFEs, it should be set when a decision to include cannot > be made without an implementation. Agree. "Patch needed" applies if the patch is incomplete, and if it lacks tests then it is incomplete. Michael > While there is no agreement on whether the bug is valid or whether an > RFE makes any sense, the stage can stay undefined. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From skip at pobox.com Mon Jan 10 21:37:29 2011 From: skip at pobox.com (skip at pobox.com) Date: Mon, 10 Jan 2011 14:37:29 -0600 Subject: [Python-Dev] "unit test needed" In-Reply-To: <20110110202930.4991aa32@pitrou.net> References: <20110110190155.1b03667c@pitrou.net> <20110110202930.4991aa32@pitrou.net> Message-ID: <19755.28169.681711.848059@montanaro.dyndns.org> Antoine> Then we would need a whole array of checkboxes for things Antoine> missing in a patch: Antoine> - missing unit test Antoine> - missing documentation changes Antoine> - other things? How about replacing all the possibilities with patch incomplete then elaborate in the issue itself how that is the case. Antoine> I don't think it's useful. As for "tests are simpler", it Antoine> really depends on the issue :) I've worked on many issues where Antoine> writing the test took much more time than actually fixing the Antoine> bug (one-line fix vs. careful test setup to exercise the fix). I would rather write event-driven code than the test cases for it. ;-) Skip From alexander.belopolsky at gmail.com Mon Jan 10 22:00:34 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 10 Jan 2011 16:00:34 -0500 Subject: [Python-Dev] [Python-checkins] devguide: TODO about explaining how to fill in every field in an issue for a triager. In-Reply-To: References: Message-ID: On Mon, Jan 10, 2011 at 3:15 PM, brett.cannon wrote: .. > +.. todo:: > + ? ?Figure out where to put instructions for triagers on filling out issue > + ? ?fields properly Some field titles are clickable and linked to field choices descriptions. Maybe these could be replaced with links to proper devguide sections. From glyph at twistedmatrix.com Mon Jan 10 22:56:40 2011 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Mon, 10 Jan 2011 16:56:40 -0500 Subject: [Python-Dev] devguide: Point out that OS X users need to change examples to use python.exe instead of In-Reply-To: <3160B46C-EEC0-4A0A-B4A1-0590678AA6F8@langa.pl> References: <3160B46C-EEC0-4A0A-B4A1-0590678AA6F8@langa.pl> Message-ID: On Jan 10, 2011, at 1:37 PM, ?ukasz Langa wrote: > I'm using the case-sensitive variant of HFS+ since 10.4. It works, I like it and you get ./python with it. I realize that this isn't a popularity contest for this feature, but I feel like I should pipe up here and mention that it breaks some applications - for example, you can't really install World of Warcraft on a case-insensitive filesystem. Not the filesystem's fault really, but it is a good argument for why users shouldn't choose it. From nad at acm.org Mon Jan 10 23:37:19 2011 From: nad at acm.org (Ned Deily) Date: Mon, 10 Jan 2011 14:37:19 -0800 Subject: [Python-Dev] devguide: Point out that OS X users need to change examples to use python.exe instead of References: <3160B46C-EEC0-4A0A-B4A1-0590678AA6F8@langa.pl> Message-ID: In article , Glyph Lefkowitz wrote: > On Jan 10, 2011, at 1:37 PM, ??ukasz Langa wrote: > > I'm using the case-sensitive variant of HFS+ since 10.4. It works, I like > > it and you get ./python with it. > I realize that this isn't a popularity contest for this feature, but I feel > like I should pipe up here and mention that it breaks some applications - for > example, you can't really install World of Warcraft on a case-insensitive > filesystem. Not the filesystem's fault really, but it is a good argument for > why users shouldn't choose it. It's true that there is a bit of risk (and breaking WoW would be a big one for aficionados). Over the past few years, I have run into a few traditional Mac apps that did break when installed in a case-sensitive HFS. In all but one case, the app developers were happy to have the bug report and fix the app. I thought I noticed that Apple was starting to ship new machines formatted as case-insensitive but I may be imagining that. OTOH, there have been Unixy packages that break on case-insensitive systems, also arguably a bug. -- Ned Deily, nad at acm.org From stephen at xemacs.org Tue Jan 11 02:03:12 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 11 Jan 2011 10:03:12 +0900 Subject: [Python-Dev] PEP 3333: wsgi_string() function In-Reply-To: References: <1294109093.14661.4.camel@marge> <1294357806.2970.33.camel@stalk> <1294401061.14078.30.camel@marge> <925E2041-7B6F-4868-8E9D-856A15784C25@fuhm.net> <20110107170449.ECC643A411A@sparrow.telecommunity.com> <87d3o6a57i.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <87sjx08d67.fsf@uwakimon.sk.tsukuba.ac.jp> Ian Bicking writes: > On Sun, Jan 9, 2011 at 1:47 AM, Stephen J. Turnbull wrote: > > > Robert Brewer writes: > > > > > Python 3.1 was released June 27th, 2009. We're coming up faster on the > > > two-year period than we seem to be on a revised WSGI spec. Maybe we > > > should shoot for a "bytes of a known encoding" type first. > > > > You have one. It's called "ISO 2022: Information processing -- ISO > > 7-bit and 8-bit coded character sets -- Code extension techniques". > > The popularity of that standard speaks for itself. > > > > The kind of object PJE was referring to is more like Ruby's strings, Notice that Ruby was written by a Japanese, the same culture that brought us Mule, TRON, X Compound Text, and ISO-2022 in the first place. Matsumoto himself probably isn't infected with the "Unicode is going to be the death of all Japanese culture" bug, but that's the attitude that is behind ISO 2022. > which do not embed the encoding inside the bytes themselves but have the encoding > as a kind of annotation on the bytes, My pointis that ISO-2022 is basically just a serialization of that. And it sucks; nobody uses it, except in Japanese and Korean email. Maybe Mandarin (but Taiwan and Hong Kong use Big5 or EUC, not an escape-extended representation). > and do lazy transcoding when combining strings of different > encodings. Which buys WSGI nothing, AIUI, since the people who want this claim that translating to Unicode either correctly or as "big bytes" (ie, zero-extension) is inefficient. They're shoveling bits; much of the time, by the time the out-of-band information catches up, it's going to be too late. > The goal with respect to WSGI is that you could annotate bytes with > an encoding but also change or fix that encoding if other > out-of-band information implied that you got the encoding wrong > (e.g., some data is submitted with the encoding of the page the > browser was on, and so nothing inside the request itself will > indicate the encoding of the data). A noble goal, but nobody's gonna bell that cat. This is all just wishful thinking. 2 decades of experience with Emacs/Mule and similar efforts show that if you provide this facility, people will use it, and that use will include a lot of abuse (ie, throwing the garbage into somebody else's backyard, rather than disposing of it yourself) -- in the end, the garbage gets piled high enough that it's not worth the effort to try to make it work. > Latin1 is kind of the poor man's version of this -- it's a good > guess at an encoding, that at worst requires transcoding that can > be done in a predictable way. (Personally I think Latin1 gets us > 99% of the way there, and so bytes-of-a-known-encoding are not > really that important to the WSGI case.) In particular, it gets PJE 100% of the way there, since he proposes always targeting ISO 8859/1, anyway. And if it's not useful to WSGI, who is it useful to? From ncoghlan at gmail.com Tue Jan 11 06:20:28 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 11 Jan 2011 15:20:28 +1000 Subject: [Python-Dev] "unit test needed" In-Reply-To: <19755.28169.681711.848059@montanaro.dyndns.org> References: <20110110190155.1b03667c@pitrou.net> <20110110202930.4991aa32@pitrou.net> <19755.28169.681711.848059@montanaro.dyndns.org> Message-ID: On Tue, Jan 11, 2011 at 6:37 AM, wrote: > How about replacing all the possibilities with > > ? ?patch incomplete > > then elaborate in the issue itself how that is the case. +1 This is much clearer than lumping incomplete patches in with nonexistent ones. A process that goes "needs patch->patch review->(patch incomplete)->commit review->committed/rejected" (with the possibility of multiple iterations between patch review and patch incomplete) would give a pretty clear idea of where any given issue stands. The "unit test needed" stage is actually more a "confirmation needed" stage - i.e. can the reported bug actually be reproduced by anyone other than the original reporter. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eliben at gmail.com Tue Jan 11 07:36:25 2011 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 11 Jan 2011 08:36:25 +0200 Subject: [Python-Dev] "unit test needed" In-Reply-To: <19755.28169.681711.848059@montanaro.dyndns.org> References: <20110110190155.1b03667c@pitrou.net> <20110110202930.4991aa32@pitrou.net> <19755.28169.681711.848059@montanaro.dyndns.org> Message-ID: On Mon, Jan 10, 2011 at 22:37, wrote: > > Antoine> Then we would need a whole array of checkboxes for things > Antoine> missing in a patch: > Antoine> - missing unit test > Antoine> - missing documentation changes > Antoine> - other things? > > How about replacing all the possibilities with > > patch incomplete > > then elaborate in the issue itself how that is the case. > > +1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From lukasz at langa.pl Tue Jan 11 13:46:27 2011 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Tue, 11 Jan 2011 13:46:27 +0100 Subject: [Python-Dev] devguide: Point out that OS X users need to change examples to use python.exe instead of In-Reply-To: References: <3160B46C-EEC0-4A0A-B4A1-0590678AA6F8@langa.pl> Message-ID: <97FD10AA-C816-4A72-954D-7DCAF4EA9CB5@langa.pl> Wiadomo?? napisana przez Glyph Lefkowitz w dniu 2011-01-10, o godz. 22:56: > On Jan 10, 2011, at 1:37 PM, ?ukasz Langa wrote: > >> I'm using the case-sensitive variant of HFS+ since 10.4. It works, I like it and you get ./python with it. > > I realize that this isn't a popularity contest for this feature, but I feel like I should pipe up here and mention that it breaks some applications. Yes, it unfortunately does. Vendors should test their software on both filesystem variants but they don't. So it's probably safer to go with the case-insensitive option. I myself noticed three happy examples of bugs related to that: * Pro Applications Update 2008-05 reappears in Software Update and there's no fix for that * in MacVim, you can't at the same time open two files for which the insensitive path is the same (MacVim matches the first opened file and assumes the other is the same) * sometimes with multiple case variants of a single album, iTunes shows them as separate (but most of the time it doesn't) OTOH, I remember having strange problems with Fink and later MacPorts on a case-insensitive system. With case-sensitivity it works okay. Best part is, once you have a case-sensitive file system, it's quite difficult to switch to the other option. The first problem is changing the format, the second is ensuring you have no clashing data. I just wrote a script (Python 3.2 required) that lists all clashes: https://github.com/LangaCore/casecheck Funny it found quite a few on my drive I wasn't aware of. That was awfully off topic for this list. Sorry, I have been provoked ;-) -- Best regards, ?ukasz Langa tel. +48 791 080 144 WWW http://lukasz.langa.pl/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From stutzbach at google.com Tue Jan 11 18:25:42 2011 From: stutzbach at google.com (Daniel Stutzbach) Date: Tue, 11 Jan 2011 09:25:42 -0800 Subject: [Python-Dev] [Python-checkins] r87903 - python/branches/py3k/Doc/whatsnew/3.2.rst In-Reply-To: <20110110212650.20750EE98C@mail.python.org> References: <20110110212650.20750EE98C@mail.python.org> Message-ID: Thanks for catching the misspelling of my name! If you have a moment, could you look over my patch for Issue 8743 ? On Mon, Jan 10, 2011 at 1:26 PM, raymond.hettinger < python-checkins at python.org> wrote: > Author: raymond.hettinger > Date: Mon Jan 10 22:26:49 2011 > New Revision: 87903 > > Log: > Misspelling. > > > Modified: > python/branches/py3k/Doc/whatsnew/3.2.rst > > Modified: python/branches/py3k/Doc/whatsnew/3.2.rst > > ============================================================================== > --- python/branches/py3k/Doc/whatsnew/3.2.rst (original) > +++ python/branches/py3k/Doc/whatsnew/3.2.rst Mon Jan 10 22:26:49 2011 > @@ -553,7 +553,7 @@ > >>> range(0, 100, 2)[0:5] > range(0, 10, 2) > > - (Contributed by Daniel Stutzback in :issue:`9213` and by Alexander > Belopolsky > + (Contributed by Daniel Stutzbach in :issue:`9213` and by Alexander > Belopolsky > in :issue:`2690`.) > > * The :func:`callable` builtin function from Py2.x was resurrected. It > provides > @@ -1514,7 +1514,7 @@ > and it saves time lost during comparisons which were delegated by the > sort wrappers. > > - (Patch by Daniel Stutzback in :issue:`9915`.) > + (Patch by Daniel Stutzbach in :issue:`9915`.) > > * JSON decoding performance is improved and memory consumption is reduced > whenever the same string is repeated for multiple keys. Also, JSON > encoding > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > -- Daniel Stutzbach -------------- next part -------------- An HTML attachment was scrubbed... URL: From stutzbach at google.com Tue Jan 11 19:00:13 2011 From: stutzbach at google.com (Daniel Stutzbach) Date: Tue, 11 Jan 2011 10:00:13 -0800 Subject: [Python-Dev] FYI: Python 2.7.1 + gcc 4.6 (experimental) probable optimizer problem In-Reply-To: <107884.6325.qm@web111405.mail.gq1.yahoo.com> References: <107884.6325.qm@web111405.mail.gq1.yahoo.com> Message-ID: On Sat, Jan 8, 2011 at 12:03 PM, Ralf W. Grosse-Kunstleve wrote: > g++ (GCC) 4.6.0 20101206 (experimental) > % make > /bin/sh: line 1: 41686 Segmentation fault (core dumped) CC='gcc > -pthread' > LDSHARED='gcc -pthread -shared ' OPT='-DNDEBUG -g -fwrapv -O3 -Wall > -Wstrict-prototypes' ./python -E ./setup.py build > make: *** [sharedmods] Error 139 Does that version of gcc emit any warnings during compilation? -- Daniel Stutzbach -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Tue Jan 11 20:04:30 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Tue, 11 Jan 2011 21:04:30 +0200 Subject: [Python-Dev] Where are Python 2.5.5 binaries for Windows? Message-ID: I need Python 2.5.5 binaries to run Google AppEngine SDK 1.4.1 on Windows, but can't find them on http://www.python.org/download/releases/2.5.5/ Why are they removed? -- anatoly t. From brian.curtin at gmail.com Tue Jan 11 20:08:50 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Tue, 11 Jan 2011 13:08:50 -0600 Subject: [Python-Dev] Where are Python 2.5.5 binaries for Windows? In-Reply-To: References: Message-ID: On Tue, Jan 11, 2011 at 13:04, anatoly techtonik wrote: > I need Python 2.5.5 binaries to run Google AppEngine SDK 1.4.1 on > Windows, but can't find them on > http://www.python.org/download/releases/2.5.5/ > > Why are they removed? > -- > anatoly t. Nothing was removed. From that page: "This is a source-only release that only includes security fixes." -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Jan 11 20:20:52 2011 From: brett at python.org (Brett Cannon) Date: Tue, 11 Jan 2011 11:20:52 -0800 Subject: [Python-Dev] [Python-checkins] devguide: Add a doc listing the various places Python's development is discussed. In-Reply-To: <4D2B71DF.4050201@udel.edu> References: <4D2B71DF.4050201@udel.edu> Message-ID: Tweaked. Will show up in my next push. On Mon, Jan 10, 2011 at 12:53, Terry Reedy wrote: > > +The primary mailing list where discussions about Python's development >> occur is >> +python-dev_ >> > > suggest adding > ", mirrored as newsgroup gmane.comp.python.devel." > > ... > > +python-ideas_. Technical support questions should also not be asked here >> and >> +instead should go to comp.lang.python_ or python-help_. >> > > Suggest replace "comp.lang.python" with "python-list, mirrored as > gmane.comp.python.general". c.l.p is another mirror, but does not get the > spam filtering of p-l and g.c.p.g and is indeed the source or transmitter > (from google groups) of the spam that needs to be filtered. > > > + >> +The python-committers_ mailing list is publicly archived but only open to >> core >> +developers to subscribe to. If something only affect core developers >> (e.g., the >> +tree is frozen for commits, etc.), it is discussed here instead of >> python-dev >> +to keep traffic down on the latter. >> + >> +Python-ideas_ is a mailing list open to the public to discuss ideas on >> changing >> +Python. If a new idea does not start here (or comp.lang.python_), it will >> get >> > > Again, /c.l.p/python-list/ > > > +subscribe to this list and are known to reply to these emails to make >> comments >> +about various issues they catch in the commit. >> > > ;-) > > ... > > +.. _python-help: http://mail.python.org/mailman/listinfo/python-help >> +.. _python-ideas: http://mail.python.org/mailman/listinfo/python-ideas >> > > _python-list: http://mail.python.org/mailman/listinfo/python-list > > +Newsgroups >> +---------- >> > > "The free news site news.gmane.org mirrors (and archives) about 300 > python-related mailing lists, including most or all of those at > mail.python.org. Most are listed as gmane.comp.python.*." > > I would barely, if at all, mentions c.l.p: it is spammy, only mirrors 1 > list rather than 300, and is not an archive in itself. > > Terry > > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > > http://mail.python.org/mailman/listinfo/python-checkins > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Jan 11 20:25:43 2011 From: brett at python.org (Brett Cannon) Date: Tue, 11 Jan 2011 11:25:43 -0800 Subject: [Python-Dev] devguide: Add an intermediate task of helping triage issues (not to be confused with the In-Reply-To: <1294688691.3694.12.camel@localhost.localdomain> References: <20110108232235.735fea7c@pitrou.net> <20110110192403.7a63fb11@pitrou.net> <4D2B5D7F.8060306@voidspace.org.uk> <1294687893.3694.10.camel@localhost.localdomain> <4D2B5FEE.1040408@voidspace.org.uk> <1294688691.3694.12.camel@localhost.localdomain> Message-ID: On Mon, Jan 10, 2011 at 11:44, Antoine Pitrou wrote: > Le lundi 10 janvier 2011 ? 19:37 +0000, Michael Foord a ?crit : > > On 10/01/2011 19:31, Antoine Pitrou wrote: > > > Le lundi 10 janvier 2011 ? 19:26 +0000, Michael Foord a ?crit : > > >>> > > >>> Fair enough. I will remove it. > > >>> > > >> Well, *often* a test that exposes the issue can be written - and if so > > >> it is a useful exercise (surely). > > > Yes, well, that's a matter of "useful exercise for the contributor" vs. > > > "required to advance on the issue". AFAICT the "stage" field aims at > > > conveying the latter piece of information (the current wording says > > > "unit test *needed*"). > > > > Aren't we discussing the dev guide? Discussion about tracker field is > > that away <-----. > > Oh, well. I think we're discussing the directions that a contributor > willing to help triage could give so to advance an issue. I hope I'm not > mistaken. The doc has already been tweaked on my machine, so there is no need to continue this discussion. -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Tue Jan 11 20:56:12 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Tue, 11 Jan 2011 21:56:12 +0200 Subject: [Python-Dev] Where are Python 2.5.5 binaries for Windows? In-Reply-To: References: Message-ID: On Tue, Jan 11, 2011 at 9:08 PM, Brian Curtin wrote: > On Tue, Jan 11, 2011 at 13:04, anatoly techtonik > wrote: >> >> I need Python 2.5.5 binaries to run Google AppEngine SDK 1.4.1 on >> Windows, but can't find them on >> http://www.python.org/download/releases/2.5.5/ >> >> Why are they removed? >> -- >> anatoly t. > > Nothing was removed. From that page: "This is a source-only release that > only includes security fixes." Oh. Thanks. The page should have a more prominent Download section with a direct link to a page with previous release binaries. Not many people know English to figure this out from the text even if they are able to follow AppEngine tutorials in Russian. -- anatoly t. From brian.curtin at gmail.com Tue Jan 11 21:08:06 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Tue, 11 Jan 2011 14:08:06 -0600 Subject: [Python-Dev] Where are Python 2.5.5 binaries for Windows? In-Reply-To: References: Message-ID: On Tue, Jan 11, 2011 at 13:56, anatoly techtonik wrote: > On Tue, Jan 11, 2011 at 9:08 PM, Brian Curtin > wrote: > > On Tue, Jan 11, 2011 at 13:04, anatoly techtonik > > wrote: > >> > >> I need Python 2.5.5 binaries to run Google AppEngine SDK 1.4.1 on > >> Windows, but can't find them on > >> http://www.python.org/download/releases/2.5.5/ > >> > >> Why are they removed? > >> -- > >> anatoly t. > > > > Nothing was removed. From that page: "This is a source-only release that > > only includes security fixes." > > Oh. Thanks. The page should have a more prominent Download section > with a direct link to a page with previous release binaries. That's right next to the other sentence I mentioned: "The last full bug-fix release of Python 2.5 was Python 2.5.4 ." Not many > people know English to figure this out from the text even if they are > able to follow AppEngine tutorials in Russian. > -- > anatoly t. > There hasn't been a problem with this in the past that I know of, so I suspect a lot of people actually do understand the page and English, but I imagine translations of the page might be accepted. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmalcolm at redhat.com Tue Jan 11 22:11:01 2011 From: dmalcolm at redhat.com (David Malcolm) Date: Tue, 11 Jan 2011 16:11:01 -0500 Subject: [Python-Dev] [Python-checkins] devguide: Add coredev.rst to the index. In-Reply-To: References: Message-ID: <1294780261.16811.27.camel@radiator.bos.redhat.com> On Tue, 2011-01-11 at 21:56 +0100, brett.cannon wrote: > brett.cannon pushed a2d0edc3420e to devguide: > > http://hg.python.org/devguide/rev/a2d0edc3420e > changeset: 83:a2d0edc3420e > tag: tip > user: Brett Cannon > date: Tue Jan 11 12:56:47 2011 -0800 > summary: > Add coredev.rst to the index. > > files: > faq.rst > index.rst > > diff --git a/faq.rst b/faq.rst > --- a/faq.rst > +++ b/faq.rst > @@ -41,9 +41,6 @@ > Repository read-only read-write > ----------- -------------------------------------------------------------- -------------------------------------------------------------------------- > PEPs http://svn.python.org/projects/peps/trunk svn+ssh://pythondev at svn.python.org/peps/trunk > -2.7 http://svn.python.org/projects/python/branches/release27-maint svn+ssh://pythondev at svn.python.org/python/branches/release27-maint > -3.1 http://svn.python.org/projects/python/branches/release31-maint svn+ssh://pythondev at svn.python.org/python/branches/release31-maint > -3.2 http://svn.python.org/projects/python/branches/py3k svn+ssh://pythondev at svn.python.org/python/branches/py3k > =========== ============================================================== ========================================================================== > > > diff --git a/index.rst b/index.rst Was this removal of some of the SVN info from faq.rst an accident? [...snip addition of coredev.rst to index.rst...] Hope this is helpful Dave From rrr at ronadam.com Tue Jan 11 22:36:43 2011 From: rrr at ronadam.com (Ron Adam) Date: Tue, 11 Jan 2011 15:36:43 -0600 Subject: [Python-Dev] "unit test needed" In-Reply-To: <20110110190155.1b03667c@pitrou.net> References: <20110110190155.1b03667c@pitrou.net> Message-ID: On 01/10/2011 12:01 PM, Antoine Pitrou wrote: > > Hello, > > I would like to advocate again for the removal of the "unit test > needed" stage on the tracker, which regularly confuses our triagers > into thinking it's an actual requirement or expectation from > contributors and bug reporters. This keeps coming up because the logic of the different things in the tracker are not as clearly defined as they could be. There are differences between a sequential stage, and a non-sequential requirement or status. Here's an example of separating those well. Status: (Set as required) __ Bug - Set in New stage. __ Feature-request - Set in New stage. __ Commit-approved - Set in Patch-ready stage. __ Closed-committed - set in final stage. __ Closed-rejected - Set in any stage. (Add message for reason.) Stage: (Set next stage as each stage is completed) __ New - Check Validity, set Bug or Feature request status, and set Requirements as needed. __ Patch-development - Until requirements are satisfied. __ Patch-ready - Set Commit-approved if/when accepted. __ Final - Set Closed-committed status after commit. Requirements: (Set all that is needed, preferable in New stage.) __ Code patch __ Test patch __ Docs patch __ PEP Needed. User input: __ request-review - Set by tracker user. (Add message for reason.) Notes: + Patch-ready is be a nicer description of the Commit-review stage. + Remove "unittest needed" from stage, as its a requirement, not a stage. + Languishing should be a keyword. + Pending is too vague! (please remove!) + Move feature-request from type to status. + Add bug to status. "Bug" and "Feature-request" are an *issue status* as far as the tracker is concerned. This allows both bugs and features to set *Type*. "Type" refers to something in or about Python itself, rather than something in the tracker. (Something the issue *addresses* in python.) That description fits well with the items already there. An open status is the same as (not (closed-committed or closed-rejected)). The placement of some items could be better. Status, and priority would fit better in the classification section. Stage would fit better in the process section. Cheers, Ron From solipsis at pitrou.net Wed Jan 12 00:31:03 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 12 Jan 2011 00:31:03 +0100 Subject: [Python-Dev] r87946 - python/branches/py3k/Doc/howto/sorting.rst References: <20110111230556.5E7BEC99D@mail.python.org> Message-ID: <20110112003103.2dcf0e19@pitrou.net> On Wed, 12 Jan 2011 00:05:56 +0100 (CET) terry.reedy wrote: > + >>> def cmp_to_key(mycmp): http://docs.python.org/dev/library/functools.html#functools.cmp_to_key From techtonik at gmail.com Wed Jan 12 00:54:38 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 12 Jan 2011 01:54:38 +0200 Subject: [Python-Dev] Where are Python 2.5.5 binaries for Windows? In-Reply-To: References: Message-ID: On Tue, Jan 11, 2011 at 10:08 PM, Brian Curtin wrote: >> >> >> >> I need Python 2.5.5 binaries to run Google AppEngine SDK 1.4.1 on >> >> Windows, but can't find them on >> >> http://www.python.org/download/releases/2.5.5/ >> >> >> >> Why are they removed? >> >> -- >> >> anatoly t. >> > >> > Nothing was removed. From that page: "This is a source-only release that >> > only includes security fixes." >> >> Oh. Thanks. The page should have a more prominent Download section >> with a direct link to a page with previous release binaries. > > That's right next to the other sentence I mentioned: "The last full bug-fix > release of Python 2.5 was?Python 2.5.4." That's not consistent with the download section present on all other pages. >> Not many >> people know English to figure this out from the text even if they are >> able to follow AppEngine tutorials in Russian. > There hasn't been a problem with this in the past that I know of, so I > suspect a lot of people actually do understand the page and English, but I > imagine translations of the page might be accepted. Of course you can't know about problems that users complain about in Russian, but ok, the page can be translated. BTW, the page http://www.python.org/download/releases/2.5.5/ lists wrong latest release of Python 2.7 version. -- anatoly t. From rwgk at yahoo.com Wed Jan 12 03:28:25 2011 From: rwgk at yahoo.com (Ralf W. Grosse-Kunstleve) Date: Tue, 11 Jan 2011 18:28:25 -0800 (PST) Subject: [Python-Dev] FYI: Python 2.7.1 + gcc 4.6 (experimental) probable optimizer problem In-Reply-To: References: <107884.6325.qm@web111405.mail.gq1.yahoo.com> Message-ID: <161253.72269.qm@web111416.mail.gq1.yahoo.com> > Does that version of gcc emit any warnings during compilation? Yes, there are a few: http://cci.lbl.gov/~rwgk/tmp/gcc_trunk_168695_fc14_py271/ This is with: g++ (GCC) 4.6.0 20110112 (experimental) Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Jan 12 04:31:05 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Jan 2011 13:31:05 +1000 Subject: [Python-Dev] Where are Python 2.5.5 binaries for Windows? In-Reply-To: References: Message-ID: On Wed, Jan 12, 2011 at 9:54 AM, anatoly techtonik wrote: >>> Oh. Thanks. The page should have a more prominent Download section >>> with a direct link to a page with previous release binaries. >> >> That's right next to the other sentence I mentioned: "The last full bug-fix >> release of Python 2.5 was?Python 2.5.4." > > That's not consistent with the download section present on all other pages. Deliberately so - we don't really want people to download those binary releases while naively thinking they're getting all the security fixes from the source-only updates (see also the notice at the top of the 2.5.4 page). Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Wed Jan 12 04:43:20 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Jan 2011 13:43:20 +1000 Subject: [Python-Dev] [Python-checkins] devguide: Start the core dev intro doc. In-Reply-To: References: Message-ID: On Wed, Jan 12, 2011 at 6:56 AM, brett.cannon wrote: > +Mailing Lists > +''''''''''''' > + > +You are expected to subscribe to python-committers, python-dev, > +python-checkins, and one of new-bugs-announce or python-bugs-list. See > +:ref:`communication` for links to these mailing lists. I'd disagree with those last two - people that want to work on triage or general bug fixing should certainly subscribe to one of the bug lists, but a lot of us just look at the weekly summary and/or rely on the triagers to add us to the nosy list on relevant issues. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From brett at python.org Wed Jan 12 06:58:50 2011 From: brett at python.org (Brett Cannon) Date: Tue, 11 Jan 2011 21:58:50 -0800 Subject: [Python-Dev] [Python-checkins] devguide: Start the core dev intro doc. In-Reply-To: References: Message-ID: On Tue, Jan 11, 2011 at 19:43, Nick Coghlan wrote: > On Wed, Jan 12, 2011 at 6:56 AM, brett.cannon > wrote: > > +Mailing Lists > > +''''''''''''' > > + > > +You are expected to subscribe to python-committers, python-dev, > > +python-checkins, and one of new-bugs-announce or python-bugs-list. See > > +:ref:`communication` for links to these mailing lists. > > I'd disagree with those last two - people that want to work on triage > or general bug fixing should certainly subscribe to one of the bug > lists, but a lot of us just look at the weekly summary and/or rely on > the triagers to add us to the nosy list on relevant issues. > I would rather have people make the conscious decision not to subscribe to the bug lists than to unconsciously forget. -------------- next part -------------- An HTML attachment was scrubbed... URL: From db3l.net at gmail.com Thu Jan 13 00:53:38 2011 From: db3l.net at gmail.com (David Bolen) Date: Wed, 12 Jan 2011 18:53:38 -0500 Subject: [Python-Dev] 3.2b2 fails test suite on (my) Windows XP References: Message-ID: Brian Curtin writes: > http://bugs.python.org/issue9116 covers this issue. > > The reason it doesn't fail on any of the build slaves is because they modify > a registry value for Windows Error Reporting to not display the pop-up > window, or at least mine does. I think I got the idea from one of the other > Windows build slave maintainers. (delayed message as I was traveling) Note that the buildbot handling only prevents the pop-up dialogs. The underlying test that would have generated the dialog should still fail if the pop-up would have represented a failure, either through the Win32 API call failure, or the C RTL assertion termination the process with an error exit code. How that happens varies (OS dialogs are prevented from ever occurring, while C RTL dialogs have a scripted "OK" button press). There have been experiments with disabling the C RTL within Python itself in the past but never quite became consistent enough to trust on the buildbot (at least for me). But if somehow this is preventing a test failure from being detected on the buildbots, that's certainly an issue, and might indicate that some parent code to that causing the error isn't properly detecting it, either through an API error result, or process termination code. Of course, if the pop-up is not due to something considered a failure, then yes, the buildbot won't reflect what an actual user may see. I guess 9116 is an error in a child process which under Linux just terminates but under Win32 generates the pop-up, so perhaps aside from the pop-up, the test is assuming such an erroneous termination is acceptable? In such a case, then the approach in 9116 to permit control over the RTL is probably the only choice, but I don't think I see any way to accurately test that fact via the buildbots, since permitting such pop-ups are so disastrous to the overall build process. -- David From jimjjewett at gmail.com Thu Jan 13 14:21:27 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 13 Jan 2011 08:21:27 -0500 Subject: [Python-Dev] [Python-checkins] r87980 - in python/branches/py3k/Lib/importlib: _bootstrap.py abc.py In-Reply-To: <20110113023125.3B832EEA21@mail.python.org> References: <20110113023125.3B832EEA21@mail.python.org> Message-ID: Why? Are annotations being deprecated in general? Or are these particular annotations no longer accurate? -jJ On Wed, Jan 12, 2011 at 9:31 PM, raymond.hettinger wrote: > Author: raymond.hettinger > Date: Thu Jan 13 03:31:25 2011 > New Revision: 87980 > > Log: > Issue 10899: Remove function type annotations from the stdlib > > Modified: > ? python/branches/py3k/Lib/importlib/_bootstrap.py > ? python/branches/py3k/Lib/importlib/abc.py > > Modified: python/branches/py3k/Lib/importlib/_bootstrap.py > ============================================================================== > --- python/branches/py3k/Lib/importlib/_bootstrap.py ? ?(original) > +++ python/branches/py3k/Lib/importlib/_bootstrap.py ? ?Thu Jan 13 03:31:25 2011 > @@ -345,7 +345,7 @@ > > ?class SourceLoader(_LoaderBasics): > > - ? ?def path_mtime(self, path:str) -> int: > + ? ?def path_mtime(self, path): > ? ? ? ? """Optional method that returns the modification time for the specified > ? ? ? ? path. > > @@ -354,7 +354,7 @@ > ? ? ? ? """ > ? ? ? ? raise NotImplementedError > > - ? ?def set_data(self, path:str, data:bytes) -> None: > + ? ?def set_data(self, path, data): > ? ? ? ? """Optional method which writes data to a file path. > > ? ? ? ? Implementing this method allows for the writing of bytecode files. > > Modified: python/branches/py3k/Lib/importlib/abc.py > ============================================================================== > --- python/branches/py3k/Lib/importlib/abc.py ? (original) > +++ python/branches/py3k/Lib/importlib/abc.py ? Thu Jan 13 03:31:25 2011 > @@ -18,7 +18,7 @@ > ? ? """Abstract base class for import loaders.""" > > ? ? @abc.abstractmethod > - ? ?def load_module(self, fullname:str) -> types.ModuleType: > + ? ?def load_module(self, fullname): > ? ? ? ? """Abstract method which when implemented should load a module.""" > ? ? ? ? raise NotImplementedError > > @@ -28,7 +28,7 @@ > ? ? """Abstract base class for import finders.""" > > ? ? @abc.abstractmethod > - ? ?def find_module(self, fullname:str, path:[str]=None) -> Loader: > + ? ?def find_module(self, fullname, path=None): > ? ? ? ? """Abstract method which when implemented should find a module.""" > ? ? ? ? raise NotImplementedError > > @@ -47,7 +47,7 @@ > ? ? """ > > ? ? @abc.abstractmethod > - ? ?def get_data(self, path:str) -> bytes: > + ? ?def get_data(self, path): > ? ? ? ? """Abstract method which when implemented should return the bytes for > ? ? ? ? the specified path.""" > ? ? ? ? raise NotImplementedError > @@ -63,19 +63,19 @@ > ? ? """ > > ? ? @abc.abstractmethod > - ? ?def is_package(self, fullname:str) -> bool: > + ? ?def is_package(self, fullname): > ? ? ? ? """Abstract method which when implemented should return whether the > ? ? ? ? module is a package.""" > ? ? ? ? raise NotImplementedError > > ? ? @abc.abstractmethod > - ? ?def get_code(self, fullname:str) -> types.CodeType: > + ? ?def get_code(self, fullname): > ? ? ? ? """Abstract method which when implemented should return the code object > ? ? ? ? for the module""" > ? ? ? ? raise NotImplementedError > > ? ? @abc.abstractmethod > - ? ?def get_source(self, fullname:str) -> str: > + ? ?def get_source(self, fullname): > ? ? ? ? """Abstract method which should return the source code for the > ? ? ? ? module.""" > ? ? ? ? raise NotImplementedError > @@ -94,7 +94,7 @@ > ? ? """ > > ? ? @abc.abstractmethod > - ? ?def get_filename(self, fullname:str) -> str: > + ? ?def get_filename(self, fullname): > ? ? ? ? """Abstract method which should return the value that __file__ is to be > ? ? ? ? set to.""" > ? ? ? ? raise NotImplementedError > @@ -117,11 +117,11 @@ > > ? ? """ > > - ? ?def path_mtime(self, path:str) -> int: > + ? ?def path_mtime(self, path): > ? ? ? ? """Return the modification time for the path.""" > ? ? ? ? raise NotImplementedError > > - ? ?def set_data(self, path:str, data:bytes) -> None: > + ? ?def set_data(self, path, data): > ? ? ? ? """Write the bytes to the path (if possible). > > ? ? ? ? Any needed intermediary directories are to be created. If for some > @@ -170,7 +170,7 @@ > ? ? ? ? raise NotImplementedError > > ? ? @abc.abstractmethod > - ? ?def source_path(self, fullname:str) -> object: > + ? ?def source_path(self, fullname): > ? ? ? ? """Abstract method which when implemented should return the path to the > ? ? ? ? source code for the module.""" > ? ? ? ? raise NotImplementedError > @@ -279,19 +279,19 @@ > ? ? ? ? return code_object > > ? ? @abc.abstractmethod > - ? ?def source_mtime(self, fullname:str) -> int: > + ? ?def source_mtime(self, fullname): > ? ? ? ? """Abstract method which when implemented should return the > ? ? ? ? modification time for the source of the module.""" > ? ? ? ? raise NotImplementedError > > ? ? @abc.abstractmethod > - ? ?def bytecode_path(self, fullname:str) -> object: > + ? ?def bytecode_path(self, fullname): > ? ? ? ? """Abstract method which when implemented should return the path to the > ? ? ? ? bytecode for the module.""" > ? ? ? ? raise NotImplementedError > > ? ? @abc.abstractmethod > - ? ?def write_bytecode(self, fullname:str, bytecode:bytes) -> bool: > + ? ?def write_bytecode(self, fullname, bytecode): > ? ? ? ? """Abstract method which when implemented should attempt to write the > ? ? ? ? bytecode for the module, returning a boolean representing whether the > ? ? ? ? bytecode was written or not.""" > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > From fuzzyman at voidspace.org.uk Thu Jan 13 14:28:32 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 13 Jan 2011 13:28:32 +0000 Subject: [Python-Dev] [Python-checkins] r87980 - in python/branches/py3k/Lib/importlib: _bootstrap.py abc.py In-Reply-To: References: <20110113023125.3B832EEA21@mail.python.org> Message-ID: <4D2EFE00.7050805@voidspace.org.uk> On 13/01/2011 13:21, Jim Jewett wrote: > Why? > > Are annotations being deprecated in general? Or are these particular > annotations no longer accurate? See issue 10899. http://bugs.python.org/issue10899 Annotations are not deprecated but there is no accepted convention on their use (plus third party developers are free to create whatever use cases they want for annotations) so annotations are being kept out of the standard library. Particularly given that they were untested and unused. All the best, Michael Foord > -jJ > > On Wed, Jan 12, 2011 at 9:31 PM, raymond.hettinger > wrote: >> Author: raymond.hettinger >> Date: Thu Jan 13 03:31:25 2011 >> New Revision: 87980 >> >> Log: >> Issue 10899: Remove function type annotations from the stdlib >> >> Modified: >> python/branches/py3k/Lib/importlib/_bootstrap.py >> python/branches/py3k/Lib/importlib/abc.py >> >> Modified: python/branches/py3k/Lib/importlib/_bootstrap.py >> ============================================================================== >> --- python/branches/py3k/Lib/importlib/_bootstrap.py (original) >> +++ python/branches/py3k/Lib/importlib/_bootstrap.py Thu Jan 13 03:31:25 2011 >> @@ -345,7 +345,7 @@ >> >> class SourceLoader(_LoaderBasics): >> >> - def path_mtime(self, path:str) -> int: >> + def path_mtime(self, path): >> """Optional method that returns the modification time for the specified >> path. >> >> @@ -354,7 +354,7 @@ >> """ >> raise NotImplementedError >> >> - def set_data(self, path:str, data:bytes) -> None: >> + def set_data(self, path, data): >> """Optional method which writes data to a file path. >> >> Implementing this method allows for the writing of bytecode files. >> >> Modified: python/branches/py3k/Lib/importlib/abc.py >> ============================================================================== >> --- python/branches/py3k/Lib/importlib/abc.py (original) >> +++ python/branches/py3k/Lib/importlib/abc.py Thu Jan 13 03:31:25 2011 >> @@ -18,7 +18,7 @@ >> """Abstract base class for import loaders.""" >> >> @abc.abstractmethod >> - def load_module(self, fullname:str) -> types.ModuleType: >> + def load_module(self, fullname): >> """Abstract method which when implemented should load a module.""" >> raise NotImplementedError >> >> @@ -28,7 +28,7 @@ >> """Abstract base class for import finders.""" >> >> @abc.abstractmethod >> - def find_module(self, fullname:str, path:[str]=None) -> Loader: >> + def find_module(self, fullname, path=None): >> """Abstract method which when implemented should find a module.""" >> raise NotImplementedError >> >> @@ -47,7 +47,7 @@ >> """ >> >> @abc.abstractmethod >> - def get_data(self, path:str) -> bytes: >> + def get_data(self, path): >> """Abstract method which when implemented should return the bytes for >> the specified path.""" >> raise NotImplementedError >> @@ -63,19 +63,19 @@ >> """ >> >> @abc.abstractmethod >> - def is_package(self, fullname:str) -> bool: >> + def is_package(self, fullname): >> """Abstract method which when implemented should return whether the >> module is a package.""" >> raise NotImplementedError >> >> @abc.abstractmethod >> - def get_code(self, fullname:str) -> types.CodeType: >> + def get_code(self, fullname): >> """Abstract method which when implemented should return the code object >> for the module""" >> raise NotImplementedError >> >> @abc.abstractmethod >> - def get_source(self, fullname:str) -> str: >> + def get_source(self, fullname): >> """Abstract method which should return the source code for the >> module.""" >> raise NotImplementedError >> @@ -94,7 +94,7 @@ >> """ >> >> @abc.abstractmethod >> - def get_filename(self, fullname:str) -> str: >> + def get_filename(self, fullname): >> """Abstract method which should return the value that __file__ is to be >> set to.""" >> raise NotImplementedError >> @@ -117,11 +117,11 @@ >> >> """ >> >> - def path_mtime(self, path:str) -> int: >> + def path_mtime(self, path): >> """Return the modification time for the path.""" >> raise NotImplementedError >> >> - def set_data(self, path:str, data:bytes) -> None: >> + def set_data(self, path, data): >> """Write the bytes to the path (if possible). >> >> Any needed intermediary directories are to be created. If for some >> @@ -170,7 +170,7 @@ >> raise NotImplementedError >> >> @abc.abstractmethod >> - def source_path(self, fullname:str) -> object: >> + def source_path(self, fullname): >> """Abstract method which when implemented should return the path to the >> source code for the module.""" >> raise NotImplementedError >> @@ -279,19 +279,19 @@ >> return code_object >> >> @abc.abstractmethod >> - def source_mtime(self, fullname:str) -> int: >> + def source_mtime(self, fullname): >> """Abstract method which when implemented should return the >> modification time for the source of the module.""" >> raise NotImplementedError >> >> @abc.abstractmethod >> - def bytecode_path(self, fullname:str) -> object: >> + def bytecode_path(self, fullname): >> """Abstract method which when implemented should return the path to the >> bytecode for the module.""" >> raise NotImplementedError >> >> @abc.abstractmethod >> - def write_bytecode(self, fullname:str, bytecode:bytes) -> bool: >> + def write_bytecode(self, fullname, bytecode): >> """Abstract method which when implemented should attempt to write the >> bytecode for the module, returning a boolean representing whether the >> bytecode was written or not.""" >> _______________________________________________ >> Python-checkins mailing list >> Python-checkins at python.org >> http://mail.python.org/mailman/listinfo/python-checkins >> > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From amusa07 at yahoo.com Thu Jan 13 22:40:31 2011 From: amusa07 at yahoo.com (ali musa) Date: Thu, 13 Jan 2011 13:40:31 -0800 (PST) Subject: [Python-Dev] Approach for constructing Global Variables for Python Message-ID: <152074.20587.qm@web112709.mail.gq1.yahoo.com> Hi All: ? I would like to propose a solution for a proper use of Global Variables. My assumption came from the fact that most databases are using share memory to construct on the main memory a complete block of their processes and resources. Construction a method for utilizing Global Variable using block concept for common block or sub-block that can be share in any module or main program (i.e. main application and their modules). ? The solution implemented in the attached paper can ignite this idea and make you elaborate it to make better use of Python. ? Best regards. -------------------------------------------------------------------------------- Ali Abdelaziz Musa Saudi Telecom Company (Technical Advisor) Mob: (+9665)50570742, Hm:(+9661) 405-9425, Off:(+9661) 452-5539 Email: amusa07 at yahoo.com ? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: --static--bg_snowblue_1.gif Type: image/gif Size: 7874 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Paper_GlobalVariable.pdf Type: application/pdf Size: 144594 bytes Desc: not available URL: From techtonik at gmail.com Fri Jan 14 01:32:22 2011 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 14 Jan 2011 02:32:22 +0200 Subject: [Python-Dev] Get current UTC offset in crossplatform way Message-ID: Hello, It is already 2011. I didn't monitor the issue closely, but judging by the face that http://bugs.python.org/issue9527 is still open, Python still doesn't have a method to extract current timezone information from system. Can anybody recap what are we going to do with that in Python 3? Probably related - http://bugs.python.org/issue762963 It is very cumbersome to work with distributed time data with plain Python. -- anatoly t. From ben+python at benfinney.id.au Fri Jan 14 01:46:56 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Fri, 14 Jan 2011 11:46:56 +1100 Subject: [Python-Dev] Approach for constructing Global Variables for Python References: <152074.20587.qm@web112709.mail.gq1.yahoo.com> Message-ID: <877he89urj.fsf@benfinney.id.au> ali musa writes: > [a large non-text document] Please don't paste documents here. If you want to share some information with us, please post a plain text message. -- \ ?Of all classes the rich are the most noticed and the least | `\ studied.? ?John Kenneth Galbraith, _The Age of Uncertainty_, | _o__) 1977 | Ben Finney From victor.stinner at haypocalc.com Fri Jan 14 01:49:44 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 14 Jan 2011 01:49:44 +0100 Subject: [Python-Dev] Get current UTC offset in crossplatform way In-Reply-To: References: Message-ID: <1294966184.28561.11.camel@marge> Le vendredi 14 janvier 2011 ? 02:32 +0200, anatoly techtonik a ?crit : > It is already 2011. I didn't monitor the issue closely, but judging by > the face that http://bugs.python.org/issue9527 is still open, Python > still doesn't have a method to extract current timezone information > from system. Can anybody recap what are we going to do with that in > Python 3? The status of #9527 is that Alexander waits for an initial review. So if you would like to help, you can start with a review. The status of #762963 is that the patch doesn't work: "It changes the behaviour of time.asctime(time.gmtime(time.time()))". "A proper fix would be to use tm_gmtoff explicitly (...)". #1647654 has such patch (written by Alexander). Alexander is waiting for an approval (and maybe a review?): "The patch needs documentation updates which I will add if the idea is well received." If you want like to help, you can also comment this issue. Victor From ncoghlan at gmail.com Fri Jan 14 01:50:42 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 14 Jan 2011 10:50:42 +1000 Subject: [Python-Dev] Approach for constructing Global Variables for Python In-Reply-To: <152074.20587.qm@web112709.mail.gq1.yahoo.com> References: <152074.20587.qm@web112709.mail.gq1.yahoo.com> Message-ID: a) This is somewhat off-topic for this list (it is more suitable to python-ideas, at best) b) Defining process global singletons and other heap data structures in C and C++ programs is hardly a new idea c) Defining head data structures isn't the hard part, the hard part is accessing them in a reasonably efficient thread-safe manner. Given point c), an article on process global variables were the letter sequence "thread" appears only twice, and the letter sequence "sync" never appears at all and the letter sequence "lock" appears only inside the word "block" doesn't inspire much confidence. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Jan 14 01:51:42 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 14 Jan 2011 10:51:42 +1000 Subject: [Python-Dev] Approach for constructing Global Variables for Python In-Reply-To: References: <152074.20587.qm@web112709.mail.gq1.yahoo.com> Message-ID: On Fri, Jan 14, 2011 at 10:50 AM, Nick Coghlan wrote: > c) Defining head data structures isn't the hard part, the hard part is > accessing them in a reasonably efficient thread-safe manner. s/head/heap/ Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From victor.stinner at haypocalc.com Fri Jan 14 14:25:33 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 14 Jan 2011 14:25:33 +0100 Subject: [Python-Dev] Issue #4953 (cgi) closed: please test it as much as possible Message-ID: <1295011533.1559.4.camel@marge> Hi, I just closed issue #4953 just before Python 3.2 final: the CGI module should now handle correctly binary files and Unicode. I also patched urllib.parse.parse_qs() and urllib.parse.parse_qsl() to add encoding and errors arguments. Please test the CGI module (with Python 3.2) as much as possible, especially with WSGI. http://bugs.python.org/issue4953 Victor From status at bugs.python.org Fri Jan 14 18:07:05 2011 From: status at bugs.python.org (Python tracker) Date: Fri, 14 Jan 2011 18:07:05 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20110114170705.5A8E51CFBD@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2011-01-07 - 2011-01-14) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 2498 ( -3) closed 20192 (+54) total 22690 (+51) Open issues with patches: 1049 Issues opened (35) ================== #1602: windows console doesn't print or input Unicode http://bugs.python.org/issue1602 reopened by tim.golden #8844: Condition.wait() doesn't raise KeyboardInterrupt http://bugs.python.org/issue8844 reopened by haypo #8957: strptime(.., '%c') fails to parse output of strftime('%c', ..) http://bugs.python.org/issue8957 reopened by belopolsky #10116: Sporadic failures in test_urllibnet http://bugs.python.org/issue10116 reopened by pitrou #10854: Output DLL name in error message of ImportError when DLL is mi http://bugs.python.org/issue10854 reopened by techtonik #10855: wave.Wave_read.close() doesn't release file http://bugs.python.org/issue10855 reopened by pjcreath #10860: Handle empty port after port delimiter in httplib http://bugs.python.org/issue10860 opened by sligocki #10866: Add sethostname() http://bugs.python.org/issue10866 opened by rosslagerwall #10867: mmap.flush() issue msync() even if mapping was created with p http://bugs.python.org/issue10867 opened by mmarkk #10868: ABCMeta.register() should work as a decorator http://bugs.python.org/issue10868 opened by kerio #10872: Add mode to TextIOWrapper repr http://bugs.python.org/issue10872 opened by georg.brandl #10874: test_urllib2 shouldn't use is operator for comparing strings http://bugs.python.org/issue10874 opened by Trundle #10878: asyncore does not react properly on close() http://bugs.python.org/issue10878 opened by tgeorgiev #10879: cgi memory usage http://bugs.python.org/issue10879 opened by v+python #10880: do_mkvalue and 'boolean' http://bugs.python.org/issue10880 opened by IROV #10882: Add os.sendfile() http://bugs.python.org/issue10882 opened by rosslagerwall #10883: urllib: socket is not closed explicitly http://bugs.python.org/issue10883 opened by haypo #10884: pkgutil EggInfoDistribution requirements for .egg-info metadat http://bugs.python.org/issue10884 opened by pumazi #10885: multiprocessing docs http://bugs.python.org/issue10885 opened by rosslagerwall #10886: Unhelpful backtrace for multiprocessing.Queue http://bugs.python.org/issue10886 opened by torsten #10887: Add link to development ML http://bugs.python.org/issue10887 opened by techtonik #10888: os.stat(filepath).st_mode gives wrong 'executable permission' http://bugs.python.org/issue10888 opened by jeroen.dobbelaere #10891: Tweak sorting howto to eliminate redundancy http://bugs.python.org/issue10891 opened by eric.araujo #10894: Making stdlib APIs private http://bugs.python.org/issue10894 opened by SilentGhost #10895: Private stdlib API: getopt, getpass, glob, gzip, genericpath, http://bugs.python.org/issue10895 opened by SilentGhost #10897: UNIX mmap unnecessarily dup() file descriptor http://bugs.python.org/issue10897 opened by lorenz #10898: posixmodule.c redefines FSTAT http://bugs.python.org/issue10898 opened by alanh #10900: bz2 module fails to uncompress large files http://bugs.python.org/issue10900 opened by wrobell #10903: ZipExtFile:_update_crc fails for CRC >= 0x80000000 http://bugs.python.org/issue10903 opened by arindam #10904: PYTHONIOENCODING is not in manpage http://bugs.python.org/issue10904 opened by pebbe #10905: zipfile: fix arcname with leading '///' or '..' http://bugs.python.org/issue10905 opened by zhigang #10906: wsgiref should mention that CGI scripts usually expect HTTPS v http://bugs.python.org/issue10906 opened by techtonik #10907: OS X installer: warn users of buggy Tcl/Tk in OS X 10.6 http://bugs.python.org/issue10907 opened by ned.deily #10908: Improvements to trace._Ignore http://bugs.python.org/issue10908 opened by SilentGhost #775321: plistlib error handling http://bugs.python.org/issue775321 reopened by eric.araujo Most recent 15 issues with no replies (15) ========================================== #10908: Improvements to trace._Ignore http://bugs.python.org/issue10908 #10906: wsgiref should mention that CGI scripts usually expect HTTPS v http://bugs.python.org/issue10906 #10904: PYTHONIOENCODING is not in manpage http://bugs.python.org/issue10904 #10903: ZipExtFile:_update_crc fails for CRC >= 0x80000000 http://bugs.python.org/issue10903 #10898: posixmodule.c redefines FSTAT http://bugs.python.org/issue10898 #10891: Tweak sorting howto to eliminate redundancy http://bugs.python.org/issue10891 #10887: Add link to development ML http://bugs.python.org/issue10887 #10886: Unhelpful backtrace for multiprocessing.Queue http://bugs.python.org/issue10886 #10885: multiprocessing docs http://bugs.python.org/issue10885 #10883: urllib: socket is not closed explicitly http://bugs.python.org/issue10883 #10866: Add sethostname() http://bugs.python.org/issue10866 #10850: inconsistent behavior concerning multiprocessing.manager.BaseM http://bugs.python.org/issue10850 #10847: Distutils drops -fno-strict-aliasing when CFLAGS are set http://bugs.python.org/issue10847 #10837: Issue catching KeyboardInterrupt while reading stdin http://bugs.python.org/issue10837 #10836: TypeError during exception handling in urllib.request.urlretri http://bugs.python.org/issue10836 Most recent 15 issues waiting for review (15) ============================================= #10908: Improvements to trace._Ignore http://bugs.python.org/issue10908 #10907: OS X installer: warn users of buggy Tcl/Tk in OS X 10.6 http://bugs.python.org/issue10907 #10905: zipfile: fix arcname with leading '///' or '..' http://bugs.python.org/issue10905 #10897: UNIX mmap unnecessarily dup() file descriptor http://bugs.python.org/issue10897 #10895: Private stdlib API: getopt, getpass, glob, gzip, genericpath, http://bugs.python.org/issue10895 #10891: Tweak sorting howto to eliminate redundancy http://bugs.python.org/issue10891 #10885: multiprocessing docs http://bugs.python.org/issue10885 #10884: pkgutil EggInfoDistribution requirements for .egg-info metadat http://bugs.python.org/issue10884 #10882: Add os.sendfile() http://bugs.python.org/issue10882 #10874: test_urllib2 shouldn't use is operator for comparing strings http://bugs.python.org/issue10874 #10872: Add mode to TextIOWrapper repr http://bugs.python.org/issue10872 #10868: ABCMeta.register() should work as a decorator http://bugs.python.org/issue10868 #10866: Add sethostname() http://bugs.python.org/issue10866 #10860: Handle empty port after port delimiter in httplib http://bugs.python.org/issue10860 #10855: wave.Wave_read.close() doesn't release file http://bugs.python.org/issue10855 Top 10 most discussed issues (10) ================================= #1602: windows console doesn't print or input Unicode http://bugs.python.org/issue1602 14 msgs #7229: Manual entry for time.daylight can be misleading http://bugs.python.org/issue7229 13 msgs #2650: re.escape should not escape underscore http://bugs.python.org/issue2650 11 msgs #10868: ABCMeta.register() should work as a decorator http://bugs.python.org/issue10868 11 msgs #10882: Add os.sendfile() http://bugs.python.org/issue10882 11 msgs #10225: Fix doctest runable examples in python manual http://bugs.python.org/issue10225 10 msgs #7322: Socket timeout can cause file-like readline() method to lose d http://bugs.python.org/issue7322 9 msgs #8957: strptime(.., '%c') fails to parse output of strftime('%c', ..) http://bugs.python.org/issue8957 9 msgs #10828: Cannot use nonascii utf8 in names of files imported from http://bugs.python.org/issue10828 9 msgs #9566: Compilation warnings under x64 Windows http://bugs.python.org/issue9566 8 msgs Issues closed (54) ================== #2710: error: (10035, 'The socket operation could not complete withou http://bugs.python.org/issue2710 closed by terry.reedy #4953: cgi module cannot handle POST with multipart/form-data in 3.x http://bugs.python.org/issue4953 closed by haypo #5109: array.array constructor very slow when passed an array object. http://bugs.python.org/issue5109 closed by belopolsky #7662: time.utcoffset() http://bugs.python.org/issue7662 closed by belopolsky #8020: Crash in Py_ADDRESS_IN_RANGE macro http://bugs.python.org/issue8020 closed by pitrou #8771: Socket freezing under load issue on Mac. http://bugs.python.org/issue8771 closed by amcgregor #8871: --user-access-control=auto has no effect http://bugs.python.org/issue8871 closed by techtonik #9118: help() on a property descriptor launches interactive help http://bugs.python.org/issue9118 closed by belopolsky #9717: operator module - "in place" operators documentation http://bugs.python.org/issue9717 closed by rhettinger #9844: calling nonexisting function under __INSURE__ http://bugs.python.org/issue9844 closed by eli.bendersky #10013: fix `./libpython2.6.so: undefined reference to `_PyParser_Gram http://bugs.python.org/issue10013 closed by SilentGhost #10042: total_ordering stack overflow http://bugs.python.org/issue10042 closed by rhettinger #10174: multiprocessing expects sys.stdout to have a fileno/close meth http://bugs.python.org/issue10174 closed by pitrou #10357: ** and "mapping" are poorly defined in python docs http://bugs.python.org/issue10357 closed by rhettinger #10394: subprocess Popen deadlock http://bugs.python.org/issue10394 closed by pitrou #10533: Need example of using __missing__ http://bugs.python.org/issue10533 closed by rhettinger #10556: test_zipimport_support mucks up with modules http://bugs.python.org/issue10556 closed by ncoghlan #10577: (Fancy) URL opener stuck when trying to open redirected url http://bugs.python.org/issue10577 closed by pitrou #10648: Extend peepholer to reverse loads or stores instead of build/u http://bugs.python.org/issue10648 closed by rhettinger #10686: email.Generator should use unknown-8bit encoded words for head http://bugs.python.org/issue10686 closed by r.david.murray #10808: ssl unwrap fails with Error 0 http://bugs.python.org/issue10808 closed by pitrou #10813: Suppress adding decimal point for places=0 in moneyfmt() http://bugs.python.org/issue10813 closed by rhettinger #10820: 3.2 Makefile changes for versioned scripts break OS X framewor http://bugs.python.org/issue10820 closed by ned.deily #10822: test_getgroups failure under Solaris http://bugs.python.org/issue10822 closed by pitrou #10827: Functions in time module should support year < 1900 when accep http://bugs.python.org/issue10827 closed by belopolsky #10841: binary stdio http://bugs.python.org/issue10841 closed by haypo #10849: Backport test/__main__ http://bugs.python.org/issue10849 closed by georg.brandl #10851: further extend ssl SNI and ciphers API http://bugs.python.org/issue10851 closed by grooverdan #10856: documentation for ImportError parameters and attributes http://bugs.python.org/issue10856 closed by georg.brandl #10858: Make source code links less prominent http://bugs.python.org/issue10858 closed by rhettinger #10859: Is GeneratorContextManager public? http://bugs.python.org/issue10859 closed by pitrou #10861: urllib2 sporadically falsely claims infinite redirect http://bugs.python.org/issue10861 closed by pitrou #10862: Complete your registration to Python tracker -- key yq3FVw0zXb http://bugs.python.org/issue10862 closed by eric.araujo #10863: zlib.compress() fails with string http://bugs.python.org/issue10863 closed by georg.brandl #10864: time.strftime("%Y"): limitation of 4 digits on OpenIndiana (So http://bugs.python.org/issue10864 closed by haypo #10865: chroot-ing breaks encodings.idna http://bugs.python.org/issue10865 closed by georg.brandl #10869: ast.increment_lineno() increments root node twice http://bugs.python.org/issue10869 closed by georg.brandl #10870: Last line of argparse code samples can not be read on Windows http://bugs.python.org/issue10870 closed by georg.brandl #10871: argparse example use "file" instead of "open" http://bugs.python.org/issue10871 closed by georg.brandl #10873: String formatting example invalid http://bugs.python.org/issue10873 closed by eric.smith #10875: Update Regular Expression HOWTO http://bugs.python.org/issue10875 closed by terry.reedy #10876: Zipfile crashes when zip password is set to 610/844/numerous o http://bugs.python.org/issue10876 closed by pitrou #10877: Make Tools (and subdirs) a package (and subpackages) http://bugs.python.org/issue10877 closed by georg.brandl #10881: test_site and macframework builds fails http://bugs.python.org/issue10881 closed by ixokai #10889: Fix range slicing and indexing to handle lengths > sys.maxsize http://bugs.python.org/issue10889 closed by ncoghlan #10890: IDLE Freezing http://bugs.python.org/issue10890 closed by ned.deily #10892: segfault with "del X.__abstractmethods__" http://bugs.python.org/issue10892 closed by benjamin.peterson #10893: The docs mark staticmethod as a function http://bugs.python.org/issue10893 closed by rhettinger #10896: trace module compares directories as strings (--ignore-dir) http://bugs.python.org/issue10896 closed by SilentGhost #10899: No function type annotations in the standard library http://bugs.python.org/issue10899 closed by rhettinger #10901: Python 3 MIME generator dies if not given boundary http://bugs.python.org/issue10901 closed by SilentGhost #10902: Doc type: "run_*" instead of "run*" on http://docs.python.org/ http://bugs.python.org/issue10902 closed by eli.bendersky #1488934: file.write + closed pipe = no error http://bugs.python.org/issue1488934 closed by pitrou #1777412: datetime.strftime dislikes years before 1900 http://bugs.python.org/issue1777412 closed by haypo From tjreedy at udel.edu Sat Jan 15 19:58:02 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 15 Jan 2011 13:58:02 -0500 Subject: [Python-Dev] [Python-checkins] r88032 - in python/branches/py3k/Doc: c-api/code.rst howto/logging-cookbook.rst howto/logging.rst library/2to3.rst library/importlib.rst library/stdtypes.rst library/sys.rst reference/expressions.rst reference/simple_stmts.rst whatsnew/2.0.rst whatsnew/2.1.rst whatsnew/2.2.rst whatsnew/2.4.rst whatsnew/3.0.rst In-Reply-To: <20110115170302.B9F38EE9A7@mail.python.org> References: <20110115170302.B9F38EE9A7@mail.python.org> Message-ID: <4D31EE3A.6080708@udel.edu> On 1/15/2011 12:03 PM, georg.brandl wrote: > Fix a few doc errors, mostly undefined keywords. I am not sure what you mean by 'undefined keyword', but > - integer. If there is no source code, return :keyword:`None`. If the > + integer. If there is no source code, return ``None``. If the [etc] you have seem to have systematically removed the :keyword: role from None, False, and True. Since Language Reference 2.3.1 Keywords defines them as keywords, the entry keyword The name of a keyword in Python. in 4.5. Inline markup, Additional Markup Constructs, should specify "except for None, False, or True, which should just be marked as code literal ``None``, etc.". Or perhaps "The name of a statement keyword (other than None, False, or True) in Python." If your rule is even more nuanced (only sometimes make an exception), please elucidate. --- Terry Jan Reedy From g.brandl at gmx.net Sat Jan 15 23:21:05 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 15 Jan 2011 23:21:05 +0100 Subject: [Python-Dev] [Python-checkins] r88032 - in python/branches/py3k/Doc: c-api/code.rst howto/logging-cookbook.rst howto/logging.rst library/2to3.rst library/importlib.rst library/stdtypes.rst library/sys.rst reference/expressions.rst reference/simple_stmts.rst whatsnew/2.0.rst whatsnew/2.1.rst whatsnew/2.2.rst whatsnew/2.4.rst whatsnew/3.0.rst In-Reply-To: <4D31EE3A.6080708@udel.edu> References: <20110115170302.B9F38EE9A7@mail.python.org> <4D31EE3A.6080708@udel.edu> Message-ID: Am 15.01.2011 19:58, schrieb Terry Reedy: > On 1/15/2011 12:03 PM, georg.brandl wrote: > >> Fix a few doc errors, mostly undefined keywords. > > I am not sure what you mean by 'undefined keyword', but > >> - integer. If there is no source code, return :keyword:`None`. If the >> + integer. If there is no source code, return ``None``. If the > [etc] > > you have seem to have systematically removed the :keyword: role from > None, False, and True. Since Language Reference 2.3.1 Keywords defines > them as keywords, the entry > > keyword > The name of a keyword in Python. > > in 4.5. Inline markup, Additional Markup Constructs, should specify > "except for None, False, or True, which should just be marked as code > literal ``None``, etc.". Or perhaps "The name of a statement keyword > (other than None, False, or True) in Python." This section of "Documenting Python" should probably be rephrased. > If your rule is even more nuanced (only sometimes make an exception), > please elucidate. The rule is simple: :keyword:`...` generates a link. There is no corresponding link target, and therefore Sphinx generates a warning (which is new in 1.0.7, which fixed that bug.) As for why there is no link target: I think any Python programmer knows what None, True or False are. There is absolutely no need to create a link every time one of them is mentioned, which is pretty often, especially in the case of None. In contrast, take for example "the :keyword:`with` statement": this one is pretty new and many programmers might not be entirely certain what it was about; the link goes to the description of that statement. cheers, Georg From georg at python.org Sun Jan 16 08:33:41 2011 From: georg at python.org (Georg Brandl) Date: Sun, 16 Jan 2011 08:33:41 +0100 Subject: [Python-Dev] [RELEASED] Python 3.2 rc 1 Message-ID: <4D329F55.9040903@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On behalf of the Python development team, I'm very happy to announce the first release candidate of Python 3.2. Python 3.2 is a continuation of the efforts to improve and stabilize the Python 3.x line. Since the final release of Python 2.7, the 2.x line will only receive bugfixes, and new features are developed for 3.x only. Since PEP 3003, the Moratorium on Language Changes, is in effect, there are no changes in Python's syntax and built-in types in Python 3.2. Development efforts concentrated on the standard library and support for porting code to Python 3. Highlights are: * numerous improvements to the unittest module * PEP 3147, support for .pyc repository directories * PEP 3149, support for version tagged dynamic libraries * PEP 3148, a new futures library for concurrent programming * PEP 384, a stable ABI for extension modules * PEP 391, dictionary-based logging configuration * an overhauled GIL implementation that reduces contention * an extended email package that handles bytes messages * a much improved ssl module with support for SSL contexts and certificate hostname matching * a sysconfig module to access configuration information * additions to the shutil module, among them archive file support * many enhancements to configparser, among them mapping protocol support * improvements to pdb, the Python debugger * countless fixes regarding bytes/string issues; among them full support for a bytes environment (filenames, environment variables) * many consistency and behavior fixes for numeric operations For a more extensive list of changes in 3.2, see http://docs.python.org/3.2/whatsnew/3.2.html To download Python 3.2 visit: http://www.python.org/download/releases/3.2/ Please consider trying Python 3.2 with your code and reporting any bugs you may notice to: http://bugs.python.org/ Enjoy! - -- Georg Brandl, Release Manager georg at python.org (on behalf of the entire python-dev team and 3.2's contributors) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAk0yn1QACgkQN9GcIYhpnLDTdACgqQYW5ZmTLlxmppBZItprSj7I TmAAn13lgnu9TdVy0Jln7VwOt5JW9CwL =VZ3p -----END PGP SIGNATURE----- From bisfik at yahoo.com Sun Jan 16 18:44:27 2011 From: bisfik at yahoo.com (Peter Hall) Date: Sun, 16 Jan 2011 09:44:27 -0800 (PST) Subject: [Python-Dev] I want my sugar... Message-ID: <709598.40331.qm@web130202.mail.mud.yahoo.com> I am a newbie to python, and am curious why the following syntax is not supported: # Boolean test on x... if (x = someFunc(...)): # Do something with x... I've found it convenient in perl. Is the syntax actually supported, and I'm ignorant? Is the usage considered 'unPythonic'? Can you point me to any existing discussion of the issue (it must have come up before?)? Thanks, Peter From benjamin at python.org Sun Jan 16 18:54:15 2011 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 16 Jan 2011 11:54:15 -0600 Subject: [Python-Dev] I want my sugar... In-Reply-To: <709598.40331.qm@web130202.mail.mud.yahoo.com> References: <709598.40331.qm@web130202.mail.mud.yahoo.com> Message-ID: 2011/1/16 Peter Hall : > I am a newbie to python, and am curious why the following syntax is not > supported: > > # Boolean test on x... > if (x = someFunc(...)): > ? ? # Do something with x... > > I've found it convenient in perl. > > Is the syntax actually supported, and I'm ignorant? > Is the usage considered 'unPythonic'? > Can you point me to any existing discussion of the issue (it must have come up > before?)? See python-ideas. -- Regards, Benjamin From tjreedy at udel.edu Sun Jan 16 18:54:59 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 16 Jan 2011 12:54:59 -0500 Subject: [Python-Dev] I want my sugar... In-Reply-To: <709598.40331.qm@web130202.mail.mud.yahoo.com> References: <709598.40331.qm@web130202.mail.mud.yahoo.com> Message-ID: On 1/16/2011 12:44 PM, Peter Hall wrote: > I am a newbie to python, and am curious why the following syntax is not > supported: This list is for development of future releases. Please ask such questions on python-list or other forums for discussion of current Python. -- Terry Jan Reedy From rrr at ronadam.com Mon Jan 17 00:08:46 2011 From: rrr at ronadam.com (Ron Adam) Date: Sun, 16 Jan 2011 17:08:46 -0600 Subject: [Python-Dev] [RELEASED] Python 3.2 rc 1 In-Reply-To: <4D329F55.9040903@python.org> References: <4D329F55.9040903@python.org> Message-ID: <4D337A7E.2030907@ronadam.com> :-D Great job Georg! Ron Adam On 01/16/2011 01:33 AM, Georg Brandl wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On behalf of the Python development team, I'm very happy to announce the > first release candidate of Python 3.2. > > Python 3.2 is a continuation of the efforts to improve and stabilize the > Python 3.x line. Since the final release of Python 2.7, the 2.x line > will only receive bugfixes, and new features are developed for 3.x only. > > Since PEP 3003, the Moratorium on Language Changes, is in effect, there > are no changes in Python's syntax and built-in types in Python 3.2. > Development efforts concentrated on the standard library and support for > porting code to Python 3. Highlights are: > > * numerous improvements to the unittest module > * PEP 3147, support for .pyc repository directories > * PEP 3149, support for version tagged dynamic libraries > * PEP 3148, a new futures library for concurrent programming > * PEP 384, a stable ABI for extension modules > * PEP 391, dictionary-based logging configuration > * an overhauled GIL implementation that reduces contention > * an extended email package that handles bytes messages > * a much improved ssl module with support for SSL contexts and certificate > hostname matching > * a sysconfig module to access configuration information > * additions to the shutil module, among them archive file support > * many enhancements to configparser, among them mapping protocol support > * improvements to pdb, the Python debugger > * countless fixes regarding bytes/string issues; among them full support > for a bytes environment (filenames, environment variables) > * many consistency and behavior fixes for numeric operations > > For a more extensive list of changes in 3.2, see > > http://docs.python.org/3.2/whatsnew/3.2.html > > To download Python 3.2 visit: > > http://www.python.org/download/releases/3.2/ > > Please consider trying Python 3.2 with your code and reporting any bugs > you may notice to: > > http://bugs.python.org/ > > > Enjoy! > > - -- > Georg Brandl, Release Manager > georg at python.org > (on behalf of the entire python-dev team and 3.2's contributors) > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.16 (GNU/Linux) > > iEYEARECAAYFAk0yn1QACgkQN9GcIYhpnLDTdACgqQYW5ZmTLlxmppBZItprSj7I > TmAAn13lgnu9TdVy0Jln7VwOt5JW9CwL > =VZ3p > -----END PGP SIGNATURE----- From list at qtrac.plus.com Mon Jan 17 09:33:42 2011 From: list at qtrac.plus.com (Mark Summerfield) Date: Mon, 17 Jan 2011 08:33:42 +0000 Subject: [Python-Dev] [python-committers] [RELEASED] Python 3.2 rc 1 In-Reply-To: <4D329F55.9040903@python.org> References: <4D329F55.9040903@python.org> Message-ID: <20110117083342.719d9ab5@dino> Hi Georg, I can't be sure it is a bug, but there is a definite difference of behavior between 3.0/3.1 and 3.2rc1. Given this directory layout: $ ls -R Graphics/ Graphics/: __init__.py Vector Xpm.py Graphics/Vector: __init__.py Svg.py And these files: $ cat Graphics/__init__.py __all__ = ["Xpm"] $ cat Graphics/Xpm.py #!/usr/bin/env python3 XPM = 0 $ cat Graphics/Vector/__init__.py __all__ = ["Svg"] $ cat Graphics/Vector/Svg.py #!/usr/bin/env python3 from ..Graphics import Xpm SVG = 1 I can do the relative import with Python 3.0 and 3.1 but not with 3.2rc1: $ python30 Python 3.0.1 (r301:69556, Jul 15 2010, 10:31:51) [GCC 4.4.4] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from Graphics.Vector import * >>> Svg.SVG 1 $ python31 Python 3.1.2 (r312:79147, Jul 15 2010, 10:56:05) [GCC 4.4.4] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from Graphics.Vector import * >>> Svg.SVG 1 $ ~/opt/python32rc1/bin/python3 Python 3.2rc1 (r32rc1:88035, Jan 16 2011, 08:32:59) [GCC 4.4.5] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from Graphics.Vector import * Traceback (most recent call last): File "", line 1, in File "Graphics/Vector/Svg.py", line 2, in from ..Graphics import Xpm ImportError: No module named Graphics Should I report it as a bug or is this a planned change of behavior (or was the original behavior wrong?). -- Mark Summerfield, Qtrac Ltd, www.qtrac.eu C++, Python, Qt, PyQt - training and consultancy "Advanced Qt Programming" - ISBN 0321635906 http://www.qtrac.eu/aqpbook.html From orsenthil at gmail.com Mon Jan 17 11:31:42 2011 From: orsenthil at gmail.com (Senthil Kumaran) Date: Mon, 17 Jan 2011 16:01:42 +0530 Subject: [Python-Dev] [python-committers] [RELEASED] Python 3.2 rc 1 In-Reply-To: <20110117083342.719d9ab5@dino> References: <4D329F55.9040903@python.org> <20110117083342.719d9ab5@dino> Message-ID: On Mon, Jan 17, 2011 at 2:03 PM, Mark Summerfield wrote: > Hi Georg, > > I can't be sure it is a bug, but there is a definite difference of > behavior between 3.0/3.1 and 3.2rc1. > > I can do the relative import with Python 3.0 and 3.1 but not with > 3.2rc1: Are you sure that the package that you are trying to import is the PYTHONPATH of your system's Python 3.0 and Python 3.1 and Not in RC1? Looks to me a PYTHONPATH problem than a problem with rc1. - I tried to recreate the directory structure that you mentioned and tried from Graphics.Vector import * It failed with ImportError on python3, 3.1 and rc. - Just to test the relative imports, I created a directory structure as mentioned here: http://www.python.org/dev/peps/pep-0328/ and tried to test the relative import for usecase :- from ..moduleA import foo and works fine in rc1. - I also find that your use case (from ..Graphics import XPM in Graphics/Vector/Svg.py) is not one of the listed ones in PEP-0328. -- Senthil From g.rodola at gmail.com Mon Jan 17 14:53:19 2011 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Mon, 17 Jan 2011 14:53:19 +0100 Subject: [Python-Dev] os.ioprio_get() and os.ioprio_set() Message-ID: I've recently implemented this functionality in psutil: http://code.google.com/p/psutil/issues/detail?id=147 If desired, I can contribute a patch for the os module, altough being such functions Linux-only, I'm not sure os module is the right place for them to land. Also, I've been thinking about this for quite a bit: would it be the case to add system-specific modules such as "linux" (and maybe also a "win32") to the standard library? --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ From solipsis at pitrou.net Mon Jan 17 15:13:39 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 17 Jan 2011 15:13:39 +0100 Subject: [Python-Dev] os.ioprio_get() and os.ioprio_set() References: Message-ID: <20110117151339.4ca5dcd5@pitrou.net> On Mon, 17 Jan 2011 14:53:19 +0100 Giampaolo Rodol? wrote: > I've recently implemented this functionality in psutil: > http://code.google.com/p/psutil/issues/detail?id=147 > If desired, I can contribute a patch for the os module, altough being > such functions Linux-only, I'm not sure os module is the right place > for them to land. > Also, I've been thinking about this for quite a bit: would it be the > case to add system-specific modules such as "linux" (and maybe also a > "win32") to the standard library? The problem with something named "linux" is that when some of these APIs get ported to other Unix variants, things will get very confusing. "win32" is different since it's quite unlikely for some Windows-specific APIs to get ported to other OSes. Regards Antoine. From list at qtrac.plus.com Mon Jan 17 15:33:21 2011 From: list at qtrac.plus.com (Mark Summerfield) Date: Mon, 17 Jan 2011 14:33:21 +0000 Subject: [Python-Dev] [python-committers] [RELEASED] Python 3.2 rc 1 In-Reply-To: <20110117142339.E3D2D24108C@kimball.webabinitio.net> References: <4D329F55.9040903@python.org> <20110117083342.719d9ab5@dino> <20110117142339.E3D2D24108C@kimball.webabinitio.net> Message-ID: <20110117143321.627793c9@dino> On Mon, 17 Jan 2011 09:23:39 -0500 "R. David Murray" wrote: > On Mon, 17 Jan 2011 08:33:42 +0000, Mark Summerfield > wrote: > > from ..Graphics import Xpm > > SVG = 1 > > > > I can do the relative import with Python 3.0 and 3.1 but not with > > 3.2rc1: > > What about 3.1.3? I wonder if it is related to this issue: > > http://bugs.python.org/issue7902 > > -- > R. David Murray www.bitdance.com I'm not sure. Anyway, I have reported it a Georg's suggestion: http://bugs.python.org/issue10926 And mentioned issue7902. -- Mark Summerfield, Qtrac Ltd, www.qtrac.eu C++, Python, Qt, PyQt - training and consultancy "Programming in Python 3" - ISBN 0321680561 http://www.qtrac.eu/py3book.html From rdmurray at bitdance.com Mon Jan 17 15:23:39 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 17 Jan 2011 09:23:39 -0500 Subject: [Python-Dev] [python-committers] [RELEASED] Python 3.2 rc 1 In-Reply-To: <20110117083342.719d9ab5@dino> References: <4D329F55.9040903@python.org> <20110117083342.719d9ab5@dino> Message-ID: <20110117142339.E3D2D24108C@kimball.webabinitio.net> On Mon, 17 Jan 2011 08:33:42 +0000, Mark Summerfield wrote: > from ..Graphics import Xpm > SVG = 1 > > I can do the relative import with Python 3.0 and 3.1 but not with > 3.2rc1: What about 3.1.3? I wonder if it is related to this issue: http://bugs.python.org/issue7902 -- R. David Murray www.bitdance.com From solipsis at pitrou.net Mon Jan 17 21:00:10 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 17 Jan 2011 21:00:10 +0100 Subject: [Python-Dev] devguide: Add a doc outlining how to add something to the stdlib. References: Message-ID: <20110117210010.56b1beea@pitrou.net> On Sun, 16 Jan 2011 21:38:43 +0100 brett.cannon wrote: > + > +Adding to a pre-existing module > +------------------------------- > + > +If you have found that a function, method, or class is useful and you believe > +it would be useful to the general Python community, there are some steps to go > +through in order to see it added to the stdlib. > + > +First is you need to gauge the usefulness of the code. Typically this is done > +by sharing the code publicly. Actually, most feature requests get approved without this intermediate step. So I would suggest directing people to the tracker instead. Only very large or controversial additions usually get refused on these grounds. Regards Antoine. From brett at python.org Mon Jan 17 21:08:03 2011 From: brett at python.org (Brett Cannon) Date: Mon, 17 Jan 2011 12:08:03 -0800 Subject: [Python-Dev] devguide: Add a doc outlining how to add something to the stdlib. In-Reply-To: <20110117210010.56b1beea@pitrou.net> References: <20110117210010.56b1beea@pitrou.net> Message-ID: On Mon, Jan 17, 2011 at 12:00, Antoine Pitrou wrote: > On Sun, 16 Jan 2011 21:38:43 +0100 > brett.cannon wrote: >> + >> +Adding to a pre-existing module >> +------------------------------- >> + >> +If you have found that a function, method, or class is useful and you believe >> +it would be useful to the general Python community, there are some steps to go >> +through in order to see it added to the stdlib. >> + >> +First is you need to gauge the usefulness of the code. Typically this is done >> +by sharing the code publicly. > > Actually, most feature requests get approved without this intermediate > step. So I would suggest directing people to the tracker instead. > Only very large or controversial additions usually get refused on these > grounds. I weakened it to a suggestion, but didn't cut it entirely as I still think it's a good step to take. -Brett > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > From rrr at ronadam.com Mon Jan 17 21:22:32 2011 From: rrr at ronadam.com (Ron Adam) Date: Mon, 17 Jan 2011 14:22:32 -0600 Subject: [Python-Dev] Exception __name__ missing? Message-ID: Is this on purpose? Python 3.2rc1 (py3k:88040, Jan 15 2011, 18:11:39) [GCC 4.4.5] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> Exception.__name__ 'Exception' >>> e = Exception('has no name') >>> e.__name__ Traceback (most recent call last): File "", line 1, in AttributeError: 'Exception' object has no attribute '__name__' Ron Adam From rdmurray at bitdance.com Mon Jan 17 21:32:18 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 17 Jan 2011 15:32:18 -0500 Subject: [Python-Dev] devguide: Add a doc outlining how to add something to the stdlib. In-Reply-To: <20110117210010.56b1beea@pitrou.net> References: <20110117210010.56b1beea@pitrou.net> Message-ID: <20110117203218.5CF9A1F9548@kimball.webabinitio.net> On Mon, 17 Jan 2011 21:00:10 +0100, Antoine Pitrou wrote: > On Sun, 16 Jan 2011 21:38:43 +0100 > brett.cannon wrote: > > + > > +Adding to a pre-existing module > > +------------------------------- > > + > > +If you have found that a function, method, or class is useful and you believe > > +it would be useful to the general Python community, there are some steps to go > > +through in order to see it added to the stdlib. > > + > > +First is you need to gauge the usefulness of the code. Typically this is done > > +by sharing the code publicly. > > Actually, most feature requests get approved without this intermediate > step. So I would suggest directing people to the tracker instead. > Only very large or controversial additions usually get refused on these > grounds. A new contributor isn't in general going to know when a small change is controversial without asking *somewhere*, be it a mailing list or the tracker. Searching the tracker to make sure it hasn't already been proposed and rejected is, of course, a good idea. Perhaps the 'search the tracker' advice is worth repeating in this specific context. -- R. David Murray www.bitdance.com From brett at python.org Mon Jan 17 21:32:20 2011 From: brett at python.org (Brett Cannon) Date: Mon, 17 Jan 2011 12:32:20 -0800 Subject: [Python-Dev] Moving stuff out of Misc and over to the devguide Message-ID: There is a bunch of stuff in Misc that probably belongs in the devguide (under Resources) instead of in svn. Here are the files I think can be moved (in order of how strongly I think they should be moved): PURIFY.README README.coverty README.klocwork README.valgrind Porting developers.txt maintainers.rst SpecialBuilds.txt Now before anyone yells "that is inconvenient", don't forget that all core developers can check out and edit the devguide, and that almost all of the files listed (SpecialBuilds.txt is the exception) are typically edited and viewed on their own. Anyway, if there is a file listed here you don't think should move out of py3k and into the devguide, speak up. From g.brandl at gmx.net Mon Jan 17 21:27:00 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 17 Jan 2011 21:27:00 +0100 Subject: [Python-Dev] Exception __name__ missing? In-Reply-To: References: Message-ID: Am 17.01.2011 21:22, schrieb Ron Adam: > > Is this on purpose? > > > Python 3.2rc1 (py3k:88040, Jan 15 2011, 18:11:39) > [GCC 4.4.5] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> Exception.__name__ > 'Exception' > >>> e = Exception('has no name') > >>> e.__name__ > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'Exception' object has no attribute '__name__' It's not on purpose in the sense that it's not something special to exceptions. The class __name__ attribute is not accessible from instances of any class. Georg From benjamin at python.org Mon Jan 17 21:35:12 2011 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 17 Jan 2011 14:35:12 -0600 Subject: [Python-Dev] Exception __name__ missing? In-Reply-To: References: Message-ID: 2011/1/17 Ron Adam : > > Is this on purpose? Of course: instances don't have names. -- Regards, Benjamin From g.brandl at gmx.net Mon Jan 17 21:36:04 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 17 Jan 2011 21:36:04 +0100 Subject: [Python-Dev] Moving stuff out of Misc and over to the devguide In-Reply-To: References: Message-ID: Am 17.01.2011 21:32, schrieb Brett Cannon: > There is a bunch of stuff in Misc that probably belongs in the > devguide (under Resources) instead of in svn. Here are the files I > think can be moved (in order of how strongly I think they should be > moved): > > PURIFY.README > README.coverty > README.klocwork > README.valgrind > Porting > developers.txt > maintainers.rst > SpecialBuilds.txt > > Now before anyone yells "that is inconvenient", don't forget that all > core developers can check out and edit the devguide, and that almost > all of the files listed (SpecialBuilds.txt is the exception) are > typically edited and viewed on their own. > > Anyway, if there is a file listed here you don't think should move out > of py3k and into the devguide, speak up. No objections, +1. While it seems convenient to have (e.g.) the list of maintainers directly in the source tree, a) developers already know where to find it, no matter if in Misc/ or devguide/ b) others first have to find it anyway, and it's better to find when embedded in the rest of developer-related docs c) everyone can commit to the devguide as well as to cpython d) people should get used to multiple repos with hg coming Same goes for developers.txt. Georg From guido at python.org Mon Jan 17 21:53:16 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 17 Jan 2011 12:53:16 -0800 Subject: [Python-Dev] Moving stuff out of Misc and over to the devguide In-Reply-To: References: Message-ID: On Mon, Jan 17, 2011 at 12:32 PM, Brett Cannon wrote: > There is a bunch of stuff in Misc that probably belongs in the > devguide (under Resources) instead of in svn. Here are the files I > think can be moved (in order of how strongly I think they should be > moved): > > PURIFY.README > README.coverty > README.klocwork > README.valgrind > Porting > developers.txt > maintainers.rst > SpecialBuilds.txt > > Now before anyone yells "that is inconvenient", don't forget that all > core developers can check out and edit the devguide, and that almost > all of the files listed (SpecialBuilds.txt is the exception) are > typically edited and viewed on their own. > > Anyway, if there is a file listed here you don't think should move out > of py3k and into the devguide, speak up. Wow, that Purify file is really old... Unless anyone can confirm it still works, maybe just toss it? Barry? I would think the best way to decide whether something belongs in the developers guide or in Misc is whether it makes sense for this information to be included in that tar files that people download for specific releases. Especially files that contain stuff that might be useful to copy/paste might still be better off closer to the source code. From that POV the files for which the argument to move them over to devguide is weakest are PURIFY.README (though it really should be named README.purify :-), README.valgrind, and SpecialBuilds.txt. There's also something to be said for keeping version-dependent info closer to the source code -- personally, I would expect to be reading the devguide online. -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Mon Jan 17 21:54:40 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 17 Jan 2011 21:54:40 +0100 Subject: [Python-Dev] Moving stuff out of Misc and over to the devguide References: Message-ID: <20110117215440.122bef7b@pitrou.net> On Mon, 17 Jan 2011 12:32:20 -0800 Brett Cannon wrote: > There is a bunch of stuff in Misc that probably belongs in the > devguide (under Resources) instead of in svn. Here are the files I > think can be moved (in order of how strongly I think they should be > moved): > > PURIFY.README > README.coverty > README.klocwork > README.valgrind > Porting > developers.txt > maintainers.rst > SpecialBuilds.txt > > Now before anyone yells "that is inconvenient", don't forget that all > core developers can check out and edit the devguide, and that almost > all of the files listed (SpecialBuilds.txt is the exception) are > typically edited and viewed on their own. Well it *is* inconvenient in the case of maintainers.rst, which is often consulted casually for daily bug tracker work. Grepping Misc/maintainers.rst is much easier than first having to find again where your checkout of the devguide is, and ensuring it is up-to-date. Also, I see no need to put the maintainers list in the dev guide, actually. Regards Antoine. From brett at python.org Mon Jan 17 22:17:26 2011 From: brett at python.org (Brett Cannon) Date: Mon, 17 Jan 2011 13:17:26 -0800 Subject: [Python-Dev] devguide: Add a doc outlining how to add something to the stdlib. In-Reply-To: <20110117203218.5CF9A1F9548@kimball.webabinitio.net> References: <20110117210010.56b1beea@pitrou.net> <20110117203218.5CF9A1F9548@kimball.webabinitio.net> Message-ID: On Mon, Jan 17, 2011 at 12:32, R. David Murray wrote: > On Mon, 17 Jan 2011 21:00:10 +0100, Antoine Pitrou wrote: >> On Sun, 16 Jan 2011 21:38:43 +0100 >> brett.cannon wrote: >> > + >> > +Adding to a pre-existing module >> > +------------------------------- >> > + >> > +If you have found that a function, method, or class is useful and you believe >> > +it would be useful to the general Python community, there are some steps to go >> > +through in order to see it added to the stdlib. >> > + >> > +First is you need to gauge the usefulness of the code. Typically this is done >> > +by sharing the code publicly. >> >> Actually, most feature requests get approved without this intermediate >> step. So I would suggest directing people to the tracker instead. >> Only very large or controversial additions usually get refused on these >> grounds. > > A new contributor isn't in general going to know when a small change > is controversial without asking *somewhere*, be it a mailing list or > the tracker. ?Searching the tracker to make sure it hasn't already been > proposed and rejected is, of course, a good idea. ?Perhaps the > 'search the tracker' advice is worth repeating in this specific context. done From scott+python-dev at scottdial.com Mon Jan 17 22:09:47 2011 From: scott+python-dev at scottdial.com (Scott Dial) Date: Mon, 17 Jan 2011 16:09:47 -0500 Subject: [Python-Dev] Exception __name__ missing? In-Reply-To: References: Message-ID: <4D34B01B.7010507@scottdial.com> On 1/17/2011 3:22 PM, Ron Adam wrote: > Is this on purpose? This reminds me of something I ran into a few years ago wrt. the attribute on exceptions. Namely, that instances of built-in exceptions do not have a __module__ attribute, but instance of user exceptions do -- a change which appeared in Python 2.5: http://mail.python.org/pipermail/python-list/2007-November/1088229.html I had a use case, using ZSI to provide a SOAP interface, where being able to get the __module__ and __name__ was needed (to serialize into a SOAP "fault" message). I worked around the issue by referencing the __class__ (as the other replier mentioned). But, I didn't receive any responses then, so I think not a lot of attention was put into these type of attributes on exceptions. -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From ncoghlan at gmail.com Mon Jan 17 23:41:47 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 18 Jan 2011 08:41:47 +1000 Subject: [Python-Dev] Moving stuff out of Misc and over to the devguide In-Reply-To: <20110117215440.122bef7b@pitrou.net> References: <20110117215440.122bef7b@pitrou.net> Message-ID: On Tue, Jan 18, 2011 at 6:54 AM, Antoine Pitrou wrote: > > Well it *is* inconvenient in the case of maintainers.rst, which is > often consulted casually for daily bug tracker work. Grepping > Misc/maintainers.rst is much easier than first having to find again > where your checkout of the devguide is, and ensuring it is up-to-date. > > Also, I see no need to put the maintainers list in the dev guide, > actually. Every time I see someone syncing the version-independent maintainers list across branches a little alarm bell goes off in my head to say that file should be somewhere other than the main source tree. It's also quite possible that once the maintainer list is part of the dev guide, triagers will start using the official copy on python.org and the search function in their web browser rather than running grep over a source checkout. So moving the version-independent stuff certainly makes sense, but the stuff that is dependent on a particular version's build system is more questionable. What may make sense is for the devguide to describe the general contents of the files in Misc, but point out that for any given version, the version specific details in Misc take precedence. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue Jan 18 00:01:32 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 18 Jan 2011 09:01:32 +1000 Subject: [Python-Dev] Exception __name__ missing? In-Reply-To: <4D34B01B.7010507@scottdial.com> References: <4D34B01B.7010507@scottdial.com> Message-ID: On Tue, Jan 18, 2011 at 7:09 AM, Scott Dial wrote: > I worked around the issue by referencing the __class__ (as the other > replier mentioned). But, I didn't receive any responses then, so I think > not a lot of attention was put into these type of attributes on exceptions. That's not a workaround, it is the way you're meant to access __module__ and __name__ on new-style classes (which was the transition that happened for Exception in 2.5). The fact that user-defined classes get a __module__ attribute on instances while builtin and extension types don't isn't unique to exceptions though: >>> class C: pass ... >>> C.__module__ '__main__' >>> C().__module__ '__main__' >>> str.__module__ 'builtins' >>> str().__module__ Traceback (most recent call last): File "", line 1, in AttributeError: 'str' object has no attribute '__module__' >>> import datetime >>> datetime.datetime.__module__ 'datetime' >>> datetime.datetime.now().__module__ Traceback (most recent call last): File "", line 1, in AttributeError: 'datetime.datetime' object has no attribute '__module__' The addition of __module__ to user defined class instances strikes me as a bug. You can see in the language reference [1] that __dict__ and __class__ are the only expected data attributes for class instances. [1] http://docs.python.org/dev/reference/datamodel.html (search for the entry on "class instances", then scroll back up and contrast with the sections on class objects, functions and methods) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Tue Jan 18 00:14:11 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 18 Jan 2011 00:14:11 +0100 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. References: Message-ID: <20110118001411.6acd4141@pitrou.net> On Mon, 17 Jan 2011 23:37:07 +0100 brett.cannon wrote: > + > +To undo a patch, do:: > + > + patch -R -p0 < patch.diff > + Or, simply and more reliably, use the corresponding VCS incantation ("svn revert -R ." or "hg revert -a"). Regards Antoine. From barry at python.org Tue Jan 18 00:40:39 2011 From: barry at python.org (Barry Warsaw) Date: Mon, 17 Jan 2011 18:40:39 -0500 Subject: [Python-Dev] Moving stuff out of Misc and over to the devguide In-Reply-To: References: Message-ID: <20110117184039.3a9d55c5@limelight.wooz.org> On Jan 17, 2011, at 12:53 PM, Guido van Rossum wrote: >Wow, that Purify file is really old... Unless anyone can confirm it >still works, maybe just toss it? Barry? Wow indeed. The email address in there hasn't worked in, what? a decade? :) Toss it! -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From brett at python.org Tue Jan 18 01:12:49 2011 From: brett at python.org (Brett Cannon) Date: Mon, 17 Jan 2011 16:12:49 -0800 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: <20110118001411.6acd4141@pitrou.net> References: <20110118001411.6acd4141@pitrou.net> Message-ID: Done On Mon, Jan 17, 2011 at 15:14, Antoine Pitrou wrote: > On Mon, 17 Jan 2011 23:37:07 +0100 > brett.cannon wrote: >> + >> +To undo a patch, do:: >> + >> + ? ?patch -R -p0 < patch.diff >> + > > Or, simply and more reliably, use the corresponding VCS incantation > ("svn revert -R ." or "hg revert -a"). > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > From brett at python.org Tue Jan 18 01:17:36 2011 From: brett at python.org (Brett Cannon) Date: Mon, 17 Jan 2011 16:17:36 -0800 Subject: [Python-Dev] Moving stuff out of Misc and over to the devguide In-Reply-To: References: <20110117215440.122bef7b@pitrou.net> Message-ID: On Mon, Jan 17, 2011 at 14:41, Nick Coghlan wrote: > On Tue, Jan 18, 2011 at 6:54 AM, Antoine Pitrou wrote: >> >> Well it *is* inconvenient in the case of maintainers.rst, which is >> often consulted casually for daily bug tracker work. Grepping >> Misc/maintainers.rst is much easier than first having to find again >> where your checkout of the devguide is, and ensuring it is up-to-date. >> >> Also, I see no need to put the maintainers list in the dev guide, >> actually. > > Every time I see someone syncing the version-independent maintainers > list across branches a little alarm bell goes off in my head to say > that file should be somewhere other than the main source tree. Ditto for me. > > It's also quite possible that once the maintainer list is part of the > dev guide, triagers will start using the official copy on python.org > and the search function in their web browser rather than running grep > over a source checkout. > > So moving the version-independent stuff certainly makes sense, but the > stuff that is dependent on a particular version's build system is more > questionable. What may make sense is for the devguide to describe the > general contents of the files in Misc, but point out that for any > given version, the version specific details in Misc take precedence. > I am not describing what is in Misc. It comes down to a question of whether any core dev-specific stuff should be in Misc that is not a configuration file or not. I say no, that the directory should contain stuff that applies to everyone and not specifically to core devs. -Brett -Brett > Cheers, > Nick. > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > From skip at pobox.com Tue Jan 18 01:19:50 2011 From: skip at pobox.com (skip at pobox.com) Date: Mon, 17 Jan 2011 18:19:50 -0600 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: <20110118001411.6acd4141@pitrou.net> References: <20110118001411.6acd4141@pitrou.net> Message-ID: <19764.56486.736957.590344@montanaro.dyndns.org> Antoine> On Mon, 17 Jan 2011 23:37:07 +0100 Antoine> brett.cannon wrote: >> + >> +To undo a patch, do:: >> + >> + patch -R -p0 < patch.diff >> + Antoine> Or, simply and more reliably, use the corresponding VCS Antoine> incantation ("svn revert -R ." or "hg revert -a"). I prefer Brett's solution. It's one command instead of one command per VCS. It works with other version control systems and provides me the opportunity to save a copy I can restore later. Skip From rrr at ronadam.com Tue Jan 18 03:41:32 2011 From: rrr at ronadam.com (Ron Adam) Date: Mon, 17 Jan 2011 20:41:32 -0600 Subject: [Python-Dev] Exception __name__ missing? In-Reply-To: References: Message-ID: On 01/17/2011 02:27 PM, Georg Brandl wrote: > Am 17.01.2011 21:22, schrieb Ron Adam: >> >> Is this on purpose? >> >> >> Python 3.2rc1 (py3k:88040, Jan 15 2011, 18:11:39) >> [GCC 4.4.5] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >> >>> Exception.__name__ >> 'Exception' >> >>> e = Exception('has no name') >> >>> e.__name__ >> Traceback (most recent call last): >> File "", line 1, in >> AttributeError: 'Exception' object has no attribute '__name__' > > It's not on purpose in the sense that it's not something special > to exceptions. The class __name__ attribute is not accessible > from instances of any class. Yes, I realised this on the way to an appointment. Oh well. ;-) What I needed was e.__class__.__name__ instead of e.__name__. I should have thought about this a little more before posting. The particular reason I wanted it was to format a nice message for displaying in pydoc browser mode. The server errors, like a missing .css file, and any other server related errors, go the server console, while the content errors get displayed in a web page. ie... object not found, or some other content related reason for not giving what was asked for. Doing repr(e) was giving me too much. UnicodeDecodeError('utf8', b'\x7fELF\x02\x01\x01\x00\x00\x00\x .... With pages of bytes, and I'd rather not truncate it, although that would be ok. str(e) was more useful, but didn't include the exception name. 'utf8' codec can't decode byte 0xe0 in position 24: invalid continuation byte So doing e.__name__ was the obvious next thing... for some reason I expected the __name__ attribute in exception instances to be inherited from the class. Beats me why. Thanks, Ron From g.brandl at gmx.net Tue Jan 18 08:12:36 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 18 Jan 2011 08:12:36 +0100 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: <19764.56486.736957.590344@montanaro.dyndns.org> References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> Message-ID: Am 18.01.2011 01:19, schrieb skip at pobox.com: > > Antoine> On Mon, 17 Jan 2011 23:37:07 +0100 > Antoine> brett.cannon wrote: > >> + > >> +To undo a patch, do:: > >> + > >> + patch -R -p0 < patch.diff > >> + > > Antoine> Or, simply and more reliably, use the corresponding VCS > Antoine> incantation ("svn revert -R ." or "hg revert -a"). > > I prefer Brett's solution. It's one command instead of one command per VCS. > It works with other version control systems and provides me the opportunity > to save a copy I can restore later. It assumes you already have the copy. Georg From g.brandl at gmx.net Tue Jan 18 08:14:15 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 18 Jan 2011 08:14:15 +0100 Subject: [Python-Dev] Exception __name__ missing? In-Reply-To: References: Message-ID: Am 18.01.2011 03:41, schrieb Ron Adam: > > > On 01/17/2011 02:27 PM, Georg Brandl wrote: >> Am 17.01.2011 21:22, schrieb Ron Adam: >>> >>> Is this on purpose? >>> >>> >>> Python 3.2rc1 (py3k:88040, Jan 15 2011, 18:11:39) >>> [GCC 4.4.5] on linux2 >>> Type "help", "copyright", "credits" or "license" for more information. >>> >>> Exception.__name__ >>> 'Exception' >>> >>> e = Exception('has no name') >>> >>> e.__name__ >>> Traceback (most recent call last): >>> File "", line 1, in >>> AttributeError: 'Exception' object has no attribute '__name__' >> >> It's not on purpose in the sense that it's not something special >> to exceptions. The class __name__ attribute is not accessible >> from instances of any class. > > Yes, I realised this on the way to an appointment. Oh well. ;-) > > > What I needed was e.__class__.__name__ instead of e.__name__. > > I should have thought about this a little more before posting. > > > > The particular reason I wanted it was to format a nice message for > displaying in pydoc browser mode. The server errors, like a missing .css > file, and any other server related errors, go the server console, while the > content errors get displayed in a web page. ie... object not found, or > some other content related reason for not giving what was asked for. > > Doing repr(e) was giving me too much. > > UnicodeDecodeError('utf8', b'\x7fELF\x02\x01\x01\x00\x00\x00\x .... > > With pages of bytes, and I'd rather not truncate it, although that would be ok. > > str(e) was more useful, but didn't include the exception name. > > 'utf8' codec can't decode byte 0xe0 in position 24: invalid > continuation byte For these cases, you can use traceback.format_exception_only(). Georg From phd at phdru.name Tue Jan 18 10:31:51 2011 From: phd at phdru.name (Oleg Broytman) Date: Tue, 18 Jan 2011 12:31:51 +0300 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: <19764.56486.736957.590344@montanaro.dyndns.org> References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> Message-ID: <20110118093151.GB6425@iskra.aviel.ru> On Mon, Jan 17, 2011 at 06:19:50PM -0600, skip at pobox.com wrote: > Antoine> On Mon, 17 Jan 2011 23:37:07 +0100 > Antoine> brett.cannon wrote: > >> + > >> +To undo a patch, do:: > >> + > >> + patch -R -p0 < patch.diff > >> + > > Antoine> Or, simply and more reliably, use the corresponding VCS > Antoine> incantation ("svn revert -R ." or "hg revert -a"). > > I prefer Brett's solution. It's one command instead of one command per VCS. > It works with other version control systems and provides me the opportunity > to save a copy I can restore later. "hg revert" saves files before reverting as *.orig. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From solipsis at pitrou.net Tue Jan 18 13:35:41 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 18 Jan 2011 13:35:41 +0100 Subject: [Python-Dev] devguide: Write a guide to committing a patch. References: Message-ID: <20110118133541.3351d77d@pitrou.net> On Tue, 18 Jan 2011 07:14:51 +0100 Ezio Melotti wrote: > > + > > +Committing Patches > > +================== [...] > > + > > + svnmerge.py merge -r 42 > > + > > +This will try to apply the patch to the current patch and generate a > > commit Do we want to spend so much time explaining how to use SVN for core developers while we're supposed to switch to Mercurial Real Soon Now? (current core devs already know how to use it, and we don't get many new ones every month) Regards Antoine. From ncoghlan at gmail.com Tue Jan 18 14:12:08 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 18 Jan 2011 23:12:08 +1000 Subject: [Python-Dev] Moving stuff out of Misc and over to the devguide In-Reply-To: References: <20110117215440.122bef7b@pitrou.net> Message-ID: On Tue, Jan 18, 2011 at 10:17 AM, Brett Cannon wrote: > I am not describing what is in Misc. > > It comes down to a question of whether any core dev-specific stuff > should be in Misc that is not a configuration file or not. I say no, > that the directory should contain stuff that applies to everyone and > not specifically to core devs. In that case, I think the memory profiling and debugging info should stay there, as should the build info. Not only can that stuff change between version, but it can all be useful to people that are merely embedding Python or invoking other code from it, and need to dig deeply into what is going on to figure things out. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue Jan 18 14:18:54 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 18 Jan 2011 23:18:54 +1000 Subject: [Python-Dev] devguide: Write a guide to committing a patch. In-Reply-To: <20110118133541.3351d77d@pitrou.net> References: <20110118133541.3351d77d@pitrou.net> Message-ID: On Tue, Jan 18, 2011 at 10:35 PM, Antoine Pitrou wrote: > On Tue, 18 Jan 2011 07:14:51 +0100 > Ezio Melotti wrote: >> > + >> > +Committing Patches >> > +================== > [...] >> > + >> > + ? ?svnmerge.py merge -r 42 >> > + >> > +This will try to apply the patch to the current patch and generate a >> > commit > > Do we want to spend so much time explaining how to use SVN for core > developers while we're supposed to switch to Mercurial Real Soon Now? > (current core devs already know how to use it, and we don't get many > new ones every month) Yes, since it makes it clear where the new hg instructions need to go and the minimum set of operations they have to cover. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue Jan 18 15:16:21 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 19 Jan 2011 00:16:21 +1000 Subject: [Python-Dev] Tidying up the Meta-PEP and Other Informational PEP sections of PEP 0 Message-ID: python-checkins watchers would be aware that I just checked in a few changes to the PEP 0 generator (and a few PEP statuses) to start tidying up the first two sections of PEP 0. The old release schedule PEPs and similarly obsolete files have been moved down to a new historical section later in the document. This was mostly straightforward and non-controversial, but I ran into a problem with the Informational PEPs: we use the same status ("Final") to mean two completely different things for those PEPs. For the release schedule PEPs it means "done and dusted" (similar to the meaning for ordinary PEPs). For the API standardisation PEPs (like WSGI) it instead means the spec has been locked down and any changes will require a new PEP. This caused a problem for the PEP 0 generator, since the former kind of PEP should be moved to the new historical section, while the latter kind should remain up top. Would anyone object if I switched all the API definition PEPs to the "Active" state? PEP 1 indicates that is the appropriate state for reference PEPs that are never truly "finished" (in the sense of code being implemented and committed to the source control system). In addition, I would like to mark PEP 42 as Withdrawn and PEP 3100 as Final (the former is a random grab bag of feature requests better handled via the tracker, while the latter is a similar py3k grab bag, except we either did most of the things it lists or deliberately chose not to) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From brett at python.org Tue Jan 18 20:06:16 2011 From: brett at python.org (Brett Cannon) Date: Tue, 18 Jan 2011 11:06:16 -0800 Subject: [Python-Dev] devguide: Write a guide to committing a patch. In-Reply-To: References: <20110118133541.3351d77d@pitrou.net> Message-ID: On Tue, Jan 18, 2011 at 05:18, Nick Coghlan wrote: > On Tue, Jan 18, 2011 at 10:35 PM, Antoine Pitrou wrote: >> On Tue, 18 Jan 2011 07:14:51 +0100 >> Ezio Melotti wrote: >>> > + >>> > +Committing Patches >>> > +================== >> [...] >>> > + >>> > + ? ?svnmerge.py merge -r 42 >>> > + >>> > +This will try to apply the patch to the current patch and generate a >>> > commit >> >> Do we want to spend so much time explaining how to use SVN for core >> developers while we're supposed to switch to Mercurial Real Soon Now? >> (current core devs already know how to use it, and we don't get many >> new ones every month) > > Yes, since it makes it clear where the new hg instructions need to go > and the minimum set of operations they have to cover. Plus it's from the dev FAQ so I didn't have to go and figure something out. From brett at python.org Tue Jan 18 20:28:29 2011 From: brett at python.org (Brett Cannon) Date: Tue, 18 Jan 2011 11:28:29 -0800 Subject: [Python-Dev] [Python-checkins] devguide: Write a guide to committing a patch. In-Reply-To: References: Message-ID: On Mon, Jan 17, 2011 at 22:14, Ezio Melotti wrote: > Hi, > > On Thu, Jan 13, 2011 at 12:44 AM, brett.cannon > wrote: >> >> brett.cannon pushed 75300a08c6d7 to devguide: >> >> http://hg.python.org/devguide/rev/75300a08c6d7 >> changeset: ? 88:75300a08c6d7 >> tag: ? ? ? ? tip >> user: ? ? ? ?Brett Cannon >> date: ? ? ? ?Wed Jan 12 15:44:04 2011 -0800 >> summary: >> ?Write a guide to committing a patch. >> >> files: >> ?committing.rst >> ?faq.rst >> ?index.rst >> >> diff --git a/committing.rst b/committing.rst >> new file mode 100644 >> --- /dev/null >> +++ b/committing.rst >> @@ -0,0 +1,47 @@ >> +.. _committing: >> + >> +Committing Patches >> +================== >> + >> +As a core developer you will occasionally want to commit a patch created >> by >> +someone else. When doing so you will want to make sure of some things. >> + >> +First, make sure the patch in a good state. Both :ref:`patch` and >> :ref:`triage` >> +explain what is to be expected of a patch. Typically patches that get >> passed by >> +triagers are good to go except maybe lacking ``Misc/ACKS`` and >> ``Misc/NEWS`` >> +entries. >> + >> +Second, make sure the patch does not break backwards-compatibility >> without a >> +good reason. This means :ref:`running the test suite ` to make >> sure >> +everything still passes. It also means that if semantics do change there >> must >> +be a good reason for the the breakage of code the change will cause (and >> it >> +**will** break someone's code). If you are unsure if the breakage is >> worth it, >> +ask on python-dev. >> + >> +Third, backport as necessary. If the patch is a bugfix and it does not >> break >> +backwards-compatibility *at all*, then backport it to the branch(es) in >> +maintenance mode. The easiest way to do this is to apply the patch in the >> +development branch, commit, and then use svnmerge.py_ to backport the >> patch. >> + >> +For example, let us assume you just made commit 42 in the development >> branch >> +and you want to backport it to the ``release31-maint`` branch. You would >> change >> +your working directory to the maintenance branch and run the command:: >> + >> + ? ?svnmerge.py merge -r 42 >> + >> +This will try to apply the patch to the current patch and generate a >> commit > > s/current patch/current branch/ Fixed in a previous commit. > >> >> +message. You will need to revert ``Misc/NEWS`` and do a new entry (the >> file >> +changes too much between releases to ever have a merge succeed). Once >> your >> +checkout is ready to be committed, do:: > > Here you could mention that there are two ways to deal with files that have > conflicts (marked with 'C' by svn): > 1) revert them with 'svn revert filename' and then change them manually; > 2) edit the file directly to resolve the conflict and then use 'svn resolved > filename'; > until there are no more files marked with 'C' in 'svn stat'. > It's also a good idea to cat svnmerge-commit-message.txt and double check > that the message is correct. Done. > >> >> + >> + ? ?svn ci -F svnmerge-commit-message.txt >> + >> +This will commit the bacport along with using the commit message created >> by > > s/bacport/backport/ Fixed in a previous commit. > >> >> +``svnmerge.py`` for you. >> + >> +If it turns out you do not have the time to do a backport, then at least >> leave >> +the proper issue open on the tracker with a note specifying that the >> change >> +should be backported so someone else can do it. > > Maybe a short paragraph about "svnmerge block" can be included here. No because I have not seen people bother with svnmerge block in a while. -Brett > >> >> + >> + >> +.. _svnmerge.py: >> http://svn.apache.org/repos/asf/subversion/trunk/contrib/client-side/svnmerge/svnmerge.py >> diff --git a/faq.rst b/faq.rst >> --- a/faq.rst >> +++ b/faq.rst >> @@ -44,65 +44,6 @@ >> ?=========== >> ============================================================== >> ========================================================================== >> >> >> -How do I prepare a new branch for merging? >> ------------------------------------------- >> - >> -You need to initialize a new branch by having ``svnmerge.py`` discover >> the >> -revision number that the branch was created with. ?Do this with the >> command:: >> - >> - ? ?svnmerge.py init >> - >> -Then check in the change to the root of the branch. ?This is a one-time >> -operation (i.e. only when the branch is originally created, not when each >> -developer creates a local checkout for the branch). >> - >> - >> -How do I merge between branches? >> --------------------------------- >> - >> -In the current situation for Python there are four branches under >> development, >> -meaning that there are three branches to merge into. Assuming a change is >> -committed into ``trunk`` as revision 0001, you merge into the 2.x >> maintenance >> -by doing:: >> - >> - ? ?# In the 2.x maintenance branch checkout. >> - ? ?svnmerge.py merge -r 0001 >> - ? ?svn commit -F svnmerge-commit-message.txt ?# r0002 >> - >> -To pull into py3k:: >> - >> - ? ?# In a py3k checkout. >> - ? ?svnmerge.py merge -r 0001 >> - ? ?svn commit -F svnmerge-commit-message.txt ?# r0003 >> - >> -The 3.x maintenance branch is a special case as you must pull from the >> py3k >> -branch revision, *not* trunk:: >> - >> - ? ?# In a 3.x maintenance checkout. >> - ? ?svnmerge.py merge -r 0003 ?# Notice the rev is the one from py3k! >> - ? ?svn resolved . >> - ? ?svn commit -F svnmerge-commit-message.txt >> - >> - >> -How do I block a specific revision from being merged into a branch? >> -------------------------------------------------------------------- >> - >> -With the revision number that you want to block handy and >> ``svnmerge.py``, go >> -to your checkout of the branch where you want to block the revision and >> run:: >> - >> - ? ?svnmerge.py block -r >> - >> -This will modify the repository's top directory (which should be your >> current >> -directory) and create ``svnmerge-commit-message.txt`` which contains a >> -generated log message. >> - >> -If the command says "no available revisions to block", then it means >> someone >> -already merged the revision. >> - >> -To check in the new metadata, run:: >> - >> - ? ?svn ci -F svnmerge-commit-message.txt >> - >> >> ?SSH >> ?======= >> @@ -158,21 +99,6 @@ >> >> ?.. _Pageant: >> http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html >> >> -Can I make check-ins from machines other than the one I generated the >> keys on? >> >> ------------------------------------------------------------------------------- >> - >> -Yes, all you need is to make sure that the machine you want to check >> -in code from has both the public and private keys in the standard >> -place that ssh will look for them (i.e. ~/.ssh on Unix machines). >> -Please note that although the key file ending in .pub contains your >> -user name and machine name in it, that information is not used by the >> -verification process, therefore these key files can be moved to a >> -different computer and used for verification. ?Please guard your keys >> -and never share your private key with anyone. ?If you lose the media >> -on which your keys are stored or the machine on which your keys are >> -stored, be sure to report this to pydotorg at python.org at the same time >> -that you change your keys. >> - >> >> >> ?Editors and Tools >> diff --git a/index.rst b/index.rst >> --- a/index.rst >> +++ b/index.rst >> @@ -16,6 +16,7 @@ >> ? ?languishing >> ? ?communication >> ? ?coredev >> + ? committing >> >> >> ?.. todolist:: >> @@ -48,7 +49,7 @@ >> ? ? * :ref:`languishing` >> ?* :ref:`communication` >> ?* :ref:`coredev` >> - ? ?* `Committing patches `_ >> + ? ?* :ref:`committing` >> >> >> ?Proposing changes to Python itself >> >> -- >> Repository URL: http://hg.python.org/devguide >> _______________________________________________ >> Python-checkins mailing list >> Python-checkins at python.org >> http://mail.python.org/mailman/listinfo/python-checkins >> > > Best Regards, > Ezio Melotti > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > > From skip at pobox.com Tue Jan 18 20:32:42 2011 From: skip at pobox.com (skip at pobox.com) Date: Tue, 18 Jan 2011 13:32:42 -0600 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> Message-ID: <19765.60122.251699.624502@montanaro.dyndns.org> >> I prefer Brett's solution. It's one command instead of one command >> per VCS. It works with other version control systems and provides me >> the opportunity to save a copy I can restore later. Georg> It assumes you already have the copy. Sure, but the way to get the input to the patch command is easy, and is probably almost the same for any version control system: whatever-vcs diff > patch.diff The odds that someone will remember the syntax for the diff command for the VCS are much higher than the revert command. My guess is "diff" is executed more often than any other version control commands except "update" and "commit", and far more often than "revert". Personally, I'm not sure I've ever used "revert" more than a handful of times in my entire professional lifetime. I realize the world is passing me by and that I'm rapidly turning into a dinosaur w.r.t. distributed version control, but as you write/update the developer's guide remember that proficiency in Python does not necessarily equate to proficiency in version control systems, especially with the less frequently used commands. I personally would prefer that more general commands and concepts be used where possible so that newcomers not be put off unnecessarily by the complexity of version control. Skip From rrr at ronadam.com Wed Jan 19 00:36:14 2011 From: rrr at ronadam.com (Ron Adam) Date: Tue, 18 Jan 2011 17:36:14 -0600 Subject: [Python-Dev] Exception __name__ missing? In-Reply-To: References: Message-ID: On 01/18/2011 01:14 AM, Georg Brandl wrote: > For these cases, you can use traceback.format_exception_only(). Thanks George, That works nicely. Ron ;-) From kianseong.low at logisticsconsulting.asia Wed Jan 19 06:02:09 2011 From: kianseong.low at logisticsconsulting.asia (low kian seong) Date: Wed, 19 Jan 2011 13:02:09 +0800 Subject: [Python-Dev] [concurrentrotatingfilehandler]: How are the log files split up ? Message-ID: Dear people, I am currently using concurrentrotatingfilehandler to handle my Python logs. The situation is okay when it's only one log, but when it needs to spill over to the next log (I configured to have 2) say test.log.2 then I see that the output is sort of shared between the first log test.log and test.log.2. Am I supposed to concatenate all the logs together to get my logs back ? Google hasn't brought back any results, so I am wondering is it just me using or reading the resultant logs wrong? -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Wed Jan 19 06:11:27 2011 From: phd at phdru.name (Oleg Broytman) Date: Wed, 19 Jan 2011 08:11:27 +0300 Subject: [Python-Dev] [concurrentrotatingfilehandler]: How are the log files split up ? In-Reply-To: References: Message-ID: <20110119051127.GA7348@iskra.aviel.ru> Hello. We are sorry but we cannot help you. This mailing list is to work on developing Python (adding new features to Python itself and fixing bugs); if you're having problems learning, understanding or using Python, please find another forum. Probably python-list/comp.lang.python mailing list/news group is the best place; there are Python developers who participate in it; you may get a faster, and probably more complete, answer there. See http://www.python.org/community/ for other lists/news groups/fora. Thank you for understanding. On Wed, Jan 19, 2011 at 01:02:09PM +0800, low kian seong wrote: > Dear people, > > I am currently using concurrentrotatingfilehandler to handle my Python logs. > The situation is okay when it's only one log, but when it needs to spill > over to the next log (I configured to have 2) say test.log.2 then I see that > the output is sort of shared between the first log test.log and test.log.2. > > Am I supposed to concatenate all the logs together to get my logs back ? > Google hasn't brought back any results, so I am wondering is it just me > using or reading the resultant logs wrong? > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/phd%40phdru.name Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From ncoghlan at gmail.com Wed Jan 19 11:35:26 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 19 Jan 2011 20:35:26 +1000 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: <19765.60122.251699.624502@montanaro.dyndns.org> References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> Message-ID: On Wed, Jan 19, 2011 at 5:32 AM, wrote: > The odds that someone will remember the syntax for the diff command for the > VCS are much higher than the revert command. ?My guess is "diff" is executed > more often than any other version control commands except "update" and > "commit", and far more often than "revert". ?Personally, I'm not sure I've > ever used "revert" more than a handful of times in my entire professional > lifetime. > > I realize the world is passing me by and that I'm rapidly turning into a > dinosaur w.r.t. distributed version control, but as you write/update the > developer's guide remember that proficiency in Python does not necessarily > equate to proficiency in version control systems, especially with the less > frequently used commands. ?I personally would prefer that more general > commands and concepts be used where possible so that newcomers not be put > off unnecessarily by the complexity of version control. Interesting. I almost *never* reverse patches - I always use the SVN revert command. Usually, this is because I will have edited the source tree since applying the patch. Reversion has the advantage of not getting confused by any additional changes. I also usually use "svn diff" to save a copy before I revert in case I change my mind. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From skip at pobox.com Wed Jan 19 12:04:01 2011 From: skip at pobox.com (skip at pobox.com) Date: Wed, 19 Jan 2011 05:04:01 -0600 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> Message-ID: <19766.50465.232219.424926@montanaro.dyndns.org> Nick> Usually, this is because I will have edited the source tree since Nick> applying the patch. Reversion has the advantage of not getting Nick> confused by any additional changes. I also usually use "svn diff" Nick> to save a copy before I revert in case I change my mind. I routinely use CVS and Subversion at work, occasionally SCCS (yes, we still have a little of that other dinosaur laying about - our sysadmins, what can I say? they are luddites). Most of my interaction with these tools is mediated through the Emacs vc package, so my direct use of the command line is reduced even from what you might think normal. It's generally only when I need to operate on a group of files that I revert to using the command line. That tends to be to check in a group of files or discard one or changes before checking in, generally by taking a diff and unapplying it with with patch, perhaps first saving it to a file. If I want to revert a change after checking it in, I can just pipe the confirmation email through patch. S From victor.stinner at haypocalc.com Wed Jan 19 13:34:02 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 19 Jan 2011 13:34:02 +0100 Subject: [Python-Dev] Import and unicode: part two Message-ID: <1295440442.432.18.camel@marge> Hi, I patched Python 3.2 to support modules with non-ASCII paths (*). It works well on all operating systems. But the task is not completly done: (a) Python 3 doesn't support non-ASCII module names (b) Python 3 doesn't support unencodable characters in the module path I would like to know if we need to support that. Terry J. Reedy wrote (issue #10828): "I think bugs in core syntax should have high priority. I appreciate your work toward fixing it." I wrote a patch (issue #3080) fixing both points. If you agree that both issues should be fixed, I will fix them in Python 3.3. (a) is the issue #10828 reported recently (january 2011): "import gui_j?mf?ra" doesn't work with a locale encoding different than UTF-8 (so it doesn't work on Windows). (b) is specific to Windows: FAT32 and NTFS filesystems store filenames in unicode, but Python encodes paths to the ANSI code page (which is a very small subset of Unicode). If a character cannot be encoded to the code page, you cannot load a module. Eg. add a japanese character in a directory name on a Windows using cp1252 (english) code page. I don't think that (b) was already reported by an user, it's more a theorical problem. My patch is huge, but it simplifies the code. We doesn't need to regulary convert from/to UTF-8. And for the functions using PyUnicodeObject objects (and not a Py_UNICODE* buffer): PyUnicodeObject stores the string length (it avoids calls to strlen()) and PyUnicode_FromFormat() doesn't need a buffer size (no risk of buffer overflow). I suppose that it makes Python faster, but I didn't try. (*) Python 3.2 doesn't support non-ASCII in the module *name*, only in the path (sys.path). Victor Stinner From merwok at netwok.org Wed Jan 19 14:03:20 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Wed, 19 Jan 2011 14:03:20 +0100 Subject: [Python-Dev] Moving stuff out of Misc and over to the devguide In-Reply-To: References: <20110117215440.122bef7b@pitrou.net> Message-ID: <4D36E118.40008@netwok.org> Le 17/01/2011 23:41, Nick Coghlan a ?crit : > On Tue, Jan 18, 2011 at 6:54 AM, Antoine Pitrou wrote: >> [...] >> Also, I see no need to put the maintainers list in the dev guide, >> actually. > > Every time I see someone syncing the version-independent maintainers > list across branches a little alarm bell goes off in my head to say > that file should be somewhere other than the main source tree. > > It's also quite possible that once the maintainer list is part of the > dev guide, triagers will start using the official copy on python.org > and the search function in their web browser rather than running grep > over a source checkout. +1 to moving maintainers.rst to the devguide, a wiki page (I?m volunteering to monitor that page for vandalism), or make it somehow part of the bug tracker. Let?s also take the opportunity to rename it to ?experts?, following R. David Murray: ?Any module without a listed maintainer is maintained by the community as a whole [...] I think perhaps the name chosen for the file was unfortunate. I view it more as the 'experts' file, rather than the maintainers file, though in some cases the expert is indeed the principle maintainer of the module (such as Vinay for logging).? Bonus question: if we remove maintainers.rst from py3k, what do we do in 3.1 and 2.7? I?d favor removing them over keeping outdated versions. Regards From steve at pearwood.info Wed Jan 19 15:23:26 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 20 Jan 2011 01:23:26 +1100 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: <19765.60122.251699.624502@montanaro.dyndns.org> References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> Message-ID: <4D36F3DE.8080100@pearwood.info> skip at pobox.com wrote: > I realize the world is passing me by and that I'm rapidly turning into a > dinosaur w.r.t. distributed version control, but as you write/update the > developer's guide remember that proficiency in Python does not necessarily > equate to proficiency in version control systems, especially with the less > frequently used commands. I personally would prefer that more general > commands and concepts be used where possible so that newcomers not be put > off unnecessarily by the complexity of version control. What he said, only bolded and underlined. -- Steven From solipsis at pitrou.net Wed Jan 19 15:28:53 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 19 Jan 2011 15:28:53 +0100 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> <4D36F3DE.8080100@pearwood.info> Message-ID: <20110119152853.3d529f01@pitrou.net> On Thu, 20 Jan 2011 01:23:26 +1100 Steven D'Aprano wrote: > skip at pobox.com wrote: > > > I realize the world is passing me by and that I'm rapidly turning into a > > dinosaur w.r.t. distributed version control, but as you write/update the > > developer's guide remember that proficiency in Python does not necessarily > > equate to proficiency in version control systems, especially with the less > > frequently used commands. I personally would prefer that more general > > commands and concepts be used where possible so that newcomers not be put > > off unnecessarily by the complexity of version control. > > What he said, only bolded and underlined. I'm not sure what the issue is. Is there something, concretely, that needs to be fixed? From steve at pearwood.info Wed Jan 19 15:54:37 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 20 Jan 2011 01:54:37 +1100 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: <20110119152853.3d529f01@pitrou.net> References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> <4D36F3DE.8080100@pearwood.info> <20110119152853.3d529f01@pitrou.net> Message-ID: <4D36FB2D.4060106@pearwood.info> Antoine Pitrou wrote: > On Thu, 20 Jan 2011 01:23:26 +1100 > Steven D'Aprano wrote: >> skip at pobox.com wrote: >> >>> I realize the world is passing me by and that I'm rapidly turning into a >>> dinosaur w.r.t. distributed version control, but as you write/update the >>> developer's guide remember that proficiency in Python does not necessarily >>> equate to proficiency in version control systems, especially with the less >>> frequently used commands. I personally would prefer that more general >>> commands and concepts be used where possible so that newcomers not be put >>> off unnecessarily by the complexity of version control. >> What he said, only bolded and underlined. > > I'm not sure what the issue is. Is there something, concretely, that > needs to be fixed? You'll have to ask Skip if he thinks there's a concrete problem. I haven't seen one, but I've only been reading this thread with one eye and it may be I've missed the mother of all problems. The (non-concrete) issue, as I understand it, is simple: be aware that not all Python developers are necessarily expert in DVCSes, and please keep it simple. -- Steven From hodgestar+pythondev at gmail.com Wed Jan 19 16:01:50 2011 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Wed, 19 Jan 2011 17:01:50 +0200 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <1295440442.432.18.camel@marge> References: <1295440442.432.18.camel@marge> Message-ID: On Wed, Jan 19, 2011 at 2:34 PM, Victor Stinner wrote: > ?(a) Python 3 doesn't support non-ASCII module names -0: I'm vaguely against this being supported because I'd rather not have to deal with what happens when the guess regarding the filesystem encoding is wrong. On the other hand, a general encouragement to stick to ASCII module names is probably functionally equivalent without imposing a hard restriction. > ?(b) Python 3 doesn't support unencodable characters in the module path +1: It'd be nice if Python could import modules regardless of what folder names people happen to have on their module path. Schiavo Simon From solipsis at pitrou.net Wed Jan 19 16:15:22 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 19 Jan 2011 16:15:22 +0100 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> <4D36F3DE.8080100@pearwood.info> <20110119152853.3d529f01@pitrou.net> <4D36FB2D.4060106@pearwood.info> Message-ID: <20110119161522.32893641@pitrou.net> On Thu, 20 Jan 2011 01:54:37 +1100 Steven D'Aprano wrote: > > You'll have to ask Skip if he thinks there's a concrete problem. I > haven't seen one, but I've only been reading this thread with one eye > and it may be I've missed the mother of all problems. > > The (non-concrete) issue, as I understand it, is simple: be aware that > not all Python developers are necessarily expert in DVCSes, and please > keep it simple. Well "svn revert" is one of the basic SVN commands (that I personally use far more often than "patch -R", but YMMV). We're not talking about some advanced use of Mercurial queues. The point is a bit subtler here though: if you use "patch -R" after you have done some changes of your own, the checkout will not be restored to its pristine state, which may bite you later. "svn revert -R ." ensures everything is clean. Arguably, even "patch" isn't familiar to Windows developers. It doesn't come bundled and has to be installed separately, and I've seen some people use the TortoiseSVN GUI for applying patches. Regards Antoine. From eric at trueblade.com Wed Jan 19 16:25:30 2011 From: eric at trueblade.com (Eric Smith) Date: Wed, 19 Jan 2011 10:25:30 -0500 (EST) Subject: [Python-Dev] Moving stuff out of Misc and over to the devguide In-Reply-To: <4D36E118.40008@netwok.org> References: <20110117215440.122bef7b@pitrou.net> <4D36E118.40008@netwok.org> Message-ID: <7e9e0831fbf09c304e9f358b406f43b8.squirrel@mail.trueblade.com> > Bonus question: if we remove maintainers.rst from py3k, what do we do in > 3.1 and 2.7? I???d favor removing them over keeping outdated versions. Is there not some advantage to knowing who was the maintainer (or expert) of a given module at the time of a release? Eric. From skip at pobox.com Wed Jan 19 17:36:04 2011 From: skip at pobox.com (skip at pobox.com) Date: Wed, 19 Jan 2011 10:36:04 -0600 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: <20110119152853.3d529f01@pitrou.net> References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> <4D36F3DE.8080100@pearwood.info> <20110119152853.3d529f01@pitrou.net> Message-ID: <19767.4852.985854.119907@montanaro.dyndns.org> >> What he said, only bolded and underlined. Antoine> I'm not sure what the issue is. Is there something, concretely, Antoine> that needs to be fixed? Strictly speaking, nothing needs to be "fixed" because nothing is broken. Rephrasing my earlier messages: 1. Being a sophisticated Python programmer (and thus being a potential core developer) does not necessarily equate to being a sophisticated user of (especially distributed) version control systems. I have been programming in Python for about 15 years and have made contributions to the core off-and-on for about 10 years. I have never, not even once, been tempted to learn about or use svnmerge. Even considering the more mundane subcommands of the normal svn and hg commands (not to mention cvs, bzr, git, darcs, etc) there are plenty of different ways to structure the workflow, not all of which will make sense for each of those vcs's, nor will they all make sense to all potential users. 2. There is more than one way to skin many of the cats involved in version control. My preference to use "vcs diff | patch -p0 -R" or "patch -p0 -R < some-email" in preference to "vcs revert " is just one example. I'm sure I will be able to master "svn revert" and "hg revert" if necessary, but that knowledge won't transfer at all to CVS (no revert command) and won't transfer 100% to other vcs's because their revert commands will have semantic differences or use different command line flags to dictate the specifics of the action to perform. 3. Not everyone will use the command line (strange as that may seem coming from a decades-long Unix user). Many Windows users (and probably some Mac users) will have GUIs like TortoiseHg. Smart/lazy/ memory-challenged Emacs and vim users will have version control commands built into their editors precisely to paper over the arcane differences which exist between vcs's even for common operations. Skip From janssen at parc.com Wed Jan 19 17:54:40 2011 From: janssen at parc.com (Bill Janssen) Date: Wed, 19 Jan 2011 08:54:40 PST Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: <4D36F3DE.8080100@pearwood.info> References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> <4D36F3DE.8080100@pearwood.info> Message-ID: <34926.1295456080@parc.com> Steven D'Aprano wrote: > skip at pobox.com wrote: > > > I realize the world is passing me by and that I'm rapidly turning into a > > dinosaur w.r.t. distributed version control, but as you write/update the > > developer's guide remember that proficiency in Python does not necessarily > > equate to proficiency in version control systems, especially with the less > > frequently used commands. I personally would prefer that more general > > commands and concepts be used where possible so that newcomers not be put > > off unnecessarily by the complexity of version control. > > What he said, only bolded and underlined. Indeed. I now have to deal with an unholy mix of CVS, Subversion, git, and Mercurial -- a twisty maze of little one-letter options, all so similar, all too powerful. At least with CVS and Subversion you could concentrate your mistakes on a single file :-). Bill From solipsis at pitrou.net Wed Jan 19 18:02:47 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 19 Jan 2011 18:02:47 +0100 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: <19767.4852.985854.119907@montanaro.dyndns.org> References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> <4D36F3DE.8080100@pearwood.info> <20110119152853.3d529f01@pitrou.net> <19767.4852.985854.119907@montanaro.dyndns.org> Message-ID: <20110119180247.5c549dc4@pitrou.net> On Wed, 19 Jan 2011 10:36:04 -0600 skip at pobox.com wrote: > > >> What he said, only bolded and underlined. > > Antoine> I'm not sure what the issue is. Is there something, concretely, > Antoine> that needs to be fixed? > > Strictly speaking, nothing needs to be "fixed" because nothing is broken. > Rephrasing my earlier messages: > [...] Ok, thank you but... are you suggesting something or not? From g.brandl at gmx.net Wed Jan 19 18:04:23 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 19 Jan 2011 18:04:23 +0100 Subject: [Python-Dev] Moving stuff out of Misc and over to the devguide In-Reply-To: <7e9e0831fbf09c304e9f358b406f43b8.squirrel@mail.trueblade.com> References: <20110117215440.122bef7b@pitrou.net> <4D36E118.40008@netwok.org> <7e9e0831fbf09c304e9f358b406f43b8.squirrel@mail.trueblade.com> Message-ID: Am 19.01.2011 16:25, schrieb Eric Smith: >> Bonus question: if we remove maintainers.rst from py3k, what do we do in >> 3.1 and 2.7? I???d favor removing them over keeping outdated versions. > > Is there not some advantage to knowing who was the maintainer (or expert) > of a given module at the time of a release? I don't see much advantage. And if you need the version of maintainers.rst in another repo, it's not hard to find the revision that corresponds to the release date. Georg From skip at pobox.com Wed Jan 19 19:10:11 2011 From: skip at pobox.com (skip at pobox.com) Date: Wed, 19 Jan 2011 12:10:11 -0600 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: <20110119180247.5c549dc4@pitrou.net> References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> <4D36F3DE.8080100@pearwood.info> <20110119152853.3d529f01@pitrou.net> <19767.4852.985854.119907@montanaro.dyndns.org> <20110119180247.5c549dc4@pitrou.net> Message-ID: <19767.10499.387402.885670@montanaro.dyndns.org> Antoine> Ok, thank you but... are you suggesting something or not? Yes. Keep the vcs command recommendations simple. At least mention idioms which likely to apply across a wider range of version control systems. S From fuzzyman at voidspace.org.uk Wed Jan 19 19:19:08 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 19 Jan 2011 19:19:08 +0100 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> Message-ID: <4D372B1C.8050204@voidspace.org.uk> On 19/01/2011 11:35, Nick Coghlan wrote: > On Wed, Jan 19, 2011 at 5:32 AM, wrote: >> The odds that someone will remember the syntax for the diff command for the >> VCS are much higher than the revert command. My guess is "diff" is executed >> more often than any other version control commands except "update" and >> "commit", and far more often than "revert". Personally, I'm not sure I've >> ever used "revert" more than a handful of times in my entire professional >> lifetime. >> >> I realize the world is passing me by and that I'm rapidly turning into a >> dinosaur w.r.t. distributed version control, but as you write/update the >> developer's guide remember that proficiency in Python does not necessarily >> equate to proficiency in version control systems, especially with the less >> frequently used commands. I personally would prefer that more general >> commands and concepts be used where possible so that newcomers not be put >> off unnecessarily by the complexity of version control. > Interesting. I almost *never* reverse patches - I always use the SVN > revert command. > > Usually, this is because I will have edited the source tree since > applying the patch. Reversion has the advantage of not getting > confused by any additional changes. I also usually use "svn diff" to > save a copy before I revert in case I change my mind. > Ditto, same here. For me (by no stretch of the imagination an "expert" VCS user) the revert commands (of svn, Hg and bzr) are basically straightforward (and cross-platform). To me it is tinkering with the patch command that is arcane... All the best, Michael > Cheers, > Nick. > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From fuzzyman at voidspace.org.uk Wed Jan 19 19:20:01 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 19 Jan 2011 19:20:01 +0100 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: <19767.10499.387402.885670@montanaro.dyndns.org> References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> <4D36F3DE.8080100@pearwood.info> <20110119152853.3d529f01@pitrou.net> <19767.4852.985854.119907@montanaro.dyndns.org> <20110119180247.5c549dc4@pitrou.net> <19767.10499.387402.885670@montanaro.dyndns.org> Message-ID: <4D372B51.6010509@voidspace.org.uk> On 19/01/2011 19:10, skip at pobox.com wrote: > Antoine> Ok, thank you but... are you suggesting something or not? > > Yes. Keep the vcs command recommendations simple. At least mention idioms > which likely to apply across a wider range of version control systems. The revert works with svn, hg and bzr. Using patch is not going to work on Windoze unless cygwin has been installed. Michael > S > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From brett at python.org Wed Jan 19 19:23:02 2011 From: brett at python.org (Brett Cannon) Date: Wed, 19 Jan 2011 10:23:02 -0800 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On Wed, Jan 19, 2011 at 07:01, Simon Cross wrote: > On Wed, Jan 19, 2011 at 2:34 PM, Victor Stinner > wrote: >> ?(a) Python 3 doesn't support non-ASCII module names > > -0: I'm vaguely against this being supported because I'd rather not > have to deal with what happens when the guess regarding the filesystem > encoding is wrong. On the other hand, a general encouragement to stick > to ASCII module names is probably functionally equivalent without > imposing a hard restriction. -0 from me (unless the Unicode variable naming PEP says otherwise). > >> ?(b) Python 3 doesn't support unencodable characters in the module path > > +1: It'd be nice if Python could import modules regardless of what > folder names people happen to have on their module path. +1 from me as well (nervously hoping importlib already supports it =) . From brett at python.org Wed Jan 19 19:25:58 2011 From: brett at python.org (Brett Cannon) Date: Wed, 19 Jan 2011 10:25:58 -0800 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: <19767.10499.387402.885670@montanaro.dyndns.org> References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> <4D36F3DE.8080100@pearwood.info> <20110119152853.3d529f01@pitrou.net> <19767.4852.985854.119907@montanaro.dyndns.org> <20110119180247.5c549dc4@pitrou.net> <19767.10499.387402.885670@montanaro.dyndns.org> Message-ID: On Wed, Jan 19, 2011 at 10:10, wrote: > > ? ?Antoine> Ok, thank you but... are you suggesting something or not? > > Yes. ?Keep the vcs command recommendations simple. ?At least mention idioms > which likely to apply across a wider range of version control systems. I was hoping this would flame out, but two days of discussion suggests otherwise. I am of the opinion of always listing how to use the CVS to its fullest. It is the thing you will have to interact with the most when doing work on Python, so trying to avoid it is not doing anyone any favours. That being said, I am not opposed to someone (other than me as I am not going to bother) **adding** a not about `patch -R`, but it should not replace the `svn revert` explanation. From alexander.belopolsky at gmail.com Wed Jan 19 19:38:43 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Jan 2011 13:38:43 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On Wed, Jan 19, 2011 at 1:23 PM, Brett Cannon wrote: .. >>> ?(a) Python 3 doesn't support non-ASCII module names .. > -0 from me (unless the Unicode variable naming PEP says otherwise). > I am not sure what exactly is not supported. On my OSX system: $ ./python.exe Python 3.2b2+ .. >>> import ???? >>> ????.foo 42 >>> from ???? import foo >>> foo 42 PEP 3131 does not distinguish between different types of identifiers, so I think it assumes that non-ascii module names should be supported. +1 on fixing any remaining bugs From barry at python.org Wed Jan 19 19:40:44 2011 From: barry at python.org (Barry Warsaw) Date: Wed, 19 Jan 2011 13:40:44 -0500 Subject: [Python-Dev] Tidying up the Meta-PEP and Other Informational PEP sections of PEP 0 In-Reply-To: References: Message-ID: <20110119134044.4caebb3f@python.org> On Jan 19, 2011, at 12:16 AM, Nick Coghlan wrote: >For the release schedule PEPs it means "done and dusted" (similar to >the meaning for ordinary PEPs). For the API standardisation PEPs (like >WSGI) it instead means the spec has been locked down and any changes >will require a new PEP. This caused a problem for the PEP 0 generator, >since the former kind of PEP should be moved to the new historical >section, while the latter kind should remain up top. > >Would anyone object if I switched all the API definition PEPs to the >"Active" state? PEP 1 indicates that is the appropriate state for >reference PEPs that are never truly "finished" (in the sense of code >being implemented and committed to the source control system). Perhaps we need a new type for API PEPs instead? Type: API Type: Consensus ? If not, then I'd rather come up with a different status to describe an API PEP that has been locked down. Re-using Active doesn't seem right to me. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From brett at python.org Wed Jan 19 19:42:02 2011 From: brett at python.org (Brett Cannon) Date: Wed, 19 Jan 2011 10:42:02 -0800 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On Wed, Jan 19, 2011 at 10:38, Alexander Belopolsky wrote: > On Wed, Jan 19, 2011 at 1:23 PM, Brett Cannon wrote: > .. >>>> ?(a) Python 3 doesn't support non-ASCII module names > .. >> -0 from me (unless the Unicode variable naming PEP says otherwise). >> > > I am not sure what exactly is not supported. ?On my OSX system: Victor said this is a Windows-specific issue. -Brett > > $ ./python.exe > Python 3.2b2+ .. > >>>> import ???? >>>> ????.foo > 42 >>>> from ???? import foo >>>> foo > 42 > > > PEP 3131 does not distinguish between different types of identifiers, > so I think it assumes that non-ascii module names should be supported. > > +1 on fixing any remaining bugs > From solipsis at pitrou.net Wed Jan 19 19:47:54 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 19 Jan 2011 19:47:54 +0100 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: <4D372B51.6010509@voidspace.org.uk> References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> <4D36F3DE.8080100@pearwood.info> <20110119152853.3d529f01@pitrou.net> <19767.4852.985854.119907@montanaro.dyndns.org> <20110119180247.5c549dc4@pitrou.net> <19767.10499.387402.885670@montanaro.dyndns.org> <4D372B51.6010509@voidspace.org.uk> Message-ID: <20110119194754.4b325b92@pitrou.net> On Wed, 19 Jan 2011 19:20:01 +0100 Michael Foord wrote: > On 19/01/2011 19:10, skip at pobox.com wrote: > > Antoine> Ok, thank you but... are you suggesting something or not? > > > > Yes. Keep the vcs command recommendations simple. At least mention idioms > > which likely to apply across a wider range of version control systems. > > The revert works with svn, hg and bzr. Using patch is not going to work > on Windoze unless cygwin has been installed. You don't need cygwin, just something much smaller with "GNU" in its name: http://gnuwin32.sourceforge.net/packages/patch.htm (yes, the suggestion is already in the dev guide) From alexander.belopolsky at gmail.com Wed Jan 19 19:51:35 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Jan 2011 13:51:35 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On Wed, Jan 19, 2011 at 1:42 PM, Brett Cannon wrote: .. >> I am not sure what exactly is not supported. ?On my OSX system: > > Victor said this is a Windows-specific issue. I missed that part. In this case, I change my vote to +0 to reflect my lack of knowledge or exposure to Windows-only issues. However, if Victor's patch simplifies the code (as many of his changes in this area do), I will be happy to review it. From benjamin at python.org Wed Jan 19 20:17:23 2011 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 19 Jan 2011 13:17:23 -0600 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: <4D372B51.6010509@voidspace.org.uk> References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> <4D36F3DE.8080100@pearwood.info> <20110119152853.3d529f01@pitrou.net> <19767.4852.985854.119907@montanaro.dyndns.org> <20110119180247.5c549dc4@pitrou.net> <19767.10499.387402.885670@montanaro.dyndns.org> <4D372B51.6010509@voidspace.org.uk> Message-ID: 2011/1/19 Michael Foord : > On 19/01/2011 19:10, skip at pobox.com wrote: >> >> ? ? Antoine> ?Ok, thank you but... are you suggesting something or not? >> >> Yes. ?Keep the vcs command recommendations simple. ?At least mention >> idioms >> which likely to apply across a wider range of version control systems. > > The revert works with svn, hg and bzr. Using patch is not going to work on > Windoze unless cygwin has been installed. I thought you were supposed to use some variant of "update" on hg instead revert, though. -- Regards, Benjamin From solipsis at pitrou.net Wed Jan 19 20:22:28 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 19 Jan 2011 20:22:28 +0100 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> <4D36F3DE.8080100@pearwood.info> <20110119152853.3d529f01@pitrou.net> <19767.4852.985854.119907@montanaro.dyndns.org> <20110119180247.5c549dc4@pitrou.net> <19767.10499.387402.885670@montanaro.dyndns.org> <4D372B51.6010509@voidspace.org.uk> Message-ID: <1295464948.3694.1.camel@localhost.localdomain> > > The revert works with svn, hg and bzr. Using patch is not going to work on > > Windoze unless cygwin has been installed. > > I thought you were supposed to use some variant of "update" on hg > instead revert, though. I think what is discouraged is to "hg revert" to a different revision. We are talking about reverting your working copy to its pristine state. Regards Antoine. From victor.stinner at haypocalc.com Wed Jan 19 20:31:08 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 19 Jan 2011 20:31:08 +0100 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: <1295465468.1248.5.camel@marge> Le mercredi 19 janvier 2011 ? 10:42 -0800, Brett Cannon a ?crit : > > I am not sure what exactly is not supported. On my OSX system: > > Victor said this is a Windows-specific issue. Autoquote: "(a) (...) doesn't work with a locale encoding different than UTF-8" Hum, it's not exactly the locale encoding, but the Python filesystem encoding. On Mac OS X, this encoding is *hardcoded* to UTF-8, so it is possible to use non-ASCII module names on this OS. It is also possible on other BSD/UNIX systems using UTF-8 locale encoding. But this issue only concerns any BSD/UNIX using a locale encoding different than UTF-8. Eg. MvL's buildbot (x86 debian parallel 3.x) uses ISO-8859-15 (see #10492, issue fixed 13 days ago). Even if UTF-8 becomes a de facto standard locale encoding, many systems still use something else. And Python 2 users will complain that their script works with Python 2 but not with Python 3 :-) If we decide to reject non-ASCII module names, it should be done on any operating systems, not only on Windows. Victor From v+python at g.nevcal.com Wed Jan 19 21:19:46 2011 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 19 Jan 2011 12:19:46 -0800 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <1295465468.1248.5.camel@marge> References: <1295440442.432.18.camel@marge> <1295465468.1248.5.camel@marge> Message-ID: <4D374762.4010505@g.nevcal.com> On 1/19/2011 11:31 AM, Victor Stinner wrote: > If we decide to reject non-ASCII module names, it should be done on any > operating systems, not only on Windows. Since Python allows non-ASCII variable names, I think it should allow non-ASCII module names also, on any platform that supports the appropriate characters in the filesystem. Since some platforms already accept them, dropping them would be incompatible. If Victor already has a patch coded (i.e. the work is basically done, no waiting in line 3), I'm even more in favor of it. If it took lots of future hard work, and no one volunteered to do it, that would perhaps be justification for retaining module name restrictions. I guess that is not the case here, so... +1 on supporting full Unicode module names on all platforms. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Jan 19 21:32:06 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 19 Jan 2011 15:32:06 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <1295440442.432.18.camel@marge> References: <1295440442.432.18.camel@marge> Message-ID: On 1/19/2011 7:34 AM, Victor Stinner wrote: > Hi, > > I patched Python 3.2 to support modules with non-ASCII paths (*). It > works well on all operating systems. But the task is not completly > done: > > (a) Python 3 doesn't support non-ASCII module names (b) Python 3 > doesn't support unencodable characters in the module path > > I would like to know if we need to support that. Terry J. Reedy > wrote (issue #10828): "I think bugs in core syntax should have high > priority. I appreciate your work toward fixing it." I am a little shocked at the so-far tepid response to (a), so let me defend and explain my claim that it is a bug. In the simplest case (from 6.11. The import statement and 2.3. Identifiers and keywords) import_stmt ::= "import" module module ::= indentifier identifier ::= There is nothing, nothing, about any restriction on identifiers. The rest of 6.11 discusses the complex import algorithm but leaves out the simple semantics that cover 99% of cases (import a ???.py file in a directory on sys.path), and never mentions ".py". So lets go to Tutorial 6. Modules which does explain the simple case: "A module is a file containing Python definitions and statements. The file name is the module name with the suffix .py appended." So, if xyz is a legal identifier and xyx.py exists on sys.path, it is reasonable from the docs to expect 'import xyz' to work. (Sys.path is memtioned in the reference.) But we now have the following possibility: Let xyz.py be def double(x): return 2*x if __name__=="__main__": if double(2) == 4: print("test passed") We run the file, get "test passed", and write zyx.py: import xyz ... We run zyx and Python says "No module named xyz". Bad, and quite puzzling to anyone who does not understand the subtle difference between running and importing a file. -- Terry Jan Reedy From victor.stinner at haypocalc.com Wed Jan 19 22:10:35 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 19 Jan 2011 22:10:35 +0100 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: <1295471435.1248.108.camel@marge> Le mercredi 19 janvier 2011 ? 13:38 -0500, Alexander Belopolsky a ?crit : > PEP 3131 does not distinguish between different types of identifiers, > so I think it assumes that non-ascii module names should be supported. My opinion is that we should suport non-ASCII module names and unencodable paths if it doesn't introduce an overhead (make Python slower and add a lot of code). My patch adds ~400 lines of code (I think that it is small: the patch adds many functions), but I think that it makes Python as fast, or maybe faster. Victor From hodgestar+pythondev at gmail.com Wed Jan 19 22:05:41 2011 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Wed, 19 Jan 2011 23:05:41 +0200 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On Wed, Jan 19, 2011 at 10:32 PM, Terry Reedy wrote: > I am a little shocked at the so-far tepid response to (a), so let me > defend and explain my claim that it is a bug. > > In the simplest case (from 6.11. The import statement and ?2.3. Identifiers > and keywords) > > import_stmt ::= "import" module > module ? ? ?::= indentifier > identifier ?::= > > There is nothing, nothing, about any restriction on identifiers. I have no problem with non-ASCII module identifiers being valid syntax. It's a question of whether attempting to translate a non-ASCII module name into a file name (so the file can be imported) is a good idea and whether these sorts of files can be safely transferred among diverse filesystems. For similar reasons we tend to avoid capital letters in module names. Schiavo Simon From g.brandl at gmx.net Wed Jan 19 22:07:03 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 19 Jan 2011 22:07:03 +0100 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: Am 19.01.2011 21:32, schrieb Terry Reedy: > On 1/19/2011 7:34 AM, Victor Stinner wrote: >> Hi, >> >> I patched Python 3.2 to support modules with non-ASCII paths (*). It >> works well on all operating systems. But the task is not completly >> done: >> >> (a) Python 3 doesn't support non-ASCII module names (b) Python 3 >> doesn't support unencodable characters in the module path >> >> I would like to know if we need to support that. Terry J. Reedy >> wrote (issue #10828): "I think bugs in core syntax should have high >> priority. I appreciate your work toward fixing it." > > I am a little shocked at the so-far tepid response to (a), so let me > defend and explain my claim that it is a bug. > > In the simplest case (from 6.11. The import statement and 2.3. > Identifiers and keywords) > > import_stmt ::= "import" module > module ::= indentifier > identifier ::= > > There is nothing, nothing, about any restriction on identifiers. +1. The restriction on valid identifiers is very sensible (obviously, since "m" needs to be accessible after "import m"), but a further restriction seems just arbitrary. Georg From tjreedy at udel.edu Wed Jan 19 22:22:46 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 19 Jan 2011 16:22:46 -0500 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> <4D36F3DE.8080100@pearwood.info> <20110119152853.3d529f01@pitrou.net> <19767.4852.985854.119907@montanaro.dyndns.org> <20110119180247.5c549dc4@pitrou.net> <19767.10499.387402.885670@montanaro.dyndns.org> Message-ID: On 1/19/2011 1:25 PM, Brett Cannon wrote: > On Wed, Jan 19, 2011 at 10:10, wrote: >> >> Antoine> Ok, thank you but... are you suggesting something or not? >> >> Yes. Keep the vcs command recommendations simple. At least mention idioms >> which likely to apply across a wider range of version control systems. > > I was hoping this would flame out, but two days of discussion suggests > otherwise. > > I am of the opinion of always listing how to use the CVS to its > fullest. It is the thing you will have to interact with the most when > doing work on Python, so trying to avoid it is not doing anyone any > favours. > > That being said, I am not opposed to someone (other than me as I am > not going to bother) **adding** a not about `patch -R`, but it should > not replace the `svn revert` explanation. As a neophyte vcs user, I like specific commands that can only do what I want, and not screw up with a wrong flag, so I agree with this. The most important thing is being clear about which data will have which effect on which other data. -- Terry Jan Reedy From fuzzyman at voidspace.org.uk Wed Jan 19 22:27:44 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 19 Jan 2011 22:27:44 +0100 Subject: [Python-Dev] devguide: Cover how to (un-)apply a patch. In-Reply-To: <20110119194754.4b325b92@pitrou.net> References: <20110118001411.6acd4141@pitrou.net> <19764.56486.736957.590344@montanaro.dyndns.org> <19765.60122.251699.624502@montanaro.dyndns.org> <4D36F3DE.8080100@pearwood.info> <20110119152853.3d529f01@pitrou.net> <19767.4852.985854.119907@montanaro.dyndns.org> <20110119180247.5c549dc4@pitrou.net> <19767.10499.387402.885670@montanaro.dyndns.org> <4D372B51.6010509@voidspace.org.uk> <20110119194754.4b325b92@pitrou.net> Message-ID: <4D375750.6010601@voidspace.org.uk> On 19/01/2011 19:47, Antoine Pitrou wrote: > On Wed, 19 Jan 2011 19:20:01 +0100 > Michael Foord wrote: >> On 19/01/2011 19:10, skip at pobox.com wrote: >>> Antoine> Ok, thank you but... are you suggesting something or not? >>> >>> Yes. Keep the vcs command recommendations simple. At least mention idioms >>> which likely to apply across a wider range of version control systems. >> The revert works with svn, hg and bzr. Using patch is not going to work >> on Windoze unless cygwin has been installed. > You don't need cygwin, just something much smaller with "GNU" in its > name: http://gnuwin32.sourceforge.net/packages/patch.htm > > (yes, the suggestion is already in the dev guide) > Unfortunately gnuwin32 patch doesn't play well with Windows 7. I remember giving up on it completely and installing cygwin. This page seems to explain the details: http://math.nist.gov/oommf/software-patchsets/patch_on_Windows7.html Michael > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From tjreedy at udel.edu Wed Jan 19 22:40:24 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 19 Jan 2011 16:40:24 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On 1/19/2011 4:05 PM, Simon Cross wrote: > I have no problem with non-ASCII module identifiers being valid > syntax. It's a question of whether attempting to translate a non-ASCII If the names are the same, ie, produced with the same sequence of keystrokes in the save-as box and importing box, then there is no translation, at least from the user's view. > module name into a file name (so the file can be imported) is a good > idea and whether these sorts of files can be safely transferred among > diverse filesystems. I believe we now have the situation that a package that works on *nix could fail on Windows, whereas I believe that patch would *improve* portability. > For similar reasons we tend to avoid capital letters in module names. That is a stdlib style guide followed by many, but intentionally not enforced. -- Terry Jan Reedy From victor.stinner at haypocalc.com Wed Jan 19 22:44:40 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 19 Jan 2011 22:44:40 +0100 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <4D374762.4010505@g.nevcal.com> References: <1295440442.432.18.camel@marge> <1295465468.1248.5.camel@marge> <4D374762.4010505@g.nevcal.com> Message-ID: <1295473480.1248.117.camel@marge> Le mercredi 19 janvier 2011 ? 12:19 -0800, Glenn Linderman a ?crit : > Since Python allows non-ASCII variable names, I think it should allow > non-ASCII module names also, on any platform that supports the > appropriate characters in the filesystem. > > Since some platforms already accept them, dropping them would be > incompatible. ok > If Victor already has a patch coded (i.e. the work is basically done, no > waiting in line 3), I'm even more in favor of it. If it took lots of > future hard work, and no one volunteered to do it, that would perhaps be > justification for retaining module name restrictions. I guess that is > not the case here, so... I am volunteer to do the work, and I already have a working patch (but it is not ready yet to be commited, it requires a long review). FYI, I rewrote the patch 4 times since one year, for different reasons: - the patch is huge, complex, and I was unable to "write it correctly" the first time - I splitted the work into two big parts: support non-ASCII paths (done in Python 3.2) and the other changes in the part two - Update an huge patchset on py3k tree is hard, even with git-svn (and git svn rebase) - In my first tries, I didn't patch the import machinery to support non-ASCII module names, I only patched the support of non-ASCII paths But I don't want to apply such huge patch if Python code developers don't want to support non-ASCII module names and unencodable paths. Victor From alexander.belopolsky at gmail.com Wed Jan 19 23:23:54 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Jan 2011 17:23:54 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On Wed, Jan 19, 2011 at 4:40 PM, Terry Reedy wrote: .. >> For similar reasons we tend to avoid capital letters in module names. > > That is a stdlib style guide followed by many, but intentionally not > enforced. Indeed. Last time I looked, we still had cProfile in stdlib. From brett at python.org Wed Jan 19 23:47:11 2011 From: brett at python.org (Brett Cannon) Date: Wed, 19 Jan 2011 14:47:11 -0800 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On Wed, Jan 19, 2011 at 14:23, Alexander Belopolsky wrote: > On Wed, Jan 19, 2011 at 4:40 PM, Terry Reedy wrote: > .. >>> For similar reasons we tend to avoid capital letters in module names. >> >> That is a stdlib style guide followed by many, but intentionally not >> enforced. > > Indeed. ?Last time I looked, we still had cProfile in stdlib. Yes, but that is because no one got around to hiding cProfile behind profile before we released Python 3.0. I would still like to see it (slowly) go away from being directly visible. From alexander.belopolsky at gmail.com Thu Jan 20 00:05:35 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Jan 2011 18:05:35 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On Wed, Jan 19, 2011 at 5:47 PM, Brett Cannon wrote: .. >> Indeed. ?Last time I looked, we still had cProfile in stdlib. > > Yes, but that is because no one got around to hiding cProfile behind > profile before we released Python 3.0. I would still like to see it > (slowly) go away from being directly visible. > Another big offender is the idlelib package. Most of the modules there are in mixed case. From ncoghlan at gmail.com Thu Jan 20 00:11:35 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Jan 2011 09:11:35 +1000 Subject: [Python-Dev] Tidying up the Meta-PEP and Other Informational PEP sections of PEP 0 In-Reply-To: <20110119134044.4caebb3f@python.org> References: <20110119134044.4caebb3f@python.org> Message-ID: On Thu, Jan 20, 2011 at 4:40 AM, Barry Warsaw wrote: > On Jan 19, 2011, at 12:16 AM, Nick Coghlan wrote: > >>For the release schedule PEPs it means "done and dusted" (similar to >>the meaning for ordinary PEPs). For the API standardisation PEPs (like >>WSGI) it instead means the spec has been locked down and any changes >>will require a new PEP. This caused a problem for the PEP 0 generator, >>since the former kind of PEP should be moved to the new historical >>section, while the latter kind should remain up top. >> >>Would anyone object if I switched all the API definition PEPs to the >>"Active" state? PEP 1 indicates that is the appropriate state for >>reference PEPs that are never truly "finished" (in the sense of code >>being implemented and committed to the source control system). > > Perhaps we need a new type for API PEPs instead? > > Type: API > Type: Consensus > > ? > > If not, then I'd rather come up with a different status to describe an API PEP > that has been locked down. ?Re-using Active doesn't seem right to me. Oh, I like "Consensus". I was going to suggest a new state, but I couldn't think of any names I liked. Hmm, guess I'll add "propose a revision to PEP 1" to the to-do list... Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From sandro.tosi at gmail.com Thu Jan 20 00:21:57 2011 From: sandro.tosi at gmail.com (Sandro Tosi) Date: Thu, 20 Jan 2011 00:21:57 +0100 Subject: [Python-Dev] [Python-checkins] devguide: Short doc about where to get tech help related to developing Python. In-Reply-To: References: Message-ID: Hi, On Wed, Jan 19, 2011 at 23:19, brett.cannon wrote: > +Where to Get Help > +================= > +If you are working on Python it is very possible you will come across an issue > +where you need some assistance in solving (this happens to core developers all > +the time). You have a couple of options depending on what kind of help you need. > +If the question involves process or tool usage then please check the developer's > +guide first as is should answer your question. as it should > +Filing a Bug > +------------ > +If you come across an odd error message that seems like a bug, then file a bug > +on the `issue tracker`_. In the bug you can explain that you are not sure why > +the error is coming up or that the exact nature of the problem is. Someone will ...or what the exact...? > +Asking a Technical Question > +--------------------------- > +You have two avenues of communication out of the :ref:`myriad of options > +available `. If you are comfortable with IRC you can try asking > +in #python-dev. Typically there are a couple of experienced developers, ranging > +from triagers to core developers, who can ask questions about developing for who can answer questions Cheers, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi From brett at python.org Thu Jan 20 00:31:24 2011 From: brett at python.org (Brett Cannon) Date: Wed, 19 Jan 2011 15:31:24 -0800 Subject: [Python-Dev] Moving stuff out of Misc and over to the devguide In-Reply-To: References: Message-ID: OK, here is my plan that I will implement: MOVE ---------- developers.txt maintainers.rst README.gdb README.coverity README.Emacs DELETE (seem way too old to still be relevant; tell me if I am wrong) ----------- README.OpenBSD README.AIX cheatsheet LEAVE everything else (with README properly edited and simplified to only list files with non-obvious names) On Mon, Jan 17, 2011 at 12:32, Brett Cannon wrote: > There is a bunch of stuff in Misc that probably belongs in the > devguide (under Resources) instead of in svn. Here are the files I > think can be moved (in order of how strongly I think they should be > moved): > > PURIFY.README > README.coverty > README.klocwork > README.valgrind > Porting > developers.txt > maintainers.rst > SpecialBuilds.txt > > Now before anyone yells "that is inconvenient", don't forget that all > core developers can check out and edit the devguide, and that almost > all of the files listed (SpecialBuilds.txt is the exception) are > typically edited and viewed on their own. > > Anyway, if there is a file listed here you don't think should move out > of py3k and into the devguide, speak up. > From a.badger at gmail.com Thu Jan 20 00:44:19 2011 From: a.badger at gmail.com (Toshio Kuratomi) Date: Wed, 19 Jan 2011 15:44:19 -0800 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: <20110119234419.GO22400@unaka.lan> On Wed, Jan 19, 2011 at 04:40:24PM -0500, Terry Reedy wrote: > On 1/19/2011 4:05 PM, Simon Cross wrote: > > >I have no problem with non-ASCII module identifiers being valid > >syntax. It's a question of whether attempting to translate a non-ASCII > > If the names are the same, ie, produced with the same sequence of > keystrokes in the save-as box and importing box, then there is no > translation, at least from the user's view. > > >module name into a file name (so the file can be imported) is a good > >idea and whether these sorts of files can be safely transferred among > >diverse filesystems. > > I believe we now have the situation that a package that works on *nix > could fail on Windows, whereas I believe that patch would *improve* > portability. > I'm not so sure about this.... You may have something that works on Windows and on *NIX under certain circumstances but it seems likely to fail when moving files between them (for instance, as packages downloaded from pypi). Additionally, many unix filesystem don't specify a filesystem encoding for filenames; they deal in legal and illegal bytes which could lead to troubles. This problem of which encoding to use is a problem that can be seen on UNIX systems even now. Try this: echo 'print("hi")' > caf?.py convmv -f utf-8 -t latin1 caf?.py python3 -c 'import caf?' ASCII seems very sensible to me when faced with these ambiguities. Other options I can brainstorm that could be explored: * Specify an encoding per platform and stick to that. (So, for instance, all module names on posix platforms would have to be utf-8). Force translation between encoding when installing packages (But that doesn't help for people that are creating their modules using their own build scripts rather than distutils, copying the files using raw tar, etc.) * Change import semantics to allow specifying the encoding of the module on the filesystem (seems really icky). -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From solipsis at pitrou.net Thu Jan 20 00:49:22 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 20 Jan 2011 00:49:22 +0100 Subject: [Python-Dev] Moving stuff out of Misc and over to the devguide References: Message-ID: <20110120004922.22e499e4@pitrou.net> On Wed, 19 Jan 2011 15:31:24 -0800 Brett Cannon wrote: > OK, here is my plan that I will implement: > > MOVE > ---------- > developers.txt > maintainers.rst > README.gdb > README.coverity > README.Emacs > > DELETE (seem way too old to still be relevant; tell me if I am wrong) > ----------- > README.OpenBSD > README.AIX > cheatsheet > README.gdb is useful to more than core developers and contributors, so I think it should stay inside Misc. Regards Antoine. From tjreedy at udel.edu Thu Jan 20 01:12:31 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 19 Jan 2011 19:12:31 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <20110119234419.GO22400@unaka.lan> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> Message-ID: On 1/19/2011 6:44 PM, Toshio Kuratomi wrote: >> I believe we now have the situation that a package that works on *nix >> could fail on Windows, whereas I believe that patch would *improve* >> portability. >> > I'm not so sure about this.... Forget that claim if it is not true. The patch will certainly improve consistency with a box so that files that run can also be imported. -- Terry Jan Reedy From tjreedy at udel.edu Thu Jan 20 01:15:27 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 19 Jan 2011 19:15:27 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On 1/19/2011 6:05 PM, Alexander Belopolsky wrote: > On Wed, Jan 19, 2011 at 5:47 PM, Brett Cannon wrote: > .. >>> Indeed. Last time I looked, we still had cProfile in stdlib. >> >> Yes, but that is because no one got around to hiding cProfile behind >> profile before we released Python 3.0. I would still like to see it >> (slowly) go away from being directly visible. >> > > Another big offender is the idlelib package. Most of the modules > there are in mixed case. Given that the individual modules are not documented and that the only programs importing the individual modules are other idlelib modules (true?) then a rename should be possible. In the other hand, the same facts sort of make it unnecessary ;-). -- Terry Jan Reedy From victor.stinner at haypocalc.com Thu Jan 20 01:26:01 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 20 Jan 2011 01:26:01 +0100 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <20110119234419.GO22400@unaka.lan> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> Message-ID: <1295483161.12324.10.camel@marge> Le mercredi 19 janvier 2011 ? 15:44 -0800, Toshio Kuratomi a ?crit : > Additionally, many unix filesystem don't specify a filesystem encoding for > filenames; they deal in legal and illegal bytes which could lead to > troubles. This problem of which encoding to use is a problem that can be > seen on UNIX systems even now. If the system is not correctly configured, it is not a bug in Python, but a bug in the system config. Python relies on the locale to choose the filesystem encoding (sys.getfilesystemencoding()). Python uses this encoding to decode and encode all filenames. > * Specify an encoding per platform and stick to that. It doesn't work: on UNIX/BSD, the user chooses its own encoding and all programs will use it. Anyway, I don't see why it is a problem to have different encodings on different systems. Each system can use its own encoding. The bug that I'm trying to solve is a Python bug, not an OS bug. > * Change import semantics to allow specifying the encoding of the module on > the filesystem (seems really icky). This is a very bad idea. I introduced PYTHONFSENCODING environment variable in Python 3.2, but then quickly removed it, because it introduced a lot of inconsistencies. Victor From foom at fuhm.net Thu Jan 20 01:11:52 2011 From: foom at fuhm.net (James Y Knight) Date: Wed, 19 Jan 2011 19:11:52 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <20110119234419.GO22400@unaka.lan> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> Message-ID: On Jan 19, 2011, at 6:44 PM, Toshio Kuratomi wrote: > This problem of which encoding to use is a problem that can be > seen on UNIX systems even now. Try this: > > echo 'print("hi")' > caf?.py > convmv -f utf-8 -t latin1 caf?.py > python3 -c 'import caf?' > > ASCII seems very sensible to me when faced with these ambiguities. > > Other options I can brainstorm that could be explored: > > * Specify an encoding per platform and stick to that. (So, for instance, > all module names on posix platforms would have to be utf-8). Force > translation between encoding when installing packages (But that doesn't > help for people that are creating their modules using their own build > scripts rather than distutils, copying the files using raw tar, etc.) > * Change import semantics to allow specifying the encoding of the module on > the filesystem (seems really icky). None of this is unique to import -- the same exact issue occurs with open(u'caf?'). I don't see any reason why import caf? should be though of as more of a problem, or treated any differently. It's reasonable to recommend that people use ASCII in their module names if they want wide portability, but it should still be supported to use non-ASCII. James From a.badger at gmail.com Thu Jan 20 02:30:31 2011 From: a.badger at gmail.com (Toshio Kuratomi) Date: Wed, 19 Jan 2011 17:30:31 -0800 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> Message-ID: <20110120013031.GP22400@unaka.lan> On Wed, Jan 19, 2011 at 07:11:52PM -0500, James Y Knight wrote: > On Jan 19, 2011, at 6:44 PM, Toshio Kuratomi wrote: > > This problem of which encoding to use is a problem that can be > > seen on UNIX systems even now. Try this: > > > > echo 'print("hi")' > caf?.py > > convmv -f utf-8 -t latin1 caf?.py > > python3 -c 'import caf?' > > > > ASCII seems very sensible to me when faced with these ambiguities. > > > > Other options I can brainstorm that could be explored: > > > > * Specify an encoding per platform and stick to that. (So, for instance, > > all module names on posix platforms would have to be utf-8). Force > > translation between encoding when installing packages (But that doesn't > > help for people that are creating their modules using their own build > > scripts rather than distutils, copying the files using raw tar, etc.) > > * Change import semantics to allow specifying the encoding of the module on > > the filesystem (seems really icky). > > None of this is unique to import -- the same exact issue occurs with open(u'caf?'). I don't see any reason why import caf? should be though of as more of a problem, or treated any differently. > It's unique in several ways: 1) With open, you can specify a byte string:: open(b'caf\xe9.py').read() I don't know of any way to do that with import. This is needed when the filename is not compatible with your current locale. 2) import assigns a name to the module that it imports whereas open lets the programmer assign the name. So even if you can specify what to use as a byte string for this filename on this particular filesystem you'd still end up with some ugly pseudo-representation of bytes when attempting to access it in code:: import caf\xe9 caf\xe9.do_something() -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From a.badger at gmail.com Thu Jan 20 03:07:25 2011 From: a.badger at gmail.com (Toshio Kuratomi) Date: Wed, 19 Jan 2011 18:07:25 -0800 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <1295483161.12324.10.camel@marge> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> Message-ID: <20110120020725.GQ22400@unaka.lan> On Thu, Jan 20, 2011 at 01:26:01AM +0100, Victor Stinner wrote: > Le mercredi 19 janvier 2011 ? 15:44 -0800, Toshio Kuratomi a ?crit : > > Additionally, many unix filesystem don't specify a filesystem encoding for > > filenames; they deal in legal and illegal bytes which could lead to > > troubles. This problem of which encoding to use is a problem that can be > > seen on UNIX systems even now. > > If the system is not correctly configured, it is not a bug in Python, > but a bug in the system config. Python relies on the locale to choose > the filesystem encoding (sys.getfilesystemencoding()). Python uses this > encoding to decode and encode all filenames. > Saying that multiple encodings on a single system is a misconfiguration every time it comes up does not make it true. There's been multiple examples of how you can end up with multiple encodings of filenames on a single system listed in past threads: multiple users with different encodings for their locales, mounting remote filesystems, downloading a file.... To the existing list I'd add getting a package from pypi -- neither tar nor zip files contain encoding information about the filenames. Therefore if I create an sdist of a python module using non-ascii filenames using a locale of latin1 and then upload to pypi, people downloading that on a utf-8 using locale will end up not being able to use the module. > > * Specify an encoding per platform and stick to that. > > It doesn't work: on UNIX/BSD, the user chooses its own encoding and all > programs will use it. > The proposal is that you ignore that when talking about loading and creating (I mentioned distutils because my thought was that distutils could grow the ability to translate from the system locale to a chosen neutral encoding when running setup.py any of the dist commands but that doesn't address the issue when testing a module that you've just written so perhaps that's not necessary.) python modules. Python modules would have a set of defined filesystem encodings per system. This prevents getting a mixture of encodings of modules and having things work in one location but fail when used somewhere else. Instead, you get an upfront failure until you correct the encoding. > Anyway, I don't see why it is a problem to have different encodings on > different systems. Each system can use its own encoding. The bug that > I'm trying to solve is a Python bug, not an OS bug. > There is no OS bug here. There is perhaps an OS design flaw but it's not a flaw that will be going away soon (in part, because the present OS designers do not see it as an OS flaw... to them it's a bug in code that attempts to build a simpler interface on top of it.) > > * Change import semantics to allow specifying the encoding of the module on > > the filesystem (seems really icky). > > This is a very bad idea. I introduced PYTHONFSENCODING environment > variable in Python 3.2, but then quickly removed it, because it > introduced a lot of inconsistencies. > Thanks for getting rid of that, PYTHONFSENCODING is a bad idea because it doesn't solve the underlying issues. However, when I say specifying the encoding of the module on the filesystem, I don't mean something global like PYTHONFSENCODING -- I mean something at the python code level:: import caf? encoded_as('latin1') After thinking about this one, though, I don't think it will work either. This takes care of importing modules where the fs encoding of the module is known but it doesn't where the fs encoding may be translated between platforms. I believe that this could arise when untarring a module on windows using winzip or similar that gives you the option of translating from utf-8 bytes into bytes that have meaning as characters on that platform, for instance. Do you have a solution to the problem? I haven't looked at your patch so perhaps you have an ingenous method of translating from the unicode representation of the module in the import statement to the bytes in arbitrary encodings on the filesystem that I haven't thought of. If you don't, however, then really - ASCII-only seems like the sanest of the three solutions I can think of. -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From victor.stinner at haypocalc.com Thu Jan 20 03:51:05 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 20 Jan 2011 03:51:05 +0100 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <20110120020725.GQ22400@unaka.lan> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> Message-ID: <1295491865.22752.22.camel@marge> Le mercredi 19 janvier 2011 ? 18:07 -0800, Toshio Kuratomi a ?crit : > Saying that multiple encodings on a single system is a misconfiguration > every time it comes up does not make it true. Yes, each filesystem can have its own encoding. For example, this is supported by Linux. Python doesn't support such configuration, but this limitation is wider than the import machinery. If you consider it import enough, please open an issue. > To the existing list I'd add getting a package from pypi -- > neither tar nor zip files contain encoding information about the filenames. ZIP contain a flag to indicate the encoding: cp437 or UTF-8. TAR has an extension called "PAX" which stores filenames as UTF-8. But yes, most tarballs store filenames as raw byte strings. Anyway, if you would like to share your code on PyPI, you should not use non-ASCII module names (or any other non-ASCII name/identifier :-)). Python 3 supports non-ASCII identifiers (PEP 3131), but the developer is responsible to decide if (s)he uses it or not, depending on its audience. For a lesson at school, it is nice to write examples in the mother language, instead of using "raw" english with ASCII identifiers and filenames. In a school, you can use the same configuration (encoding) on all computers. > > > * Specify an encoding per platform and stick to that. > > > > It doesn't work: on UNIX/BSD, the user chooses its own encoding and all > > programs will use it. > > > (...) This prevents getting a mixture of encodings of modules (...) If you have an issue with encodings, when have to fix it when you create a module (on disk), not when you load a module (it is too late). > (...) I mean something at the python code level:: > > import caf? encoded_as('latin1') Import a module using its byte name? You mean that caf? filename was not encoded to the Python filesystem encoding, but to other (wrong) encoding, at the creation of the module. As written before, you should fix your filename, instead of using an (ugly) workaround in Python. > I haven't looked at your patch so > perhaps you have an ingenous method of translating from the unicode > representation of the module in the import statement to the bytes in > arbitrary encodings on the filesystem that I haven't thought of. On Windows, My patch tries to avoid any conversion: it uses unicode everywhere. On other OSes, it uses the Python filesystem encoding to encode a module name (as it is done for any other operation on the filesystem with an unicode filename). -- Python 3 supports bytes filename to be able to read/copy/delete undecodable filenames, filenames stored in a encoding different than the system encoding, broken filenames. It is also possible to access these files using PEP 383 (with surrogate characters). This is useful to use Python on an old system. > If you don't, however, then really - ASCII-only seems like the sanest > of the three solutions I can think of. But a (Python 3) module is not supposed to have a broken filename. If it is the case, you have better to fix its name, instead of trying to fix the problem later (in Python). With UTF-8 filesystem encoding (eg. on Mac OS X, and most Linux setups), it is already possible to use non-ASCII module names. Victor From brett at python.org Thu Jan 20 04:40:49 2011 From: brett at python.org (Brett Cannon) Date: Wed, 19 Jan 2011 19:40:49 -0800 Subject: [Python-Dev] Moving stuff out of Misc and over to the devguide In-Reply-To: <20110120004922.22e499e4@pitrou.net> References: <20110120004922.22e499e4@pitrou.net> Message-ID: On Wed, Jan 19, 2011 at 15:49, Antoine Pitrou wrote: > On Wed, 19 Jan 2011 15:31:24 -0800 > Brett Cannon wrote: >> OK, here is my plan that I will implement: >> >> MOVE >> ---------- >> developers.txt >> maintainers.rst >> README.gdb >> README.coverity >> README.Emacs >> >> DELETE (seem way too old to still be relevant; tell me if I am wrong) >> ----------- >> README.OpenBSD >> README.AIX >> cheatsheet >> > > README.gdb is useful to more than core developers and contributors, so I > think it should stay inside Misc. That's true of README.Emacs as well. But I'm willing to bet more people will find out about the gdb and Emacs details if we put them online through search engines, blogs, and reading the devguide than anyone ever did by digging through the Misc directory. From a.badger at gmail.com Thu Jan 20 05:39:01 2011 From: a.badger at gmail.com (Toshio Kuratomi) Date: Wed, 19 Jan 2011 20:39:01 -0800 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <1295491865.22752.22.camel@marge> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> Message-ID: <20110120043901.GR22400@unaka.lan> On Thu, Jan 20, 2011 at 03:51:05AM +0100, Victor Stinner wrote: > For a lesson at school, it is nice to write examples in the > mother language, instead of using "raw" english with ASCII identifiers > and filenames. Then use this:: import cafe as caf? When you do things this way you do not have to translate between unknown encodings into unicode. Everything is within python source where you have a defined encoding. Teaching students to write non-portable code (relying on filesystem encoding where your solution is, don't upload to pypi anything that has non-ascii filenames) seems like the exact opposite of how you'd want to shape a young student's understanding of good programming practices. > In a school, you can use the same configuration > (encoding) on all computers. > In a school computer lab perhaps. But not on all the students' and professors' machines. How many professors will be cursing python when they discover that the example code that they wrote on their Linux workstation doesn't work when the students try to use it in their windows computer lab? How many students will be upset when the code they turn in runs on their professor's test machine if the lab computers were booted into the Linux partition but not if the they were booted into Windows? > > > > > * Specify an encoding per platform and stick to that. > > > > > > It doesn't work: on UNIX/BSD, the user chooses its own encoding and all > > > programs will use it. > > > > > (...) This prevents getting a mixture of encodings of modules (...) > > If you have an issue with encodings, when have to fix it when you create > a module (on disk), not when you load a module (it is too late). > It's not too late to throw a clear error of what's wrong. > > I haven't looked at your patch so > > perhaps you have an ingenous method of translating from the unicode > > representation of the module in the import statement to the bytes in > > arbitrary encodings on the filesystem that I haven't thought of. > > On Windows, My patch tries to avoid any conversion: it uses unicode > everywhere. > > On other OSes, it uses the Python filesystem encoding to encode a module > name (as it is done for any other operation on the filesystem with an > unicode filename). > The other interfaces are somewhat of a red herring here. As I wrote in another email, importing modules has ramifications that open(), for instance, does not. Additionally, those other filesystem operations have been growing the ability to take byte values and encoding parameters because unicode translation via a single filesystem encoding is a good default but not a complete solution. I think that this problem demands a complete solution, however, and it seems to me that limiting the scope of the problem is the most pleasant method to accomplish this. Your solution creates modules which aren't portable. One of my proposals creates python code which isn't portable. The other one suffers some of the same disadvantages as your solution in portability but allows for tools that could automatically correct modules. > -- > > Python 3 supports bytes filename to be able to read/copy/delete > undecodable filenames, filenames stored in a encoding different than the > system encoding, broken filenames. It is also possible to access these > files using PEP 383 (with surrogate characters). This is useful to use > Python on an old system. > > > If you don't, however, then really - ASCII-only seems like the sanest > > of the three solutions I can think of. > > But a (Python 3) module is not supposed to have a broken filename. If it > is the case, you have better to fix its name, instead of trying to fix > the problem later (in Python). > We agree that there should not be broken module names. However it seems we very hotly disagree about the definition of that. You think that if a module is named appropriately on one system but is not portable to another system, that's fine. I think that portability between systems is very important and sacrificing that so that someone can locally use a module with non-ASCII characters doesn't have a justifiable reward. > With UTF-8 filesystem encoding (eg. on Mac OS X, and most Linux setups), > it is already possible to use non-ASCII module names. > Tangent: This is not true about Linux. UTF-8 is a matter of the interpretation of the filesystem bytes that the user specifies by setting their system locale. Setting system locale to ASCII for use in system-wide scripts, is quite common as is changing locale settings in other parts of the world (as I can tell you from the bug reports colleagues CC me on to fix for the problems with unicode support in their python2 programs). Allowing module names incompatible with ascii without specifying an encoding will just lead to bug reports down the line. Relatively few programmers understand the difference between the python unicode abstraction and the byte representations possible for those strings. Allowing non-ascii characters in module filenames without specifying an encoding sets a trap for these programmers to fall into when they move beyond their studies to programming for customers, pypi downloaders, etc who don't have the same environment as themselves. -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From alexander.belopolsky at gmail.com Thu Jan 20 06:02:04 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 20 Jan 2011 00:02:04 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <20110120020725.GQ22400@unaka.lan> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> Message-ID: On Wed, Jan 19, 2011 at 9:07 PM, Toshio Kuratomi wrote: .. > Do you have a solution to the problem? ?I haven't looked at your patch so > perhaps you have an ingenous method of translating from the unicode > representation of the module in the import statement to the bytes in > arbitrary encodings on the filesystem that I haven't thought of. If I understand what Victor's patch does correctly, it allows Python on Windows to bypass translation from Unicode to bytes by using Windows "wide character" APIs. From v+python at g.nevcal.com Thu Jan 20 06:02:17 2011 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 19 Jan 2011 21:02:17 -0800 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <20110120043901.GR22400@unaka.lan> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> Message-ID: <4D37C1D9.7080801@g.nevcal.com> On 1/19/2011 8:39 PM, Toshio Kuratomi wrote: > use this:: > import cafe as caf? > > When you do things this way you do not have to translate between unknown > encodings into unicode. Everything is within python source where you have > a defined encoding. This is a great way of converting non-portable module names, if the module ever leaves the bounds of its computer, and runs into problems there. It may be that the best practices for writing platform portable modules should include * ASCII module filenames * Code that can handle 16 or 32 bit Unicode * and likely some other things. But for local code, having to think up an ASCII name for a module rather than use the obvious native-language name, is just brain-burden when creating the code. Your demonstration of such an easy solution to the concerns you raise convinces me more than ever that it is acceptable to allow non-ASCII module names. For those programmers in a single locale environment, it'll just work. And for those not in a single locale environment, there is your above simple solution to achieve portability without changing large numbers of lines of code. Glenn -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Thu Jan 20 06:11:33 2011 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Thu, 20 Jan 2011 00:11:33 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <4D37C1D9.7080801@g.nevcal.com> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> <4D37C1D9.7080801@g.nevcal.com> Message-ID: <5D5CC0A9-BD43-4496-872C-B70C5FC77490@twistedmatrix.com> On Jan 20, 2011, at 12:02 AM, Glenn Linderman wrote: > But for local code, having to think up an ASCII name for a module rather than use the obvious native-language name, is just brain-burden when creating the code. Is it really? You already had to type 'import', presumably if you can think in Python you can think in ASCII. (After my experiences with namespace crowding in Twisted, I'm inclined to suggest something more like "import m_07117FE4A1EBD544965DC19573183DA2 as caf?" - then I never need to worry about "caf?2" looking ugly or "cafe" being incompatible :).) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Thu Jan 20 06:15:03 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 20 Jan 2011 00:15:03 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <20110120043901.GR22400@unaka.lan> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> Message-ID: On Wed, Jan 19, 2011 at 11:39 PM, Toshio Kuratomi wrote: .. > Teaching students to write non-portable code (relying on filesystem encoding > where your solution is, don't upload to pypi anything that has non-ascii > filenames) seems like the exact opposite of how you'd want to shape a young > student's understanding of good programming practices. > Let's not confuse language definition with the quality of implementation. It would be a perfectly valid Python implementation that would run on a system that does not even have a traditional filesystem and "import foo" looks up foo module code in an in-memory database. Should Python be redefined so that module names are case insensitive simply because case insensitive filesystems are still popular? I don't think so. From v+python at g.nevcal.com Thu Jan 20 06:19:08 2011 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 19 Jan 2011 21:19:08 -0800 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <5D5CC0A9-BD43-4496-872C-B70C5FC77490@twistedmatrix.com> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> <4D37C1D9.7080801@g.nevcal.com> <5D5CC0A9-BD43-4496-872C-B70C5FC77490@twistedmatrix.com> Message-ID: <4D37C5CC.3040200@g.nevcal.com> On 1/19/2011 9:11 PM, Glyph Lefkowitz wrote: > > On Jan 20, 2011, at 12:02 AM, Glenn Linderman wrote: > >> But for local code, having to think up an ASCII name for a module >> rather than use the obvious native-language name, is just >> brain-burden when creating the code. > > Is it really? You already had to type 'import', presumably if you can > think in Python you can think in ASCII. There is a difference between memorizing and typing keywords, and inventing new names in non-native scripts. It is hard to even invent all the names in one's native language; if restricted to inventing them, even some of them, in some non-native script such as ASCII, it is just brain-burden indeed. > > (After my experiences with namespace crowding in Twisted, I'm inclined > to suggest something more like "import > m_07117FE4A1EBD544965DC19573183DA2 as caf?" - then I never need to > worry about "caf?2" looking ugly or "cafe" being incompatible :).) > Now if the stuff after m_ was the hex UTF-8 of "caf?", that could get interesting :) But now you are talking about automating the creation of ASCII file names from the actual non-ASCII names of the modules, or something. Sadly, the module is not required to contain its name, so if it differs from the filename, some global view or non-Python annotation would be required to create/maintain the mapping. [This paragraph is only semi-serious, like yours.] -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Thu Jan 20 06:21:10 2011 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Thu, 20 Jan 2011 00:21:10 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <4D37C5CC.3040200@g.nevcal.com> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> <4D37C1D9.7080801@g.nevcal.com> <5D5CC0A9-BD43-4496-872C-B70C5FC77490@twistedmatrix.com> <4D37C5CC.3040200@g.nevcal.com> Message-ID: <6E9023A7-7701-4AE4-8B86-696C87A569BA@twistedmatrix.com> On Jan 20, 2011, at 12:19 AM, Glenn Linderman wrote: > Now if the stuff after m_ was the hex UTF-8 of "caf?", that could get interesting :) (As it happens, it's the hex digest of the MD5 of the UTF-8 of caf?... ;-)) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Thu Jan 20 06:26:05 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 20 Jan 2011 00:26:05 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <5D5CC0A9-BD43-4496-872C-B70C5FC77490@twistedmatrix.com> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> <4D37C1D9.7080801@g.nevcal.com> <5D5CC0A9-BD43-4496-872C-B70C5FC77490@twistedmatrix.com> Message-ID: On Thu, Jan 20, 2011 at 12:11 AM, Glyph Lefkowitz wrote: .. >> But for local code, having to think up an ASCII name for a module rather >> than use the obvious native-language name, is just brain-burden when >> creating the code. > > Is it really? ?You already had to type 'import', presumably if you can think > in Python you can think in ASCII. Yes, it is a burden. For example, Russian word "??" can be transliterated into ASCII as "schi", "shchi", "stchi", or even "wji". There are many incompatible standards and neither is well-known or "natural". Reading transliterated Cyrillic text is not hard, but guessing the correct spelling is nearly impossible. Good programming style guides recommend avoiding arbitrary contractions in variable names for the same reason. From a.badger at gmail.com Thu Jan 20 08:20:25 2011 From: a.badger at gmail.com (Toshio Kuratomi) Date: Wed, 19 Jan 2011 23:20:25 -0800 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <4D37C1D9.7080801@g.nevcal.com> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> <4D37C1D9.7080801@g.nevcal.com> Message-ID: <20110120072025.GS22400@unaka.lan> On Wed, Jan 19, 2011 at 09:02:17PM -0800, Glenn Linderman wrote: > On 1/19/2011 8:39 PM, Toshio Kuratomi wrote: > > use this:: > > import cafe as caf? > > When you do things this way you do not have to translate between unknown > encodings into unicode. Everything is within python source where you have > a defined encoding. > > > This is a great way of converting non-portable module names, if the module ever > leaves the bounds of its computer, and runs into problems there. > You're missing a piece here. If you mandate ascii you can convert to a unicode name using "import as" because python knows that it has ascii text from the filesystem when it converts it to an abstract unicode string that you've specified in the program text. You cannot go the other way because python lacks the information (the encoding of the filename on the filesystem) to do the transformation. > Your demonstration of such an easy solution to the concerns you raise convinces > me more than ever that it is acceptable to allow non-ASCII module names. For > those programmers in a single locale environment, it'll just work. And for > those not in a single locale environment, there is your above simple solution > to achieve portability without changing large numbers of lines of code. > Does my demonstration that you can't do that mean that it's no longer acceptable? :-) /me guesses that the relative merits of being forced to write portable code vs convenience of writing a module name in your native script still has a different balance than in mine, thus the smiley :-) -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From v+python at g.nevcal.com Thu Jan 20 08:36:42 2011 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 19 Jan 2011 23:36:42 -0800 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <20110120072025.GS22400@unaka.lan> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> <4D37C1D9.7080801@g.nevcal.com> <20110120072025.GS22400@unaka.lan> Message-ID: <4D37E60A.5000906@g.nevcal.com> On 1/19/2011 11:20 PM, Toshio Kuratomi wrote: > On Wed, Jan 19, 2011 at 09:02:17PM -0800, Glenn Linderman wrote: >> On 1/19/2011 8:39 PM, Toshio Kuratomi wrote: >> >> use this:: >> >> import cafe as caf? >> >> When you do things this way you do not have to translate between unknown >> encodings into unicode. Everything is within python source where you have >> a defined encoding. >> >> >> This is a great way of converting non-portable module names, if the module ever >> leaves the bounds of its computer, and runs into problems there. >> > You're missing a piece here. If you mandate ascii you can convert to > a unicode name using "import as" because python knows that it has ascii text > from the filesystem when it converts it to an abstract unicode string that > you've specified in the program text. You cannot go the other way because > python lacks the information (the encoding of the filename on the > filesystem) to do the transformation. > >> Your demonstration of such an easy solution to the concerns you raise convinces >> me more than ever that it is acceptable to allow non-ASCII module names. For >> those programmers in a single locale environment, it'll just work. And for >> those not in a single locale environment, there is your above simple solution >> to achieve portability without changing large numbers of lines of code. >> > Does my demonstration that you can't do that mean that it's no longer > acceptable? :-) > > /me guesses that the relative merits of being forced to write portable code > vs convenience of writing a module name in your native script still has > a different balance than in mine, thus the smiley :-) > > -Toshio Sadly, you didn't demonstrate it, you seem to have misunderstood my statement, which was probably not all that clear, somehow. Let me try again. User codes module caf?.py, tests, debugs, completes, is happy. User moves code to a different computer, different locale, no ? character, module can't be found, is sad. User renames file to cafefromuser.py, changes the import statement from import caf? to import cafefromuser as caf? module now imports successfully, no other code changes needed. User is happy again, thanks Toshio for great solution to file system encoding problem. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sandro.tosi at gmail.com Thu Jan 20 10:22:53 2011 From: sandro.tosi at gmail.com (Sandro Tosi) Date: Thu, 20 Jan 2011 10:22:53 +0100 Subject: [Python-Dev] [Python-checkins] devguide: Move Misc/maintainers.rst here and rename to experts.rst. In-Reply-To: References: Message-ID: Hi, On Thu, Jan 20, 2011 at 04:56, brett.cannon wrote: > +Unless a name is followed by a '*', you should never assign an issue to > +that person, only make them nosy. ?Names followed by a '*' may be assigned > +issues involving the module or topic for which the name has a '*'. isn't last sentence a bit weird? I'm not native but "Names followed by a '*' may issues assigned for the modules...." be a bit better? ok, fairly minor you can also ignore it :) Cheers, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi From foom at fuhm.net Thu Jan 20 11:15:59 2011 From: foom at fuhm.net (James Y Knight) Date: Thu, 20 Jan 2011 05:15:59 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <20110120043901.GR22400@unaka.lan> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> Message-ID: On Jan 19, 2011, at 11:39 PM, Toshio Kuratomi wrote: > Tangent: This is not true about Linux. UTF-8 is a matter of the > interpretation of the filesystem bytes that the user specifies by setting > their system locale. Setting system locale to ASCII for use in system-wide > scripts, is quite common as is changing locale settings in other parts of > the world (as I can tell you from the bug reports colleagues CC me on to fix > for the problems with unicode support in their python2 programs). Fortunately, there's been some (slow) movement towards adding a "C.UTF-8" locale and using that by default where "C" (ASCII) is currently used. So that may be less of a problem in a few years time. James From victor.stinner at haypocalc.com Thu Jan 20 12:51:29 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 20 Jan 2011 12:51:29 +0100 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <20110120043901.GR22400@unaka.lan> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> Message-ID: <1295524289.2016.116.camel@marge> Le mercredi 19 janvier 2011 ? 20:39 -0800, Toshio Kuratomi a ?crit : > Teaching students to write non-portable code (relying on filesystem encoding > where your solution is, don't upload to pypi anything that has non-ascii > filenames) seems like the exact opposite of how you'd want to shape a young > student's understanding of good programming practices. That was already discuted before: see PEP 3131. http://www.python.org/dev/peps/pep-3131/#common-objections If the teacher choose to use non-ASCII, (s)he is responsible to explain the consequences to his/her students :-) > > In a school, you can use the same configuration > > (encoding) on all computers. > > > In a school computer lab perhaps. But not on all the students' and > professors' machines. How many professors will be cursing python when they > discover that the example code that they wrote on their Linux workstation > doesn't work when the students try to use it in their windows computer lab? Because some students use a stupid or misconfigured OS, Python should only accept ASCII names? So, why do Python 3 support non-ASCII filenames: it is very well known that non-ASCII filenames is the root in many troubles! Should we simply drop unicode support for all filenames? And maybe restrict bytes filenames to bytes in [0; 127]? Or better, restrict to [32; 126] (U+007f causes some troubles in some terminals). I think that in 2011, non-ASCII filenames are well supported on all (modern) operating systems. Issues with non-ASCII filenames are OS specific and should be fixed by the user (the admin of the computer). > Additionally, those other filesystem operations have > been growing the ability to take byte values and encoding parameters because > unicode translation via a single filesystem encoding is a good default but > not a complete solution. If you are unable to configure correctly your system to decode/encode correctly filenames, you should just avoid non-ASCII characters in the module names. You only give theorical arguments: did you at least try to use non-ASCII module names on your system with Python 3.2? I suppose that it will just work and you will never notice that the unicode module name (on "import caf?") in encoded to bytes. It fails on on OSes using filesystem encodings other than UTF-8 (eg. Windows)... because of a Python bug, and I just asked if I have to fix this bug (or if we should deny non-ASCII names). If the bug is fixed, it will works everywhere. > Your solution creates modules which aren't portable More and more operating systems use a filesystem encoding able to encode any Unicode characters. ASCII-only always give you the best portability, but I think that today you can start to play with (at least) ISO-8859-1 characters (caf? should work on all operating systems). If you don't Unicode issues (I personally love them!), just use ASCII everywhere. > One of my proposals creates python code which isn't portable. The other one > suffers some of the same disadvantages as your solution in portability but > allows for tools that could automatically correct modules. __import__('caf?'.encode('UTF-8')) or __import__('caf?'.encode('ISO-8859-1')) is less portable than __import__('caf?'). > You think that if a module is named appropriately on one system but is not portable to another > system, that's fine. No, I am not saying that. I say that if your name is broken while you transfer your project from a system to another (eg. decompressing an archive creates filenames with mojibake in the filenames), you should fix your transfer procedure (eg. use another archive format, use a script to fix filenames, or anything else), but don't try to handle invalid filenames. > Setting system locale to ASCII for use in system-wide scripts This is stupid :-) Yes, on such system you, cannot open *any* non-ASCII file with Python 3 (except if you work, as Python 2, on bytes filenames). Python cannot do anything to improve Unicode support on such system: only the administrator have to something to do for that. I know that you can give me many examples of systems where Unicode doesn't work because the system is not correctly configured. But my opinion is that we should support non-ASCII names because there are somewhere "some" systems where Unicode is fully functionnal :-) Victor From victor.stinner at haypocalc.com Thu Jan 20 13:06:15 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 20 Jan 2011 13:06:15 +0100 Subject: [Python-Dev] [Python-checkins] r88119 - in python/branches/py3k/Doc: library/inspect.rst whatsnew/3.2.rst In-Reply-To: <20110120040319.9192BEEA55@mail.python.org> References: <20110120040319.9192BEEA55@mail.python.org> Message-ID: <1295525175.2016.120.camel@marge> Le jeudi 20 janvier 2011 ? 05:03 +0100, raymond.hettinger a ?crit : > * Python's import mechanism can now load modules installed in directories with > - non-ASCII characters in the path name. > + non-ASCII characters in the path name: > + > + >>> import m??se.bites > > (Required extensive work by Victor Stinner in :issue:`9425`.) Ooops, it is not the good example :-) In this example, m??se is not a module path, but a module name... And Python 3.2 doesn't support correctly non-ASCII module names on all operating systems yet (see "[Python-Dev] Import and unicode: part two" thread :-)). #9425 only concerns sys.path (eg. sys.path.append('/home/m??se/modules/')). Victor From hodgestar+pythondev at gmail.com Thu Jan 20 13:08:20 2011 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Thu, 20 Jan 2011 14:08:20 +0200 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On Wed, Jan 19, 2011 at 5:01 PM, Simon Cross wrote: > On Wed, Jan 19, 2011 at 2:34 PM, Victor Stinner > wrote: >> ?(a) Python 3 doesn't support non-ASCII module names > > -0: I'm vaguely against this being supported because I'd rather not > have to deal with what happens when the guess regarding the filesystem > encoding is wrong. On the other hand, a general encouragement to stick > to ASCII module names is probably functionally equivalent without > imposing a hard restriction. I'm changing my vote on this to a +1 for two reasons: * Initially I thought this wasn't supported by Python at all but I see that currently it is supported but that support is broken (or at least limited to UTF-8 filesystem encodings). Since support is there, might as well make it better (especially if it tidies up the code base at the same time). * I still don't think it's a good idea to give modules non-ASCII names but the "consenting adults" approach suggests we should let people shoot themselves in the foot if they believe they have good reason to do so. Schiavo Simon From victor.stinner at haypocalc.com Thu Jan 20 13:41:36 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 20 Jan 2011 13:41:36 +0100 Subject: [Python-Dev] [Python-checkins] r88121 - python/branches/py3k/Doc/whatsnew/3.2.rst In-Reply-To: <20110120090440.16F11EE98E@mail.python.org> References: <20110120090440.16F11EE98E@mail.python.org> Message-ID: <1295527296.9228.2.camel@marge> Le jeudi 20 janvier 2011 ? 10:04 +0100, raymond.hettinger a ?crit : > +Some operating systems allow direct access to the unencoded bytes in the > +environment. If so, the :attr:`os.supports_bytes_environ` constant will be > +true. > + > +For direct access to unencoded environment variables (if available), > +use the new :func:`os.getenvb` function or use :data:`os.environb` > +which is a bytes version of :data:`os.environ`. Hum, I think that "undecoded" bytes term is more appropriate. You can decode bytes and encode characters, but not the opposite. Victor From solipsis at pitrou.net Thu Jan 20 14:11:56 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 20 Jan 2011 14:11:56 +0100 Subject: [Python-Dev] r88122 - python/branches/py3k/Doc/whatsnew/3.2.rst References: <20110120094704.663C6EE988@mail.python.org> Message-ID: <20110120141156.156886f7@pitrou.net> On Thu, 20 Jan 2011 10:47:04 +0100 (CET) raymond.hettinger wrote: > > +Code Repository > +=============== > + > +In addition to the existing Subversion code repository at http://svn.python.org > +there is now a `Mercurial `_ repository at > +http://hg.python.org/ . Shouldn't we wait until that repository is ready? Right now people willing to work with it will get the surprise that it hasn't been updated for 3 months. Regards Antoine. From ncoghlan at gmail.com Thu Jan 20 14:16:51 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Jan 2011 23:16:51 +1000 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On Thu, Jan 20, 2011 at 10:08 PM, Simon Cross wrote: > I'm changing my vote on this to a +1 for two reasons: > > * Initially I thought this wasn't supported by Python at all but I see > that currently it is supported but that support is broken (or at least > limited to UTF-8 filesystem encodings). Since support is there, might > as well make it better (especially if it tidies up the code base at > the same time). > > * I still don't think it's a good idea to give modules non-ASCII names > but the "consenting adults" approach suggests we should let people > shoot themselves in the foot if they believe they have good reason to > do so. I'm also +1 on this for the reasons Simon gives. I should have a chance to look at the patch this weekend. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Jan 20 14:25:28 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Jan 2011 23:25:28 +1000 Subject: [Python-Dev] [Python-checkins] r88121 - python/branches/py3k/Doc/whatsnew/3.2.rst In-Reply-To: <1295527296.9228.2.camel@marge> References: <20110120090440.16F11EE98E@mail.python.org> <1295527296.9228.2.camel@marge> Message-ID: On Thu, Jan 20, 2011 at 10:41 PM, Victor Stinner wrote: > Le jeudi 20 janvier 2011 ? 10:04 +0100, raymond.hettinger a ?crit : >> +Some operating systems allow direct access to the unencoded bytes in the >> +environment. ?If so, the :attr:`os.supports_bytes_environ` constant will be >> +true. >> + >> +For direct access to unencoded environment variables (if available), >> +use the new :func:`os.getenvb` function or use :data:`os.environb` >> +which is a bytes version of :data:`os.environ`. > > Hum, I think that "undecoded" bytes term is more appropriate. You can > decode bytes and encode characters, but not the opposite. I was going to say the same thing. "encoded", "undecoded" or "raw" would all work, but "unencoded" definitely isn't right. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From nadeem.vawda at gmail.com Thu Jan 20 14:59:39 2011 From: nadeem.vawda at gmail.com (Nadeem Vawda) Date: Thu, 20 Jan 2011 15:59:39 +0200 Subject: [Python-Dev] r88122 - python/branches/py3k/Doc/whatsnew/3.2.rst In-Reply-To: <20110120141156.156886f7@pitrou.net> References: <20110120094704.663C6EE988@mail.python.org> <20110120141156.156886f7@pitrou.net> Message-ID: On Thu, Jan 20, 2011 at 3:11 PM, Antoine Pitrou wrote: > On Thu, 20 Jan 2011 10:47:04 +0100 (CET) > raymond.hettinger wrote: >> >> +Code Repository >> +=============== >> + >> +In addition to the existing Subversion code repository at http://svn.python.org >> +there is now a `Mercurial `_ repository at >> +http://hg.python.org/ . > > Shouldn't we wait until that repository is ready? Right now people > willing to work with it will get the surprise that it hasn't been > updated for 3 months. Should that perhaps be http://code.python.org/hg/ ? I've been using that repository for the last two months or so, and it seems to be up to date (to within an hour or so, at least). From solipsis at pitrou.net Thu Jan 20 15:04:29 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 20 Jan 2011 15:04:29 +0100 Subject: [Python-Dev] r88122 - python/branches/py3k/Doc/whatsnew/3.2.rst In-Reply-To: References: <20110120094704.663C6EE988@mail.python.org> <20110120141156.156886f7@pitrou.net> Message-ID: <20110120150429.565a5649@pitrou.net> On Thu, 20 Jan 2011 15:59:39 +0200 Nadeem Vawda wrote: > On Thu, Jan 20, 2011 at 3:11 PM, Antoine Pitrou wrote: > > On Thu, 20 Jan 2011 10:47:04 +0100 (CET) > > raymond.hettinger wrote: > >> > >> +Code Repository > >> +=============== > >> + > >> +In addition to the existing Subversion code repository at http://svn.python.org > >> +there is now a `Mercurial `_ repository at > >> +http://hg.python.org/ . > > > > Shouldn't we wait until that repository is ready? Right now people > > willing to work with it will get the surprise that it hasn't been > > updated for 3 months. > > Should that perhaps be http://code.python.org/hg/ ? I've been using > that repository for the last two months or so, and it seems to be up > to date (to within an hour or so, at least). Indeed. However, this will not be the definitive repository (as used after the migration), so I'm not sure it deserves mentioning in the what's new. Also, it's technically not new at all :) Regards Antoine. From guido at python.org Thu Jan 20 17:46:23 2011 From: guido at python.org (Guido van Rossum) Date: Thu, 20 Jan 2011 08:46:23 -0800 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On Thu, Jan 20, 2011 at 5:16 AM, Nick Coghlan wrote: > On Thu, Jan 20, 2011 at 10:08 PM, Simon Cross > wrote: >> I'm changing my vote on this to a +1 for two reasons: >> >> * Initially I thought this wasn't supported by Python at all but I see >> that currently it is supported but that support is broken (or at least >> limited to UTF-8 filesystem encodings). Since support is there, might >> as well make it better (especially if it tidies up the code base at >> the same time). >> >> * I still don't think it's a good idea to give modules non-ASCII names >> but the "consenting adults" approach suggests we should let people >> shoot themselves in the foot if they believe they have good reason to >> do so. > > I'm also +1 on this for the reasons Simon gives. Same here. *Most* code will never be shared, or will only be shared between users in the same community. When it goes wrong it's also a learning opportunity. :-) > I should have a chance to look at the patch this weekend. -- --Guido van Rossum (python.org/~guido) From ateijelo at gmail.com Thu Jan 20 17:45:54 2011 From: ateijelo at gmail.com (Andy Teijelo) Date: Thu, 20 Jan 2011 11:45:54 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <6E9023A7-7701-4AE4-8B86-696C87A569BA@twistedmatrix.com> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> <4D37C1D9.7080801@g.nevcal.com> <5D5CC0A9-BD43-4496-872C-B70C5FC77490@twistedmatrix.com> <4D37C5CC.3040200@g.nevcal.com> <6E9023A7-7701-4AE4-8B86-696C87A569BA@twistedmatrix.com> Message-ID: <4D3866C2.2010007@gmail.com> (Hi, I'm writing from an address different to the one I'm subscribed with to the list because I don't have reverse dns in my mail server and mail.python.org rejects my messages. I hope that's not much trouble) Maybe Python should always use an ASCII encodable filename for modules: a translation of the module name into an ASCII encodable string that, preferrably, was the same as the module name if the module name didn't have any non-ASCII characters. Like, if the code said: import cafe Python would look for a file named: cafe.py but if the code said: import caf? then Python would look, in any platform, for a file named: café.py or café.py or something nicer. Something along the lines of xmlcharrefreplace. Just an idea. Andy. El 1/20/11 12:21 a.m., Glyph Lefkowitz escribi?: > > On Jan 20, 2011, at 12:19 AM, Glenn Linderman wrote: > >> Now if the stuff after m_ was the hex UTF-8 of "caf?", that could get >> interesting :) > > (As it happens, it's the hex digest of the MD5 of the UTF-8 of caf?... ;-)) > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/andy%40lists.teijelo.net From a.badger at gmail.com Thu Jan 20 18:44:39 2011 From: a.badger at gmail.com (Toshio Kuratomi) Date: Thu, 20 Jan 2011 09:44:39 -0800 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <1295524289.2016.116.camel@marge> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> <1295524289.2016.116.camel@marge> Message-ID: <20110120174439.GT22400@unaka.lan> On Thu, Jan 20, 2011 at 12:51:29PM +0100, Victor Stinner wrote: > Le mercredi 19 janvier 2011 ? 20:39 -0800, Toshio Kuratomi a ?crit : > > Teaching students to write non-portable code (relying on filesystem encoding > > where your solution is, don't upload to pypi anything that has non-ascii > > filenames) seems like the exact opposite of how you'd want to shape a young > > student's understanding of good programming practices. > > That was already discuted before: see PEP 3131. > http://www.python.org/dev/peps/pep-3131/#common-objections > > If the teacher choose to use non-ASCII, (s)he is responsible to explain > the consequences to his/her students :-) > It's not discussed in that PEP section. The PEP section says this: "People claim that they will not be able to use a library if to do so they have to use characters they cannot type on their keyboards." Whether you can type it at your keyboard or not is not the problem here. The problem is portability. The students and professors are sharing code with each other. But because of a mixture of operating systems (let alone locale settings), the code written by one partner is unable to run on the computer of the other. If non-ascii filenames without a defined encoding are considered a feature, python cannot even issue a descriptive error when this occurs. It can only say that it could not find the module but not why. A restriction on module names to ascii only could actually state that module names are not allowed to be non-ASCII when it encounters the import line. > > > In a school, you can use the same configuration > > > (encoding) on all computers. > > > > > In a school computer lab perhaps. But not on all the students' and > > professors' machines. How many professors will be cursing python when they > > discover that the example code that they wrote on their Linux workstation > > doesn't work when the students try to use it in their windows computer lab? > > Because some students use a stupid or misconfigured OS, Python should > only accept ASCII names? Just a note -- you'll get much farther if you refrain from calling names. It just makes me think that you aren't reading and understanding the issue I'm raising. My examples that you're replying to involve two "properly configured" OS's. The Linux workstations are configured with a UTF-8 locale. The Windows OS's use wide character unicode. The problem occurs in that the code that one of the parties develops (either the students or the professors) is developed on one of those OS's and then used on the other OS. > So, why do Python 3 support non-ASCII > filenames: it is very well known that non-ASCII filenames is the root in > many troubles! Should we simply drop unicode support for all filenames? > And maybe restrict bytes filenames to bytes in [0; 127]? Or better, > restrict to [32; 126] (U+007f causes some troubles in some terminals). > If you want to argue that because python3 supports non-ascii filenames in other code, then the logical extension is that the import mechanism should support importing module names defined by byte sequences. I happen to think that import has a lot of differences between it and other filenames as I've said three times now. > I think that in 2011, non-ASCII filenames are well supported on all > (modern) operating systems. Issues with non-ASCII filenames are OS > specific and should be fixed by the user (the admin of the computer). > > > Additionally, those other filesystem operations have > > been growing the ability to take byte values and encoding parameters because > > unicode translation via a single filesystem encoding is a good default but > > not a complete solution. > > If you are unable to configure correctly your system to decode/encode > correctly filenames, you should just avoid non-ASCII characters in the > module names. > This seems like an argument to only have unicode versions of all filesystem operations. Since you've been spearheading the effort to have bytes versions of things that access filenames, environment variables, etc, I don't think that you seriously mean that. Perhaps there is a language issue here. > You only give theorical arguments: did you at least try to use non-ASCII > module names on your system with Python 3.2? I suppose that it will just > work and you will never notice that the unicode module name (on "import > caf?") in encoded to bytes. > Yes I did and I got it to fail a cornercase as I showed twice with the same example in other posts. However, I want to make clear here that the issue is not that I can create a non-ascii filename and then import it. The issue is that I can create a non-ascii filename and then try to share it with the usual tools and it won't work on the recipient's system. (A tangent is whether the recipient's system is physically distinct from mine or only has a different environment on the same physical host.) > It fails on on OSes using filesystem encodings other than UTF-8 (eg. > Windows)... because of a Python bug, and I just asked if I have to fix > this bug (or if we should deny non-ASCII names). If the bug is fixed, it > will works everywhere. > I understand that your patch allows non-ASCII names to work on Windows. My issue is that non-ASCII names have ramifications beyond just, "works on Windows" "works on Linux". There's also the question of whether it works when you transfer modules between OS's. > > Your solution creates modules which aren't portable > > More and more operating systems use a filesystem encoding able to encode > any Unicode characters. ASCII-only always give you the best portability, > but I think that today you can start to play with (at least) ISO-8859-1 > characters (caf? should work on all operating systems). If you don't > Unicode issues (I personally love them!), just use ASCII everywhere. > I'd be happy to agree with your enthusiasm for unicode characters if your patch included a method to preserve portability between operating systems. > > One of my proposals creates python code which isn't portable. The other one > > suffers some of the same disadvantages as your solution in portability but > > allows for tools that could automatically correct modules. > > __import__('caf?'.encode('UTF-8')) or > __import__('caf?'.encode('ISO-8859-1')) is less portable than > __import__('caf?'). > Yep, this method is just as unportable as yours as I said in an anlysis in a previous post. The other method is the one that's more portable but has painful drawbacks. (Also note that your example above ignores one of the differences between import and open() that I mentioned in a previous post: import assigns the module to a name automatically whereas open() [like__import__()] makes the programmer assign the name) > > You think that if a module is named appropriately on one system but is not portable to another > > system, that's fine. > > No, I am not saying that. > > I say that if your name is broken while you transfer your project from a > system to another (eg. decompressing an archive creates filenames with > mojibake in the filenames), you should fix your transfer procedure (eg. > use another archive format, use a script to fix filenames, or anything > else), but don't try to handle invalid filenames. > So here's a revised summary: A module being able to be imported by the module author is of primary importance. Portability of modules relies upon third party tool support. Lacking that support, the modules may not be portable. > > Setting system locale to ASCII for use in system-wide scripts > > This is stupid :-) Yes, on such system you, cannot open *any* non-ASCII > file with Python 3 (except if you work, as Python 2, on bytes > filenames). > > Python cannot do anything to improve Unicode support on such system: > only the administrator have to something to do for that. > Python supports open() with a bytes argument for this reason. import does not support such a thing (and I think it would be more wrong for import to do so). > I know that you can give me many examples of systems where Unicode > doesn't work because the system is not correctly configured. But my > opinion is that we should support non-ASCII names because there are > somewhere "some" systems where Unicode is fully functionnal :-) > Comments like these make me think that you aren't understanding me which just makes me frustrated with you. OTOH, if you could acknowledge the points that I'm making and simply disagree with the relative merits of them then we could simply agree to disagree. -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From alexander.belopolsky at gmail.com Thu Jan 20 19:02:28 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 20 Jan 2011 13:02:28 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <4D3866C2.2010007@gmail.com> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> <4D37C1D9.7080801@g.nevcal.com> <5D5CC0A9-BD43-4496-872C-B70C5FC77490@twistedmatrix.com> <4D37C5CC.3040200@g.nevcal.com> <6E9023A7-7701-4AE4-8B86-696C87A569BA@twistedmatrix.com> <4D3866C2.2010007@gmail.com> Message-ID: On Thu, Jan 20, 2011 at 11:45 AM, Andy Teijelo wrote: .. > but if the code said: > > import caf? > > then Python would look, in any platform, for a file named: > > café.py ?or ?café.py ?or something nicer. > > Something along the lines of xmlcharrefreplace. > Just an idea. Curiously, something like this already happens on OSX when filename is not valid UTF-8. For example, >>> open(b'\xdb\xcd', 'w').close() >>> open(b'\xdb\xcd') <_io.TextIOWrapper name=b'\xdb\xcd' mode='r' encoding='UTF-8'> but the actual file created is named "%DB%CD". (Looks like URL-encoding). From alexander.belopolsky at gmail.com Thu Jan 20 19:43:03 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 20 Jan 2011 13:43:03 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <20110120174439.GT22400@unaka.lan> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> <1295524289.2016.116.camel@marge> <20110120174439.GT22400@unaka.lan> Message-ID: On Thu, Jan 20, 2011 at 12:44 PM, Toshio Kuratomi wrote: > ..?My examples that you're replying to involve two "properly > configured" OS's. ?The Linux workstations are configured with a UTF-8 > locale. ?The Windows OS's use wide character unicode. ?The problem occurs in > that the code that one of the parties develops (either the students or the > professors) is developed on one of those OS's and then used on the other OS. > I re-read your posts on this thread, but could not find the examples that you refer to. ISTM, your hypothetical students should have no problem as long as their professor uses proper tools to package her code. For example, if she uses a recent version of zip that supports the Info-ZIP Unicode Comment Extra Field (see http://www.pkware.com/documents/casestudies/APPNOTE.TXT) and students use similarly up to date unzip tool, the shared code should work as expected. Similarly, I would be surprised if Samba server would not be able to present a shared Linux partition that uses UTF-8 encoding to a Windows client in a way that will make wopen() work as expected. The problem with current Python import mechanism is that it does not use wopen() on Windows and instead, attempts to encode Unicode module name into a mythical single-byte filesystem encoding (locale ANSI code page?) and calls byte-oriented open(char *) on the result. From brett at python.org Thu Jan 20 19:42:19 2011 From: brett at python.org (Brett Cannon) Date: Thu, 20 Jan 2011 10:42:19 -0800 Subject: [Python-Dev] [Python-checkins] devguide: Short doc about where to get tech help related to developing Python. In-Reply-To: References: Message-ID: On Wed, Jan 19, 2011 at 15:21, Sandro Tosi wrote: > Hi, > > On Wed, Jan 19, 2011 at 23:19, brett.cannon wrote: >> +Where to Get Help >> +================= >> +If you are working on Python it is very possible you will come across an issue >> +where you need some assistance in solving (this happens to core developers all >> +the time). You have a couple of options depending on what kind of help you need. >> +If the question involves process or tool usage then please check the developer's >> +guide first as is should answer your question. > > as it should > >> +Filing a Bug >> +------------ >> +If you come across an odd error message that seems like a bug, then file a bug >> +on the `issue tracker`_. In the bug you can explain that you are not sure why >> +the error is coming up or that the exact nature of the problem is. Someone will > > ...or what the exact...? > >> +Asking a Technical Question >> +--------------------------- >> +You have two avenues of communication out of the :ref:`myriad of options >> +available `. If you are comfortable with IRC you can try asking >> +in #python-dev. Typically there are a couple of experienced developers, ranging >> +from triagers to core developers, who can ask questions about developing for > > who can answer questions They can ask as well. =) Anyway, all changes coming in the next push. > > Cheers, > -- > Sandro Tosi (aka morph, morpheus, matrixhasu) > My website: http://matrixhasu.altervista.org/ > Me at Debian: http://wiki.debian.org/SandroTosi > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > From brett at python.org Thu Jan 20 19:43:37 2011 From: brett at python.org (Brett Cannon) Date: Thu, 20 Jan 2011 10:43:37 -0800 Subject: [Python-Dev] [Python-checkins] devguide: Move Misc/maintainers.rst here and rename to experts.rst. In-Reply-To: References: Message-ID: It's just a bit wordy. I simplified it. On Thu, Jan 20, 2011 at 01:22, Sandro Tosi wrote: > Hi, > > On Thu, Jan 20, 2011 at 04:56, brett.cannon wrote: >> +Unless a name is followed by a '*', you should never assign an issue to >> +that person, only make them nosy. ?Names followed by a '*' may be assigned >> +issues involving the module or topic for which the name has a '*'. > > isn't last sentence a bit weird? I'm not native but "Names followed by > a '*' may issues assigned for the modules...." be a bit better? ok, > fairly minor you can also ignore it :) > > Cheers, > -- > Sandro Tosi (aka morph, morpheus, matrixhasu) > My website: http://matrixhasu.altervista.org/ > Me at Debian: http://wiki.debian.org/SandroTosi > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > From a.badger at gmail.com Thu Jan 20 20:27:43 2011 From: a.badger at gmail.com (Toshio Kuratomi) Date: Thu, 20 Jan 2011 11:27:43 -0800 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> <1295524289.2016.116.camel@marge> <20110120174439.GT22400@unaka.lan> Message-ID: <20110120192743.GU22400@unaka.lan> On Thu, Jan 20, 2011 at 01:43:03PM -0500, Alexander Belopolsky wrote: > On Thu, Jan 20, 2011 at 12:44 PM, Toshio Kuratomi wrote: > > ..?My examples that you're replying to involve two "properly > > configured" OS's. ?The Linux workstations are configured with a UTF-8 > > locale. ?The Windows OS's use wide character unicode. ?The problem occurs in > > that the code that one of the parties develops (either the students or the > > professors) is developed on one of those OS's and then used on the other OS. > > > > I re-read your posts on this thread, but could not find the examples > that you refer to. > Examples might be a bad word in this context. Victor was commenting on the two brainstorm ideas for alternatives to ascii-only that I had. One was: * Mandate that every python module on a platform has a specific encoding (rather than the value of the locale) The other was: * allow using byte strings for import I think that both ideas are inferior to mandating that every python module filename is ascii. From what I'm getting from Victor's posts is that he, at least, considers the portability problems to be ignorable because dealing with ambiguous file name encodings is something that he'd like to force third party tools to deal with. -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From brett at python.org Thu Jan 20 20:43:59 2011 From: brett at python.org (Brett Cannon) Date: Thu, 20 Jan 2011 11:43:59 -0800 Subject: [Python-Dev] Moving stuff out of Misc and over to the devguide In-Reply-To: References: Message-ID: Short of moving README.coverity (I'm waiting to here back from the company), I'm done with my tweaks to the directory. On Wed, Jan 19, 2011 at 15:31, Brett Cannon wrote: > OK, here is my plan that I will implement: > > MOVE > ---------- > developers.txt > maintainers.rst > README.gdb > README.coverity > README.Emacs > > DELETE (seem way too old to still be relevant; tell me if I am wrong) > ----------- > README.OpenBSD > README.AIX > cheatsheet > > LEAVE everything else (with README properly edited and simplified to > only list files with non-obvious names) > > On Mon, Jan 17, 2011 at 12:32, Brett Cannon wrote: >> There is a bunch of stuff in Misc that probably belongs in the >> devguide (under Resources) instead of in svn. Here are the files I >> think can be moved (in order of how strongly I think they should be >> moved): >> >> PURIFY.README >> README.coverty >> README.klocwork >> README.valgrind >> Porting >> developers.txt >> maintainers.rst >> SpecialBuilds.txt >> >> Now before anyone yells "that is inconvenient", don't forget that all >> core developers can check out and edit the devguide, and that almost >> all of the files listed (SpecialBuilds.txt is the exception) are >> typically edited and viewed on their own. >> >> Anyway, if there is a file listed here you don't think should move out >> of py3k and into the devguide, speak up. >> > From skip at pobox.com Thu Jan 20 20:59:34 2011 From: skip at pobox.com (skip at pobox.com) Date: Thu, 20 Jan 2011 13:59:34 -0600 Subject: [Python-Dev] Moving stuff out of Misc and over to the devguide In-Reply-To: References: Message-ID: <19768.37926.530229.216373@montanaro.dyndns.org> Brett, I'm sure I just missed it, but where is the devguide in the Subversion tree? Thx, Skip From mal at egenix.com Thu Jan 20 21:09:47 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 20 Jan 2011 21:09:47 +0100 Subject: [Python-Dev] [Python-checkins] r88127 - in python/branches/py3k/Misc: README.AIX README.OpenBSD cheatsheet In-Reply-To: <20110120193435.89D3CEEA5B@mail.python.org> References: <20110120193435.89D3CEEA5B@mail.python.org> Message-ID: <4D38968B.9060103@egenix.com> brett.cannon wrote: > Author: brett.cannon > Date: Thu Jan 20 20:34:35 2011 > New Revision: 88127 > > Log: > Remove some outdated files from Misc. > > Removed: > python/branches/py3k/Misc/README.AIX Are you sure that the AIX README is outdated ? It explains some of the details of why there are scripts like ld_so_aix which are still needed on AIX. > python/branches/py3k/Misc/README.OpenBSD Same here. Does OpenBSD 4.x still have the issues mentioned in the file. > python/branches/py3k/Misc/cheatsheet Wouldn't it be better to update this useful file (as part of your PSF grant) ? Most of it still applies to Py3. Regarding some other things you removed or moved: > D SVN-Python3/Misc/maintainers.rst > D SVN-Python3/Misc/developers.txt Why were these removed from the source archive ? They are useful to have around for users wanting to report bugs and are useful to follow the development of the core team between different Python versions. > D SVN-Python3/Misc/python-mode.el Why is this gone ? It's a useful file for Emacs users and usually more recent than what you get with your Emacs installation. > D SVN-Python3/Misc/AIX-NOTES I guess this was renamed to README.AIX before you removed it. See above. > D SVN-Python3/Misc/PURIFY.README Why is this outdated ? Should probably be renamed to README.Purify. > D SVN-Python3/Misc/RFD That's a piece of Python history. These nuggets should stay in the Python source archive, IMHO. > D SVN-Python3/Misc/setuid-prog.c This is useful for people writing setuid programs in Python and avoids many of the usual pitfalls: http://mail.python.org/pipermail/python-list/1999-April/620658.html -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 20 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From brett at python.org Thu Jan 20 21:11:58 2011 From: brett at python.org (Brett Cannon) Date: Thu, 20 Jan 2011 12:11:58 -0800 Subject: [Python-Dev] Moving stuff out of Misc and over to the devguide In-Reply-To: <19768.37926.530229.216373@montanaro.dyndns.org> References: <19768.37926.530229.216373@montanaro.dyndns.org> Message-ID: It's not in the svn tree; it's an Hg repo: ssh://hg at hg.python.org/devguide . The link is also listed in the Resources section of the devguide. On Thu, Jan 20, 2011 at 11:59, wrote: > > Brett, > > I'm sure I just missed it, but where is the devguide in the Subversion tree? > > Thx, > > Skip > From solipsis at pitrou.net Thu Jan 20 21:23:21 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 20 Jan 2011 21:23:21 +0100 Subject: [Python-Dev] [Python-checkins] r88127 - in python/branches/py3k/Misc: README.AIX README.OpenBSD cheatsheet References: <20110120193435.89D3CEEA5B@mail.python.org> <4D38968B.9060103@egenix.com> Message-ID: <20110120212321.385b4690@pitrou.net> On Thu, 20 Jan 2011 21:09:47 +0100 "M.-A. Lemburg" wrote: > brett.cannon wrote: > > Author: brett.cannon > > Date: Thu Jan 20 20:34:35 2011 > > New Revision: 88127 > > > > Log: > > Remove some outdated files from Misc. > > > > Removed: > > python/branches/py3k/Misc/README.AIX > > Are you sure that the AIX README is outdated ? It explains some > of the details of why there are scripts like ld_so_aix which are > still needed on AIX. If someone wants to contribute an up-to-date version they're welcome. The version which has been deleted was totally obsolete. http://bugs.python.org/issue10709 From glyph at twistedmatrix.com Thu Jan 20 21:27:08 2011 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Thu, 20 Jan 2011 15:27:08 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On Jan 20, 2011, at 11:46 AM, Guido van Rossum wrote: > On Thu, Jan 20, 2011 at 5:16 AM, Nick Coghlan wrote: >> On Thu, Jan 20, 2011 at 10:08 PM, Simon Cross >> wrote: >>> I'm changing my vote on this to a +1 for two reasons: >>> >>> * Initially I thought this wasn't supported by Python at all but I see >>> that currently it is supported but that support is broken (or at least >>> limited to UTF-8 filesystem encodings). Since support is there, might >>> as well make it better (especially if it tidies up the code base at >>> the same time). >>> >>> * I still don't think it's a good idea to give modules non-ASCII names >>> but the "consenting adults" approach suggests we should let people >>> shoot themselves in the foot if they believe they have good reason to >>> do so. >> >> I'm also +1 on this for the reasons Simon gives. > > Same here. *Most* code will never be shared, or will only be shared > between users in the same community. When it goes wrong it's also a > learning opportunity. :-) Despite my usual proclivity for being contrarian, I find myself in agreement here. Linux users with locales that don't specify UTF-8 frankly _should_ have to deal with all kinds of nastiness until they can transcode their filesystems. MacOS and Windows both have a "right" answer here and your third-party tools shouldn't create mojibake in your filenames. However, I feel that we should not necessarily be making non-ASCII programmers second-class citizens, if they are to be supported at all. The obvious outcome of the current regime is, if you want your code to work in the wider world, you have to make everything ASCII, so non-ASCII programmers have to do a huge amount of extra work to prepare their stuff for distribution. As an english speaker I'd be happy about that, but as a person with a lot of Chinese in-laws, it gives me pause. There is a difference between sharing code for inspection and editing (where a little codec pain is good for the soul: set your locale to UTF-8 and forget it already!) and sharing code so that a (non-programming) user can just run it. If I can write software in English and distribute it to Chinese people, fair's fair, they should be able to write it in chinese and have it work on my computer. To support the latter, could we just make sure that zipimport has a consistent, non-locale-or-operating-system-dependent interpretation of encoding? That way a distributed egg would be importable from a zipfile regardless of how screwed up the distribution target machine's filesystem is. (And this is yet more motivation for distributors to set zip_safe=True.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Thu Jan 20 21:37:14 2011 From: brett at python.org (Brett Cannon) Date: Thu, 20 Jan 2011 12:37:14 -0800 Subject: [Python-Dev] [Python-checkins] r88127 - in python/branches/py3k/Misc: README.AIX README.OpenBSD cheatsheet In-Reply-To: <4D38968B.9060103@egenix.com> References: <20110120193435.89D3CEEA5B@mail.python.org> <4D38968B.9060103@egenix.com> Message-ID: On Thu, Jan 20, 2011 at 12:09, M.-A. Lemburg wrote: > brett.cannon wrote: >> Author: brett.cannon >> Date: Thu Jan 20 20:34:35 2011 >> New Revision: 88127 >> >> Log: >> Remove some outdated files from Misc. >> >> Removed: >> ? ?python/branches/py3k/Misc/README.AIX > > Are you sure that the AIX README is outdated ? It explains some > of the details of why there are scripts like ld_so_aix which are > still needed on AIX. > I asked earlier if anyone thought they were not and no one spoke up. Same goes for README.OpenBSD. >> ? ?python/branches/py3k/Misc/README.OpenBSD > > Same here. Does OpenBSD 4.x still have the issues mentioned in the > file. > >> ? ?python/branches/py3k/Misc/cheatsheet > > Wouldn't it be better to update this useful file (as part of your > PSF grant) ? Most of it still applies to Py3. That file was not even updated to cover context managers and the 'with' keyword so it's been outdated for years and for at least a couple of releases now. If no one has cared to update it for the last two releases of Python 2.x I don't see a point in my spending time doing an update, especially considering it is a duplicate of official docs which is just asking for maintenance trouble. > > Regarding some other things you removed or moved: > >> D ? ?SVN-Python3/Misc/maintainers.rst >> D ? ?SVN-Python3/Misc/developers.txt > > Why were these removed from the source archive ? They are useful > to have around for users wanting to report bugs and are useful > to follow the development of the core team between different > Python versions. They are in the devguide now. > >> D ? ?SVN-Python3/Misc/python-mode.el > > Why is this gone ? It's a useful file for Emacs users and usually > more recent than what you get with your Emacs installation. Barry removed that (I think) two months ago; I was simply updating the README to reflect the actual state of the directory. > >> D ? ?SVN-Python3/Misc/AIX-NOTES > > I guess this was renamed to README.AIX before you removed it. > See above. > >> D ? ?SVN-Python3/Misc/PURIFY.README > > Why is this outdated ? > Should probably be renamed to README.Purify. Because Barry said it was considering it contained an email that has not worked in a decade. > >> D ? ?SVN-Python3/Misc/RFD > > That's a piece of Python history. These nuggets should stay > in the Python source archive, IMHO. Once again, it was already not there and this is just a cleanup of the file; I didn't delete it. > >> D ? ?SVN-Python3/Misc/setuid-prog.c > > This is useful for people writing setuid programs in Python and > avoids many of the usual pitfalls: Another cleanup of the file. -Brett > > http://mail.python.org/pipermail/python-list/1999-April/620658.html > > -- > Marc-Andre Lemburg > eGenix.com > > Professional Python Services directly from the Source ?(#1, Jan 20 2011) >>>> Python/Zope Consulting and Support ... ? ? ? ?http://www.egenix.com/ >>>> mxODBC.Zope.Database.Adapter ... ? ? ? ? ? ? http://zope.egenix.com/ >>>> mxODBC, mxDateTime, mxTextTools ... ? ? ? ?http://python.egenix.com/ > ________________________________________________________________________ > > ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: > > > ? eGenix.com Software, Skills and Services GmbH ?Pastor-Loeh-Str.48 > ? ?D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg > ? ? ? ? ? Registered at Amtsgericht Duesseldorf: HRB 46611 > ? ? ? ? ? ? ? http://www.egenix.com/company/contact/ > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > From nyamatongwe at gmail.com Thu Jan 20 21:47:32 2011 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Fri, 21 Jan 2011 07:47:32 +1100 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <20110120174439.GT22400@unaka.lan> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> <1295524289.2016.116.camel@marge> <20110120174439.GT22400@unaka.lan> Message-ID: Toshio Kuratomi: > My examples that you're replying to involve two "properly > configured" OS's. ?The Linux workstations are configured with a UTF-8 > locale. ?The Windows OS's use wide character unicode. ?The problem occurs in > that the code that one of the parties develops (either the students or the > professors) is developed on one of those OS's and then used on the other OS. This implies a symmetric issue,. but I can not see how there can be a problem with non-ASCII module names on Windows as the file system allows all Unicode characters so can represent any module name. OS X is also based on Unicode file names. While it is possible to mount file systems on Windows or OS X that do not support Unicode file names these are a very unusual situation that will cause problems in other ways. Common Linux distributions like Ubuntu and Fedora now default to UTF-8 locales. The situations in which users may encounter installations that do not support Unicode file names have reduced greatly. Neil From solipsis at pitrou.net Thu Jan 20 21:55:22 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 20 Jan 2011 21:55:22 +0100 Subject: [Python-Dev] Import and unicode: part two References: <1295440442.432.18.camel@marge> Message-ID: <20110120215522.61429d36@pitrou.net> On Thu, 20 Jan 2011 15:27:08 -0500 Glyph Lefkowitz wrote: > > To support the latter, could we just make sure that zipimport has a consistent, > non-locale-or-operating-system-dependent interpretation of encoding? It already has, but it's dependent on a flag in the zip file itself (actually, one flag per archived file in the zip it seems). (by the way, it would be nice if your text/mail editor wrapped lines at 80 characters or something) Regards Antoine. From v+python at g.nevcal.com Thu Jan 20 21:59:15 2011 From: v+python at g.nevcal.com (Glenn Linderman) Date: Thu, 20 Jan 2011 12:59:15 -0800 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: <4D38A223.1020908@g.nevcal.com> On 1/20/2011 12:27 PM, Glyph Lefkowitz wrote: > To support the latter, could we just make sure that zipimport has a > consistent, non-locale-or-operating-system-dependent interpretation of > encoding? That way a distributed egg would be importable from a zipfile > regardless of how screwed up the distribution target machine's > filesystem is. (And this is yet more motivation for distributors to set > zip_safe=True.) I guess zip_safe is a distutils thing, and I haven't (yet) used distutils. But regarding zip files, I was trying to figure out if ZipFile module supported the CP437/UTF-8 flag, but its documentation seems to predate that concept, and just talks about unencoded byte streams. Yet, I think I have Python3 code that passes str to the filenames, and that works, so some amount of encoding and decoding to something must be happening behind the documentation's back? It does seem that if a ZipFile is created with the UTF-8 flag turned on, that Python should respect that, and that should be independent of the file system configured encoding on the local machine on which the ZipFile is used (as long as the name of the ZipFile is usable). I do know that listing filenames from a zip file created without the UTF-8 flag, using ZipFile to access it and place the names inside a web page that specifies its encoding to be UTF-8 produces illegal characters, so I've become tuned in recently to the zip files do have such a flag, and have been learning the right options to turn it on for the command line tools I use to create zip files... but was surprised when investigating the same for ZipFile. From sandro.tosi at gmail.com Thu Jan 20 22:06:53 2011 From: sandro.tosi at gmail.com (Sandro Tosi) Date: Thu, 20 Jan 2011 22:06:53 +0100 Subject: [Python-Dev] [Python-checkins] devguide: Move Misc/README.Emacs to here. In-Reply-To: References: Message-ID: Hi, On Thu, Jan 20, 2011 at 20:33, brett.cannon wrote: > +.. > + ? Local Variables: > + ? mode: indented-text > + ? indent-tabs-mode: nil > + ? sentence-end-double-space: t > + ? fill-column: 78 > + ? coding: utf-8 > + ? End: maybe this can be removed now Cheers, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi From alexander.belopolsky at gmail.com Thu Jan 20 22:09:58 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 20 Jan 2011 16:09:58 -0500 Subject: [Python-Dev] [Python-checkins] r88127 - in python/branches/py3k/Misc: README.AIX README.OpenBSD cheatsheet In-Reply-To: References: <20110120193435.89D3CEEA5B@mail.python.org> <4D38968B.9060103@egenix.com> Message-ID: On Thu, Jan 20, 2011 at 3:37 PM, Brett Cannon wrote: > On Thu, Jan 20, 2011 at 12:09, M.-A. Lemburg wrote: .. >>> ? ?python/branches/py3k/Misc/cheatsheet >> >> Wouldn't it be better to update this useful file (as part of your >> PSF grant) ? Most of it still applies to Py3. > > That file was not even updated to cover context managers and the > 'with' keyword so it's been outdated for years and for at least a > couple of releases now. If no one has cared to update it for the last > two releases of Python 2.x I don't see a point in my spending time > doing an update, especially considering it is a duplicate of official > docs which is just asking for maintenance trouble. > You should probably close issue4819 with "won't fix" in this case. I am with MAL on this one, though. I don't think equivalent presentation is duplicated anywhere in the docs. It would be better to have it updated and moved to Doc. From brett at python.org Thu Jan 20 22:13:46 2011 From: brett at python.org (Brett Cannon) Date: Thu, 20 Jan 2011 13:13:46 -0800 Subject: [Python-Dev] [Python-checkins] r88127 - in python/branches/py3k/Misc: README.AIX README.OpenBSD cheatsheet In-Reply-To: References: <20110120193435.89D3CEEA5B@mail.python.org> <4D38968B.9060103@egenix.com> Message-ID: On Thu, Jan 20, 2011 at 13:09, Alexander Belopolsky wrote: > On Thu, Jan 20, 2011 at 3:37 PM, Brett Cannon wrote: >> On Thu, Jan 20, 2011 at 12:09, M.-A. Lemburg wrote: > .. >>>> ? ?python/branches/py3k/Misc/cheatsheet >>> >>> Wouldn't it be better to update this useful file (as part of your >>> PSF grant) ? Most of it still applies to Py3. >> >> That file was not even updated to cover context managers and the >> 'with' keyword so it's been outdated for years and for at least a >> couple of releases now. If no one has cared to update it for the last >> two releases of Python 2.x I don't see a point in my spending time >> doing an update, especially considering it is a duplicate of official >> docs which is just asking for maintenance trouble. >> > > You should probably close issue4819 with "won't fix" in this case. > > > I am with MAL on this one, though. ?I don't think equivalent > presentation is duplicated anywhere in the docs. ?It would be better > to have it updated and moved to Doc. > If someone wants to update I'm not objecting, I'm just saying I view getting the devguide done and moving on to the Python 2 -> 3 porting guide more important. From g.brandl at gmx.net Thu Jan 20 22:24:32 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 20 Jan 2011 22:24:32 +0100 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <1295524289.2016.116.camel@marge> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> <1295524289.2016.116.camel@marge> Message-ID: Am 20.01.2011 12:51, schrieb Victor Stinner: > You only give theorical arguments Read Anathem lately? ;) Georg From ncoghlan at gmail.com Fri Jan 21 01:00:14 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 21 Jan 2011 10:00:14 +1000 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <20110120192743.GU22400@unaka.lan> References: <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> <1295524289.2016.116.camel@marge> <20110120174439.GT22400@unaka.lan> <20110120192743.GU22400@unaka.lan> Message-ID: On Fri, Jan 21, 2011 at 5:27 AM, Toshio Kuratomi wrote: > I think that both ideas are inferior to mandating that every python module > filename is ascii. ?From what I'm getting from Victor's posts is that he, at > least, considers the portability problems to be ignorable because dealing > with ambiguous file name encodings is something that he'd like to force > third party tools to deal with. I think you're starting from an incorrect premise: we *already* allow non-ASCII module names in Py3k. They just don't always work properly, hence why people are currently much, much better off using pure ASCII for their module names (as ASCII is still the lowest common denominator for internet communication). However, you are proposing that, instead of attempting to fix at least some of the cases where it doesn't work, we throw up our hands and tell people "Since some poorly configured systems have trouble with this feature, we're taking it away from everybody. Sorry if this breaks your code." While there may be situations where that's a valid approach, this isn't one of them. Yes, non-ASCII filenames are problems for all sorts of reasons (with Python's historically poor support being one of them). The idea is that we're striving to no longer be part of that problem, even if it isn't within our power to fix it entirely. Once we fix the core to handle various Unicode issues, then over time that support can ripple out through the rest of the Python ecosystem - we don't expect everything to magically "just work" as soon as the basic issue in the core is fixed. It's going to be *years* before non-ASCII file names are as portable as pure ASCII ones (it kind of reminds me of the era when you had to avoid spaces in filenames because so many applications choked on them, even after the OS had been updated to support them). As far as the question of filenames not being re-encoded properly when copied between two systems, then yes, that *is* a problem with the third party tools used to do the copying. Such tools will break any code that uses the str APIs to access the filesystem. To deal with the case of undecodable filenames that the import system skips over, it is certainly possibly that importlib or runpy (probably the former) could acquire a function that allowed a named file to imported directly (with a specific module name) rather than requiring the import system to search for it. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Fri Jan 21 01:58:54 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 21 Jan 2011 10:58:54 +1000 Subject: [Python-Dev] [Python-checkins] devguide: Copy over the dev FAQ and *only* strip out stuff covered elsewhere in the In-Reply-To: References: Message-ID: On Fri, Jan 21, 2011 at 6:42 AM, brett.cannon wrote: > brett.cannon pushed 82d3a1b694b3 to devguide: > > http://hg.python.org/devguide/rev/82d3a1b694b3 > changeset: ? 167:82d3a1b694b3 > user: ? ? ? ?Brett Cannon > date: ? ? ? ?Thu Jan 20 12:40:47 2011 -0800 > summary: > ?Copy over the dev FAQ and *only* strip out stuff covered elsewhere in the devguide. Nick Coghlan should be a happy boy after this. Yay, thanks :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ezio.melotti at gmail.com Fri Jan 21 03:31:33 2011 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Fri, 21 Jan 2011 03:31:33 +0100 Subject: [Python-Dev] [Python-checkins] r87815 - peps/trunk/pep-3333.txt In-Reply-To: <20110107153928.3CE34EE988@mail.python.org> References: <20110107153928.3CE34EE988@mail.python.org> Message-ID: On Fri, Jan 7, 2011 at 4:39 PM, phillip.eby wrote: > Author: phillip.eby > Date: Fri Jan 7 16:39:27 2011 > New Revision: 87815 > > Log: > More bytes I/O fixes > > > Modified: > peps/trunk/pep-3333.txt > > Modified: peps/trunk/pep-3333.txt > > ============================================================================== > --- peps/trunk/pep-3333.txt (original) > +++ peps/trunk/pep-3333.txt Fri Jan 7 16:39:27 2011 > @@ -310,9 +310,9 @@ > elif not headers_sent: > # Before the first output, send the stored headers > status, response_headers = headers_sent[:] = headers_set > - sys.stdout.write('Status: %s\r\n' % status) > + sys.stdout.buffer.write('Status: %s\r\n' % status) > for header in response_headers: > - sys.stdout.write('%s: %s\r\n' % header) > + sys.stdout.buffer.write('%s: %s\r\n' % header) > Also note that .buffer might not be available in some cases (i.e. when sys.stdout has been replaced with other objects). > sys.stdout.write('\r\n') > > sys.stdout.buffer.write(data) > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > -------------- next part -------------- An HTML attachment was scrubbed... URL: From foom at fuhm.net Fri Jan 21 04:16:36 2011 From: foom at fuhm.net (James Y Knight) Date: Thu, 20 Jan 2011 22:16:36 -0500 Subject: [Python-Dev] [Python-checkins] r87815 - peps/trunk/pep-3333.txt In-Reply-To: References: <20110107153928.3CE34EE988@mail.python.org> Message-ID: <884F4E19-1E56-44C1-9801-9128ADD99743@fuhm.net> On Jan 20, 2011, at 9:31 PM, Ezio Melotti wrote: >> Modified: peps/trunk/pep-3333.txt >> ============================================================================== >> --- peps/trunk/pep-3333.txt (original) >> +++ peps/trunk/pep-3333.txt Fri Jan 7 16:39:27 2011 >> @@ -310,9 +310,9 @@ >> elif not headers_sent: >> # Before the first output, send the stored headers >> status, response_headers = headers_sent[:] = headers_set >> - sys.stdout.write('Status: %s\r\n' % status) >> + sys.stdout.buffer.write('Status: %s\r\n' % status) >> for header in response_headers: >> - sys.stdout.write('%s: %s\r\n' % header) >> + sys.stdout.buffer.write('%s: %s\r\n' % header) > > Also note that .buffer might not be available in some cases (i.e. when sys.stdout has been replaced with other objects). Do you have a recommendation for a better way to do bytes I/O on stdin/sydout, then?...just saying that .buffer might not be available isn't a very useful comment unless there's a replacement idiom... James From foom at fuhm.net Fri Jan 21 04:25:17 2011 From: foom at fuhm.net (James Y Knight) Date: Thu, 20 Jan 2011 22:25:17 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <20110120215522.61429d36@pitrou.net> References: <1295440442.432.18.camel@marge> <20110120215522.61429d36@pitrou.net> Message-ID: <863D108E-9B6E-4F9D-8D1F-D2B87B0EAC69@fuhm.net> On Jan 20, 2011, at 3:55 PM, Antoine Pitrou wrote: > On Thu, 20 Jan 2011 15:27:08 -0500 > Glyph Lefkowitz wrote: >> >> To support the latter, could we just make sure that zipimport has a consistent, >> non-locale-or-operating-system-dependent interpretation of encoding? > > It already has, but it's dependent on a flag in the zip file itself > (actually, one flag per archived file in the zip it seems). > > (by the way, it would be nice if your text/mail editor wrapped lines at > 80 characters or something) You could complain to Apple, but it seems unlikely that they'd change it. They broke it intentionally in OSX 10.6.2 for better compatibility with MS Outlook. (for the technically inclined: It still wraps lines at 80 characters in the raw message, but it uses quoted-printable encoding to escape the line-breaks, so mail readers which decode quoted-printable but can't flow text are now S.O.L. Apple used to use the nice format=flowed standard instead.) James From ncoghlan at gmail.com Fri Jan 21 06:59:13 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 21 Jan 2011 15:59:13 +1000 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On Fri, Jan 21, 2011 at 3:44 PM, Atsuo Ishimoto wrote: > I don't want Python to encourage people to use non-ascii module names. > Today, seeing UnicodeEncodingError is one of popular reasons for > newbies to abandon learning Python in Japan. Non-ascii module name is > an another source of confusion for newbies. > > Experienced Japanese programmers may not use non-ascii module names to > avoid encoding issues. > > But novice programmers or non-programmers willing to learn programming > with Python will wish to use Japanese module names. Their programs > will stop working if they copy them to another environment. Sooner or > later, they will see storange ImportError and will start complaining > "Python sucks! Python doesn't support Japanese!" on Twitter. > > Copying files with non-ascii file name over platform is not easy as it > sounds. What happen if I copy such files from OSX to my web hosting > server ? Results might differ depending on tools I use to copy and > platforms. These all sound like good reasons to continue to *advise* against using non-ASCII module names. But aside from that, they sound exactly like a lot of the arguments we heard when Py3k started enforcing the bytes/text distinction more rigorously: "you're going to break stuff!". Yes, we know. But if core software development components like Python don't try to improve their Unicode support, how is the situation ever going to get better? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From tjreedy at udel.edu Fri Jan 21 07:17:07 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 21 Jan 2011 01:17:07 -0500 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <20110120174439.GT22400@unaka.lan> References: <1295440442.432.18.camel@marge> <20110119234419.GO22400@unaka.lan> <1295483161.12324.10.camel@marge> <20110120020725.GQ22400@unaka.lan> <1295491865.22752.22.camel@marge> <20110120043901.GR22400@unaka.lan> <1295524289.2016.116.camel@marge> <20110120174439.GT22400@unaka.lan> Message-ID: On 1/20/2011 12:44 PM, Toshio Kuratomi wrote: > The problem occurs in > that the code that one of the parties develops (either the students or the > professors) is developed on one of those OS's and then used on the other OS. The problem that I reported and hope will be fixed is that private code written and tested on one machine, which will never be distributed, could not be imported on the *same* machine, with nothing changed on that machine except for writing a second file that does the import. If filenames get mangled when file are transported (admittedly more likely with non-ascii chars), that is a different issue. -- Terry Jan Reedy From g.brandl at gmx.net Fri Jan 21 08:33:48 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 21 Jan 2011 08:33:48 +0100 Subject: [Python-Dev] r88121 - python/branches/py3k/Doc/whatsnew/3.2.rst In-Reply-To: <20110120090440.16F11EE98E@mail.python.org> References: <20110120090440.16F11EE98E@mail.python.org> Message-ID: Am 20.01.2011 10:04, schrieb raymond.hettinger: > +os > +-- > + > +Different operating systems use various encodings for filenames and environment > +variables. The :mod:`os` module provides two new functions, > +:func:`~os.fsencode` and :func:`~os.fsdecode`, for encoding and decoding > +filenames: > + > +>>> filename = '???????' > +>>> os.fsencode(filename) > +b'\xd1\x81\xd0\xbb\xd0\xbe\xd0\xb2\xd0\xb0\xd1\x80\xd1\x8c' > +>>> open(os.fsencode(filename)) Please do not include Cyrillic characters directly in the source -- it breaks the LaTeX PDF build. A non-ascii name from the latin-1 range should be fine. Georg From ncoghlan at gmail.com Fri Jan 21 08:55:25 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 21 Jan 2011 17:55:25 +1000 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On Fri, Jan 21, 2011 at 4:44 PM, Atsuo Ishimoto wrote: > On Fri, Jan 21, 2011 at 2:59 PM, Nick Coghlan wrote: >> >> These all sound like good reasons to continue to *advise* against >> using non-ASCII module names. But aside from that, they sound exactly >> like a lot of the arguments we heard when Py3k started enforcing the >> bytes/text distinction more rigorously: "you're going to break >> stuff!". > > No, non-ASCII module names are new breakage you are going to introduce now :) No, they're not. Non-ASCII module names *already work* in Python 3.1 on UTF-8 filesystems. The portability problem you're complaining about exists now, and Victor is trying to at least partially alleviate it by making these filenames work correctly on more properly configured systems (such as Windows). It won't go away until all filesystem manipulation tools are properly Unicode aware, but that's no reason for us to continue to unnecessarily exacerbate the problem. Given imp_cafe.py: import caf? And caf?.py: print('Hello world from {}'.format(__name__)) I get the following result: ~$ python3.1 imp_cafe.py Hello world from caf? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ishimoto at gembook.org Fri Jan 21 06:44:48 2011 From: ishimoto at gembook.org (Atsuo Ishimoto) Date: Fri, 21 Jan 2011 14:44:48 +0900 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On Fri, Jan 21, 2011 at 1:46 AM, Guido van Rossum wrote: > On Thu, Jan 20, 2011 at 5:16 AM, Nick Coghlan wrote: >> On Thu, Jan 20, 2011 at 10:08 PM, Simon Cross >> wrote: >>> I'm changing my vote on this to a +1 for two reasons: >>> >>> * Initially I thought this wasn't supported by Python at all but I see >>> that currently it is supported but that support is broken (or at least >>> limited to UTF-8 filesystem encodings). Since support is there, might >>> as well make it better (especially if it tidies up the code base at >>> the same time). >>> >>> * I still don't think it's a good idea to give modules non-ASCII names >>> but the "consenting adults" approach suggests we should let people >>> shoot themselves in the foot if they believe they have good reason to >>> do so. >> >> I'm also +1 on this for the reasons Simon gives. > > Same here. *Most* code will never be shared, or will only be shared > between users in the same community. When it goes wrong it's also a > learning opportunity. :-) > I don't want Python to encourage people to use non-ascii module names. Today, seeing UnicodeEncodingError is one of popular reasons for newbies to abandon learning Python in Japan. Non-ascii module name is an another source of confusion for newbies. Experienced Japanese programmers may not use non-ascii module names to avoid encoding issues. But novice programmers or non-programmers willing to learn programming with Python will wish to use Japanese module names. Their programs will stop working if they copy them to another environment. Sooner or later, they will see storange ImportError and will start complaining "Python sucks! Python doesn't support Japanese!" on Twitter. Copying files with non-ascii file name over platform is not easy as it sounds. What happen if I copy such files from OSX to my web hosting server ? Results might differ depending on tools I use to copy and platforms. Is it a good opportunity to start learnig abound encodings? I don't think so. They should learn concepts of charater set and encodings, Unicode and JIS character sets, some kind of Japanse encodings, number of platform specifix issues, non-standard extention of Microsoft and Apple, and so on. I think they should defer learning these messes until they get ready. -- Atsuo Ishimoto Mail: ishimoto at gembook.org Blog: http://d.hatena.ne.jp/atsuoishimoto/ Twitter: atsuoishimoto From ishimoto at gembook.org Fri Jan 21 07:44:43 2011 From: ishimoto at gembook.org (Atsuo Ishimoto) Date: Fri, 21 Jan 2011 15:44:43 +0900 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: On Fri, Jan 21, 2011 at 2:59 PM, Nick Coghlan wrote: > > These all sound like good reasons to continue to *advise* against > using non-ASCII module names. But aside from that, they sound exactly > like a lot of the arguments we heard when Py3k started enforcing the > bytes/text distinction more rigorously: "you're going to break > stuff!". No, non-ASCII module names are new breakage you are going to introduce now :) If the advice against using non-ASCII module names is reasonable, why bother supporting them? > > Yes, we know. But if core software development components like Python > don't try to improve their Unicode support, how is the situation ever > going to get better? > Java, a leading language of IT industry, have already support non-ASCII class files for years. But I've never seen such files in production in Japan, and didn't improve situation until now. -- Atsuo Ishimoto Mail: ishimoto at gembook.org Blog: http://d.hatena.ne.jp/atsuoishimoto/ Twitter: atsuoishimoto From stephen at xemacs.org Fri Jan 21 09:45:44 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 21 Jan 2011 17:45:44 +0900 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: <871v467ih3.fsf@uwakimon.sk.tsukuba.ac.jp> Nick Coghlan writes: > On Fri, Jan 21, 2011 at 3:44 PM, Atsuo Ishimoto wrote: > > I don't want Python to encourage people to use non-ascii module names. I don't think anybody is *encouraging* it. The argument is for *permitting* it, partly for consistency with other identifiers, and partly because of Python's usual "consenting adults" standard for permitting "dangerous" practices. I realize this is a somewhat problematic distinction in Japan, for several reasons, but it's really not one that can be avoided in computing in any case. The sooner novice programmers learn it, the better. > > Today, seeing UnicodeEncodingError is one of popular reasons for > > newbies to abandon learning Python in Japan. Non-ascii module name is > > an another source of confusion for newbies. > > > > Experienced Japanese programmers may not use non-ascii module names to > > avoid encoding issues. > > > > But novice programmers or non-programmers willing to learn programming > > with Python will wish to use Japanese module names. Their programs > > will stop working if they copy them to another environment. Sooner or > > later, they will see storange ImportError and will start complaining > > "Python sucks! Python doesn't support Japanese!" on Twitter. So ask them, "What language *does* 'support Japanese'?" ;-) Seriously, "support Japanese" is an impossibly hard standard in the current environment. Not only does Japan have 5 more or less standard encodings still in daily use (EUC-JP, ISO-2022-JP, Shift JIS, UTF-8, and UTF-16LE), but many major IT companies have their own variants of the JIS standard character repertoire (all of the variant ideographs I've seen in the wild are in Unicode, but many corporate repertoires add extra symbols that are not), and of course some Microsoft utilities insist on using the deprecated UTF-8 signature with UTF-8. That said, I really don't see module names as a particular problem. By the time your novice is using her own modules (as opposed to importing stdlib and PyPI add-on modules, all with ASCII-only names), she'll be doing file I/O which has all the same problems, AFAICS. True, file names will be strings rather than identifiers, but I don't see why that matters. > > Copying files with non-ascii file name over platform is not easy as it > > sounds. Agreed, it's not trivial. But it's not that hard, either[1], and web hosts and others *could* help by providing checkers for languages that they support. > > What happen if I copy such files from OSX to my web hosting > > server ? Results might differ depending on tools I use to copy and > > platforms. I don't see why this problem is specific to Python modules, as opposed to any file name. > These all sound like good reasons to continue to *advise* against > using non-ASCII module names. +1 > But aside from that, they sound exactly like a lot of the arguments > we heard when Py3k started enforcing the bytes/text distinction > more rigorously: "you're going to break stuff!". Well, not exactly. Enforcing the bytes/text distinction was a change in the definition of Python; breakage was our fault. The change was made because in the (not so) long run it would reduce new breakage. Here, Python is fine (or at least we have some pretty good ideas how to fix it), it's the world that's broken. *Especially* Japan, with its five standard encodings in daily use and scads of private variant repertoires masquerading as standard encodings on top of that. But the whole world is broken because of the NFD/NFC thing. AFAIK, the only file system that tries to enforce an NF is Mac OS X HFS+, and (unfortunately for portability *from* Mac OS X *to* other systems) they chose NFD. Proper NFD support is arguably better for a number of reasons (for one, people regularly invent new composition sequences that will not have precomposed glyphs in any font), but NFC has the advantage that existing fonts support precomposed standard characters while many display engines do not support composition properly yet. And it's likely to stay broken for a while: the move to conformant display engines is going to take more time. I still don't see this as a reason to give up on non-ASCII module names. Just have the documentation warn that many non-ASCII names will be non-portable, so use on multiple systems will require care (maybe gloss that with "probably more care than you want to take"). Footnotes: [1] I actually find copying file names with spaces to be a bigger problem, because it's so hard to get shell quoting right. From martin at v.loewis.de Fri Jan 21 10:53:33 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 21 Jan 2011 10:53:33 +0100 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: <4D39579D.4060408@v.loewis.de> > I don't want Python to encourage people to use non-ascii module names. I don't think the feature is open for debate anymore. PEP 3131 has been accepted (after *long* debates), and I'll pronounce that supporting non-ASCII module names is a direct consequence of having it accepted. Of course, there may be limitations with respect to operating systems, and in the way Python modules integrate with the file system - but that non-ASCII module names must be supported is really out of question. If you would like this to be reverted, you need to write another PEP. Regards, Martin From stephen at xemacs.org Fri Jan 21 11:42:14 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 21 Jan 2011 19:42:14 +0900 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: References: <1295440442.432.18.camel@marge> Message-ID: <87zkqu5yih.fsf@uwakimon.sk.tsukuba.ac.jp> Atsuo Ishimoto writes: > Java, a leading language of IT industry, have already support > non-ASCII class files for years. But I've never seen such files in > production in Japan, and didn't improve situation until now. So why wouldn't Python work the same way? The rest of the world can use non-ASCII modules names sparingly, and Japanese programmers can avoid them diligently. Or learn to use them properly and teach each other; if anybody has the experience of multiple encodings needed to figure out a good way to use the native language in program identifiers despite the encoding problem, my bet is it would be Japan. From solipsis at pitrou.net Fri Jan 21 12:31:29 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 21 Jan 2011 12:31:29 +0100 Subject: [Python-Dev] Import and unicode: part two In-Reply-To: <863D108E-9B6E-4F9D-8D1F-D2B87B0EAC69@fuhm.net> References: <1295440442.432.18.camel@marge> <20110120215522.61429d36@pitrou.net> <863D108E-9B6E-4F9D-8D1F-D2B87B0EAC69@fuhm.net> Message-ID: <20110121123129.3c05e129@pitrou.net> On Thu, 20 Jan 2011 22:25:17 -0500 James Y Knight wrote: > > On Jan 20, 2011, at 3:55 PM, Antoine Pitrou wrote: > > > On Thu, 20 Jan 2011 15:27:08 -0500 > > Glyph Lefkowitz wrote: > >> > >> To support the latter, could we just make sure that zipimport has a consistent, > >> non-locale-or-operating-system-dependent interpretation of encoding? > > > > It already has, but it's dependent on a flag in the zip file itself > > (actually, one flag per archived file in the zip it seems). > > > > (by the way, it would be nice if your text/mail editor wrapped lines at > > 80 characters or something) > > You could complain to Apple, but it seems unlikely that they'd change it. They broke it intentionally in OSX 10.6.2 for better compatibility with MS Outlook. > > (for the technically inclined: It still wraps lines at 80 characters in the raw message, but it uses quoted-printable encoding to escape the line-breaks, so mail readers which decode quoted-printable but can't flow text are now S.O.L. Apple used to use the nice format=flowed standard instead.) I think most mail readers are able to word-wrap raw text correctly (even though it still makes your messages look bad amongst a thread of nicely-formatted 80-column messages). The real annoyance is when reading Web archives of mailing-lists, e.g. http://twistedmatrix.com/pipermail/twisted-python/2011-January/023346.html Regards Antoine. From solipsis at pitrou.net Fri Jan 21 12:34:42 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 21 Jan 2011 12:34:42 +0100 Subject: [Python-Dev] [Python-checkins] r87815 - peps/trunk/pep-3333.txt References: <20110107153928.3CE34EE988@mail.python.org> <884F4E19-1E56-44C1-9801-9128ADD99743@fuhm.net> Message-ID: <20110121123442.65621877@pitrou.net> On Thu, 20 Jan 2011 22:16:36 -0500 James Y Knight wrote: > > On Jan 20, 2011, at 9:31 PM, Ezio Melotti wrote: > >> Modified: peps/trunk/pep-3333.txt > >> ============================================================================== > >> --- peps/trunk/pep-3333.txt (original) > >> +++ peps/trunk/pep-3333.txt Fri Jan 7 16:39:27 2011 > >> @@ -310,9 +310,9 @@ > >> elif not headers_sent: > >> # Before the first output, send the stored headers > >> status, response_headers = headers_sent[:] = headers_set > >> - sys.stdout.write('Status: %s\r\n' % status) > >> + sys.stdout.buffer.write('Status: %s\r\n' % status) > >> for header in response_headers: > >> - sys.stdout.write('%s: %s\r\n' % header) > >> + sys.stdout.buffer.write('%s: %s\r\n' % header) > > > > Also note that .buffer might not be available in some cases (i.e. when sys.stdout has been replaced with other objects). > > Do you have a recommendation for a better way to do bytes I/O on stdin/sydout, then?...just saying that .buffer might not be available isn't a very useful comment unless there's a replacement idiom... Well, this is the recommmendation. There's no reason for sys.stdXXX.buffer not to exist if you have full control over the application (which you normally have if you do CGI). Regards Antoine. From foom at fuhm.net Fri Jan 21 14:23:31 2011 From: foom at fuhm.net (James Y Knight) Date: Fri, 21 Jan 2011 08:23:31 -0500 Subject: [Python-Dev] Mail archive line wrapping (Was: Import and unicode: part two) In-Reply-To: <20110121123129.3c05e129@pitrou.net> References: <1295440442.432.18.camel@marge> <20110120215522.61429d36@pitrou.net> <863D108E-9B6E-4F9D-8D1F-D2B87B0EAC69@fuhm.net> <20110121123129.3c05e129@pitrou.net> Message-ID: On Jan 21, 2011, at 6:31 AM, Antoine Pitrou wrote: > On Thu, 20 Jan 2011 22:25:17 -0500 > James Y Knight wrote: >> >> On Jan 20, 2011, at 3:55 PM, Antoine Pitrou wrote: >>> (by the way, it would be nice if your text/mail editor wrapped lines at >>> 80 characters or something) >> >> You could complain to Apple, but it seems unlikely that they'd change it. They broke it intentionally in OSX 10.6.2 for better compatibility with MS Outlook. >> >> (for the technically inclined: It still wraps lines at 80 characters in the raw message, but it uses quoted-printable encoding to escape the line-breaks, so mail readers which decode quoted-printable but can't flow text are now S.O.L. Apple used to use the nice format=flowed standard instead.) > > I think most mail readers are able to word-wrap raw text correctly > (even though it still makes your messages look bad amongst a thread of > nicely-formatted 80-column messages). > The real annoyance is when reading Web archives of mailing-lists, e.g. > http://twistedmatrix.com/pipermail/twisted-python/2011-January/023346.html Well, yes, that's a pretty annoying bug in mailman, isn't it? If only anyone around here was involved in mailman and could fix it! :) [I've attempted to cc this to mailman-users with this message, but since I'm not subscribed I dunno if it'll make it or not.] I have this in my user CSS override file to fix the issue for myself globally on all such archives out in the world: /* Mailing list archives */ html>body>pre { white-space: pre-wrap !important; } But really, pipermail should just output a suitable style itself, e.g.:
 or a  in the header.

That's supported on all browsers since FF3.0, IE8, Safari 3, Opera 8. There are various nonstandard CSS selectors for reaching older browsers (IE5.5+, Firefox pre-1.0+, Opera 4+)...But by the time this change gets made in mailman, and released, and gets into the distros that the various list hosts around the web use, and those hosts get upgraded, I doubt anyone will actually even be able to run those old browsers anymore.

James

From solipsis at pitrou.net  Fri Jan 21 14:30:21 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 21 Jan 2011 14:30:21 +0100
Subject: [Python-Dev] Mail archive line wrapping (Was: Import and
	unicode: part two)
In-Reply-To: 
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	<20110120215522.61429d36@pitrou.net>
	<863D108E-9B6E-4F9D-8D1F-D2B87B0EAC69@fuhm.net>
	<20110121123129.3c05e129@pitrou.net>
	
Message-ID: <20110121143021.4a75954b@pitrou.net>

On Fri, 21 Jan 2011 08:23:31 -0500
James Y Knight  wrote:
> > 
> > I think most mail readers are able to word-wrap raw text correctly
> > (even though it still makes your messages look bad amongst a thread of
> > nicely-formatted 80-column messages).
> > The real annoyance is when reading Web archives of mailing-lists, e.g.
> > http://twistedmatrix.com/pipermail/twisted-python/2011-January/023346.html
> 
> Well, yes, that's a pretty annoying bug in mailman, isn't it? If only anyone around here was involved in mailman and could fix it! :) [I've attempted to cc this to mailman-users with this message, but since I'm not subscribed I dunno if it'll make it or not.]

Why is this a bug in mailman? Mailman archives messages as they are
sent (well, perhaps it mangles e-mail addresses, perhaps). If someone
draws a nice ASCII-art diagram which requires 90 columns instead of 80,
you wouldn't want the archive to break its rendering.

So, it's really the mail client (or its user :-)) which should handle
word-wrapping, not some downstream tool which has no idea of the
original intent.

> I have this in my user CSS override file to fix the issue for myself globally on all such archives out in the world:
> /* Mailing list archives */
> html>body>pre { white-space: pre-wrap !important; }

That doesn't wrap to 80 characters, does it? Only whatever the
current window/container width is, which isn't necessarily the right
thing (if that makes lines 160 characters long, it's still quite
uncomfortable to read).

Regards

Antoine.

From barry at python.org  Fri Jan 21 14:31:10 2011
From: barry at python.org (Barry Warsaw)
Date: Fri, 21 Jan 2011 08:31:10 -0500
Subject: [Python-Dev] Mail archive line wrapping (Was: Import and
 unicode: part two)
In-Reply-To: 
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	<20110120215522.61429d36@pitrou.net>
	<863D108E-9B6E-4F9D-8D1F-D2B87B0EAC69@fuhm.net>
	<20110121123129.3c05e129@pitrou.net>
	
Message-ID: <20110121083110.5445ce58@python.org>

On Jan 21, 2011, at 08:23 AM, James Y Knight wrote:

>Well, yes, that's a pretty annoying bug in mailman, isn't it? If only anyone
>around here was involved in mailman and could fix it! :) [I've attempted to
>cc this to mailman-users with this message, but since I'm not subscribed I
>dunno if it'll make it or not.]

Technically, Pipermail, but jeebus how I hate hacking on that code. :)  Although
it's been futile for the last decade, maybe this time will work: volunteers
wanted!

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 

From ishimoto at gembook.org  Fri Jan 21 16:07:04 2011
From: ishimoto at gembook.org (Atsuo Ishimoto)
Date: Sat, 22 Jan 2011 00:07:04 +0900
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <871v467ih3.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	
	<871v467ih3.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: 

On Fri, Jan 21, 2011 at 5:45 PM, Stephen J. Turnbull  wrote:
> Nick Coghlan writes:
> ?> On Fri, Jan 21, 2011 at 3:44 PM, Atsuo Ishimoto  wrote:
>
> ?> > I don't want Python to encourage people to use non-ascii module names.
>
> I don't think anybody is *encouraging* it. ?The argument is for
> *permitting* it, partly for consistency with other identifiers, and
> partly because of Python's usual "consenting adults" standard for
> permitting "dangerous" practices.

I'm sorry, I was not clear. I was afraid that saying "learning
opportunity" tempt people to try non-ASCII module names.
In these days, even non technical people have access to Windows, Mac
and Linux boxes at a time. So chances to be annoyed with broken
non-ASCII named files are pretty common.

>
> I still don't see this as a reason to give up on non-ASCII module
> names. ?Just have the documentation warn that many non-ASCII names
> will be non-portable, so use on multiple systems will require care
> (maybe gloss that with "probably more care than you want to take").
>
Nice gloss.

-- 
Atsuo Ishimoto
Mail: ishimoto at gembook.org
Blog: http://d.hatena.ne.jp/atsuoishimoto/
Twitter: atsuoishimoto

From status at bugs.python.org  Fri Jan 21 18:07:04 2011
From: status at bugs.python.org (Python tracker)
Date: Fri, 21 Jan 2011 18:07:04 +0100 (CET)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <20110121170704.496D01CB5A@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (2011-01-14 - 2011-01-21)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open    2527 (+29)
  closed 20228 (+36)
  total  22755 (+65)

Open issues with patches: 1062 


Issues opened (44)
==================

#10896: trace module compares directories as strings (--ignore-dir)
http://bugs.python.org/issue10896  reopened by SilentGhost

#10909: thread hang, possibly related to print
http://bugs.python.org/issue10909  opened by PythonInTheGrass

#10910: pyport.h FreeBSD/Mac OS X "fix" causes errors in C++ compilati
http://bugs.python.org/issue10910  opened by X-Istence

#10911: cgi: add more tests
http://bugs.python.org/issue10911  opened by haypo

#10914: Python sub-interpreter test
http://bugs.python.org/issue10914  opened by pitrou

#10915: Make the PyGILState API compatible with multiple interpreters
http://bugs.python.org/issue10915  opened by pitrou

#10919: Environment variables are not expanded in _winreg when using R
http://bugs.python.org/issue10919  opened by rjnienaber

#10922: Unexpected exception when calling function_proxy.__class__.__c
http://bugs.python.org/issue10922  opened by DasIch

#10924: Adding salt and Modular Crypt Format to crypt library.
http://bugs.python.org/issue10924  opened by jafo

#10925: Document pure Python version of integer-to-float correctly-rou
http://bugs.python.org/issue10925  opened by mark.dickinson

#10932: distutils.core.setup - data_files misbehaviour ?
http://bugs.python.org/issue10932  opened by Thorsten.Simons

#10933: Tracing disabled when a recursion error is triggered (even if 
http://bugs.python.org/issue10933  opened by fabioz

#10936: Simple CSS fix for left margin at docs.python.org
http://bugs.python.org/issue10936  opened by cdunn2001

#10937: WinPE 64 bit execution results with errors
http://bugs.python.org/issue10937  opened by gettingback2basics

#10938: Provide links to system specific strftime/ptime docs
http://bugs.python.org/issue10938  opened by hdiogenes

#10939: imaplib: Internaldate2tuple raises KeyError parsing month and 
http://bugs.python.org/issue10939  opened by lavajoe

#10940: IDLE 3.2 hangs with Cmd-M hotkey on OS X 10.6 with 64-bit inst
http://bugs.python.org/issue10940  opened by rhettinger

#10941: imaplib: Internaldate2tuple produces wrong result if date is n
http://bugs.python.org/issue10941  opened by lavajoe

#10942: xml.etree.ElementTree.tostring returns type bytes, expected ty
http://bugs.python.org/issue10942  opened by JTMoon79

#10945: bdist_wininst depends on MBCS codec, unavailable on non-Window
http://bugs.python.org/issue10945  opened by eric.araujo

#10948: Trouble with dir_util created dir cache
http://bugs.python.org/issue10948  opened by diegoqueiroz

#10949: logging.RotatingFileHandler not robust enough
http://bugs.python.org/issue10949  opened by kalt

#10951: gcc 4.6 warnings
http://bugs.python.org/issue10951  opened by haypo

#10952: Don't normalize module names to NFKC?
http://bugs.python.org/issue10952  opened by haypo

#10954: No warning for csv.writer API change
http://bugs.python.org/issue10954  opened by lregebro

#10955: Possible regression with stdlib in zipfile
http://bugs.python.org/issue10955  opened by ronaldoussoren

#10956: file.write and file.read don't handle EINTR
http://bugs.python.org/issue10956  opened by eggy

#10957: Python FAQ grammar error
http://bugs.python.org/issue10957  opened by jerry.seutter

#10960: os.stat() does not mention that it follow symlinks by default
http://bugs.python.org/issue10960  opened by mmarkk

#10961: Pydoc touchups in new browser for 3.2
http://bugs.python.org/issue10961  opened by ron_adam

#10963: "subprocess" can raise OSError (EPIPE) when communicating with
http://bugs.python.org/issue10963  opened by dmalcolm

#10964: Mac installer need not add things to /usr/local
http://bugs.python.org/issue10964  opened by reowen

#10965: dev task of documenting undocumented APIs
http://bugs.python.org/issue10965  opened by brett.cannon

#10966: eliminate use of ImportError implicitly representing TestSkipp
http://bugs.python.org/issue10966  opened by brett.cannon

#10967: move regrtest over to using more unittest infrastructure
http://bugs.python.org/issue10967  opened by brett.cannon

#10968: threading.Timer should be a class so that it can be derived
http://bugs.python.org/issue10968  opened by Kain94

#10969: Make Tcl recommendation more prominent
http://bugs.python.org/issue10969  opened by rhettinger

#10970: "string".encode('base64') is not the same as base64.b64encode(
http://bugs.python.org/issue10970  opened by mahmoudimus

#10971: python Lib/test/regrtest.py -R 3:3: test_zipimport_support fai
http://bugs.python.org/issue10971  opened by haypo

#10972: zipfile: add "unicode" option to the force the filename encodi
http://bugs.python.org/issue10972  opened by haypo

#10973: '??' not working with IDLE 3.2rc1 - OSX 10.6.6
http://bugs.python.org/issue10973  opened by naguilera

#10943: abitype: Need better support to port C extension modules to th
http://bugs.python.org/issue10943  opened by fhaxbox66 at googlemail.com

#10947: imaplib: Internaldate2tuple and ParseFlags require (and latter
http://bugs.python.org/issue10947  opened by lavajoe

#10946: bdist doesn???t pass --skip-build on to subcommands
http://bugs.python.org/issue10946  opened by eric.araujo



Most recent 15 issues with no replies (15)
==========================================

#10971: python Lib/test/regrtest.py -R 3:3: test_zipimport_support fai
http://bugs.python.org/issue10971

#10970: "string".encode('base64') is not the same as base64.b64encode(
http://bugs.python.org/issue10970

#10967: move regrtest over to using more unittest infrastructure
http://bugs.python.org/issue10967

#10965: dev task of documenting undocumented APIs
http://bugs.python.org/issue10965

#10960: os.stat() does not mention that it follow symlinks by default
http://bugs.python.org/issue10960

#10957: Python FAQ grammar error
http://bugs.python.org/issue10957

#10949: logging.RotatingFileHandler not robust enough
http://bugs.python.org/issue10949

#10943: abitype: Need better support to port C extension modules to th
http://bugs.python.org/issue10943

#10933: Tracing disabled when a recursion error is triggered (even if 
http://bugs.python.org/issue10933

#10925: Document pure Python version of integer-to-float correctly-rou
http://bugs.python.org/issue10925

#10910: pyport.h FreeBSD/Mac OS X "fix" causes errors in C++ compilati
http://bugs.python.org/issue10910

#10909: thread hang, possibly related to print
http://bugs.python.org/issue10909

#10891: Tweak sorting howto to eliminate redundancy
http://bugs.python.org/issue10891

#10886: Unhelpful backtrace for multiprocessing.Queue
http://bugs.python.org/issue10886

#10885: multiprocessing docs
http://bugs.python.org/issue10885



Most recent 15 issues waiting for review (15)
=============================================

#10972: zipfile: add "unicode" option to the force the filename encodi
http://bugs.python.org/issue10972

#10963: "subprocess" can raise OSError (EPIPE) when communicating with
http://bugs.python.org/issue10963

#10961: Pydoc touchups in new browser for 3.2
http://bugs.python.org/issue10961

#10956: file.write and file.read don't handle EINTR
http://bugs.python.org/issue10956

#10955: Possible regression with stdlib in zipfile
http://bugs.python.org/issue10955

#10949: logging.RotatingFileHandler not robust enough
http://bugs.python.org/issue10949

#10947: imaplib: Internaldate2tuple and ParseFlags require (and latter
http://bugs.python.org/issue10947

#10941: imaplib: Internaldate2tuple produces wrong result if date is n
http://bugs.python.org/issue10941

#10939: imaplib: Internaldate2tuple raises KeyError parsing month and 
http://bugs.python.org/issue10939

#10925: Document pure Python version of integer-to-float correctly-rou
http://bugs.python.org/issue10925

#10924: Adding salt and Modular Crypt Format to crypt library.
http://bugs.python.org/issue10924

#10922: Unexpected exception when calling function_proxy.__class__.__c
http://bugs.python.org/issue10922

#10915: Make the PyGILState API compatible with multiple interpreters
http://bugs.python.org/issue10915

#10914: Python sub-interpreter test
http://bugs.python.org/issue10914

#10908: Improvements to trace._Ignore
http://bugs.python.org/issue10908



Top 10 most discussed issues (10)
=================================

#10955: Possible regression with stdlib in zipfile
http://bugs.python.org/issue10955  20 msgs

#3080: Full unicode import system
http://bugs.python.org/issue3080  18 msgs

#10952: Don't normalize module names to NFKC?
http://bugs.python.org/issue10952  15 msgs

#10915: Make the PyGILState API compatible with multiple interpreters
http://bugs.python.org/issue10915  14 msgs

#10924: Adding salt and Modular Crypt Format to crypt library.
http://bugs.python.org/issue10924  13 msgs

#4819: Misc/cheatsheet needs updating
http://bugs.python.org/issue4819  10 msgs

#10936: Simple CSS fix for left margin at docs.python.org
http://bugs.python.org/issue10936   8 msgs

#10948: Trouble with dir_util created dir cache
http://bugs.python.org/issue10948   8 msgs

#10968: threading.Timer should be a class so that it can be derived
http://bugs.python.org/issue10968   8 msgs

#8957: strptime(.., '%c') fails to parse output of strftime('%c', ..)
http://bugs.python.org/issue8957   7 msgs



Issues closed (36)
==================

#2644: errors from msync ignored in mmap_object_dealloc
http://bugs.python.org/issue2644  closed by brian.curtin

#6075: Patch for IDLE/OS X to work with Tk-Cocoa
http://bugs.python.org/issue6075  closed by ned.deily

#8846: cgi.py bug report + fix: tailing carriage return and newline c
http://bugs.python.org/issue8846  closed by wobsta

#9532: pipe.read hang, when calling commands.getstatusoutput in multi
http://bugs.python.org/issue9532  closed by r.david.murray

#10238: ctypes not building under OS X 10.6 with LLVM/Clang 2.8
http://bugs.python.org/issue10238  closed by brett.cannon

#10451: memoryview can be used to write into readonly buffer
http://bugs.python.org/issue10451  closed by pitrou

#10843: OS X installer: install the Tools source directory
http://bugs.python.org/issue10843  closed by ned.deily

#10874: test_urllib2 shouldn't use is operator for comparing strings
http://bugs.python.org/issue10874  closed by pitrou

#10887: Add link to development ML
http://bugs.python.org/issue10887  closed by eric.araujo

#10898: posixmodule.c redefines FSTAT
http://bugs.python.org/issue10898  closed by pitrou

#10903: ZipExtFile:_update_crc fails for CRC >= 0x80000000
http://bugs.python.org/issue10903  closed by arindam

#10904: PYTHONIOENCODING is not in manpage
http://bugs.python.org/issue10904  closed by pebbe

#10906: wsgiref should mention that CGI scripts usually expect HTTPS v
http://bugs.python.org/issue10906  closed by georg.brandl

#10912: PyObject_RichCompare differs in behaviour from PyObject_RichCo
http://bugs.python.org/issue10912  closed by eli.bendersky

#10913: Deprecate PyEval_AcquireLock() and PyEval_ReleaseLock()
http://bugs.python.org/issue10913  closed by pitrou

#10916: mmap segfault
http://bugs.python.org/issue10916  closed by pitrou

#10917: PEP 333 link to CGI specification is broken
http://bugs.python.org/issue10917  closed by georg.brandl

#10918: **kwargs unnecessarily restricted in concurrent.futures 'submi
http://bugs.python.org/issue10918  closed by bquinlan

#10920: cp65001, PowerShell, Python crash.
http://bugs.python.org/issue10920  closed by haypo

#10921: imaplib: Internaldate2tuple() string/bytes issues, does not ha
http://bugs.python.org/issue10921  closed by lavajoe

#10923: Deadlock because of the import lock when loading the utf8 code
http://bugs.python.org/issue10923  closed by haypo

#10926: Some Invalid Relative Imports succeed in Py 3.0 & 3.1 [& corre
http://bugs.python.org/issue10926  closed by r.david.murray

#10927: Allow universal line endings in filecmp module
http://bugs.python.org/issue10927  closed by r.david.murray

#10928: Strange input processing
http://bugs.python.org/issue10928  closed by r.david.murray

#10929: telnetlib does not send FIN when self.close() issued
http://bugs.python.org/issue10929  closed by r.david.murray

#10930: dict.setdefault: Bug: default argument is ALWAYS evaluated, i.
http://bugs.python.org/issue10930  closed by amaury.forgeotdarc

#10931: print() from pipe enclosed between {b'} and {'}-pair on python
http://bugs.python.org/issue10931  closed by amaury.forgeotdarc

#10934: imaplib: Internaldate2tuple() is documented to return UTC, but
http://bugs.python.org/issue10934  closed by belopolsky

#10935: wsgiref.handlers.BaseHandler and subclasses of str
http://bugs.python.org/issue10935  closed by eric.araujo

#10944: ctypes documentation does not mention c_bool in table of stand
http://bugs.python.org/issue10944  closed by georg.brandl

#10950: ServerProxy returns bad XML
http://bugs.python.org/issue10950  closed by georg.brandl

#10953: safely eval serialized dict/list data from arbitrary string ov
http://bugs.python.org/issue10953  closed by georg.brandl

#10958: stat.S_ISLNK() does not wok!
http://bugs.python.org/issue10958  closed by amaury.forgeotdarc

#10959: mmap crash
http://bugs.python.org/issue10959  closed by pitrou

#10962: gdb support broken
http://bugs.python.org/issue10962  closed by pitrou

#1535504: CGIHTTPServer doesn't handle path names with embeded space
http://bugs.python.org/issue1535504  closed by georg.brandl

From victor.stinner at haypocalc.com  Fri Jan 21 18:29:18 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Fri, 21 Jan 2011 18:29:18 +0100
Subject: [Python-Dev] Import and unicode: patch is ready for a review and
	tests
Message-ID: <1295630958.15444.3.camel@marge>

Hi,

It looks like some people fear that non-ASCII module names will cause
troubles for the interoperability: you can try my patch attached to
issue #3080 to prevent these issues and fix all bugs :-)

http://bugs.python.org/issue3080

I should maybe create a dummy Python project using non-ASCII module
names to test it.

I posted my patch on Rietveld:

http://codereview.appspot.com/3972045

Victor


From brett at python.org  Fri Jan 21 19:21:05 2011
From: brett at python.org (Brett Cannon)
Date: Fri, 21 Jan 2011 10:21:05 -0800
Subject: [Python-Dev] [Python-checkins] devguide: Copy over the dev FAQ
 and *only* strip out stuff covered elsewhere in the
In-Reply-To: 
References: 
	
Message-ID: 

On Thu, Jan 20, 2011 at 16:58, Nick Coghlan  wrote:
> On Fri, Jan 21, 2011 at 6:42 AM, brett.cannon
>  wrote:
>> brett.cannon pushed 82d3a1b694b3 to devguide:
>>
>> http://hg.python.org/devguide/rev/82d3a1b694b3
>> changeset: ? 167:82d3a1b694b3
>> user: ? ? ? ?Brett Cannon 
>> date: ? ? ? ?Thu Jan 20 12:40:47 2011 -0800
>> summary:
>> ?Copy over the dev FAQ and *only* strip out stuff covered elsewhere in the devguide. Nick Coghlan should be a happy boy after this.
>
> Yay, thanks :)

Watch what you wish for since you are now maintaining it. =)

-Brett

>
> Cheers,
> Nick.
>
> --
> Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia
> _______________________________________________
> Python-checkins mailing list
> Python-checkins at python.org
> http://mail.python.org/mailman/listinfo/python-checkins
>

From brett at python.org  Fri Jan 21 19:22:24 2011
From: brett at python.org (Brett Cannon)
Date: Fri, 21 Jan 2011 10:22:24 -0800
Subject: [Python-Dev] [Python-checkins] devguide: Move Misc/README.Emacs
 to here.
In-Reply-To: 
References: 
	
Message-ID: 

It's the Emacs lovers who put that stuff in all of their files, so I
ain't touching it.

On Thu, Jan 20, 2011 at 13:06, Sandro Tosi  wrote:
> Hi,
>
> On Thu, Jan 20, 2011 at 20:33, brett.cannon  wrote:
>> +..
>> + ? Local Variables:
>> + ? mode: indented-text
>> + ? indent-tabs-mode: nil
>> + ? sentence-end-double-space: t
>> + ? fill-column: 78
>> + ? coding: utf-8
>> + ? End:
>
> maybe this can be removed now
>
> Cheers,
> --
> Sandro Tosi (aka morph, morpheus, matrixhasu)
> My website: http://matrixhasu.altervista.org/
> Me at Debian: http://wiki.debian.org/SandroTosi
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org
>

From martin at v.loewis.de  Sat Jan 22 00:50:33 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 22 Jan 2011 00:50:33 +0100
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: 
References: <1295440442.432.18.camel@marge>								<871v467ih3.fsf@uwakimon.sk.tsukuba.ac.jp>
	
Message-ID: <4D3A1BC9.40604@v.loewis.de>

>> I don't think anybody is *encouraging* it.  The argument is for
>> *permitting* it, partly for consistency with other identifiers, and
>> partly because of Python's usual "consenting adults" standard for
>> permitting "dangerous" practices.
> 
> I'm sorry, I was not clear. I was afraid that saying "learning
> opportunity" tempt people to try non-ASCII module names.
> In these days, even non technical people have access to Windows, Mac
> and Linux boxes at a time. So chances to be annoyed with broken
> non-ASCII named files are pretty common.

Actually, as long people only involve Windows, or only involve Mac,
it will all work just fine. It's only when they use non-Mac Unix
(such as Linux), or try to move files across systems using sub-prime
technology (such as your typical Windows zip utility) they will run
into problems. But then it will be clear whom to blame - and people
run in the same problems regardless of whether they move Python modules,
or regular files (say, Word documents).

So the more people get confronted with the poor support of non-ASCII
file names in tools, the faster the tools will improve. It took PKWARE
many years to come up with a reasonable Unicode story - but now it's
really the tools that need to catch up, not the spec.

Regards,
Martin

From ncoghlan at gmail.com  Sat Jan 22 03:35:39 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 22 Jan 2011 12:35:39 +1000
Subject: [Python-Dev] Triagers and checkin access to the devguide repository
Message-ID: 

Given that some of the dev guide docs cover triaging and other aspects
of managing issues on the tracker, does it make sense to offer
devguide checkin access to triagers that want it?

Regards,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From msapiro at value.net  Sat Jan 22 02:39:19 2011
From: msapiro at value.net (Mark Sapiro)
Date: Fri, 21 Jan 2011 17:39:19 -0800
Subject: [Python-Dev] Mail archive line wrapping (Was: Import and
 unicode:	part two)
In-Reply-To: 
References: <1295440442.432.18.camel@marge>							<20110120215522.61429d36@pitrou.net>	<863D108E-9B6E-4F9D-8D1F-D2B87B0EAC69@fuhm.net>	<20110121123129.3c05e129@pitrou.net>
	
Message-ID: <4D3A3547.9010508@value.net>

On 11:59 AM, James Y Knight wrote:
> 
> Well, yes, that's a pretty annoying bug in mailman, isn't it? If only anyone around here was involved in mailman and could fix it! :) [I've attempted to cc this to mailman-users with this message, but since I'm not subscribed I dunno if it'll make it or not.]
> 
> I have this in my user CSS override file to fix the issue for myself globally on all such archives out in the world:
> /* Mailing list archives */
> html>body>pre { white-space: pre-wrap !important; }
> 
> But really, pipermail should just output a suitable style itself, e.g.: 
 or a  in the header.


This is mailman bug 266467
. It was fixed in
Mailman 2.1.13.

-- 
Mark Sapiro        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan


From swamiyeswanth at hotmail.com  Sat Jan 22 19:16:24 2011
From: swamiyeswanth at hotmail.com (yeswanth)
Date: Sat, 22 Jan 2011 23:46:24 +0530
Subject: [Python-Dev] web framework for py3k
Message-ID: 

I would want to help porting some web framework for py3k .. I want to 
know to know which one is good and which can be ported easily . Also i 
would require some guidance for this work as I am just a beginner here ..

Thanks
Yeswanth

From alexander.belopolsky at gmail.com  Sat Jan 22 19:25:19 2011
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 22 Jan 2011 13:25:19 -0500
Subject: [Python-Dev] [Python-checkins] r88140 - in
	python/branches/py3k: Misc/NEWS Modules/zipimport.c
In-Reply-To: <20110122103029.CE0F8EEA11@mail.python.org>
References: <20110122103029.CE0F8EEA11@mail.python.org>
Message-ID: 

On Sat, Jan 22, 2011 at 5:30 AM, victor.stinner
 wrote:
..
> zipimport uses ASCII encoding instead of *cp497* to decode filenames, ..

Shouldn't this be "instead of *cp437*"?

From tjreedy at udel.edu  Sat Jan 22 19:55:41 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 22 Jan 2011 13:55:41 -0500
Subject: [Python-Dev] web framework for py3k
In-Reply-To: 
References: 
Message-ID: 

On 1/22/2011 1:16 PM, yeswanth wrote:

In general, this list is for development of Python, CPython, and its 
stdlib, not 3rd party modules.

> I would want to help porting some web framework for py3k ..

The target of any such efforts should be 3.2 as it has changes intended 
to help web programming.

> I want to know to know which one is good  and which can be ported easily

Opinions will depend on the person. Such questions might be better asked 
on Python list or the specialized web-sig list, where there are more 
people involved with web frameworks. Most frameworks have their own 
lists. Some can be accessed as newsgroups at news.gmane.org in the 
gmane.comp.python hierarchy.

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Sat Jan 22 20:04:00 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 22 Jan 2011 14:04:00 -0500
Subject: [Python-Dev] What's new 2.x in 3.x docs.
Message-ID: 

The 3.x docs mostly started fresh with 3.0. The major exception is the 
What's new section, which goes back to 2.0. The 2.x stuff comprises 
about 650KB in the repository and whatever that translates into in the 
distribution. I cannot imagine that anyone who only has 3.x and no 2.x 
version would have any interest in the 2.x history. And of course, the 
complete 2.x history will always be available with the latest 2.7.z. And 
the cover page for 3.x could even say so and include a link. So why not 
remove it from the 3.2 release (and have two separate pages for the 
online version)?

-- 
Terry Jan Reedy


From solipsis at pitrou.net  Sat Jan 22 20:20:20 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 22 Jan 2011 20:20:20 +0100
Subject: [Python-Dev] What's new 2.x in 3.x docs.
References: 
Message-ID: <20110122202020.51920405@pitrou.net>

On Sat, 22 Jan 2011 14:04:00 -0500
Terry Reedy  wrote:
> The 3.x docs mostly started fresh with 3.0. The major exception is the 
> What's new section, which goes back to 2.0. The 2.x stuff comprises 
> about 650KB in the repository and whatever that translates into in the 
> distribution. I cannot imagine that anyone who only has 3.x and no 2.x 
> version would have any interest in the 2.x history. And of course, the 
> complete 2.x history will always be available with the latest 2.7.z. And 
> the cover page for 3.x could even say so and include a link. So why not 
> remove it from the 3.2 release (and have two separate pages for the 
> online version)?

Well, is there any point in doing so, apart from saving 650KB in the
repository? I'm not sure we care about the latter (right now the
whole source tree is more than 50MB, and that's without version
control information).

Regards

Antoine.



From brett at python.org  Sat Jan 22 21:54:41 2011
From: brett at python.org (Brett Cannon)
Date: Sat, 22 Jan 2011 12:54:41 -0800
Subject: [Python-Dev] Triagers and checkin access to the devguide
	repository
In-Reply-To: 
References: 
Message-ID: 

On Fri, Jan 21, 2011 at 18:35, Nick Coghlan  wrote:
> Given that some of the dev guide docs cover triaging and other aspects
> of managing issues on the tracker, does it make sense to offer
> devguide checkin access to triagers that want it?
>

There are enough triagers with commit privileges that this is probably
not needed.

-Brett

> Regards,
> Nick.
>
> --
> Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org
>

From raymond.hettinger at gmail.com  Sat Jan 22 22:23:15 2011
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sat, 22 Jan 2011 13:23:15 -0800
Subject: [Python-Dev] What's new 2.x in 3.x docs.
In-Reply-To: 
References: 
Message-ID: <60D93B3F-FF28-4BFF-AD77-F2316F5BAC53@gmail.com>


On Jan 22, 2011, at 11:04 AM, Terry Reedy wrote:

> The 3.x docs mostly started fresh with 3.0. The major exception is the What's new section, which goes back to 2.0. The 2.x stuff comprises about 650KB in the repository and whatever that translates into in the distribution. I cannot imagine that anyone who only has 3.x and no 2.x version would have any interest in the 2.x history. And of course, the complete 2.x history will always be available with the latest 2.7.z. And the cover page for 3.x could even say so and include a link. So why not remove it from the 3.2 release (and have two separate pages for the online version)?

I think there is value in the older whatsnew docs.  The provide a readable introduction to various features and nicely augment the plain docs which can be a little dry.

+1 for keeping the links as-is.  Removing them takes away a resource and gains nothing.


Raymond



From victor.stinner at haypocalc.com  Sun Jan 23 01:12:26 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Sun, 23 Jan 2011 01:12:26 +0100
Subject: [Python-Dev] [Python-checkins] r88140 - in
 python/branches/py3k: Misc/NEWS Modules/zipimport.c
In-Reply-To: 
References: <20110122103029.CE0F8EEA11@mail.python.org>
	
Message-ID: <1295741546.18456.0.camel@marge>

Le samedi 22 janvier 2011 ? 13:25 -0500, Alexander Belopolsky a ?crit :
> > zipimport uses ASCII encoding instead of *cp497* to decode filenames, ..
> 
> Shouldn't this be "instead of *cp437*"?

Woops, correct: fixed in r88145.

Victor


From brett at python.org  Sun Jan 23 02:08:00 2011
From: brett at python.org (Brett Cannon)
Date: Sat, 22 Jan 2011 17:08:00 -0800
Subject: [Python-Dev] Beta version of the new devguide
Message-ID: 

http://docs.python.org/devguide/

If you are a core developer and have a correction you want to make you
can simply check out the devguide yourself (link is in the Resources
section of the devguide) and make the corrections yourself. Otherwise
reply here (you can email me directly but I already have instances of
multiple people telling me about the same spelling mistake so it's
nice to have it public so people know when I have been informed).

As for what is left to do, there are a a few things. One is fixing
some issues to allow test coverage to be run for the entire test suite
(see the coverage docs to know what issues are tracking the problems).
I will work on this next if no one beats me to it as both issues
should be relatively simple to do.

Two, what should the final URL be? Georg picked the current one and I
am happy with it.

Three, where should it be linked from? docs.python.org homepage?

Four, what to do with www.python.org/dev/? Redirect for all the pages?

Otherwise I consider the devguide ready to go. My next thing will be
an "official" HOWTO on dealing with Python 2/3 porting/maintenance.

From brett at python.org  Sun Jan 23 02:14:44 2011
From: brett at python.org (Brett Cannon)
Date: Sat, 22 Jan 2011 17:14:44 -0800
Subject: [Python-Dev] Beta version of the new devguide
In-Reply-To: 
References: 
Message-ID: 

And I forgot to mention I also plan to edit the help text on the
various fields on the issue tracker to point to the triaging doc.

On Sat, Jan 22, 2011 at 17:08, Brett Cannon  wrote:
> http://docs.python.org/devguide/
>
> If you are a core developer and have a correction you want to make you
> can simply check out the devguide yourself (link is in the Resources
> section of the devguide) and make the corrections yourself. Otherwise
> reply here (you can email me directly but I already have instances of
> multiple people telling me about the same spelling mistake so it's
> nice to have it public so people know when I have been informed).
>
> As for what is left to do, there are a a few things. One is fixing
> some issues to allow test coverage to be run for the entire test suite
> (see the coverage docs to know what issues are tracking the problems).
> I will work on this next if no one beats me to it as both issues
> should be relatively simple to do.
>
> Two, what should the final URL be? Georg picked the current one and I
> am happy with it.
>
> Three, where should it be linked from? docs.python.org homepage?
>
> Four, what to do with www.python.org/dev/? Redirect for all the pages?
>
> Otherwise I consider the devguide ready to go. My next thing will be
> an "official" HOWTO on dealing with Python 2/3 porting/maintenance.
>

From ncoghlan at gmail.com  Sun Jan 23 02:48:29 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 23 Jan 2011 11:48:29 +1000
Subject: [Python-Dev] What's new 2.x in 3.x docs.
In-Reply-To: <60D93B3F-FF28-4BFF-AD77-F2316F5BAC53@gmail.com>
References: 
	<60D93B3F-FF28-4BFF-AD77-F2316F5BAC53@gmail.com>
Message-ID: 

On Sun, Jan 23, 2011 at 7:23 AM, Raymond Hettinger
 wrote:
> On Jan 22, 2011, at 11:04 AM, Terry Reedy wrote:
>
>> The 3.x docs mostly started fresh with 3.0. The major exception is the What's new section, which goes back to 2.0. The 2.x stuff comprises about 650KB in the repository and whatever that translates into in the distribution. I cannot imagine that anyone who only has 3.x and no 2.x version would have any interest in the 2.x history. And of course, the complete 2.x history will always be available with the latest 2.7.z. And the cover page for 3.x could even say so and include a link. So why not remove it from the 3.2 release (and have two separate pages for the online version)?
>
> I think there is value in the older whatsnew docs. ?The provide a readable introduction to various features and nicely augment the plain docs which can be a little dry.
>
> +1 for keeping the links as-is. ?Removing them takes away a resource and gains nothing.

They're also a useful resource when developing compatibility guides
for projects that target older versions (including ones that support
py3k via 2to3).

With the latest 3.x release always being at the top, I agree with
Raymond that retaining the history is a better option.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From tjreedy at udel.edu  Sun Jan 23 04:05:00 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 22 Jan 2011 22:05:00 -0500
Subject: [Python-Dev] What's new 2.x in 3.x docs.
In-Reply-To: <20110122202020.51920405@pitrou.net>
References:  <20110122202020.51920405@pitrou.net>
Message-ID: 

On 1/22/2011 2:20 PM, Antoine Pitrou wrote:
> On Sat, 22 Jan 2011 14:04:00 -0500
> Terry Reedy  wrote:
>> The 3.x docs mostly started fresh with 3.0. The major exception is the
>> What's new section, which goes back to 2.0. The 2.x stuff comprises
>> about 650KB in the repository and whatever that translates into in the
>> distribution. I cannot imagine that anyone who only has 3.x and no 2.x
>> version would have any interest in the 2.x history. And of course, the
>> complete 2.x history will always be available with the latest 2.7.z. And
>> the cover page for 3.x could even say so and include a link. So why not
>> remove it from the 3.2 release (and have two separate pages for the
>> online version)?
>
> Well, is there any point in doing so, apart from saving 650KB in the
> repository? I'm not sure we care about the latter (right now the
> whole source tree is more than 50MB, and that's without version
> control information).

I was only proposing actual removal of what to me is noise from the 
windows help file (now 5.6 mb) with a link to the online version. But 
the idea is rejected. Fini.

-- 
Terry Jan Reedy


From prasun3 at gmail.com  Sun Jan 23 08:20:10 2011
From: prasun3 at gmail.com (Prasun Ratn)
Date: Sat, 22 Jan 2011 23:20:10 -0800
Subject: [Python-Dev] build problem
Message-ID: 

Hello
    I got the latest copy of python source from svn and was trying to
build it on Windows Vista (32 bit) using Microsoft Visual Express
2008.

    I got the following error:

5>"C:\Program Files\TortoiseSVN\bin\subwcrev.exe" ..
..\Modules\getbuildinfo.c
"E:\coding\py3kclean\py3k\PCbuild\Win32-temp-Debug\pythoncore\\getbuildinfo2.c"
5>'C:\Program' is not recognized as an internal or external command,

    Adding an extra set of quotes around the command seems to fix
this. I've attached a patch.


Thanks
Prasun
-------------- next part --------------
A non-text attachment was scrubbed...
Name: buildinfo.patch
Type: application/octet-stream
Size: 1356 bytes
Desc: not available
URL: 

From martin at v.loewis.de  Sun Jan 23 19:18:35 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 23 Jan 2011 19:18:35 +0100
Subject: [Python-Dev] build problem
In-Reply-To: 
References: 
Message-ID: <4D3C70FB.60102@v.loewis.de>

>     Adding an extra set of quotes around the command seems to fix
> this. I've attached a patch.

This is puzzling: a) AFAICT, the code works on all other system as it
stands, and b) putting this many quotes into the command line is not
plausible.

Do you have any strange settings on your computer, such as using a
non-standard cmd shell?

Regards,
Martin

From martin at v.loewis.de  Sun Jan 23 19:23:02 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 23 Jan 2011 19:23:02 +0100
Subject: [Python-Dev] web framework for py3k
In-Reply-To: 
References: 
Message-ID: <4D3C7206.6080609@v.loewis.de>

Am 22.01.2011 19:16, schrieb yeswanth:
> I would want to help porting some web framework for py3k .. I want to
> know to know which one is good and which can be ported easily . Also i
> would require some guidance for this work as I am just a beginner here ..

Yeswanth,

Terry already indicated that this is the wrong list. The right list
would be the python-porting list:

http://mail.python.org/mailman/listinfo/python-porting

As for which framework can be ported easily: that is, unfortunately,
difficult to tell in advance. I'd expect that sheer number of lines
gives a good indicator: the smaller the framework, the more easy should
it be to port it.

HTH,
Martin

From solipsis at pitrou.net  Sun Jan 23 19:58:41 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 23 Jan 2011 19:58:41 +0100
Subject: [Python-Dev] r88147 - in python/branches/py3k: Misc/NEWS
 Modules/_pickle.c Tools/scripts/find_recursionlimit.py
References: <20110123171226.131CCEE98B@mail.python.org>
Message-ID: <20110123195841.40d2bbff@pitrou.net>

On Sun, 23 Jan 2011 18:12:26 +0100 (CET)
antoine.pitrou  wrote:
> Author: antoine.pitrou
> Date: Sun Jan 23 18:12:25 2011
> New Revision: 88147
> 
> Log:
> Issue #10987: Fix the recursion limit handling in the _pickle module.

I forgot to mention that it was ok'ed by Georg, so there it is.

Regards

Antoine.



From brett at python.org  Sun Jan 23 21:22:41 2011
From: brett at python.org (Brett Cannon)
Date: Sun, 23 Jan 2011 12:22:41 -0800
Subject: [Python-Dev] Beta version of the new devguide
In-Reply-To: <20110123075621.468d07c9@dino>
References: 
	<20110123075621.468d07c9@dino>
Message-ID: 

On Sat, Jan 22, 2011 at 23:56, Mark Summerfield  wrote:
> Hi Brett,
>
> On Sat, 22 Jan 2011 17:08:00 -0800
> Brett Cannon  wrote:
>> http://docs.python.org/devguide/
>
> Personally, I found the first paragraph of "Contributing" a bit
> off-putting.
>
> How about replacing:
>
> ? ?People who wish to contribute to Python must read the following
> ? ?documents in the order provided. You can stop where you feel
> ? ?comfortable and begin contributing immediately without reading and
> ? ?understanding these documents all at once, but please do not skip
> ? ?around within the documentation as everything is written assuming
> ? ?preceding documentation has been read.
>
> With something like:
>
> ? ?The Python core development team always welcomes new contributors,
> ? ?so we are very glad of your interest! Please read the following
> ? ?documents---in the order shown---to ensure that you understand how
> ? ?Python's development process works. This will ensure that your
> ? ?contributions are considered purely on their merit and don't get
> ? ?rejected due to missing or incorrectly performing a step in the
> ? ?process.
>

I'll see what I can do.

> In "Getting Set Up" it describes how to build a pydebug build. Is that
> really necessary for those who plan only to contribute by working on
> pure Python code?
>

Yes, there is actually a laundry list of reasons even people only
working on the stdlib should use a pydebug build.

> I had a quick skim over the rest and got the feeling that no clear
> distinction is made between C and Python work. Personally, I feel that
> more of a distinction should be made since not everyone will be
> confident or interested in C. (And maybe more distinction should be made
> between working on CPython and the standard library?)

I don't see where the distinction between extensions and Python code
would serve a purpose beyond clouding the documents by adding more
details. People who know both are fine and the people who don't can
start off ignorant and work there way up.

As for CPython/Python distinction, they are so intertwined at the
moment that the distinction is once again not worth it beyond what I
have already done. When the stdlib is separated from CPython then the
delineation of one over will become worth it.

>
> Overall I think this document is *extremely welcome* and I am very glad
> you have done it. I'm sure that once it starts to get known it will help
> add to the pool of people contributing to Python as well as helping to
> keep the processes clear:-)

=) That's the hope.

-Brett

>
> --
> Mark Summerfield, Qtrac Ltd, www.qtrac.eu
> ? ?C++, Python, Qt, PyQt - training and consultancy
> ? ? ? ?"Advanced Qt Programming" - ISBN 0321635906
> ? ? ? ? ? ?http://www.qtrac.eu/aqpbook.html
>

From victor.stinner at haypocalc.com  Sun Jan 23 21:22:31 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Sun, 23 Jan 2011 21:22:31 +0100
Subject: [Python-Dev] build problem
In-Reply-To: <4D3C70FB.60102@v.loewis.de>
References: 
	<4D3C70FB.60102@v.loewis.de>
Message-ID: <1295814151.22114.4.camel@marge>

Le dimanche 23 janvier 2011 ? 19:18 +0100, "Martin v. L?wis" a ?crit :
> >     Adding an extra set of quotes around the command seems to fix
> > this. I've attached a patch.

Hey! I already wrote exactly the same patch! But I didn't propose it
upstream because I was unable to reproduce the bug.

> This is puzzling: a) AFAICT, the code works on all other system as it
> stands,

I had this issue already twice, but later (after a reboot? I don't
remember) it worked again (without the patch). It might be related to an
upgrade of TortoiseSVN (try to upgrade TortoiseSVN without rebooting).

> b) putting this many quotes into the command line is not plausible.

""c:\path\to\subwcrev.exe" arg1 arg2 ..." just works. I don't understand
why (strange syntax), but it works :-)

When I had the problem, it worked with extra quotes, but not without. It
is strange because the program ("c:\path\to\subwcrev.exe") existed!?

Victor


From martin at v.loewis.de  Sun Jan 23 23:10:55 2011
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sun, 23 Jan 2011 23:10:55 +0100
Subject: [Python-Dev] build problem
In-Reply-To: <1295814151.22114.4.camel@marge>
References: 	<4D3C70FB.60102@v.loewis.de>
	<1295814151.22114.4.camel@marge>
Message-ID: <4D3CA76F.9070702@v.loewis.de>

> ""c:\path\to\subwcrev.exe" arg1 arg2 ..." just works. I don't understand
> why (strange syntax), but it works :-)
> 
> When I had the problem, it worked with extra quotes, but not without. It
> is strange because the program ("c:\path\to\subwcrev.exe") existed!?

I'd really like to understand it before changing it. The part "it
sometimes works, then fails" is particularly puzzling, and indicates
that the *actual* problem is entirely unrelated to the quoting.

Regards,
Martin

From tjreedy at udel.edu  Mon Jan 24 00:45:50 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 23 Jan 2011 18:45:50 -0500
Subject: [Python-Dev] r88147 - in python/branches/py3k: Misc/NEWS
 Modules/_pickle.c Tools/scripts/find_recursionlimit.py
In-Reply-To: <20110123195841.40d2bbff@pitrou.net>
References: <20110123171226.131CCEE98B@mail.python.org>
	<20110123195841.40d2bbff@pitrou.net>
Message-ID: 

On 1/23/2011 1:58 PM, Antoine Pitrou wrote:

>> Issue #10987: Fix the recursion limit handling in the _pickle module.

12 hours after the report!

I am still curious why a previous exception changed pickle behavior, and 
only in 3.2, but I would rather you fix another bug than speeding much 
time to get me up to speed on the intricacies of _pickle ;-).

-- 
Terry Jan Reedy


From fuzzyman at voidspace.org.uk  Mon Jan 24 00:54:08 2011
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Sun, 23 Jan 2011 23:54:08 +0000
Subject: [Python-Dev] Beta version of the new devguide
In-Reply-To: 
References: 	<20110123075621.468d07c9@dino>
	
Message-ID: <4D3CBFA0.1000402@voidspace.org.uk>

On 23/01/2011 20:22, Brett Cannon wrote:
> [snip...]
>> I had a quick skim over the rest and got the feeling that no clear
>> distinction is made between C and Python work. Personally, I feel that
>> more of a distinction should be made since not everyone will be
>> confident or interested in C. (And maybe more distinction should be made
>> between working on CPython and the standard library?)
> I don't see where the distinction between extensions and Python code
> would serve a purpose beyond clouding the documents by adding more
> details. People who know both are fine and the people who don't can
> start off ignorant and work there way up.
>

I think a lot of people assume that unless they know C they can't 
contribute to Python. I don't know where the best place is but it would 
be good to make it *clear* that this isn't true.

All the best,

Michael Foord

> As for CPython/Python distinction, they are so intertwined at the
> moment that the distinction is once again not worth it beyond what I
> have already done. When the stdlib is separated from CPython then the
> delineation of one over will become worth it.
>
>> Overall I think this document is *extremely welcome* and I am very glad
>> you have done it. I'm sure that once it starts to get known it will help
>> add to the pool of people contributing to Python as well as helping to
>> keep the processes clear:-)
> =) That's the hope.
>
> -Brett
>
>> --
>> Mark Summerfield, Qtrac Ltd, www.qtrac.eu
>>     C++, Python, Qt, PyQt - training and consultancy
>>         "Advanced Qt Programming" - ISBN 0321635906
>>             http://www.qtrac.eu/aqpbook.html
>>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk


-- 
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html


From solipsis at pitrou.net  Mon Jan 24 01:02:32 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 24 Jan 2011 01:02:32 +0100
Subject: [Python-Dev] r88147 - in python/branches/py3k: Misc/NEWS
 Modules/_pickle.c Tools/scripts/find_recursionlimit.py
References: <20110123171226.131CCEE98B@mail.python.org>
	<20110123195841.40d2bbff@pitrou.net> 
Message-ID: <20110124010232.0f3c5823@pitrou.net>

On Sun, 23 Jan 2011 18:45:50 -0500
Terry Reedy  wrote:

> On 1/23/2011 1:58 PM, Antoine Pitrou wrote:
> 
> >> Issue #10987: Fix the recursion limit handling in the _pickle module.
> 
> 12 hours after the report!
> 
> I am still curious why a previous exception changed pickle behavior, and 
> only in 3.2, but I would rather you fix another bug than speeding much 
> time to get me up to speed on the intricacies of _pickle ;-).

It was not about a previous exception. The issue is that pickle
detected the recursion overflow but returned a successful status after
having set the exception. This is the kind of mistake that produces
strange "delayed" exceptions.

Regards

Antoine.



From martin at v.loewis.de  Mon Jan 24 01:28:26 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 24 Jan 2011 01:28:26 +0100
Subject: [Python-Dev] r88147 - in python/branches/py3k: Misc/NEWS
 Modules/_pickle.c Tools/scripts/find_recursionlimit.py
In-Reply-To: 
References: <20110123171226.131CCEE98B@mail.python.org>	<20110123195841.40d2bbff@pitrou.net>
	
Message-ID: <4D3CC7AA.3070204@v.loewis.de>

> I am still curious why a previous exception changed pickle behavior, and
> only in 3.2, but I would rather you fix another bug than speeding much
> time to get me up to speed on the intricacies of _pickle ;-).

IIUC, the code change made pickle actually aware of the exception,
rather than just setting it in the thread state, but then happily
declaring that pickle succeeded (with what would turn out to be
incorrect data).

As for why an explicit exception breaks the reporting, and omitting
it makes it report the exception correctly:

the report that it gave wasn't actually correct. I got

raceback (most recent call last):
  File "a.py", line 4, in 
    for i in range(100):
RuntimeError: maximum recursion depth exceeded while pickling an object

So the exception is reported on the range call, or the for loop.
After the change, we get

Traceback (most recent call last):
  File "a.py", line 7, in 
    _pickle.Pickler(io.BytesIO(), protocol=-1).dump(l)
RuntimeError: maximum recursion depth exceeded while pickling an object

So it appears that the interpreter would actually pick up the exception
set by pickle, and attribute it to the for loop. When you add an
explicit raise, this raise will clear the stack overflow exception,
and set the new exception. So the error manages to pass silently,
without being explicitly silenced.

I wonder whether we could sprinkle more exception-set? checks in
the interpreter loop, at least in debug mode.

It's a design flaw in CPython that there are two ways to report
an exception: either through the thread state, or through the return
value. I don't think this flaw can be fully fixed. However, I
wonder whether static analysis of the C code could produce better
detection of this kind of bug.

Regards,
Martin

From solipsis at pitrou.net  Mon Jan 24 02:16:44 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 24 Jan 2011 02:16:44 +0100
Subject: [Python-Dev] r88147 - in python/branches/py3k: Misc/NEWS
 Modules/_pickle.c Tools/scripts/find_recursionlimit.py
References: <20110123171226.131CCEE98B@mail.python.org>
	<20110123195841.40d2bbff@pitrou.net> 
	<4D3CC7AA.3070204@v.loewis.de>
Message-ID: <20110124021644.7dc67ccb@pitrou.net>

On Mon, 24 Jan 2011 01:28:26 +0100
"Martin v. L?wis"  wrote:
> 
> I wonder whether we could sprinkle more exception-set? checks in
> the interpreter loop, at least in debug mode.

Yes, this would be nice. Nicer if it can be centralized, of course.
That said, it probably wouldn't have helped here, since the code which
exhibited the bug (the find_recursion_limit.py script) is basically
never run automatically, and very rarely by a human.

Regards

Antoine.



From prasun3 at gmail.com  Mon Jan 24 04:08:33 2011
From: prasun3 at gmail.com (prasun3 at gmail.com)
Date: Sun, 23 Jan 2011 19:08:33 -0800
Subject: [Python-Dev] build problem
In-Reply-To: <4D3CA76F.9070702@v.loewis.de>
References: 
	<4D3C70FB.60102@v.loewis.de> <1295814151.22114.4.camel@marge>
	<4D3CA76F.9070702@v.loewis.de>
Message-ID: 

On Sun, Jan 23, 2011 at 2:10 PM, "Martin v. L?wis"  wrote:
>> ""c:\path\to\subwcrev.exe" arg1 arg2 ..." just works. I don't understand
>> why (strange syntax), but it works :-)
>>
>> When I had the problem, it worked with extra quotes, but not without. It
>> is strange because the program ("c:\path\to\subwcrev.exe") existed!?
>
> I'd really like to understand it before changing it. The part "it
> sometimes works, then fails" is particularly puzzling, and indicates
> that the *actual* problem is entirely unrelated to the quoting.

I used ProcMon to track down the actual command that the system() call creates.

The unmodified code produces this:
C:\Windows\system32\cmd.exe /c "C:\Program
Files\TortoiseSVN\bin\subwcrev.exe" .. ..\Modules\getbuildinfo.c
"E:\coding\py3k\PCbuild\Win32-temp-pgi\pythoncore\\getbuildinfo2.c"

whereas my patch produces this:
C:\Windows\system32\cmd.exe /c ""C:\Program
Files\TortoiseSVN\bin\subwcrev.exe" .. ..\Modules\getbuildinfo.c
"E:\coding\py3k\PCbuild\Win32-temp-pgi\pythoncore\\getbuildinfo2.c""

I pasted those two lines on the command prompt. The first results in
the error "'C:\Program' is not recognized ...... ". The second one
does the right thing.

It would be great if someone could run ProcMon on a "normal" system
and see what command is created.

Thanks
Prasun

From stephen at xemacs.org  Mon Jan 24 03:33:28 2011
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 24 Jan 2011 11:33:28 +0900
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <4D3A1BC9.40604@v.loewis.de>
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	
	<871v467ih3.fsf@uwakimon.sk.tsukuba.ac.jp>
	
	<4D3A1BC9.40604@v.loewis.de>
Message-ID: <87y66b58uf.fsf@uwakimon.sk.tsukuba.ac.jp>

"Martin v. L?wis" writes:

 > Actually, as long people only involve Windows, or only involve Mac,
 > it will all work just fine. It's only when they use non-Mac Unix
 > (such as Linux), or try to move files across systems using sub-prime
 > technology (such as your typical Windows zip utility) they will run
 > into problems.

I believe that the kind of thing that Ishimoto-san has in mind is
things like "smart cameras" that will upload your photos to your blog
with one touch on the cameras screen and other "Web 2.0 for the rest
of us" apps.  What with the popularity of Linux and *BSD for such
sites, it's easy to imagine problems of the kind he describes
occurring between those (which will probably be using Shift JIS in
Japan) apps and the websites.

Why people with the skills to be actually using Python would have a
problem like that, I don't know, but my experience with Japanese
vendors is no different from anywhere else: they put the blame for
bugs in systems on any convenient component other than their own or
close business partners'.  Open source is especially convenient
because of the NO WARRANTY section prominently displayed in all
licenses.

 > So the more people get confronted with the poor support of non-ASCII
 > file names in tools, the faster the tools will improve. It took PKWARE
 > many years to come up with a reasonable Unicode story - but now it's
 > really the tools that need to catch up, not the spec.

I still agree with this point of view, but there is some scope for
discussion of whether these tools should be "included batteries" or
not.  (Unfortunately I'm not in a position to volunteer to help with
them for some time. :-( )


From guido at python.org  Mon Jan 24 05:07:45 2011
From: guido at python.org (Guido van Rossum)
Date: Sun, 23 Jan 2011 20:07:45 -0800
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <87y66b58uf.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	
	<871v467ih3.fsf@uwakimon.sk.tsukuba.ac.jp>
	
	<4D3A1BC9.40604@v.loewis.de> <87y66b58uf.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: 

On Sun, Jan 23, 2011 at 6:33 PM, Stephen J. Turnbull  wrote:
> "Martin v. L?wis" writes:
> ?> Actually, as long people only involve Windows, or only involve Mac,
> ?> it will all work just fine. It's only when they use non-Mac Unix
> ?> (such as Linux), or try to move files across systems using sub-prime
> ?> technology (such as your typical Windows zip utility) they will run
> ?> into problems.
>
> I believe that the kind of thing that Ishimoto-san has in mind is
> things like "smart cameras" that will upload your photos to your blog
> with one touch on the cameras screen and other "Web 2.0 for the rest
> of us" apps. ?What with the popularity of Linux and *BSD for such
> sites, it's easy to imagine problems of the kind he describes
> occurring between those (which will probably be using Shift JIS in
> Japan) apps and the websites.

Really? I would have thought that cell phones have long been the
platforms most supportive of Unicode. IIRC Nokia's Python port to S60
*required* Unicode strings for all system interfaces. Android, using
Java, also is pretty much all Unicode inside. Am I naive to generalize
from these two examples?

(This is not meant as a rhetorical question -- I may well be missing
something and am genuinely curious about the answer.)

-- 
--Guido van Rossum (python.org/~guido)

From stephen at xemacs.org  Mon Jan 24 10:19:00 2011
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 24 Jan 2011 18:19:00 +0900
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: 
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	
	<871v467ih3.fsf@uwakimon.sk.tsukuba.ac.jp>
	
	<4D3A1BC9.40604@v.loewis.de>
	<87y66b58uf.fsf@uwakimon.sk.tsukuba.ac.jp>
	
Message-ID: <87pqrm64mz.fsf@uwakimon.sk.tsukuba.ac.jp>

Guido van Rossum writes:

 > Really? I would have thought that cell phones have long been the
 > platforms most supportive of Unicode.

I would think so too, except in Japan.

However, my previous phones exposed file systems with names encoded in
Shift JIS to USB and IR browsers, though.  (My current one uses
Bluetooth, and I don't know how to "get at" the filesystem itself.)  A
lot of these devices also tend to present themselves as VFAT-formatted
drives (a la a USB memory stick), and Shift JIS is very commonly used
on those for reasons I don't really understand.

In any case, AIUI here the problem is like the problem of refactoring
a "make"-based system.  There are identifiers which are "spelled" one
way inside of files which need to match the "spelling" of names of
external filesystem objects.  If you transport such a set of files to
a POSIX system (which AFAIK most servers still are), then it's quite
possible that the file names will get translated to the locale's
encoding while the identifiers will not.

From martin at v.loewis.de  Mon Jan 24 10:45:35 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 24 Jan 2011 10:45:35 +0100
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <87pqrm64mz.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <1295440442.432.18.camel@marge>								<871v467ih3.fsf@uwakimon.sk.tsukuba.ac.jp>		<4D3A1BC9.40604@v.loewis.de>	<87y66b58uf.fsf@uwakimon.sk.tsukuba.ac.jp>	
	<87pqrm64mz.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <4D3D4A3F.1020502@v.loewis.de>

>  > Really? I would have thought that cell phones have long been the
>  > platforms most supportive of Unicode.
> 
> I would think so too, except in Japan.
> 
> However, my previous phones exposed file systems with names encoded in
> Shift JIS to USB and IR browsers, though.  (My current one uses
> Bluetooth, and I don't know how to "get at" the filesystem itself.)  A
> lot of these devices also tend to present themselves as VFAT-formatted
> drives (a la a USB memory stick), and Shift JIS is very commonly used
> on those for reasons I don't really understand.

It's one thing how the file systems are formatted, but another thing
how they are presented to APIs. For example, the phones using Windows CE
would have to convert the file names to Unicode in the OS kernel.

So: for these phones - do you know how they present file names to the
application?

Regards,
Martin

From ncoghlan at gmail.com  Mon Jan 24 11:33:07 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 24 Jan 2011 20:33:07 +1000
Subject: [Python-Dev] Beta version of the new devguide
In-Reply-To: 
References: 
	<20110123075621.468d07c9@dino>
	
Message-ID: 

On Mon, Jan 24, 2011 at 6:22 AM, Brett Cannon  wrote:
>> In "Getting Set Up" it describes how to build a pydebug build. Is that
>> really necessary for those who plan only to contribute by working on
>> pure Python code?
>>
>
> Yes, there is actually a laundry list of reasons even people only
> working on the stdlib should use a pydebug build.

And one big reason why I don't unless I have a specific need to check
something with it - it makes the already quite long running time for
the full test suite take even longer :)

I figure it's beneficial to have people running a mixture of debug and
release builds anyway - it helps catch things that work in one mode
and not the other.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From stephen at xemacs.org  Mon Jan 24 11:35:22 2011
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 24 Jan 2011 19:35:22 +0900
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <4D3D4A3F.1020502@v.loewis.de>
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	
	<871v467ih3.fsf@uwakimon.sk.tsukuba.ac.jp>
	
	<4D3A1BC9.40604@v.loewis.de>
	<87y66b58uf.fsf@uwakimon.sk.tsukuba.ac.jp>
	
	<87pqrm64mz.fsf@uwakimon.sk.tsukuba.ac.jp>
	<4D3D4A3F.1020502@v.loewis.de>
Message-ID: <87lj2a613p.fsf@uwakimon.sk.tsukuba.ac.jp>

"Martin v. L?wis" writes:

 > It's one thing how the file systems are formatted, but another thing
 > how they are presented to APIs. For example, the phones using Windows CE
 > would have to convert the file names to Unicode in the OS kernel.
 > 
 > So: for these phones - do you know how they present file names to the
 > application?

First of all, these aren't just phones; these are all kinds of gadgets
(the example I gave was a camera).  They're not as smart as an Android
or iPhone-like device, and I don't know what OS they use.

As for "presentation to the application", as I said, my older phones
presented themselves as "removable memory devices" (specifically on
the USB port), with VFAT-formatted file systems and Shift JIS file
names.  In that case you can surely have the kinds of problems
described, even if the app is not running on the device itself.  I
don't know if this is still true of more modern devices, but I was a
little shocked that is was true at all, even 5 or 6 years ago.

That may be one reason why the phone I have now doesn't provide a USB
interface at all.  That kind of interface is not only unnecessary with
Bluetooth, but Bluetooth uses more robust protocols.

From ncoghlan at gmail.com  Mon Jan 24 11:51:34 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 24 Jan 2011 20:51:34 +1000
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <87lj2a613p.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	
	<871v467ih3.fsf@uwakimon.sk.tsukuba.ac.jp>
	
	<4D3A1BC9.40604@v.loewis.de>
	<87y66b58uf.fsf@uwakimon.sk.tsukuba.ac.jp>
	
	<87pqrm64mz.fsf@uwakimon.sk.tsukuba.ac.jp>
	<4D3D4A3F.1020502@v.loewis.de>
	<87lj2a613p.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: 

On Mon, Jan 24, 2011 at 8:35 PM, Stephen J. Turnbull  wrote:
> First of all, these aren't just phones; these are all kinds of gadgets
> (the example I gave was a camera). ?They're not as smart as an Android
> or iPhone-like device, and I don't know what OS they use.

We're getting a little far afield from the original question though -
once it was pointed out that non-ASCII module names already work on
some systems but not others, it became fairly clear that Victor's
patch is about fixing an existing feature to be more robust rather
than adding something new.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From list at qtrac.plus.com  Sun Jan 23 08:56:21 2011
From: list at qtrac.plus.com (Mark Summerfield)
Date: Sun, 23 Jan 2011 07:56:21 +0000
Subject: [Python-Dev] Beta version of the new devguide
In-Reply-To: 
References: 
Message-ID: <20110123075621.468d07c9@dino>

Hi Brett,

On Sat, 22 Jan 2011 17:08:00 -0800
Brett Cannon  wrote:
> http://docs.python.org/devguide/

Personally, I found the first paragraph of "Contributing" a bit
off-putting.

How about replacing:

    People who wish to contribute to Python must read the following
    documents in the order provided. You can stop where you feel
    comfortable and begin contributing immediately without reading and
    understanding these documents all at once, but please do not skip
    around within the documentation as everything is written assuming
    preceding documentation has been read.

With something like:

    The Python core development team always welcomes new contributors,
    so we are very glad of your interest! Please read the following
    documents---in the order shown---to ensure that you understand how
    Python's development process works. This will ensure that your
    contributions are considered purely on their merit and don't get
    rejected due to missing or incorrectly performing a step in the
    process.

In "Getting Set Up" it describes how to build a pydebug build. Is that
really necessary for those who plan only to contribute by working on
pure Python code?

I had a quick skim over the rest and got the feeling that no clear
distinction is made between C and Python work. Personally, I feel that
more of a distinction should be made since not everyone will be
confident or interested in C. (And maybe more distinction should be made
between working on CPython and the standard library?)

Overall I think this document is *extremely welcome* and I am very glad
you have done it. I'm sure that once it starts to get known it will help
add to the pool of people contributing to Python as well as helping to
keep the processes clear:-)

-- 
Mark Summerfield, Qtrac Ltd, www.qtrac.eu
    C++, Python, Qt, PyQt - training and consultancy
        "Advanced Qt Programming" - ISBN 0321635906
            http://www.qtrac.eu/aqpbook.html

From solipsis at pitrou.net  Mon Jan 24 12:29:58 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 24 Jan 2011 12:29:58 +0100
Subject: [Python-Dev] Beta version of the new devguide
References: 
	<20110123075621.468d07c9@dino>
	
	
Message-ID: <20110124122958.589246ed@pitrou.net>

On Mon, 24 Jan 2011 20:33:07 +1000
Nick Coghlan  wrote:
> On Mon, Jan 24, 2011 at 6:22 AM, Brett Cannon  wrote:
> >> In "Getting Set Up" it describes how to build a pydebug build. Is that
> >> really necessary for those who plan only to contribute by working on
> >> pure Python code?
> >>
> >
> > Yes, there is actually a laundry list of reasons even people only
> > working on the stdlib should use a pydebug build.
> 
> And one big reason why I don't unless I have a specific need to check
> something with it - it makes the already quite long running time for
> the full test suite take even longer :)

Please try the -j option to regrtest.

Regards

Antoine.



From earney at umsystem.edu  Mon Jan 24 14:56:56 2011
From: earney at umsystem.edu (Earney, Billy C.)
Date: Mon, 24 Jan 2011 07:56:56 -0600
Subject: [Python-Dev] tahoe-lafs
Message-ID: <59ACE054B23B6045ACC43DBD778609BD6867D17257@UM-EMAIL02.um.umsystem.edu>

Greetings!

I know that this list is for python development questions/comments, but I wanted to bring up the tahoe-lafs project if people are interested in a project developed in python that allows for secure distributed storage.  For more information see http://tahoe-lafs.org

For those of you interested in joining a tahoe-lafs storage grid, I'm a member of a newly created storage grid called volunteer-grid2, and we are currently looking for new members.  The requirements to be a member can be viewed at http://bigpig.org/twiki/bin/view/Main/AboutVolunteerGrid2

Billy Earney
earney at umsystem.edu
Programmer/Analyst-Expert
[cid:image001.gif at 01CBBB9B.FB70CDB0]  MySQL Certified DBA

Office of Social and Economic Data Analysis (OSEDA)
University of Missouri
Phone: 573-882-7396
Fax: 573-884-4635

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 1989 bytes
Desc: image001.gif
URL: 

From solipsis at pitrou.net  Mon Jan 24 15:14:27 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 24 Jan 2011 15:14:27 +0100
Subject: [Python-Dev] tahoe-lafs
References: <59ACE054B23B6045ACC43DBD778609BD6867D17257@UM-EMAIL02.um.umsystem.edu>
Message-ID: <20110124151427.3526dd1a@pitrou.net>

On Mon, 24 Jan 2011 07:56:56 -0600
"Earney, Billy C."  wrote:
> Greetings!
> 
> I know that this list is for python development questions/comments, but I wanted to bring up the tahoe-lafs project [...]

You should really post such messages to comp.lang.python.



From ncoghlan at gmail.com  Mon Jan 24 16:33:04 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 25 Jan 2011 01:33:04 +1000
Subject: [Python-Dev] Beta version of the new devguide
In-Reply-To: <20110124122958.589246ed@pitrou.net>
References: 
	<20110123075621.468d07c9@dino>
	
	
	<20110124122958.589246ed@pitrou.net>
Message-ID: 

On Mon, Jan 24, 2011 at 9:29 PM, Antoine Pitrou  wrote:
> On Mon, 24 Jan 2011 20:33:07 +1000
> Nick Coghlan  wrote:
>> On Mon, Jan 24, 2011 at 6:22 AM, Brett Cannon  wrote:
>> >> In "Getting Set Up" it describes how to build a pydebug build. Is that
>> >> really necessary for those who plan only to contribute by working on
>> >> pure Python code?
>> >>
>> >
>> > Yes, there is actually a laundry list of reasons even people only
>> > working on the stdlib should use a pydebug build.
>>
>> And one big reason why I don't unless I have a specific need to check
>> something with it - it makes the already quite long running time for
>> the full test suite take even longer :)
>
> Please try the -j option to regrtest.

While I must admit I'm still not in the habit of running tests in
parallel, that's a substantial speed improvement regardless of build
type, so a non-debug build is still noticeably faster.

release (with -j4): 2 min 25 sec (3 min wall clock time)
pydebug (with -j4): 4 min 43 sec (10 min wall clock time)

Given that I typically *don't* need the extra info from a debug build
to analyse problems and a full configure and rebuild cycle takes less
time than a single pydebug test run, I'll happily stick with the much
faster test execution that comes from using a release build.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Mon Jan 24 16:35:05 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 25 Jan 2011 01:35:05 +1000
Subject: [Python-Dev] tahoe-lafs
In-Reply-To: <59ACE054B23B6045ACC43DBD778609BD6867D17257@UM-EMAIL02.um.umsystem.edu>
References: <59ACE054B23B6045ACC43DBD778609BD6867D17257@UM-EMAIL02.um.umsystem.edu>
Message-ID: 

On Mon, Jan 24, 2011 at 11:56 PM, Earney, Billy C. wrote:

> Greetings!
>
>
>
> I know that this list is for python development questions/comments,
>

People that post questions innocently unaware of the nature of this list
have an excuse.

You don't.

This is not a good way to encourage people to think well of you or your
project.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From solipsis at pitrou.net  Mon Jan 24 16:46:01 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 24 Jan 2011 16:46:01 +0100
Subject: [Python-Dev] Beta version of the new devguide
References: 
Message-ID: <20110124164601.6b984c6c@pitrou.net>

On Sat, 22 Jan 2011 17:08:00 -0800
Brett Cannon  wrote:
> 
> Two, what should the final URL be? Georg picked the current one and I
> am happy with it.

Ditto for me.

> Three, where should it be linked from? docs.python.org homepage?
> Four, what to do with www.python.org/dev/? Redirect for all the pages?

Right, this whole area (wpo/dev) looks obsolete to me. The devguide
allows us to easily edit and improve development-related docs, which is
great! It should be accessible easily from the main site. Perhaps
"core development" should be renamed "contributing" and redirect to the
devguide. Also, the submenu displayed below "core development" can be
trimmed dramatically.

(then there's the question of whether the devguide should be
exhaustive; should it contain reference-like material about
all aspects of core development?)

Regards

Antoine.



From vstinner at edenwall.com  Mon Jan 24 16:39:39 2011
From: vstinner at edenwall.com (Victor Stinner)
Date: Mon, 24 Jan 2011 16:39:39 +0100
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <87lj2a613p.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <1295440442.432.18.camel@marge> <4D3D4A3F.1020502@v.loewis.de>
	<87lj2a613p.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <201101241639.39345.vstinner@edenwall.com>

Le lundi 24 janvier 2011 11:35:22, Stephen J. Turnbull a ?crit :
> ... VFAT-formatted file systems and Shift JIS file names ...

I missed something: VFAT stores filenames as unicode (whereas FAT only 
supports byte filenames). Well, VFAT stores filenames twice: as a 8+3 byte 
strings and as a 255 unicode (UTF-16-LE) string (UTF-16-LE).

On which OS do you access this VFAT file system? On Windows, you have two 
APIs: bytes (*A) and wide character (*W). If you use the wide character, there 
is explicit encoding at all. Linux has two mount options to control unicode on 
a VFAT filesystem: "codepage" for the byte filenames (use Shift JIS here) and 
"iocharset" for the unicode filenames (I don't understand this option). 
Anyway, both systems support unicode filenames.

I suppose that Shift JIS is used to encode the filename in the 8+3 byte string 
form.

Victor

From spoettl at hotmail.com  Mon Jan 24 17:39:54 2011
From: spoettl at hotmail.com (Stefan Spoettl)
Date: Mon, 24 Jan 2011 16:39:54 +0000
Subject: [Python-Dev] (no subject)
Message-ID: 


Using:Python 2.7.0+ (r27:82500, Sep 15 2010, 18:14:55) [GCC 4.4.5] on linux2(Ubuntu 10.10)
Method to reproduce error:
1. Defining a module which is later imported by another:
---------------------------------------------------------------------
class SomeThing:
    def __init__(self):        self.variable = 'Where is my bytecode?'
    def deliver(self):        return self.variable

if __name__ == '__main__':    obj = SomeThing()    print obj.deliver()
---------------------------------------------------------------------
2. Run this module:Output of the Python Shell: Where is my bytecode?                                                   >>>
3. Defining the importing module:
---------------------------------------------------------------------
class UseSomeThing:
    def __init__(self, something):        self.anything = something
    def giveanything(self):        return self.anything

if __name__ == '__main__':    anything = UseSomeThing(SomeThing.SomeThing().deliver()).giveanything()    print anything
---------------------------------------------------------------------
4. Run this module:Output of the Python Shell: Where is my bytecode                                                    >>>(One can find SomeThing.pyc on the disc.)
5. Changing the imported module:
---------------------------------------------------------------------class SomeThing:    def __init__(self):        self.variable = 'What the hell is this? It could not be Python!'    def deliver(self):        return self.variableif __name__ == '__main__':    obj = SomeThing()    print obj.deliver()---------------------------------------------------------------------
6. Run the changed module:Output of the Python Shell: What the hell is this? It could not be Python!                                                   >>>
7. Run the importing module again:Output of the Python Shell: Where is my bytecode?                                                   >>>8. Deleting the bytecode of the imported module makes no effect!
Remark: I think that I have observed yesterday late night a similar effect on Windows XPwith Python 2.7.1 and Python 3.1.3. But when I have tried it out today in the morning theerror hasn't appeared. So it may be that the Python interpreter isn't working correctly onlyon Ubuntu 10.10. 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From victor.stinner at haypocalc.com  Mon Jan 24 17:51:42 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Mon, 24 Jan 2011 17:51:42 +0100
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <201101241639.39345.vstinner@edenwall.com>
References: <1295440442.432.18.camel@marge>
	<87lj2a613p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201101241639.39345.vstinner@edenwall.com>
Message-ID: <201101241751.42470.victor.stinner@haypocalc.com>

Le lundi 24 janvier 2011 16:39:39, Victor Stinner a ?crit :
> Le lundi 24 janvier 2011 11:35:22, Stephen J. Turnbull a ?crit :
> > ... VFAT-formatted file systems and Shift JIS file names ...
> 
> I missed something: VFAT stores filenames as unicode (whereas FAT only
> supports byte filenames). Well, VFAT stores filenames twice: as a 8+3 byte
> strings and as a 255 unicode (UTF-16-LE) string (UTF-16-LE).
> 
> On which OS do you access this VFAT file system? On Windows, you have two
> APIs: bytes (*A) and wide character (*W). If you use the wide character,
> there is explicit encoding at all.

Oops, there is *not* explicit encoding a all.

Victor

From earney at umsystem.edu  Mon Jan 24 17:18:16 2011
From: earney at umsystem.edu (Earney, Billy C.)
Date: Mon, 24 Jan 2011 10:18:16 -0600
Subject: [Python-Dev] tahoe-lafs
In-Reply-To: 
References: <59ACE054B23B6045ACC43DBD778609BD6867D17257@UM-EMAIL02.um.umsystem.edu>
	
Message-ID: <59ACE054B23B6045ACC43DBD778609BD6867D1731B@UM-EMAIL02.um.umsystem.edu>

I want to make it clear that I am in no way associated with the tahoe-lafs project.  I do not want my email to make that project look bad.  That was not my intention.

Billy Earney
earney at umsystem.edu
Programmer/Analyst-Expert
[cid:image001.gif at 01CBBBB0.03DD8B00]  MySQL Certified DBA

Office of Social and Economic Data Analysis (OSEDA)
University of Missouri
Phone: 573-882-7396
Fax: 573-884-4635

From: Nick Coghlan [mailto:ncoghlan at gmail.com]
Sent: Monday, January 24, 2011 9:35 AM
To: Earney, Billy C.
Cc: Python-Dev at python.org
Subject: Re: [Python-Dev] tahoe-lafs

On Mon, Jan 24, 2011 at 11:56 PM, Earney, Billy C. > wrote:
Greetings!

I know that this list is for python development questions/comments,

People that post questions innocently unaware of the nature of this list have an excuse.

You don't.

This is not a good way to encourage people to think well of you or your project.

Regards,
Nick.

--
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 1989 bytes
Desc: image001.gif
URL: 

From phd at phdru.name  Mon Jan 24 17:49:12 2011
From: phd at phdru.name (Oleg Broytman)
Date: Mon, 24 Jan 2011 19:49:12 +0300
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <201101241639.39345.vstinner@edenwall.com>
References: <1295440442.432.18.camel@marge> <4D3D4A3F.1020502@v.loewis.de>
	<87lj2a613p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201101241639.39345.vstinner@edenwall.com>
Message-ID: <20110124164912.GA9307@iskra.aviel.ru>

On Mon, Jan 24, 2011 at 04:39:39PM +0100, Victor Stinner wrote:
> I missed something: VFAT stores filenames as unicode (whereas FAT only 
> supports byte filenames). Well, VFAT stores filenames twice: as a 8+3 byte 
> strings and as a 255 unicode (UTF-16-LE) string (UTF-16-LE).
> 
> On which OS do you access this VFAT file system? On Windows, you have two 
> APIs: bytes (*A) and wide character (*W). If you use the wide character, there 
> is explicit encoding at all. Linux has two mount options to control unicode on 
> a VFAT filesystem: "codepage" for the byte filenames (use Shift JIS here) and 
> "iocharset" for the unicode filenames (I don't understand this option). 

   AFAIU, `codepage` is "remote charset" while `iocharset` is "local
charset". I.e., to mount windows-1251 filesystem to my linux with koi8-r
locale I use codepage=cp866,iocharset=koi8-r (cp866 is OEM encoding for
cp1251 ANSI).

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.

From phd at phdru.name  Mon Jan 24 18:18:00 2011
From: phd at phdru.name (Oleg Broytman)
Date: Mon, 24 Jan 2011 20:18:00 +0300
Subject: [Python-Dev] (no subject)
In-Reply-To: 
References: 
Message-ID: <20110124171800.GB9307@iskra.aviel.ru>

On Mon, Jan 24, 2011 at 04:39:54PM +0000, Stefan Spoettl wrote:
> So it may be that the Python interpreter isn't working correctly onlyon Ubuntu 10.10

   Than you should report the problem to the Ubuntu developers, right?
And it would be nice if you investigate deeper and send a proper mail -
with a subject, with a properly formatted text, not html.

http://www.catb.org/~esr/faqs/smart-questions.html

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.

From brett at python.org  Mon Jan 24 18:33:04 2011
From: brett at python.org (Brett Cannon)
Date: Mon, 24 Jan 2011 09:33:04 -0800
Subject: [Python-Dev] (no subject)
In-Reply-To: 
References: 
Message-ID: 

Bug reports should be filed at bugs.python.org

On Mon, Jan 24, 2011 at 08:39, Stefan Spoettl  wrote:
> Using:
> Python 2.7.0+ (r27:82500, Sep 15 2010, 18:14:55)
> [GCC 4.4.5] on linux2
> (Ubuntu 10.10)
> Method to reproduce error:
> 1. Defining a module which is later imported by another:
> ---------------------------------------------------------------------
> class SomeThing:
> ?? ?def __init__(self):
> ?? ? ? ?self.variable = 'Where is my bytecode?'
> ?? ?def deliver(self):
> ?? ? ? ?return self.variable
>
> if __name__ == '__main__':
> ?? ?obj = SomeThing()
> ?? ?print obj.deliver()
> ---------------------------------------------------------------------
> 2. Run this module:
> Output of the Python Shell: Where is my bytecode?
> ?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? >>>
> 3. Defining the importing module:
> ---------------------------------------------------------------------
> class UseSomeThing:
> ?? ?def __init__(self, something):
> ?? ? ? ?self.anything = something
> ?? ?def giveanything(self):
> ?? ? ? ?return self.anything
>
> if __name__ == '__main__':
> ?? ?anything = UseSomeThing(SomeThing.SomeThing().deliver()).giveanything()
> ?? ?print anything
> ---------------------------------------------------------------------
> 4. Run this module:
> Output of the Python Shell: Where is my bytecode
> ?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?>>>
> (One can find SomeThing.pyc on the disc.)
> 5. Changing the imported module:
> ---------------------------------------------------------------------
> class SomeThing:
> ?? ?def __init__(self):
> ?? ? ? ?self.variable = 'What the hell is this? It could not be Python!'
> ?? ?def deliver(self):
> ?? ? ? ?return self.variable
>
> if __name__ == '__main__':
> ?? ?obj = SomeThing()
> ?? ?print obj.deliver()
> ---------------------------------------------------------------------
> 6. Run the changed module:
> Output of the Python Shell:?What the hell is this? It could not be Python!
> ?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? >>>
> 7. Run the importing module again:
> Output of the Python Shell:?Where is my bytecode?
> ?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? >>>
> 8. Deleting the bytecode of the imported module makes no effect!
> Remark: I think that I have observed yesterday late night a similar effect
> on Windows XP
> with Python 2.7.1 and Python 3.1.3. But when I have tried it out today in
> the morning the
> error hasn't appeared. So it may be that the Python interpreter isn't
> working correctly only
> on Ubuntu 10.10.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>
>

From brett at python.org  Mon Jan 24 19:38:45 2011
From: brett at python.org (Brett Cannon)
Date: Mon, 24 Jan 2011 10:38:45 -0800
Subject: [Python-Dev] Beta version of the new devguide
In-Reply-To: 
References: 
	<20110123075621.468d07c9@dino>
	
	
	<20110124122958.589246ed@pitrou.net>
	
Message-ID: 

On Mon, Jan 24, 2011 at 07:33, Nick Coghlan  wrote:
> On Mon, Jan 24, 2011 at 9:29 PM, Antoine Pitrou  wrote:
>> On Mon, 24 Jan 2011 20:33:07 +1000
>> Nick Coghlan  wrote:
>>> On Mon, Jan 24, 2011 at 6:22 AM, Brett Cannon  wrote:
>>> >> In "Getting Set Up" it describes how to build a pydebug build. Is that
>>> >> really necessary for those who plan only to contribute by working on
>>> >> pure Python code?
>>> >>
>>> >
>>> > Yes, there is actually a laundry list of reasons even people only
>>> > working on the stdlib should use a pydebug build.
>>>
>>> And one big reason why I don't unless I have a specific need to check
>>> something with it - it makes the already quite long running time for
>>> the full test suite take even longer :)
>>
>> Please try the -j option to regrtest.
>
> While I must admit I'm still not in the habit of running tests in
> parallel, that's a substantial speed improvement regardless of build
> type, so a non-debug build is still noticeably faster.
>
> release (with -j4): 2 min 25 sec (3 min wall clock time)
> pydebug (with -j4): 4 min 43 sec (10 min wall clock time)
>

If you thinks that's slow, trying running it under coverage single-threaded. =)

> Given that I typically *don't* need the extra info from a debug build
> to analyse problems and a full configure and rebuild cycle takes less
> time than a single pydebug test run, I'll happily stick with the much
> faster test execution that comes from using a release build.
>

I'm not going to drag on arguing this point, but there is more to
pydebug builds than some debug info when working in the C code. For
instance, pure Python code can still trigger problems indirectly in C
code which gets picked up by a pydebug. You also have ResourceWarnings
now which are almost exclusively triggered by pure Python code.

My point is there is more to a pydebug build than just direct
debugging support for C code. But if running the test suite w/o a
debug build is what it takes to get people to run the test suite I
will take that over not running it at all.

From brett at python.org  Mon Jan 24 19:43:01 2011
From: brett at python.org (Brett Cannon)
Date: Mon, 24 Jan 2011 10:43:01 -0800
Subject: [Python-Dev] Beta version of the new devguide
In-Reply-To: <20110124164601.6b984c6c@pitrou.net>
References: 
	<20110124164601.6b984c6c@pitrou.net>
Message-ID: 

On Mon, Jan 24, 2011 at 07:46, Antoine Pitrou  wrote:
> On Sat, 22 Jan 2011 17:08:00 -0800
> Brett Cannon  wrote:
>>
>> Two, what should the final URL be? Georg picked the current one and I
>> am happy with it.
>
> Ditto for me.
>
>> Three, where should it be linked from? docs.python.org homepage?

Either there and/or the www.python.org homepage.

>> Four, what to do with www.python.org/dev/? Redirect for all the pages?
>
> Right, this whole area (wpo/dev) looks obsolete to me. The devguide
> allows us to easily edit and improve development-related docs, which is
> great! It should be accessible easily from the main site. Perhaps
> "core development" should be renamed "contributing" and redirect to the
> devguide. Also, the submenu displayed below "core development" can be
> trimmed dramatically.

There actually shouldn't be anything at python.org/dev that is useful
which has not been rewritten or linked to from the devguide. So that
whole page can be heavily gutted to the point of probably being
nothing more than a link to the devguide and a link to the PEP 0. But
I will hold off on the gutting until the devguide is "released";
probably end of the week.

>
> (then there's the question of whether the devguide should be
> exhaustive; should it contain reference-like material about
> all aspects of core development?)

That's where the balancing act comes in. If we get too exhaustive then
we have to constantly update the docs anytime we make a change. If we
leave too loose then someone is going to come along and potentially
waste some time on something because they didn't realize what they
should have been doing.

From raymond.hettinger at gmail.com  Mon Jan 24 20:04:06 2011
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Mon, 24 Jan 2011 11:04:06 -0800
Subject: [Python-Dev] Keeping __init__.py empty for Python packages used for
	module grouping.
Message-ID: <854DFFA6-1BAD-41D7-BBC7-6906C606774A@gmail.com>

Looking at http://docs.python.org/dev/library/html.html#module-html it would appear that we've created a new module with a single trivial function.

In reality, there was already a python package, html, that served to group two loosely related modules, html.parser and html.entities.

ISTM, that if we're going to use python packages as "namespace containers" for categorizing modules, then the top level __init__ namespace should be left empty.

Before the placement of html.escape() becomes set in stone, I think we should consider putting it somewhere else.


Raymond


From g.brandl at gmx.net  Mon Jan 24 20:18:07 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 24 Jan 2011 20:18:07 +0100
Subject: [Python-Dev] Keeping __init__.py empty for Python packages used
 for module grouping.
In-Reply-To: <854DFFA6-1BAD-41D7-BBC7-6906C606774A@gmail.com>
References: <854DFFA6-1BAD-41D7-BBC7-6906C606774A@gmail.com>
Message-ID: 

Am 24.01.2011 20:04, schrieb Raymond Hettinger:
> Looking at http://docs.python.org/dev/library/html.html#module-html it would
> appear that we've created a new module with a single trivial function.
> 
> In reality, there was already a python package, html, that served to group
> two loosely related modules, html.parser and html.entities.
> 
> ISTM, that if we're going to use python packages as "namespace containers"
> for categorizing modules, then the top level __init__ namespace should be
> left empty.
> 
> Before the placement of html.escape() becomes set in stone, I think we should
> consider putting it somewhere else.

To be honest, I don't see the issue.  I don't see stdlib packages as
"namespace containers", but rather as a nice way of structuring functionality.
And remember that flat is better than nested -- why should escape() be put
away into a new submodule?

At least you'll need to let us know where you would rather put that function.

Georg


From raymond.hettinger at gmail.com  Mon Jan 24 20:46:45 2011
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Mon, 24 Jan 2011 11:46:45 -0800
Subject: [Python-Dev] Location of tests for packages
Message-ID: 

Right now, the tests for the unittest package are under the package directory instead of Lib/test where we have most of the other tests.

There are some other packages that do the same thing, each for their own reason.

I think we should develop a strong preference for tests going under Lib/test unless there is a very compelling reason.  We already have a similar preference for all Docs going under Doc/ and that has not proved to be an issue with any package maintainer.

* The Windows distro has an install option to exclude Lib/test.  The currrent situation with unittest works against it.
* The commingling of tests with the regular code is making it more difficult to grep code while excluding tests.
* Having packages create their little worlds within world is making it more difficult to find things. 
* For regrtest to work, there still needs to be some file in Lib/test that dispatches to the alternate test directory.

This isn't a critical issue (nothing is broken) but we're a week from another release candidate, so the new Py3.2 package organization (unittest was flat in Py3.1 and its test were under Lib/test) is about to become a de-facto decision that will be hard to undo.

I recommend moving it under Lib/test before everything is set in stone.


Raymond


P.S.  I've discussed this with Michael and his preference is against going back to the Py3.1 style where the tests were under Lib/test.  He thinks the current tree makes it easier to sync with Py2.7 and the unittest2 third-party module.  Also, he likes grepping the regular source and tests all at once.

From g.brandl at gmx.net  Mon Jan 24 20:26:00 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 24 Jan 2011 20:26:00 +0100
Subject: [Python-Dev] What's new 2.x in 3.x docs.
In-Reply-To: 
References: 	<60D93B3F-FF28-4BFF-AD77-F2316F5BAC53@gmail.com>
	
Message-ID: 

Am 23.01.2011 02:48, schrieb Nick Coghlan:
> On Sun, Jan 23, 2011 at 7:23 AM, Raymond Hettinger
>  wrote:
>> On Jan 22, 2011, at 11:04 AM, Terry Reedy wrote:
>>
>>> The 3.x docs mostly started fresh with 3.0. The major exception is the What's new section, which goes back to 2.0. The 2.x stuff comprises about 650KB in the repository and whatever that translates into in the distribution.. I cannot imagine that anyone who only has 3.x and no 2.x version would have any interest in the 2.x history. And of course, the complete 2.x history will always be available with the latest 2.7.z. And the cover page for 3.x could even say so and include a link. So why not remove it from the 3.2 release (and have two separate pages for the online version)?
>>
>> I think there is value in the older whatsnew docs.  The provide a readable introduction to various features and nicely augment the plain docs which can be a little dry.
>>
>> +1 for keeping the links as-is.  Removing them takes away a resource and gains nothing.
> 
> They're also a useful resource when developing compatibility guides
> for projects that target older versions (including ones that support
> py3k via 2to3).
> 
> With the latest 3.x release always being at the top, I agree with
> Raymond that retaining the history is a better option.

Agreed.

Georg


From fdrake at acm.org  Mon Jan 24 21:14:44 2011
From: fdrake at acm.org (Fred Drake)
Date: Mon, 24 Jan 2011 15:14:44 -0500
Subject: [Python-Dev] Keeping __init__.py empty for Python packages used
 for module grouping.
In-Reply-To: <854DFFA6-1BAD-41D7-BBC7-6906C606774A@gmail.com>
References: <854DFFA6-1BAD-41D7-BBC7-6906C606774A@gmail.com>
Message-ID: 

On Mon, Jan 24, 2011 at 2:04 PM, Raymond Hettinger
 wrote:
> ISTM, that if we're going to use python packages as "namespace containers" for
> categorizing modules, then the top level __init__ namespace should be left empty.

This is only an issue if the separate components are distributed
separately; for the standard library, we're not using it as a
namespace package in the same sense that is done with (for example)
the "zope" package.


? -Fred

--
Fred L. Drake, Jr.? ? 
"A storm broke loose in my mind."? --Albert Einstein

From merwok at netwok.org  Mon Jan 24 21:21:07 2011
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Mon, 24 Jan 2011 21:21:07 +0100
Subject: [Python-Dev] Location of tests for packages
In-Reply-To: 
References: 
Message-ID: <4D3DDF33.9020702@netwok.org>

> Right now, the tests for the unittest package are under the
> package directory instead of Lib/test where we have most of the
> other tests.
> 
> There are some other packages that do the same thing, each for
> their own reason.

The corresponding bug report is #10572 (opened by Michael Foord).

R. David Murray was +1 for moving email tests, Barry deferred to him,
Brett was +0 for importlib, and I was ?0 for distutils.  Maintainers of
ctypes, json, lib2to3 and sqlite3 haven?t yet expressed themselves.

Regards

From martin at v.loewis.de  Mon Jan 24 21:17:34 2011
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 24 Jan 2011 21:17:34 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
Message-ID: <4D3DDE5E.4080807@v.loewis.de>

I have been thinking about Unicode representation for some time now.
This was triggered, on the one hand, by discussions with Glyph Lefkowitz
(who complained that his server app consumes too much memory), and Carl
Friedrich Bolz (who profiled Python applications to determine that
Unicode strings are among the top consumers of memory in Python).
On the other hand, this was triggered by the discussion on supporting
surrogates in the library better.

I'd like to propose PEP 393, which takes a different approach,
addressing both problems simultaneously: by getting a flexible
representation (one that can be either 1, 2, or 4 bytes), we can
support the full range of Unicode on all systems, but still use
only one byte per character for strings that are pure ASCII (which
will be the majority of strings for the majority of users).

You'll find the PEP at

http://www.python.org/dev/peps/pep-0393/

For convenience, I include it below.

Regards,
Martin

PEP: 393
Title: Flexible String Representation
Version: $Revision: 88168 $
Last-Modified: $Date: 2011-01-24 21:14:21 +0100 (Mo, 24. Jan 2011) $
Author: Martin v. L?wis 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 24-Jan-2010
Python-Version: 3.3
Post-History:

Abstract
========

The Unicode string type is changed to support multiple internal
representations, depending on the character with the largest Unicode
ordinal (1, 2, or 4 bytes). This will allow a space-efficient
representation in common cases, but give access to full UCS-4 on all
systems. For compatibility with existing APIs, several representations
may exist in parallel; over time, this compatibility should be phased
out.

Rationale
=========

There are two classes of complaints about the current implementation
of the unicode type: on systems only supporting UTF-16, users complain
that non-BMP characters are not properly supported. On systems using
UCS-4 internally (and also sometimes on systems using UCS-2), there is
a complaint that Unicode strings take up too much memory - especially
compared to Python 2.x, where the same code would often use ASCII
strings (i.e. ASCII-encoded byte strings). With the proposed approach,
ASCII-only Unicode strings will again use only one byte per character;
while still allowing efficient indexing of strings containing non-BMP
characters (as strings containing them will use 4 bytes per
character).

One problem with the approach is support for existing applications
(e.g. extension modules). For compatibility, redundant representations
may be computed. Applications are encouraged to phase out reliance on
a specific internal representation if possible. As interaction with
other libraries will often require some sort of internal
representation, the specification choses UTF-8 as the recommended way
of exposing strings to C code.

For many strings (e.g. ASCII), multiple representations may actually
share memory (e.g. the shortest form may be shared with the UTF-8 form
if all characters are ASCII). With such sharing, the overhead of
compatibility representations is reduced.

Specification
=============

The Unicode object structure is changed to this definition::

  typedef struct {
    PyObject_HEAD
    Py_ssize_t length;
    void *str;
    Py_hash_t hash;
    int state;
    Py_ssize_t utf8_length;
    void *utf8;
    Py_ssize_t wstr_length;
    void *wstr;
  } PyUnicodeObject;

These fields have the following interpretations:

- length: number of code points in the string (result of sq_length)
- str: shortest-form representation of the unicode string; the lower
  two bits of the pointer indicate the specific form:
  01 => 1 byte (Latin-1); 11 => 2 byte (UCS-2); 11 => 4 byte (UCS-4);
  00 => null pointer

  The string is null-terminated (in its respective representation).
- hash, state: same as in Python 3.2
- utf8_length, utf8: UTF-8 representation (null-terminated)
- wstr_length, wstr: representation in platform's wchar_t
  (null-terminated). If wchar_t is 16-bit, this form may use surrogate
  pairs (in which cast wstr_length differs form length).

All three representations are optional, although the str form is
considered the canonical representation which can be absent only
while the string is being created.

The Py_UNICODE type is still supported but deprecated. It is always
defined as a typedef for wchar_t, so the wstr representation can double
as Py_UNICODE representation.

The str and utf8 pointers point to the same memory if the string uses
only ASCII characters (using only Latin-1 is not sufficient). The str
and wstr pointers point to the same memory if the string happens to
fit exactly to the wchar_t type of the platform (i.e. uses some
BMP-not-Latin-1 characters if sizeof(wchar_t) is 2, and uses some
non-BMP characters if sizeof(wchar_t) is 4).

If the string is created directly with the canonical representation
(see below), this representation doesn't take a separate memory block,
but is allocated right after the PyUnicodeObject struct.

String Creation
---------------

The recommended way to create a Unicode object is to use the function
PyUnicode_New::

   PyObject* PyUnicode_New(Py_ssize_t size, Py_UCS4 maxchar);

Both parameters must denote the eventual size/range of the strings.
In particular, codecs using this API must compute both the number of
characters and the maximum character in advance. An string is
allocated according to the specified size and character range and is
null-terminated; the actual characters in it may be unitialized.

PyUnicode_FromString and PyUnicode_FromStringAndSize remain supported
for processing UTF-8 input; the input is decoded, and the UTF-8
representation is not yet set for the string.

PyUnicode_FromUnicode remains supported but is deprecated. If the
Py_UNICODE pointer is non-null, the str representation is set. If the
pointer is NULL, a properly-sized wstr representation is allocated,
which can be modified until PyUnicode_Finalize() is called (explicitly
or implicitly). Resizing a Unicode string remains possible until it
is finalized.

PyUnicode_Finalize() converts a string containing only a wstr
representation into the canonical representation. Unless wstr and str
can share the memory, the wstr representation is discarded after the
conversion.

String Access
-------------

The canonical representation can be accessed using two macros
PyUnicode_Kind and PyUnicode_Data. PyUnicode_Kind gives one of the
value PyUnicode_1BYTE (1), PyUnicode_2BYTE (2), or PyUnicode_4BYTE
(3). PyUnicode_Data gives the void pointer to the data, masking out
the pointer kind. All these functions call PyUnicode_Finalize
in case the canonical representation hasn't been computed yet.

A new function PyUnicode_AsUTF8 is provided to access the UTF-8
representation. It is thus identical to the existing
_PyUnicode_AsString, which is removed. The function will compute the
utf8 representation when first called. Since this representation will
consume memory until the string object is released, applications
should use the existing PyUnicode_AsUTF8String where possible
(which generates a new string object every time). API that implicitly
converts a string to a char* (such as the ParseTuple functions) will
use this function to compute a conversion.

PyUnicode_AsUnicode is deprecated; it computes the wstr representation
on first use.

String Operations
-----------------

Various convenience functions will be provided to deal with the
canonical representation, in particular with respect to concatenation
and slicing.

Stable ABI
----------

None of the functions in this PEP become part of the stable ABI.

Copyright
=========

This document has been placed in the public domain.

From fdrake at acm.org  Mon Jan 24 21:28:21 2011
From: fdrake at acm.org (Fred Drake)
Date: Mon, 24 Jan 2011 15:28:21 -0500
Subject: [Python-Dev] Location of tests for packages
In-Reply-To: 
References: 
Message-ID: 

On Mon, Jan 24, 2011 at 2:46 PM, Raymond Hettinger
 wrote:
> P.S. ?I've discussed this with Michael and his preference is against
> going back to the Py3.1 style where the tests were under Lib/test. ?He
> thinks the current tree makes it easier to sync with Py2.7 and the
> unittest2 third-party module. ?Also, he likes grepping the regular source
> and tests all at once.

I'm with Michael on this.

-1 on pushing all the tests into Lib/test/.


? -Fred

--
Fred L. Drake, Jr.? ? 
"A storm broke loose in my mind."? --Albert Einstein

From martin at v.loewis.de  Mon Jan 24 21:28:58 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 24 Jan 2011 21:28:58 +0100
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <201101241639.39345.vstinner@edenwall.com>
References: <1295440442.432.18.camel@marge>
	<4D3D4A3F.1020502@v.loewis.de>	<87lj2a613p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201101241639.39345.vstinner@edenwall.com>
Message-ID: <4D3DE10A.6020902@v.loewis.de>

Am 24.01.2011 16:39, schrieb Victor Stinner:
> Le lundi 24 janvier 2011 11:35:22, Stephen J. Turnbull a ?crit :
>> ... VFAT-formatted file systems and Shift JIS file names ...
> 
> I missed something: VFAT stores filenames as unicode (whereas FAT only 
> supports byte filenames). Well, VFAT stores filenames twice: as a 8+3 byte 
> strings and as a 255 unicode (UTF-16-LE) string (UTF-16-LE).

Stephen may not have meant VFAT. Instead, he might have meant FAT32,
or, more likely, exFAT. VFAT is patented by Microsoft, so vendors of
devices using flash memory cards often don't support VFAT.

In any case, file names are encoded in the OEM code page even on VFAT.

> On which OS do you access this VFAT file system? On Windows, you have two 
> APIs: bytes (*A) and wide character (*W). If you use the wide character, there 
> is explicit encoding at all.

Right ("no explicit encoding"). However, this is actually where things
can go wrong: Windows needs to guess the file system, and will guess it
uses the OEM code page. If the device writing the file system uses a
different OEM code age than the Windows installation reading it, you
get moji-bake.

This will actually happen with the *A APIs as well: they do *not* give
you the file name from disk. Instead, Windows converts the OEM
characters on disk to Unicode, and then the Unicode characters to the
ANSI code page.

> Linux has two mount options to control unicode on 
> a VFAT filesystem: "codepage" for the byte filenames (use Shift JIS here) and 
> "iocharset" for the unicode filenames (I don't understand this option). 
> Anyway, both systems support unicode filenames.

Linux doesn't support "unicode file names". Instead, it can support
UTF-8. As Oleg explains: you need one encoding for the bytes on disk
(to know what they mean, when converted to Unicode), and one encoding
to then convert the "abstract" unicode to bytes again to present to
the application. This is similar to how *A works on Windows.

The iocharset is needed even if the file system is known to use UTF-16
(say, NTFS, VFAT, or Joliet).

Regards,
Martin

From barry at python.org  Mon Jan 24 21:39:27 2011
From: barry at python.org (Barry Warsaw)
Date: Mon, 24 Jan 2011 15:39:27 -0500
Subject: [Python-Dev] Location of tests for packages
In-Reply-To: 
References: 
Message-ID: <20110124153927.78810848@python.org>

On Jan 24, 2011, at 11:46 AM, Raymond Hettinger wrote:

>P.S.  I've discussed this with Michael and his preference is against going
>back to the Py3.1 style where the tests were under Lib/test.  He thinks the
>current tree makes it easier to sync with Py2.7 and the unittest2 third-party
>module.  Also, he likes grepping the regular source and tests all at once.

Which seem like compelling reasons to keep things the way they are for
unittest, in addition to the fact that we're already in RC for 3.2, so you
would need RM approval to make such a change this late in the process.

I agree that it's not ideal, but for certain packages that are also
distributed separately, it can be much easier to keep the tests with the code,
and I'm inclined to defer to the primary maintainer's preference.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 

From tseaver at palladion.com  Mon Jan 24 22:59:12 2011
From: tseaver at palladion.com (Tres Seaver)
Date: Mon, 24 Jan 2011 16:59:12 -0500
Subject: [Python-Dev] Keeping __init__.py empty for Python packages used
 for module grouping.
In-Reply-To: 
References: <854DFFA6-1BAD-41D7-BBC7-6906C606774A@gmail.com>
	
Message-ID: 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 01/24/2011 03:14 PM, Fred Drake wrote:
> On Mon, Jan 24, 2011 at 2:04 PM, Raymond Hettinger
>  wrote:
>> ISTM, that if we're going to use python packages as "namespace containers" for
>> categorizing modules, then the top level __init__ namespace should be left empty.
> 
> This is only an issue if the separate components are distributed
> separately; for the standard library, we're not using it as a
> namespace package in the same sense that is done with (for example)
> the "zope" package.

It might matter if we want to enable third-party package installation
into a namespace also used by the stdlib:  ISTR that the 'xml' package
had such installs at one point.

If that pattern is a goal, having all versions of the namespace's
__init__.py empty of anything but the __path__-munging majyk /
boilerplate is required to make such installs work regardless of the
order of PYTHONPATH.


Tres.
- -- 
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk099jAACgkQ+gerLs4ltQ7e4gCfbYJE8d8bNrX19zrzC4xvfA9Y
KkQAnA7niExvMqXtUBD/XwzZZ9EzHcBm
=/Q/Y
-----END PGP SIGNATURE-----


From solipsis at pitrou.net  Mon Jan 24 23:03:07 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 24 Jan 2011 23:03:07 +0100
Subject: [Python-Dev] Location of tests for packages
References: 
Message-ID: <20110124230307.3e7a3ee2@pitrou.net>

On Mon, 24 Jan 2011 11:46:45 -0800
Raymond Hettinger  wrote:
> 
> This isn't a critical issue (nothing is broken) but we're a week from another release candidate, so the new Py3.2 package organization (unittest was flat in Py3.1 and its test were under Lib/test) is about to become a de-facto decision that will be hard to undo.

Well can we stop being melodramatic? Tests are not part of the API and
so they are free to move whenever we want. No need to hold a release
candidate for that.

Regards

Antoine.



From solipsis at pitrou.net  Mon Jan 24 23:12:33 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 24 Jan 2011 23:12:33 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
References: <4D3DDE5E.4080807@v.loewis.de>
Message-ID: <20110124231233.79bed8eb@pitrou.net>

On Mon, 24 Jan 2011 21:17:34 +0100
"Martin v. L?wis"  wrote:
> I have been thinking about Unicode representation for some time now.
> This was triggered, on the one hand, by discussions with Glyph Lefkowitz
> (who complained that his server app consumes too much memory), and Carl
> Friedrich Bolz (who profiled Python applications to determine that
> Unicode strings are among the top consumers of memory in Python).
> On the other hand, this was triggered by the discussion on supporting
> surrogates in the library better.
> 
> I'd like to propose PEP 393, which takes a different approach,
> addressing both problems simultaneously: by getting a flexible
> representation (one that can be either 1, 2, or 4 bytes), we can
> support the full range of Unicode on all systems, but still use
> only one byte per character for strings that are pure ASCII (which
> will be the majority of strings for the majority of users).

For this kind of experiment, I think a concrete attempt at implementing
(together with performance/memory savings numbers) would be much more
useful than an abstract proposal. It is hard to judge the concrete
effects of the changes you are proposing, even though they might (or
not) make sense in theory. For example, you are adding a lot of
constant overhead to every unicode object, even very small ones, which
might be detrimental. Also, accessing the unicode object's payload
can become quite a bit more cumbersome. Only implementing can tell how
much this is workable in practice.

Regards

Antoine.



From benjamin at python.org  Mon Jan 24 23:13:38 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Mon, 24 Jan 2011 16:13:38 -0600
Subject: [Python-Dev] Location of tests for packages
In-Reply-To: <4D3DDF33.9020702@netwok.org>
References: 
	<4D3DDF33.9020702@netwok.org>
Message-ID: 

2011/1/24 ?ric Araujo :
>> Right now, the tests for the unittest package are under the
>> package directory instead of Lib/test where we have most of the
>> other tests.
>>
>> There are some other packages that do the same thing, each for
>> their own reason.
>
> The corresponding bug report is #10572 (opened by Michael Foord).
>
> R. David Murray was +1 for moving email tests, Barry deferred to him,
> Brett was +0 for importlib, and I was ?0 for distutils. ?Maintainers of
> ctypes, json, lib2to3 and sqlite3 haven?t yet expressed themselves.

I prefer lib2to3 tests to stay in lib2to3/.



-- 
Regards,
Benjamin

From tjreedy at udel.edu  Mon Jan 24 23:44:36 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 24 Jan 2011 17:44:36 -0500
Subject: [Python-Dev] Keeping __init__.py empty for Python packages used
 for module grouping.
In-Reply-To: 
References: <854DFFA6-1BAD-41D7-BBC7-6906C606774A@gmail.com>
	
Message-ID: 

On 1/24/2011 2:18 PM, Georg Brandl wrote:
> Am 24.01.2011 20:04, schrieb Raymond Hettinger:
>> Looking at http://docs.python.org/dev/library/html.html#module-html
>> it would appear that we've created a new module with a single
>> trivial function.
>>
>> In reality, there was already a python package, html, that served
>> to group two loosely related modules, html.parser and
>> html.entities.
>>
>> ISTM, that if we're going to use python packages as "namespace
>> containers" for categorizing modules, then the top level __init__
>> namespace should be left empty.
>>
>> Before the placement of html.escape() becomes set in stone, I think
>> we should consider putting it somewhere else.
>
> To be honest, I don't see the issue.  I don't see stdlib packages as
> "namespace containers", but rather as a nice way of structuring
> functionality. And remember that flat is better than nested -- why
> should escape() be put away into a new submodule?
>
> At least you'll need to let us know where you would rather put that
> function.

I would put in html.entities, which is also sparse, as it seems to me 
vaguely related.

-- 
Terry Jan Reedy


From martin at v.loewis.de  Tue Jan 25 00:04:03 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 25 Jan 2011 00:04:03 +0100
Subject: [Python-Dev] Keeping __init__.py empty for Python packages used
 for module grouping.
In-Reply-To: 
References: <854DFFA6-1BAD-41D7-BBC7-6906C606774A@gmail.com>	
	
Message-ID: <4D3E0563.2050807@v.loewis.de>

> If that pattern is a goal, having all versions of the namespace's
> __init__.py empty of anything but the __path__-munging majyk /
> boilerplate is required to make such installs work regardless of the
> order of PYTHONPATH.

With PEP 382, having extensible packages won't contradict to having
a non-trivial __init__.py, and no __path__-munging will be necessary.

Regards,
Martin

From martin at v.loewis.de  Tue Jan 25 00:07:03 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 25 Jan 2011 00:07:03 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <20110124231233.79bed8eb@pitrou.net>
References: <4D3DDE5E.4080807@v.loewis.de> <20110124231233.79bed8eb@pitrou.net>
Message-ID: <4D3E0617.7010001@v.loewis.de>

>> I'd like to propose PEP 393, which takes a different approach,
>> addressing both problems simultaneously: by getting a flexible
>> representation (one that can be either 1, 2, or 4 bytes), we can
>> support the full range of Unicode on all systems, but still use
>> only one byte per character for strings that are pure ASCII (which
>> will be the majority of strings for the majority of users).
> 
> For this kind of experiment, I think a concrete attempt at implementing
> (together with performance/memory savings numbers) would be much more
> useful than an abstract proposal.

I partially agree. An implementation is certainly needed, but there is
nothing wrong (IMO) with designing the change before implementing it.
Also, several people have offered to help with the implementation, so
we need to agree on a specification first (which is actually cheaper
than starting with the implementation only to find out that people
misunderstood each other).

Regards,
Martin

From brett at python.org  Tue Jan 25 00:09:01 2011
From: brett at python.org (Brett Cannon)
Date: Mon, 24 Jan 2011 15:09:01 -0800
Subject: [Python-Dev] Keeping __init__.py empty for Python packages used
 for module grouping.
In-Reply-To: 
References: <854DFFA6-1BAD-41D7-BBC7-6906C606774A@gmail.com>
	
Message-ID: 

On Mon, Jan 24, 2011 at 11:18, Georg Brandl  wrote:
> Am 24.01.2011 20:04, schrieb Raymond Hettinger:
>> Looking at http://docs.python.org/dev/library/html.html#module-html it would
>> appear that we've created a new module with a single trivial function.
>>
>> In reality, there was already a python package, html, that served to group
>> two loosely related modules, html.parser and html.entities.
>>
>> ISTM, that if we're going to use python packages as "namespace containers"
>> for categorizing modules, then the top level __init__ namespace should be
>> left empty.
>>
>> Before the placement of html.escape() becomes set in stone, I think we should
>> consider putting it somewhere else.
>
> To be honest, I don't see the issue. ?I don't see stdlib packages as
> "namespace containers", but rather as a nice way of structuring functionality.
> And remember that flat is better than nested -- why should escape() be put
> away into a new submodule?

Importlib also acts as a precedent with importlib.import_module(). I
honestly don't feel the need to treat packages as a namespace
explicitly (but then again I also disagree with the argument that
__init__.py needs to be left empty).

From martin at v.loewis.de  Tue Jan 25 00:14:21 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 25 Jan 2011 00:14:21 +0100
Subject: [Python-Dev] Location of tests for packages
In-Reply-To: <20110124230307.3e7a3ee2@pitrou.net>
References: 
	<20110124230307.3e7a3ee2@pitrou.net>
Message-ID: <4D3E07CD.1020904@v.loewis.de>

>> This isn't a critical issue (nothing is broken) but we're a week
>> from another release candidate, so the new Py3.2 package
>> organization (unittest was flat in Py3.1 and its test were under
>> Lib/test) is about to become a de-facto decision that will be hard
>> to undo.
> 
> Well can we stop being melodramatic? Tests are not part of the API
> and so they are free to move whenever we want. No need to hold a
> release candidate for that.

Of course there is. Any addition or removal of files at this point has
the chance of breaking the release process, which may fail to pick up
files, or break in trying to pick up files that it expected to be there.
This has happened *many* times during the alpha and beta releases of
3.2, so it's not at all a theoretical problem.

After the next release candidate, I'd prefer to see no changes
whatsoever to the tree (but it's Georg's decision, of course).

Regards,
Martin

From solipsis at pitrou.net  Tue Jan 25 00:20:45 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 25 Jan 2011 00:20:45 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <4D3E0617.7010001@v.loewis.de>
References: <4D3DDE5E.4080807@v.loewis.de>
	<20110124231233.79bed8eb@pitrou.net>  <4D3E0617.7010001@v.loewis.de>
Message-ID: <1295911245.3704.13.camel@localhost.localdomain>

Le mardi 25 janvier 2011 ? 00:07 +0100, "Martin v. L?wis" a ?crit :
> >> I'd like to propose PEP 393, which takes a different approach,
> >> addressing both problems simultaneously: by getting a flexible
> >> representation (one that can be either 1, 2, or 4 bytes), we can
> >> support the full range of Unicode on all systems, but still use
> >> only one byte per character for strings that are pure ASCII (which
> >> will be the majority of strings for the majority of users).
> > 
> > For this kind of experiment, I think a concrete attempt at implementing
> > (together with performance/memory savings numbers) would be much more
> > useful than an abstract proposal.
> 
> I partially agree. An implementation is certainly needed, but there is
> nothing wrong (IMO) with designing the change before implementing it.
> Also, several people have offered to help with the implementation, so
> we need to agree on a specification first (which is actually cheaper
> than starting with the implementation only to find out that people
> misunderstood each other).

I'm not sure it's really cheaper. When implementing you will probably
find out that it makes more sense to change the meaning of some fields,
add or remove some, etc. You will also want to try various tweaks since
the whole point is to lighten the footprint of unicode strings in common
workloads.

So, the only criticism I have, intuitively, is that the unicode
structure seems to become a bit too large. For example, I'm not sure you
need a generic (pointer, size) pair in addition to the
representation-specific ones.

Incidentally, to slightly reduce the overhead the unicode objects,
there's this proposal: http://bugs.python.org/issue1943

Regards

Antoine.



From solipsis at pitrou.net  Tue Jan 25 00:21:48 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 25 Jan 2011 00:21:48 +0100
Subject: [Python-Dev] Location of tests for packages
In-Reply-To: <4D3E07CD.1020904@v.loewis.de>
References: 
	<20110124230307.3e7a3ee2@pitrou.net>  <4D3E07CD.1020904@v.loewis.de>
Message-ID: <1295911308.3704.15.camel@localhost.localdomain>

Le mardi 25 janvier 2011 ? 00:14 +0100, "Martin v. L?wis" a ?crit :
> >> This isn't a critical issue (nothing is broken) but we're a week
> >> from another release candidate, so the new Py3.2 package
> >> organization (unittest was flat in Py3.1 and its test were under
> >> Lib/test) is about to become a de-facto decision that will be hard
> >> to undo.
> > 
> > Well can we stop being melodramatic? Tests are not part of the API
> > and so they are free to move whenever we want. No need to hold a
> > release candidate for that.
> 
> Of course there is. Any addition or removal of files at this point has
> the chance of breaking the release process, which may fail to pick up
> files, or break in trying to pick up files that it expected to be there.
> This has happened *many* times during the alpha and beta releases of
> 3.2, so it's not at all a theoretical problem.

My point was that these changes can take place after 3.2 (both final and
rc).

Regards

Antoine.



From fuzzyman at voidspace.org.uk  Tue Jan 25 00:40:55 2011
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Mon, 24 Jan 2011 23:40:55 +0000
Subject: [Python-Dev] Location of tests for packages
In-Reply-To: 
References: 
Message-ID: <4D3E0E07.1080005@voidspace.org.uk>

On 24/01/2011 19:46, Raymond Hettinger wrote:
> Right now, the tests for the unittest package are under the package directory instead of Lib/test where we have most of the other tests.
>
> There are some other packages that do the same thing, each for their own reason.
>
> I think we should develop a strong preference for tests going under Lib/test unless there is a very compelling reason.  We already have a similar preference for all Docs going under Doc/ and that has not proved to be an issue with any package maintainer.
>
> * The Windows distro has an install option to exclude Lib/test.  The currrent situation with unittest works against it.
> * The commingling of tests with the regular code is making it more difficult to grep code while excluding tests.
> * Having packages create their little worlds within world is making it more difficult to find things.
> * For regrtest to work, there still needs to be some file in Lib/test that dispatches to the alternate test directory.
>
> This isn't a critical issue (nothing is broken) but we're a week from another release candidate, so the new Py3.2 package organization (unittest was flat in Py3.1 and its test were under Lib/test) is about to become a de-facto decision that will be hard to undo.

The tests are already under unittest in 2.7 so that change isn't "new". 
Moving the tests now makes it harder to maintain them (patches to 3.2 
won't apply to 2.7). This is discussed in issue 10572.

     http://bugs.python.org/issue10572

It isn't just unittest, it seems that all *test packages* are in their 
respective package and not Lib/test except for the json module where 
Raymond already moved the tests:

     distutils/tests
     email/test
     ctypes/test
     importlib/test
     lib2to3/tests
     sqlite3/test
     tkinter/test

So I'm a little confused as to why the focus on the *unittest* test suite.

Brett has expressed a willingness to move the importlib tests under 
Lib/test and R David Murray would *like* to move the email tests there 
(but hasn't). Barry is -0 and so am I. It generally makes a few things 
slightly harder for me but not much. If we make a general policy 
decision to move all package tests out of their packages and into 
Lib/test (and actually do it) then fine, but I'm not overjoyed with a 
unilateral decision that unittest is special in this regard... :-)

All the best,

Michael
> I recommend moving it under Lib/test before everything is set in stone.
>
>
> Raymond
>
>
> P.S.  I've discussed this with Michael and his preference is against going back to the Py3.1 style where the tests were under Lib/test.  He thinks the current tree makes it easier to sync with Py2.7 and the unittest2 third-party module.  Also, he likes grepping the regular source and tests all at once.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk


-- 
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html


From fuzzyman at voidspace.org.uk  Tue Jan 25 01:19:02 2011
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Tue, 25 Jan 2011 00:19:02 +0000
Subject: [Python-Dev] Location of tests for packages
In-Reply-To: <4D3E07CD.1020904@v.loewis.de>
References: 	<20110124230307.3e7a3ee2@pitrou.net>
	<4D3E07CD.1020904@v.loewis.de>
Message-ID: <4D3E16F6.8020402@voidspace.org.uk>

On 24/01/2011 23:14, "Martin v. L?wis" wrote:
>>> This isn't a critical issue (nothing is broken) but we're a week
>>> from another release candidate, so the new Py3.2 package
>>> organization (unittest was flat in Py3.1 and its test were under
>>> Lib/test) is about to become a de-facto decision that will be hard
>>> to undo.
>> Well can we stop being melodramatic? Tests are not part of the API
>> and so they are free to move whenever we want. No need to hold a
>> release candidate for that.
> Of course there is. Any addition or removal of files at this point has
> the chance of breaking the release process, which may fail to pick up
> files, or break in trying to pick up files that it expected to be there.
> This has happened *many* times during the alpha and beta releases of
> 3.2, so it's not at all a theoretical problem.
>
> After the next release candidate, I'd prefer to see no changes
> whatsoever to the tree (but it's Georg's decision, of course).

What Antoine meant is that we could make the change for 3.2.1 and don't 
need to delay 3.2.

Michael

> Regards,
> Martin
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk


-- 
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html


From dmalcolm at redhat.com  Tue Jan 25 01:28:43 2011
From: dmalcolm at redhat.com (David Malcolm)
Date: Mon, 24 Jan 2011 19:28:43 -0500
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <4D3DDE5E.4080807@v.loewis.de>
References: <4D3DDE5E.4080807@v.loewis.de>
Message-ID: <1295915323.3219.44.camel@radiator.bos.redhat.com>

On Mon, 2011-01-24 at 21:17 +0100, "Martin v. L?wis" wrote:

... snip ...

> I'd like to propose PEP 393, which takes a different approach,
> addressing both problems simultaneously: by getting a flexible
> representation (one that can be either 1, 2, or 4 bytes), we can
> support the full range of Unicode on all systems, but still use
> only one byte per character for strings that are pure ASCII (which
> will be the majority of strings for the majority of users).

There was some discussion about this at PyCon 2010, where we referred to
it casually as "Pay-as-you-go unicode"

... snip ...

> - str: shortest-form representation of the unicode string; the lower
>   two bits of the pointer indicate the specific form:
>   01 => 1 byte (Latin-1); 11 => 2 byte (UCS-2); 11 => 4 byte (UCS-4);
Repetition of "11"; I'm guessing that the 2byte/UCS-2 should read "10",
so that they give the width of the char representation.

>   00 => null pointer

Naturally this assumes that all pointers are at least 4-byte aligned (so
that they can be masked off).  I assume that this is sane on every
platform that Python supports, but should it be spelled out explicitly
somewhere in the PEP?

> 
>   The string is null-terminated (in its respective representation).
> - hash, state: same as in Python 3.2
> - utf8_length, utf8: UTF-8 representation (null-terminated)
If this is to share its buffer with the "str" representation for the
Latin-1 case, then I take it this ptr will typically be (str & ~4) ?
i.e. only "str" has the low-order-bit type info.

> - wstr_length, wstr: representation in platform's wchar_t
>   (null-terminated). If wchar_t is 16-bit, this form may use surrogate
>   pairs (in which cast wstr_length differs form length).
> 
> All three representations are optional, although the str form is
> considered the canonical representation which can be absent only
> while the string is being created.

Spelling out the meaning of "optional":
  does this mean that the relevant ptr is NULL; if so, if utf8 is null,
is utf8_length undefined, or is it some dummy value?  (i.e. is the
pointer the first thing to check before we know if utf8_length is
meaningful?); similar consideration for the wstr representation.


> The Py_UNICODE type is still supported but deprecated. It is always
> defined as a typedef for wchar_t, so the wstr representation can double
> as Py_UNICODE representation.
> 
> The str and utf8 pointers point to the same memory if the string uses
> only ASCII characters (using only Latin-1 is not sufficient). The str
...though the ptrs are non-equal for this case, as noted above, as "str"
has an 0x1 typecode.

> and wstr pointers point to the same memory if the string happens to
> fit exactly to the wchar_t type of the platform (i.e. uses some
> BMP-not-Latin-1 characters if sizeof(wchar_t) is 2, and uses some
> non-BMP characters if sizeof(wchar_t) is 4).
> 
> If the string is created directly with the canonical representation
> (see below), this representation doesn't take a separate memory block,
> but is allocated right after the PyUnicodeObject struct.

Is the idea to do pointer arithmentic when deleting the PyUnicodeObject
to determine if the ptr is in that location, and not delete it if it is,
or is there some other way of determining whether the pointers need
deallocating?  If the former, is this embedding an assumption that the
underlying allocator couldn't have allocated a buffer directly adjacent
to the PyUnicodeObject.  I know that GNU libc's malloc/free
implementation has gaps of two machine words between each allocation;
off the top of my head I'm not sure if the optimized Object/obmalloc.c
allocator enforces such gaps.

... snip ...

Extra section:

GDB Debugging Hooks
-------------------
Tools/gdb/libpython.py contains debugging hooks that embed knowledge
about the internals of CPython's data types, include PyUnicodeObject
instances.  It will need to be slightly updated to track the change.

(I can do that change if need be; it shouldn't be too hard).



Hope this is helpful
Dave


From raymond.hettinger at gmail.com  Tue Jan 25 02:19:44 2011
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Mon, 24 Jan 2011 17:19:44 -0800
Subject: [Python-Dev] Location of tests for packages
In-Reply-To: <4D3E0E07.1080005@voidspace.org.uk>
References: 
	<4D3E0E07.1080005@voidspace.org.uk>
Message-ID: <749B0A1D-752F-4842-96D4-E73FEEFD5CEF@gmail.com>


On Jan 24, 2011, at 3:40 PM, Michael Foord wrote:
> It isn't just unittest, it seems that all *test packages* are in their respective package and not Lib/test except for the json module where Raymond already moved the tests:
> 
>    distutils/tests
>    email/test
>    ctypes/test
>    importlib/test
>    lib2to3/tests
>    sqlite3/test
>    tkinter/test
> 
> So I'm a little confused as to why the focus on the *unittest* test suite.


There's not a focus on unittest.  Importlib should also move under Lib/test
and when email is ready, it too should fully join the organization of
the overall project (Doc, Lib, Lib/test, Modules, Objects, Tools).

ISTM, ctypes and disutils could almost be viewed as separate projects.
We could ship Python without ctypes for example and we've got a policy
against implementing the rest of library using ctypes.  The same goes
for tkinter (it is not uncommon to have builds with it). And sqlite3 is 
close to being completely third-party maintained.

In contrast, the unittest module and importlib belong with the core distro.

So, I'm thinking that there were some precedents in cases where there
was a really good reason for separating the project (we don't even include
tkinter docs in our doc build), but that we should maintain a strong
preference for keeping the overall project organization intact.

ElementTree was fully folded into the project.  I think we should
follow that precedent and avoid balkanizing the python source
into many little project subtrees (worlds within a world).


Raymond

From a.badger at gmail.com  Tue Jan 25 04:26:09 2011
From: a.badger at gmail.com (Toshio Kuratomi)
Date: Mon, 24 Jan 2011 19:26:09 -0800
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: 
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
Message-ID: <20110125032609.GC24080@unaka.lan>

On Thu, Jan 20, 2011 at 03:27:08PM -0500, Glyph Lefkowitz wrote:
> 
> On Jan 20, 2011, at 11:46 AM, Guido van Rossum wrote:
>     Same here. *Most* code will never be shared, or will only be shared
>     between users in the same community. When it goes wrong it's also a
>     learning opportunity. :-)
> 
> 
> Despite my usual proclivity for being contrarian, I find myself in agreement
> here.  Linux users with locales that don't specify UTF-8 frankly _should_ have
> to deal with all kinds of nastiness until they can transcode their filesystems.
>  MacOS and Windows both have a "right" answer here and your third-party tools
> shouldn't create mojibake in your filenames.
> 
However, if this is the consensus, it makes a lot more sense to pick utf-8
as *the* encoding for python module filenames on Linux.

Why UTF-8:

* UTF-8 can cover the whole range of unicode whereas most (all?) other
  locale friendly encodings cannot.
* UTF-8 is becoming a standard for Linux distributions whether or not Linux
  users are adopting it.
* Third party tools are gaining support for UTF-8 even when they aren't
  gaining support for generic encodings (If I read the spec on zip
  correctly, this is actually what's happening there).

Why not locale:
* Relying on locale is simply not portable.  If nothing prevents people from
  distributing a unicode filename then they will go ahead and do so.  If
  the result works (say, because it's utf-8 and 80% of the Linux userbase is
  using utf-8) then it will get packaged and distributed and people won't
  know that it's a problem until someone with a non-utf-8 locale decids to
  use it.
* Mixing of modules from different locales won't work.  Suppose that the
  system python installs the previous module.  The local site has other
  modules that it has installed using a different filename encoding.
  The users at the site will find that either one or hte other of the two
  modules won't work.
* Because of the portability problems you have no choice but to tell people
  not to distribute python modules with non-ASCII names.  This makes the use
  of unicode names second class indefintely (until the kernel devs decide
  that they're wrong to not enforce a filesystem encoding or Linux becomes
  irrelevant as a platform).
* If you can pick a set of encodings that are valid (utf-8 for Linux and
  MacOS, wide unicode for windows [I get the feeling from other parts of the
  conversation that Windows won't be so lucky, though]) tools to convert
  python names become easier to write.  If you restrict it far enough, you
  could even write tools/importers that automatically do the detection.

PS: Sorry for not replying immediately, the team I'm on is dealing with an
issue at my work and I'm also preparing for a conference later this week.

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: 

From fdrake at acm.org  Tue Jan 25 05:40:49 2011
From: fdrake at acm.org (Fred Drake)
Date: Mon, 24 Jan 2011 23:40:49 -0500
Subject: [Python-Dev] Keeping __init__.py empty for Python packages used
 for module grouping.
In-Reply-To: 
References: <854DFFA6-1BAD-41D7-BBC7-6906C606774A@gmail.com>
	
	
Message-ID: 

On Mon, Jan 24, 2011 at 4:59 PM, Tres Seaver  wrote:
> It might matter if we want to enable third-party package installation
> into a namespace also used by the stdlib: ?ISTR that the 'xml' package
> had such installs at one point.

Almost, but not quite.

The xml package at one point allowed itself to be overridden by
another package (_xmlplus specifically), however that was define.
Experience proved that this was a mistake.

"Namespace packages", as originally defined by setuptools and applied
for the hurry, zc, and zope packages (and many others), are a very
different thing than what was done for the xml/_xmlplus package, and
have proven significantly more useful and usable.

While I heartily approve of "namespace packages" of that sort, I see
no reason to support installing into the same package namespace as the
standard library.  The primary disadvantage I see is that it would be
too easy to foster confusion over what's in the standard library among
newcomers.


? -Fred

--
Fred L. Drake, Jr.? ? 
"A storm broke loose in my mind."? --Albert Einstein

From g.brandl at gmx.net  Tue Jan 25 08:23:42 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 25 Jan 2011 08:23:42 +0100
Subject: [Python-Dev] Location of tests for packages
In-Reply-To: <4D3E07CD.1020904@v.loewis.de>
References: 	<20110124230307.3e7a3ee2@pitrou.net>
	<4D3E07CD.1020904@v.loewis.de>
Message-ID: 

Am 25.01.2011 00:14, schrieb "Martin v. L?wis":
>>> This isn't a critical issue (nothing is broken) but we're a week
>>> from another release candidate, so the new Py3.2 package
>>> organization (unittest was flat in Py3.1 and its test were under
>>> Lib/test) is about to become a de-facto decision that will be hard
>>> to undo.
>> 
>> Well can we stop being melodramatic? Tests are not part of the API
>> and so they are free to move whenever we want. No need to hold a
>> release candidate for that.

Yes, let's postpone this for after the final release.

> Of course there is. Any addition or removal of files at this point has
> the chance of breaking the release process, which may fail to pick up
> files, or break in trying to pick up files that it expected to be there.
> This has happened *many* times during the alpha and beta releases of
> 3.2, so it's not at all a theoretical problem.
> 
> After the next release candidate, I'd prefer to see no changes
> whatsoever to the tree (but it's Georg's decision, of course).

I agree with both of you.  Ideally there shouldn't be any but cosmetic
changes after rc2, otherwise I'd be inclined to add an rc3 to the
release schedule.

Georg


From g.brandl at gmx.net  Tue Jan 25 08:26:56 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 25 Jan 2011 08:26:56 +0100
Subject: [Python-Dev] Location of tests for packages
In-Reply-To: <749B0A1D-752F-4842-96D4-E73FEEFD5CEF@gmail.com>
References: 	<4D3E0E07.1080005@voidspace.org.uk>
	<749B0A1D-752F-4842-96D4-E73FEEFD5CEF@gmail.com>
Message-ID: 

Am 25.01.2011 02:19, schrieb Raymond Hettinger:
> 
> On Jan 24, 2011, at 3:40 PM, Michael Foord wrote:
>> It isn't just unittest, it seems that all *test packages* are in their respective package and not Lib/test except for the json module where Raymond already moved the tests:
>> 
>>    distutils/tests
>>    email/test
>>    ctypes/test
>>    importlib/test
>>    lib2to3/tests
>>    sqlite3/test
>>    tkinter/test
>> 
>> So I'm a little confused as to why the focus on the *unittest* test suite.
> 
> 
> There's not a focus on unittest.  Importlib should also move under Lib/test
> and when email is ready, it too should fully join the organization of
> the overall project (Doc, Lib, Lib/test, Modules, Objects, Tools).

I'm +0 on moving all tests under Lib/test -- I think the respective
maintainers of the libraries in question should have the final word,
because...

> ISTM, ctypes and disutils could almost be viewed as separate projects.
> We could ship Python without ctypes for example and we've got a policy
> against implementing the rest of library using ctypes.  The same goes
> for tkinter (it is not uncommon to have builds with it). And sqlite3 is 
> close to being completely third-party maintained.

this weakens the argument of having a consistent organization of test
modules: if one or two are allowed to have the test suite intra-package,
it doesn't matter so much any more for others.

Georg


From stephen at xemacs.org  Tue Jan 25 09:29:12 2011
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 25 Jan 2011 17:29:12 +0900
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <201101241639.39345.vstinner@edenwall.com>
References: <1295440442.432.18.camel@marge> <4D3D4A3F.1020502@v.loewis.de>
	<87lj2a613p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201101241639.39345.vstinner@edenwall.com>
Message-ID: <87fwsh5quf.fsf@uwakimon.sk.tsukuba.ac.jp>

As Nick points out, nobody really seems to think this is an
argument against your patch.  I'm going to bow out of this thread
after this post, as I'm clearly out of my technical depth.

Victor Stinner writes:

 > Le lundi 24 janvier 2011 11:35:22, Stephen J. Turnbull a ?crit :
 > > ... VFAT-formatted file systems and Shift JIS file names ...
 > 
 > I missed something: VFAT stores filenames as unicode (whereas FAT only 
 > supports byte filenames). Well, VFAT stores filenames twice: as a 8+3 byte 
 > strings and as a 255 unicode (UTF-16-LE) string (UTF-16-LE).

I don't know what it is; I didn't have char-device-level access to the
file system, nor did I have the specs (it was a proprietary phone by a
Japanese OEM).  It *presented* filenames in Shift JIS when mounted on
Linux with the vfat filesystem (either "mount -t vfat /dev/sde1
/mnt/gadget" or "mount -t auto /dev/sde1 /mnt/gadget").  Maybe there
is some unusual layer to translate from Unicode there, I'm not
familiar with Linux kernel drivers and libc facilities (such
special-casing is a common pattern in programming for Japanese;
remember, the Japanese had to deal with these issues before there was
any standard for them).

 > On which OS do you access this VFAT file system? On Windows, you have two 
 > APIs: bytes (*A) and wide character (*W). If you use the wide character, there 
 > is explicit encoding at all. Linux has two mount options to control unicode on 
 > a VFAT filesystem: "codepage" for the byte filenames (use Shift JIS here) and 
 > "iocharset" for the unicode filenames (I don't understand this
 > option). 

I didn't either, in fact this is the first I've heard of it, so I've
never tried it.

 > I suppose that Shift JIS is used to encode the filename in the 8+3 byte string 
 > form.

Could be, but I'm pretty sure these were long filenames, although
maybe they were just short enough (that is, I don't recall noticing
any truncation when mounted compared to the way they were presented on
the phone itself).  I don't use that phone anymore, it's in a box of
junk equipment somewhere....

From catch-all at masklinn.net  Tue Jan 25 10:22:41 2011
From: catch-all at masklinn.net (Xavier Morel)
Date: Tue, 25 Jan 2011 10:22:41 +0100
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <20110125032609.GC24080@unaka.lan>
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	<20110125032609.GC24080@unaka.lan>
Message-ID: <7F2E941E-143A-461E-BEC5-D7545C6D877A@masklinn.net>

On 2011-01-25, at 04:26 , Toshio Kuratomi wrote:
> 
> * If you can pick a set of encodings that are valid (utf-8 for Linux and
>  MacOS

HFS+ uses UTF-16 in NFD (actually in an Apple-specific variant of NFD). Right here you've already broken Python modules on OSX.

And as far as I know, Linux software/FS generally use NFC (I've already seen this issue cause trouble)


From ncoghlan at gmail.com  Tue Jan 25 11:13:57 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 25 Jan 2011 20:13:57 +1000
Subject: [Python-Dev] tahoe-lafs
In-Reply-To: <59ACE054B23B6045ACC43DBD778609BD6867D1731B@UM-EMAIL02.um.umsystem.edu>
References: <59ACE054B23B6045ACC43DBD778609BD6867D17257@UM-EMAIL02.um.umsystem.edu>
	
	<59ACE054B23B6045ACC43DBD778609BD6867D1731B@UM-EMAIL02.um.umsystem.edu>
Message-ID: 

On Tue, Jan 25, 2011 at 2:18 AM, Earney, Billy C. wrote:

> I want to make it clear that I am in no way associated with the tahoe-lafs
> project.  I do not want my email to make that project look bad.  That was
> not my intention.
>

Good to know. I was also in a somewhat grumpy mood when I wrote my last
post, so take it with a grain of salt :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From ncoghlan at gmail.com  Tue Jan 25 12:08:01 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 25 Jan 2011 21:08:01 +1000
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <4D3DDE5E.4080807@v.loewis.de>
References: <4D3DDE5E.4080807@v.loewis.de>
Message-ID: 

On Tue, Jan 25, 2011 at 6:17 AM, "Martin v. L?wis"  wrote:
> A new function PyUnicode_AsUTF8 is provided to access the UTF-8
> representation. It is thus identical to the existing
> _PyUnicode_AsString, which is removed. The function will compute the
> utf8 representation when first called. Since this representation will
> consume memory until the string object is released, applications
> should use the existing PyUnicode_AsUTF8String where possible
> (which generates a new string object every time). API that implicitly
> converts a string to a char* (such as the ParseTuple functions) will
> use this function to compute a conversion.

I'm not entirely clear as to what "this function" is referring to here.

I'm also dubious of the "PyUnicode_Finalize" name - "PyUnicode_Ready"
might be a better option (PyType_Ready seems a better analogy for a
"I've filled everything in, please calculate the derived fields now"
than Py_Finalize).

More generally, let me see if I understand the proposed structure correctly:

str: Always set once PyUnicode_Ready() has been called.
  Always points to the canonical representation of the string (as
indicated by PyUnicode_Kind)
length: Always set once PyUnicode_Ready() has been called. Specifies
the number of code points in the string.

wstr: Set only if PyUnicode_AsUnicode has been called on the string.
    If (sizeof(wchar_t) == 2 && PyUnicode_Kind() == PyUnicode_2BYTE)
or (sizeof(wchar_t) == 4 && PyUnicode_Kind() == PyUnicode_4BYTE), wstr
= str, otherwise wstr points to dedicated memory
wstr_length: Valid only if wstr != NULL
    If wstr_length != length, indicates presence of surrogate pairs in
a UCS-2 string (i.e. sizeof(wchar_t) == 2, PyUnicode_Kind() ==
PyUnicode_4BYTE).

utf8: Set only if PyUnicode_AsUTF8 has been called on the string.
    If string contents are pure ASCII, utf8 = str, otherwise utf8
points to dedicated memory.
utf8_length: Valid only if utf8_ptr != NULL

One change I would propose is that rather than hiding flags in the low
order bits of the str pointer, we expand the use of the existing
"state" field to cover the representation information in addition to
the interning information. I would also suggest explicitly flagging
internally whether or not a 1 byte string is ASCII or Latin-1 along
the lines of:

/* Already existing string state constants */
#SSTATE_NOT_INTERNED 0x00
#SSTATE_INTERNED_MORTAL 0x01
#SSTATE_INTERNED_IMMORTAL 0x02
/* New string state constants */
#SSTATE_INTERN_MASK 0x03
#SSTATE_KIND_ASCII 0x00
#SSTATE_KIND_LATIN1 0x04
#SSTATE_KIND_2BYTE 0x08
#SSTATE_KIND_4BYTE 0x0C
#SSTATE_KIND_MASK 0x0C


PyUnicode_Kind would then return PyUnicode_1BYTE for strings that were
flagged internally as either ASCII or LATIN1.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From solipsis at pitrou.net  Tue Jan 25 12:26:03 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 25 Jan 2011 12:26:03 +0100
Subject: [Python-Dev] r88178 -
 python/branches/py3k/Lib/test/crashers/underlying_dict.py
References: <20110125000028.94263EEBDB@mail.python.org>
Message-ID: <20110125122603.74e49f8c@pitrou.net>

On Tue, 25 Jan 2011 01:00:28 +0100 (CET)
benjamin.peterson  wrote:
> Author: benjamin.peterson
> Date: Tue Jan 25 01:00:28 2011
> New Revision: 88178
> 
> Log:
> another pretty crasher served up by pypy

Some comments would be nice. Right now it looks pretty close to
deliberately obfuscated code (especially with the call to
gc.get_referrers()).

Regards

Antoine.



From exarkun at twistedmatrix.com  Tue Jan 25 16:00:11 2011
From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com)
Date: Tue, 25 Jan 2011 15:00:11 -0000
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <7F2E941E-143A-461E-BEC5-D7545C6D877A@masklinn.net>
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	<20110125032609.GC24080@unaka.lan>
	<7F2E941E-143A-461E-BEC5-D7545C6D877A@masklinn.net>
Message-ID: <20110125150011.1699.303521989.divmod.xquotient.358@localhost.localdomain>

On 09:22 am, catch-all at masklinn.net wrote:
>On 2011-01-25, at 04:26 , Toshio Kuratomi wrote:
>>
>>* If you can pick a set of encodings that are valid (utf-8 for Linux 
>>and
>>  MacOS
>
>HFS+ uses UTF-16 in NFD (actually in an Apple-specific variant of NFD). 
>Right here you've already broken Python modules on OSX.

Are you sure about the UTF-16 part?  Evidence strongly points towards 
UTF-8:

  $ python
  Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29)  [GCC 4.2.1 (Apple 
Inc. build 5646)] on darwin
  Type "help", "copyright", "credits" or "license" for more information.
  >>> import unicodedata, os
  >>> file(u'\N{SNOWMAN}', 'w').close()
  >>> os.listdir('.')
  ['\xe2\x98\x83']
  >>> unicodedata.name('\xe2\x98\x83'.decode('utf-8'))
  'SNOWMAN'
  >>>
Jean-Paul

From ncoghlan at gmail.com  Tue Jan 25 17:07:45 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 26 Jan 2011 02:07:45 +1000
Subject: [Python-Dev] [Python-checkins] r88155 -
	python/branches/py3k/Doc/whatsnew/3.2.rst
In-Reply-To: <20110124015149.4B8CFEEAE3@mail.python.org>
References: <20110124015149.4B8CFEEAE3@mail.python.org>
Message-ID: 

On Mon, Jan 24, 2011 at 11:51 AM, raymond.hettinger
 wrote:
> Author: raymond.hettinger
> Date: Mon Jan 24 02:51:49 2011
> New Revision: 88155
>
> Log:
> Add entries for dis, dbm, and ctypes.
>
>
> Modified:
> ? python/branches/py3k/Doc/whatsnew/3.2.rst
>
> Modified: python/branches/py3k/Doc/whatsnew/3.2.rst
> ==============================================================================
> --- python/branches/py3k/Doc/whatsnew/3.2.rst ? (original)
> +++ python/branches/py3k/Doc/whatsnew/3.2.rst ? Mon Jan 24 02:51:49 2011
> @@ -1599,6 +1599,51 @@
>
> ?(Contributed by Ron Adam; :issue:`2001`.)
>
> +dis
> +---

For the dis module there is also the change to dis.dis() itself from
issue 6507 - you can now pass source strings directly to dis without
needing to compile them first:

>>> dis.dis("1 + 2")
  1           0 LOAD_CONST               2 (3)
              3 RETURN_VALUE

> +The :mod:`dis` module gained two new functions for inspecting code,
> +:func:`~dis.code_info` and :func:`~dis.show_code`. ?Both provide detailed code
> +object information for the supplied function, method, source code string or code
> +object. ?The former returns a string and the latter prints it::
> +
> + ? ?>>> import dis, random
> + ? ?>>> show_code(random.choice)

Typo here - missing a "dis." at the start of the line.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From a.badger at gmail.com  Tue Jan 25 17:35:25 2011
From: a.badger at gmail.com (Toshio Kuratomi)
Date: Tue, 25 Jan 2011 08:35:25 -0800
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <7F2E941E-143A-461E-BEC5-D7545C6D877A@masklinn.net>
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	<20110125032609.GC24080@unaka.lan>
	<7F2E941E-143A-461E-BEC5-D7545C6D877A@masklinn.net>
Message-ID: <20110125163525.GE24080@unaka.lan>

On Tue, Jan 25, 2011 at 10:22:41AM +0100, Xavier Morel wrote:
> On 2011-01-25, at 04:26 , Toshio Kuratomi wrote:
> > 
> > * If you can pick a set of encodings that are valid (utf-8 for Linux and
> >  MacOS
> 
> HFS+ uses UTF-16 in NFD (actually in an Apple-specific variant of NFD). Right here you've already broken Python modules on OSX.
>
Others have been saying that Mac OSX's HFS+ uses UTF-8.  But the question is
not whether UTF-16 or UTF-8 is used by HFS+.  It's whether you can sensibly
decide on an encoding from the type of system that is being run on.  This
could be querying the filesystem or a check on sys.platform or some other
method.  I don't know what detection the current code does.

On Linux there's no defined encoding that will work; file names are just
bytes to the Linux kernel so based on people's argument that the convention
is and should be that filenames are utf-8 and anything else is
a misconfigured system -- python should mandate that its module filenames on
Linux are utf-8 rather than using the user's locale settings.
> 
> And as far as I know, Linux software/FS generally use NFC (I've already seen this issue cause trouble)
> 
Linux FS's are bytes with a small blacklist (so you can't use the NULL byte
in a filename, for instance).  Linux software would be free to use any
normal form that they want.  If one software used NFC and another used NFD,
the FS would record two separate files with two separate filenames.  Other
programs might or might not display this correctly.

Example:
$ touch cafe
$ python
Python 2.7 (r27:82500, Sep 16 2010, 18:02:00) 
>>> import os
>>> import unicodedata
>>> a=u'caf?'
>>> b=unicodedata.normalize('NFC', a)
>>> c=unicodedata.normalize('NFD', a)
>>> open(b.encode('utf8'), 'w').close()
>>> open(c.encode('utf8'), 'w').close()
>>> os.listdir(u'.')
>>> [u'people-etc-changes.txt', u'cafe\u0301', u'cafe', u'people-etc-changes.sha256sum', u'caf\xe9']
>>> os.listdir('.')
>>> ['people-etc-changes.txt', 'cafe\xcc\x81', 'cafe', 'people-etc-changes.sha256sum', 'caf\xc3\xa9']
>>> ^D

$ ls -al .
drwxrwxr-x.  2 badger badger  4096 Jan 25 07:46 .
drwxr-xr-x. 17 badger badger  4096 Jan 24 18:27 ..
-rw-rw-r--.  1 badger badger     0 Jan 25 07:45 cafe
-rw-rw-r--.  1 badger badger     0 Jan 25 07:46 cafe
-rw-rw-r--.  1 badger badger     0 Jan 25 07:46 caf?

$ ls -al cafe
-rw-rw-r--.  1 badger badger     0 Jan 25 07:45 cafe
$ ls -al cafe?
-rw-rw-r--.  1 badger badger     0 Jan 25 07:46 cafe

Now in this case, the decomposed form of the filename is being displayed
incorrectly and the shell treats the decomposed character as two characters
instead of one.  However, when you view these files in dolphin (the KDE file
manager) you properly see caf? repeated twice.

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: 

From brett at python.org  Tue Jan 25 18:38:36 2011
From: brett at python.org (Brett Cannon)
Date: Tue, 25 Jan 2011 09:38:36 -0800
Subject: [Python-Dev] Location of tests for packages
In-Reply-To: <749B0A1D-752F-4842-96D4-E73FEEFD5CEF@gmail.com>
References: 
	<4D3E0E07.1080005@voidspace.org.uk>
	<749B0A1D-752F-4842-96D4-E73FEEFD5CEF@gmail.com>
Message-ID: 

On Mon, Jan 24, 2011 at 17:19, Raymond Hettinger
 wrote:
>
> On Jan 24, 2011, at 3:40 PM, Michael Foord wrote:
>> It isn't just unittest, it seems that all *test packages* are in their respective package and not Lib/test except for the json module where Raymond already moved the tests:
>>
>> ? ?distutils/tests
>> ? ?email/test
>> ? ?ctypes/test
>> ? ?importlib/test
>> ? ?lib2to3/tests
>> ? ?sqlite3/test
>> ? ?tkinter/test
>>
>> So I'm a little confused as to why the focus on the *unittest* test suite.
>
>
> There's not a focus on unittest. ?Importlib should also move under Lib/test
> and when email is ready, it too should fully join the organization of
> the overall project (Doc, Lib, Lib/test, Modules, Objects, Tools).

Just to clarify my position since importlib keeps getting brought up
as an example, I'm fine with a move but I won't be putting the work in
to do the move if there is actually consensus to make this a
stdlib-wide policy. And I am assuming that the directory will be moved
wholesale to Lib/test/importlib (with proper fixes for any relative
imports) along with verification that importlib.test.__main__
continues to work (naming it test.importlib_tests seems rather
redundant compared to test.importlib).

While I'm for consistency, obviously a trend was started by ctypes and
sqlite3 that the rest of us who created full packages followed up to
this point. If we move some modules and not others purely because some
distros choose not to ship e.g., ctypes and sqlite3, that will get
annoying w/o some very clear explanation/delineation as to why some
packages have a special rule to follow (I'm guessing "packages that
have external dependencies" would be it).

From fijall at gmail.com  Tue Jan 25 19:11:43 2011
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Tue, 25 Jan 2011 20:11:43 +0200
Subject: [Python-Dev] r88178 -
	python/branches/py3k/Lib/test/crashers/underlying_dict.py
In-Reply-To: <20110125122603.74e49f8c@pitrou.net>
References: <20110125000028.94263EEBDB@mail.python.org>
	<20110125122603.74e49f8c@pitrou.net>
Message-ID: 

On Tue, Jan 25, 2011 at 1:26 PM, Antoine Pitrou  wrote:
> On Tue, 25 Jan 2011 01:00:28 +0100 (CET)
> benjamin.peterson  wrote:
>> Author: benjamin.peterson
>> Date: Tue Jan 25 01:00:28 2011
>> New Revision: 88178
>>
>> Log:
>> another pretty crasher served up by pypy
>
> Some comments would be nice. Right now it looks pretty close to
> deliberately obfuscated code (especially with the call to
> gc.get_referrers()).
>
> Regards
>
> Antoine.
>

I gets to a dict of class circumventing dictproxy. It's yet unclear
why it segfaults.

From alexander.belopolsky at gmail.com  Tue Jan 25 19:16:07 2011
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 25 Jan 2011 13:16:07 -0500
Subject: [Python-Dev] Location of tests for packages
In-Reply-To: 
References: 
	<4D3E0E07.1080005@voidspace.org.uk>
	<749B0A1D-752F-4842-96D4-E73FEEFD5CEF@gmail.com>
	
Message-ID: 

On Tue, Jan 25, 2011 at 12:38 PM, Brett Cannon  wrote:
>.. If we move some modules and not others purely because some
> distros choose not to ship e.g., ctypes and sqlite3

I don't see why this is a problem.  Regrtest already has a mechanism
that allows skipping tests based on various criteria.  This mechanism
works for both packages and flat modules that can be optionally
installed.

FWIW, I am +0 on consolidating tests under Lib/test.  One of the
reasons that I have not seen mentioned is that it is well-known that
test package is not part of the official stdlib API and can be
changes/restructured in backward incompatible ways. It is not obvious
whether the same applies to say lib2to3.tests or ctypes.test.

If you are interested to see what it takes to move tests from a
package, I moved json tests to Lib/test/json_tests in r86875.  It is
not hard, but does require some changes to imports.

From solipsis at pitrou.net  Tue Jan 25 19:21:32 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 25 Jan 2011 19:21:32 +0100
Subject: [Python-Dev] r88178 -
 python/branches/py3k/Lib/test/crashers/underlying_dict.py
In-Reply-To: 
References: <20110125000028.94263EEBDB@mail.python.org>
	<20110125122603.74e49f8c@pitrou.net>
	
Message-ID: <1295979692.3716.8.camel@localhost.localdomain>

Le mardi 25 janvier 2011 ? 20:11 +0200, Maciej Fijalkowski a ?crit :
> On Tue, Jan 25, 2011 at 1:26 PM, Antoine Pitrou  wrote:
> > On Tue, 25 Jan 2011 01:00:28 +0100 (CET)
> > benjamin.peterson  wrote:
> >> Author: benjamin.peterson
> >> Date: Tue Jan 25 01:00:28 2011
> >> New Revision: 88178
> >>
> >> Log:
> >> another pretty crasher served up by pypy
> >
> > Some comments would be nice. Right now it looks pretty close to
> > deliberately obfuscated code (especially with the call to
> > gc.get_referrers()).
> >
> > Regards
> >
> > Antoine.
> >
> 
> I gets to a dict of class circumventing dictproxy. It's yet unclear
> why it segfaults.

Perhaps the method cache? But why the comment "# should print 1"?
Shouldn't it print 2 instead?




From mal at egenix.com  Tue Jan 25 23:43:52 2011
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 25 Jan 2011 23:43:52 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <4D3DDE5E.4080807@v.loewis.de>
References: <4D3DDE5E.4080807@v.loewis.de>
Message-ID: <4D3F5228.4010901@egenix.com>

I'll comment more on this later this week...

>From my first impression, I'm
not too thrilled by the prospect of making the Unicode implementation
more complicated by having three different representations on each
object.

I also don't see how this could save a lot of memory. As an example
take a French text with say 10mio code points. This would end up
appearing in memory as 3 copies on Windows: one copy stored as UCS2 (20MB),
one as Latin-1 (10MB) and one as UTF-8 (probably around 15MB, depending
on how many accents are used). That's a saving of -10MB compared to
today's implementation :-)

"Martin v. L?wis" wrote:
> I have been thinking about Unicode representation for some time now.
> This was triggered, on the one hand, by discussions with Glyph Lefkowitz
> (who complained that his server app consumes too much memory), and Carl
> Friedrich Bolz (who profiled Python applications to determine that
> Unicode strings are among the top consumers of memory in Python).
> On the other hand, this was triggered by the discussion on supporting
> surrogates in the library better.
> 
> I'd like to propose PEP 393, which takes a different approach,
> addressing both problems simultaneously: by getting a flexible
> representation (one that can be either 1, 2, or 4 bytes), we can
> support the full range of Unicode on all systems, but still use
> only one byte per character for strings that are pure ASCII (which
> will be the majority of strings for the majority of users).
> 
> You'll find the PEP at
> 
> http://www.python.org/dev/peps/pep-0393/
> 
> For convenience, I include it below.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 25 2011)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From solipsis at pitrou.net  Wed Jan 26 00:22:32 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 26 Jan 2011 00:22:32 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
References: <4D3DDE5E.4080807@v.loewis.de>
	<4D3F5228.4010901@egenix.com>
Message-ID: <20110126002232.1864cd6b@pitrou.net>


For the record:

> I also don't see how this could save a lot of memory. As an example
> take a French text with say 10mio code points. This would end up
> appearing in memory as 3 copies on Windows: one copy stored as UCS2 (20MB),
> one as Latin-1 (10MB) and one as UTF-8 (probably around 15MB, depending
> on how many accents are used).

Typical French text seems to have 5% non-ASCII characters. So the
number of UTF-8 bytes needed to represent a French text would only be
5% higher than the number of code points.

Anyway, it's quite obvious that Martin's goal is that only one
representation gets created most of the time. To quote the draft:

?All three representations are optional, although the str form is
considered the canonical representation which can be absent only
while the string is being created.?

Regards

Antoine.



From martin at v.loewis.de  Wed Jan 26 00:23:45 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 26 Jan 2011 00:23:45 +0100
Subject: [Python-Dev] r88178 -
	python/branches/py3k/Lib/test/crashers/underlying_dict.py
In-Reply-To: <20110125122603.74e49f8c@pitrou.net>
References: <20110125000028.94263EEBDB@mail.python.org>
	<20110125122603.74e49f8c@pitrou.net>
Message-ID: <4D3F5B81.6050401@v.loewis.de>

> Some comments would be nice. Right now it looks pretty close to
> deliberately obfuscated code (especially with the call to
> gc.get_referrers()).

That call tries to get at the class dictionary, rather then just
the dict_proxy that you get from A.__dict__. There should be
two referrers to thingy: the class dict, and the module dict.
The class dict will have a __module__ key.

I agree the program should print 2, though.

Regards,
Martin

From solipsis at pitrou.net  Wed Jan 26 00:24:12 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 26 Jan 2011 00:24:12 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
References: <4D3DDE5E.4080807@v.loewis.de>
	
Message-ID: <20110126002412.60002036@pitrou.net>

On Tue, 25 Jan 2011 21:08:01 +1000
Nick Coghlan  wrote:
> 
> One change I would propose is that rather than hiding flags in the low
> order bits of the str pointer, we expand the use of the existing
> "state" field to cover the representation information in addition to
> the interning information.

+1, by the way. The "state" field has many bits available (even if we
decide to make it a char rather than an int).

Regards

Antoine.



From ncoghlan at gmail.com  Wed Jan 26 02:30:27 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 26 Jan 2011 11:30:27 +1000
Subject: [Python-Dev] Location of tests for packages
In-Reply-To: 
References: 
	<4D3E0E07.1080005@voidspace.org.uk>
	<749B0A1D-752F-4842-96D4-E73FEEFD5CEF@gmail.com>
	
	
Message-ID: 

On Wed, Jan 26, 2011 at 4:16 AM, Alexander Belopolsky
 wrote:
> FWIW, I am +0 on consolidating tests under Lib/test. ?One of the
> reasons that I have not seen mentioned is that it is well-known that
> test package is not part of the official stdlib API and can be
> changes/restructured in backward incompatible ways. It is not obvious
> whether the same applies to say lib2to3.tests or ctypes.test.

I am +0 for the same reason as Alexander. The test subpackages should
either be moved under the test package, or, for packages with PyPI
distributed backports for previous versions, they should be prefixed
with a leading underscore to make it clear that they're private
implementation details and backwards compatibility guarantees don't
apply.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Wed Jan 26 02:41:31 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 26 Jan 2011 11:41:31 +1000
Subject: [Python-Dev] [Python-checkins] r88197 -
	python/branches/py3k/Lib/email/generator.py
In-Reply-To: <20110126003919.A9236EEC96@mail.python.org>
References: <20110126003919.A9236EEC96@mail.python.org>
Message-ID: 

On Wed, Jan 26, 2011 at 10:39 AM, victor.stinner
 wrote:
> Author: victor.stinner
> Date: Wed Jan 26 01:39:19 2011
> New Revision: 88197
>
> Log:
> Fix BytesGenerator._handle_text() if the message has no payload (None)

Folks, for the peace of mind of python-checkins watchers, please
remember to mention the reviewer's name when checking in fixes during
the RC period (the last one I checked had been reviewed by Georg on
the issue tracker, but it's hard to check without even an issue number
to look up).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From digitalxero at gmail.com  Wed Jan 26 02:50:30 2011
From: digitalxero at gmail.com (Dj Gilcrease)
Date: Tue, 25 Jan 2011 20:50:30 -0500
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <4D3F5228.4010901@egenix.com>
References: <4D3DDE5E.4080807@v.loewis.de> <4D3F5228.4010901@egenix.com>
Message-ID: 

On Tue, Jan 25, 2011 at 5:43 PM, M.-A. Lemburg  wrote:
> I also don't see how this could save a lot of memory. As an example
> take a French text with say 10mio code points. This would end up
> appearing in memory as 3 copies on Windows: one copy stored as UCS2 (20MB),
> one as Latin-1 (10MB) and one as UTF-8 (probably around 15MB, depending
> on how many accents are used). That's a saving of -10MB compared to
> today's implementation :-)

If I am reading the pep right, which I may not be as I am no expert on
unicode, the new implementation would actually give a 10MB saving
since the wchar field is optional, so only the str (Latin-1) and utf8
fields would need to be stored. How it decides not to store one field
or another would need to be clarified in the pep is I am right.

From brett at python.org  Wed Jan 26 03:07:38 2011
From: brett at python.org (Brett Cannon)
Date: Tue, 25 Jan 2011 18:07:38 -0800
Subject: [Python-Dev] [Python-checkins] r88197 -
	python/branches/py3k/Lib/email/generator.py
In-Reply-To: <20110126003919.A9236EEC96@mail.python.org>
References: <20110126003919.A9236EEC96@mail.python.org>
Message-ID: 

This broke the buildbots (R. David Murray thinks you may have
forgotten to call super() in the 'payload is None' branch). Are you
getting code reviews and fully running the test suite before
committing? We are in RC.

On Tue, Jan 25, 2011 at 16:39, victor.stinner
 wrote:
> Author: victor.stinner
> Date: Wed Jan 26 01:39:19 2011
> New Revision: 88197
>
> Log:
> Fix BytesGenerator._handle_text() if the message has no payload (None)
>
> Modified:
> ? python/branches/py3k/Lib/email/generator.py
>
> Modified: python/branches/py3k/Lib/email/generator.py
> ==============================================================================
> --- python/branches/py3k/Lib/email/generator.py (original)
> +++ python/branches/py3k/Lib/email/generator.py Wed Jan 26 01:39:19 2011
> @@ -377,8 +377,11 @@
> ? ? def _handle_text(self, msg):
> ? ? ? ? # If the string has surrogates the original source was bytes, so
> ? ? ? ? # just write it back out.
> - ? ? ? ?if _has_surrogates(msg._payload):
> - ? ? ? ? ? ?self.write(msg._payload)
> + ? ? ? ?payload = msg.get_payload()
> + ? ? ? ?if payload is None:
> + ? ? ? ? ? ?return
> + ? ? ? ?if _has_surrogates(payload):
> + ? ? ? ? ? ?self.write(payload)
> ? ? ? ? else:
> ? ? ? ? ? ? super(BytesGenerator,self)._handle_text(msg)
>
> _______________________________________________
> Python-checkins mailing list
> Python-checkins at python.org
> http://mail.python.org/mailman/listinfo/python-checkins
>

From stephen at xemacs.org  Wed Jan 26 03:24:54 2011
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 26 Jan 2011 11:24:54 +0900
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <20110125163525.GE24080@unaka.lan>
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	<20110125032609.GC24080@unaka.lan>
	<7F2E941E-143A-461E-BEC5-D7545C6D877A@masklinn.net>
	<20110125163525.GE24080@unaka.lan>
Message-ID: <874o8w5rm1.fsf@uwakimon.sk.tsukuba.ac.jp>

Toshio Kuratomi writes:

 > On Linux there's no defined encoding that will work; file names are just
 > bytes to the Linux kernel so based on people's argument that the convention
 > is and should be that filenames are utf-8 and anything else is
 > a misconfigured system -- python should mandate that its module filenames on
 > Linux are utf-8 rather than using the user's locale settings.

This isn't going to work where I live (Tsukuba).  At the national
university alone there are hundreds of pre-existing *nix systems whose
filesystems were often configured a decade or more ago.  Even if the
hardware and OS have been upgraded, the filesystems are usually
migrated as-is, with OS configuration tweaks to accomodate them.  Many
of them use EUC-JP (and servers often Shift JIS).  That means that you
won't be able to read module names with ls, and that will make Python
unacceptable for this purpose.  I imagine that in Russia the same is
true for the various Cyrillic encodings.

I really don't think there is anything that can be done here except to
warn people that "Kids, these stunts are performed by highly-trained
professionals.  Don't try this at home!"  Of course they will anyway,
but at least they will have been warned in sufficiently strong terms
that they might pay attention and be able to recover when they run
into bizarre import exceptions.

Oh, yeah, don't forget to apply Victor's patch, which allows Python to
keep the promises it can make about consistency.


From a.badger at gmail.com  Wed Jan 26 06:33:56 2011
From: a.badger at gmail.com (Toshio Kuratomi)
Date: Tue, 25 Jan 2011 21:33:56 -0800
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <874o8w5rm1.fsf@uwakimon.sk.tsukuba.ac.jp>
References: 
	
	
	
	
	
	<20110125032609.GC24080@unaka.lan>
	<7F2E941E-143A-461E-BEC5-D7545C6D877A@masklinn.net>
	<20110125163525.GE24080@unaka.lan>
	<874o8w5rm1.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20110126053356.GH24080@unaka.lan>

On Wed, Jan 26, 2011 at 11:24:54AM +0900, Stephen J. Turnbull wrote:
> Toshio Kuratomi writes:
> 
>  > On Linux there's no defined encoding that will work; file names are just
>  > bytes to the Linux kernel so based on people's argument that the convention
>  > is and should be that filenames are utf-8 and anything else is
>  > a misconfigured system -- python should mandate that its module filenames on
>  > Linux are utf-8 rather than using the user's locale settings.
> 
> This isn't going to work where I live (Tsukuba).  At the national
> university alone there are hundreds of pre-existing *nix systems whose
> filesystems were often configured a decade or more ago.  Even if the
> hardware and OS have been upgraded, the filesystems are usually
> migrated as-is, with OS configuration tweaks to accomodate them.  Many
> of them use EUC-JP (and servers often Shift JIS).  That means that you
> won't be able to read module names with ls, and that will make Python
> unacceptable for this purpose.  I imagine that in Russia the same is
> true for the various Cyrillic encodings.
> 
Sure ... but with these systems, neither read-modules-as-locale or
read-modules-as-utf-8 are a good solution to work, correct?  Especially if
the OS does get upgraded but the filesystems with user data (and user
created modules) are migrated as-is, you'll run into situations where system
installed modules are in utf-8 and user created modules are shift-jis and so
something will always be broken.

The only way to make sure that modules work is to restrict them to ASCII-only
on the filesystem.  But because unicode module names are seen as
a necessary feature, the question is which way forward is going to lead to
the least brokenness.  Which could be locale... but from the python2
locale-related bugs that I get to look at, I doubt.

> I really don't think there is anything that can be done here except to
> warn people that "Kids, these stunts are performed by highly-trained
> professionals.  Don't try this at home!"  Of course they will anyway,
> but at least they will have been warned in sufficiently strong terms
> that they might pay attention and be able to recover when they run
> into bizarre import exceptions.
> 
So on the subject of warnings... I think a reason it's better to pick an
encoding for the platform/filesystem rather than to use locale is because
people will get an error or a warning at the appropriate time if that's the
case -- the first time they attempt to create and import a module with
a filename that's not encoded in the correct encoding for the platform.
It's all very well to say: "We wrote in the documentation on
http://docs.python.org/distutils/introduction.html#Choosing-a-name that only
ASCII names should be used when distributing python modules" but if the
interpreter doesn't complain when people use a non-ASCII filename we all
know that they aren't going to look in the documentation; they'll try it and
if it works they'll learn that habit.  

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: 

From stephen at xemacs.org  Wed Jan 26 09:58:36 2011
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 26 Jan 2011 17:58:36 +0900
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <20110126053356.GH24080@unaka.lan>
References: 
	
	
	
	
	
	<20110125032609.GC24080@unaka.lan>
	<7F2E941E-143A-461E-BEC5-D7545C6D877A@masklinn.net>
	<20110125163525.GE24080@unaka.lan>
	<874o8w5rm1.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20110126053356.GH24080@unaka.lan>
Message-ID: <87wrls3utf.fsf@uwakimon.sk.tsukuba.ac.jp>

Toshio Kuratomi writes:

 > Sure ... but with these systems, neither read-modules-as-locale or
 > read-modules-as-utf-8 are a good solution to work, correct?

Good solution, no, but I believe that read-modules-as-locale *should*
work to a great extent.  AFAIK Python 3 reads Python programs as str
(ie, converting to Unicode -- if it doesn't, it *should*).

 > Especially if the OS does get upgraded but the filesystems with
 > user data (and user created modules) are migrated as-is, you'll run
 > into situations where system installed modules are in utf-8 and
 > user created modules are shift-jis and so something will always be
 > broken.

I don't know what you mean by "system-installed modules".  If you're
talking about Python itself, it's not a problem.  Python doesn't have
any Japanese-named modules in any encoding.

On the other hand, *everything* that involves scripting (shell
scripts, make, etc) related to those filesystems will be broken
*unless* the system, after upgrade but before going live, is converted
to have an appropriate locale encoding.  So I don't really see a
problem here.

The problem is portability across systems, and that is a problem that
only the third-party transports can really deal with.  tar and unzip
need to be taught how to change file names to the locale, etc.

 > The only way to make sure that modules work is to restrict them to ASCII-only
 > on the filesystem.  But because unicode module names are seen as
 > a necessary feature, the question is which way forward is going to lead to
 > the least brokenness.  Which could be locale... but from the python2
 > locale-related bugs that I get to look at, I doubt.

AFAICS this is going to be site-specific.  End of story.  Or, if you
prefer, "maru-nage".

IMHO, Python 2 locale bugs are unlikely to be a good guide to Python 3
locale bugs because in Python 2 most people just ignore locale and use
"native" strings (~= bytes in Python 3), and that typically "just
works".  In Python 3 that just *doesn't* work any more because you get
a UnicodeError on import, etc, etc.

IMHO, YMMV, and all that.  I know *of* such systems (there remain
quite a few here used by student and research labs), but the ones I
maintain were easy to convert to UTF-8 because I don't export file
systems (except my private files for my own use); everything is
mediated by Apache and Zope, and browsers are happy to cope if I
change from EUC-JP to UTF-8 and then flip the Apache switch to change
default encodings.



From victor.stinner at haypocalc.com  Wed Jan 26 10:40:34 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 26 Jan 2011 10:40:34 +0100
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <20110125032609.GC24080@unaka.lan>
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	<20110125032609.GC24080@unaka.lan>
Message-ID: <1296034834.25379.18.camel@marge>

Le lundi 24 janvier 2011 ? 19:26 -0800, Toshio Kuratomi a ?crit :
> Why not locale:
> * Relying on locale is simply not portable. (...)
> * Mixing of modules from different locales won't work. (...)

I don't understand what you are talking about.

When you import a module, the module name becomes a filename. On
Windows, you can reuse the Unicode name directly as a filename. On the
other OSes, you have to encode the name to filesystem encoding. During
Python 3.2 development, we tried to be able to use a filesystem encoding
different than the locale encoding (PYTHONFSENCODING environment
variable): but it doesn't work simply because Python is not alone in the
OS. Except Python, all programs speak the same "language": the locale
encoding. Let's try to give you an example: if create a module with a
name encoded to UTF-8, your file browser will display mojibake.

I don't understand the relation between the local filesystem encoding
and the portability. I suppose that you are talking about the
distribution of a module to other computers. Here the question is how
the filenames are stored during the transfer. The user is free to use
any tool, and try to find a tool handling Unicode correctly :-) But it's
no more the Python problem.

Each computer uses a different locale encoding. You have to use it to
cooperate with other programs and avoid mojibake. But I don't understand
why you write that "Mixing of modules from different locales won't
work". If you use a tool storing filenames in your locale encoding (eg.
TAR file format... and sometimes the ZIP format), the problem comes from
your tool and you should use another tool.

I created http://bugs.python.org/issue10972 to workaround ZIP tools
supposing that ZIP files use the locale encoding instead of cp497: this
issue adds an option to force the usage of the Unicode flag (and so
store filenames to UTF-8). Even if initially, I created the issue to
workaround a bootstrap issue (#10955).

Victor


From victor.stinner at haypocalc.com  Wed Jan 26 10:57:28 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 26 Jan 2011 10:57:28 +0100
Subject: [Python-Dev] [Python-checkins] r88197 -
 python/branches/py3k/Lib/email/generator.py
In-Reply-To: 
References: <20110126003919.A9236EEC96@mail.python.org>
	
Message-ID: <1296035848.25379.27.camel@marge>

Hi,

Le mardi 25 janvier 2011 ? 18:07 -0800, Brett Cannon a ?crit :
> This broke the buildbots (R. David Murray thinks you may have
> forgotten to call super() in the 'payload is None' branch). Are you
> getting code reviews and fully running the test suite before
> committing? We are in RC.
> (...)
> > -        if _has_surrogates(msg._payload):
> > -            self.write(msg._payload)
> > +        payload = msg.get_payload()
> > +        if payload is None:
> > +            return
> > +        if _has_surrogates(payload):
> > +            self.write(payload)

I didn't realize that such minor change can do anything harmful: the
parent method (Generator._handle_text) has exaclty the same test. If
msg._payload is None, call the parent method with None does nothing. But
_has_surrogates() doesn't support None.

The problem is not the test of None, but replacing msg._payload by
msg.get_payload(). I thought that get_payload() was a dummy getter
reading self._payload, but I was completly wrong :-)

I was stupid to not run at least test_email, sorry. And no, I didn't ask
for a review, because I thought that such minor change cannot be
harmful.

FYI the commit is related indirectly to #9124 (Mailbox module should use
binary I/O, not text I/O).

Victor


From martin at v.loewis.de  Wed Jan 26 11:12:02 2011
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 26 Jan 2011 11:12:02 +0100
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <1296034834.25379.18.camel@marge>
References: <1295440442.432.18.camel@marge>							<20110125032609.GC24080@unaka.lan>
	<1296034834.25379.18.camel@marge>
Message-ID: <4D3FF372.3040007@v.loewis.de>

Am 26.01.2011 10:40, schrieb Victor Stinner:
> Le lundi 24 janvier 2011 ? 19:26 -0800, Toshio Kuratomi a ?crit :
>> Why not locale:
>> * Relying on locale is simply not portable. (...)
>> * Mixing of modules from different locales won't work. (...)
> 
> I don't understand what you are talking about.

I think by "portability", he means "moving files from one computer to
another". He argues that if Python would mandate UTF-8 for all file
names on Unix, moving files in such a way would support portability,
whereas using the locale's filename might not (if the locale use a
different charset on the target system).

While this is technically true, I don't think it's a helpful way of
thinking: by mandating that file names are UTF-8 when accessed from
Python, we make the actual files inaccessible on both the source and
the target system.

> I don't understand the relation between the local filesystem encoding
> and the portability. I suppose that you are talking about the
> distribution of a module to other computers. Here the question is how
> the filenames are stored during the transfer. The user is free to use
> any tool, and try to find a tool handling Unicode correctly :-) But it's
> no more the Python problem.

There are cases where there is no real "transfer", in the sense in which
you are using the word. For example, with NFS, you can access the very
same file simultaneously on two systems, with no file name conversion
(unless you are using NFSv4, and unless your NFSv4 implementations
support the UTF-8 mandate in NFS well).

Also, if two users of the same machine have different locale settings,
the same file name might be interpreted differently.

Regards,
Martin

From phd at phdru.name  Wed Jan 26 12:02:31 2011
From: phd at phdru.name (Oleg Broytman)
Date: Wed, 26 Jan 2011 14:02:31 +0300
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <4D3FF372.3040007@v.loewis.de>
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	<20110125032609.GC24080@unaka.lan>
	<1296034834.25379.18.camel@marge> <4D3FF372.3040007@v.loewis.de>
Message-ID: <20110126110231.GB27259@iskra.aviel.ru>

On Wed, Jan 26, 2011 at 11:12:02AM +0100, "Martin v. L??wis" wrote:
> There are cases where there is no real "transfer", in the sense in which
> you are using the word. For example, with NFS, you can access the very
> same file simultaneously on two systems, with no file name conversion
> (unless you are using NFSv4, and unless your NFSv4 implementations
> support the UTF-8 mandate in NFS well).
> 
> Also, if two users of the same machine have different locale settings,
> the same file name might be interpreted differently.

   I have a solution for all these problems, with a price, of course.
Let's use utf8+base64. Base64 uses a very restricted subset of ASCII and
filenames will never be interpreted whatever filesystem encodings would
be. The price is users loose standard OS tools like ls and find.
   I am partially joking, of course, but only partially.

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.

From victor.stinner at haypocalc.com  Wed Jan 26 12:57:16 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 26 Jan 2011 12:57:16 +0100
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <4D3FF372.3040007@v.loewis.de>
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	<20110125032609.GC24080@unaka.lan> <1296034834.25379.18.camel@marge>
	<4D3FF372.3040007@v.loewis.de>
Message-ID: <1296043036.25379.41.camel@marge>

Le mercredi 26 janvier 2011 ? 11:12 +0100, "Martin v. L?wis" a ?crit :
> There are cases where there is no real "transfer", in the sense in which
> you are using the word. For example, with NFS, you can access the very
> same file simultaneously on two systems, with no file name conversion
> (unless you are using NFSv4, and unless your NFSv4 implementations
> support the UTF-8 mandate in NFS well).

Python encodes the module name to the locale encoding to create a
filename. If the locale encoding is not the encoding used on the NFS
server, it doesn't work, but I don't think that Python has to workaround
this issue. If an user plays with non-ASCII module names, (s)he has to
understand that (s)he will have to fight against badly configured
systems and tools unable to handle Unicode correctly. We might warn
him/her in the documentation.

If NFSv3 doesn't reencode filenames for each client and the clients
don't reencode filenames, all clients have to use the same locale
encoding than the server. Otherwise, I don't see how it can work.

> Also, if two users of the same machine have different locale settings,
> the same file name might be interpreted differently.

Except Mac OS X and Windows, no kernel supports Unicode and so all users
of the same computer have to use the same locale encoding, or they will
not be able to share non-ASCII filenames.

--

Again, I don't think that Python should do anything special to
workaround these issues.

(Hardcode the module filename encoding to UTF-8 doesn't work for all the
reasons explained in other emails.)

Victor


From ncoghlan at gmail.com  Wed Jan 26 13:30:37 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 26 Jan 2011 22:30:37 +1000
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: 
References: <4D3DDE5E.4080807@v.loewis.de> <4D3F5228.4010901@egenix.com>
	
Message-ID: 

On Wed, Jan 26, 2011 at 11:50 AM, Dj Gilcrease  wrote:
> On Tue, Jan 25, 2011 at 5:43 PM, M.-A. Lemburg  wrote:
>> I also don't see how this could save a lot of memory. As an example
>> take a French text with say 10mio code points. This would end up
>> appearing in memory as 3 copies on Windows: one copy stored as UCS2 (20MB),
>> one as Latin-1 (10MB) and one as UTF-8 (probably around 15MB, depending
>> on how many accents are used). That's a saving of -10MB compared to
>> today's implementation :-)
>
> If I am reading the pep right, which I may not be as I am no expert on
> unicode, the new implementation would actually give a 10MB saving
> since the wchar field is optional, so only the str (Latin-1) and utf8
> fields would need to be stored. How it decides not to store one field
> or another would need to be clarified in the pep is I am right.

The PEP actually does define that already:

PyUnicode_AsUTF8 populates the utf8 field of the existing string,
while PyUnicode_AsUTF8String creates a *new* string with that field
populated.

PyUnicode_AsUnicode will populate the wstr field (but doing so
generally shouldn't be necessary).

For a UCS4 build, my reading of the PEP puts the memory savings for a
100 code point string as follows:

Current size: 400 bytes (regardless of max code point)

New initial size (max code point < 256): 100 bytes (75% saving)
New initial size (max code point < 65536): 200 bytes (50% saving)
New initial size (max code point >= 65536): 400 bytes (no saving)

For each of the "new" strings, they may consume additional storage if
the utf8 or wstr fields get populated. The maximum possible size would
be a UCS4 string (max code point >= 65536) on a sizeof(wchar_t) == 2
system with the utf8 string populated. In such cases, you would
consume at least 700 bytes, plus whatever additional memory is needed
to encode the non-BMP characters into UTF-8 and UTF-16.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Wed Jan 26 13:34:43 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 26 Jan 2011 22:34:43 +1000
Subject: [Python-Dev] [Python-checkins] r88197 -
	python/branches/py3k/Lib/email/generator.py
In-Reply-To: <1296035848.25379.27.camel@marge>
References: <20110126003919.A9236EEC96@mail.python.org>
	
	<1296035848.25379.27.camel@marge>
Message-ID: 

On Wed, Jan 26, 2011 at 7:57 PM, Victor Stinner
 wrote:
> I was stupid to not run at least test_email, sorry. And no, I didn't ask
> for a review, because I thought that such minor change cannot be
> harmful.

During the RC period, *everything* that touches the code base should
be reviewed by a second committer before checkin, and sanctioned by
the RM as well. This applies even for apparently trivial changes.

Docs checkins are slightly less strict (especially Raymond finishing
off the What's New), but even there it's preferable to be cautious in
the run up to a final release.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From p.f.moore at gmail.com  Wed Jan 26 13:49:44 2011
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 26 Jan 2011 12:49:44 +0000
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: 
References: <4D3DDE5E.4080807@v.loewis.de> <4D3F5228.4010901@egenix.com>
	
	
Message-ID: 

On 26 January 2011 12:30, Nick Coghlan  wrote:
> The PEP actually does define that already:
>
> PyUnicode_AsUTF8 populates the utf8 field of the existing string,
> while PyUnicode_AsUTF8String creates a *new* string with that field
> populated.
>
> PyUnicode_AsUnicode will populate the wstr field (but doing so
> generally shouldn't be necessary).

AIUI, another point is that the PEP deprecates the use of the calls
that populate the utf8 and wstr fields, in favour of the calls that
expect the caller to manage the extra memory (PyUnicode_AsUTF8String
rather than PyUnicode_AsUTF8, ??? rather than PyUnicode_AsUnicode). So
in the long term, the extra fields should never be populated -
although this could take some time as extensions have to be recoded.
Ultimately, the extra fields and older APIs could even be removed.

So any space cost (which I concede could be non-trivial in some cases)
is expected to be short-term.

Paul.

From foom at fuhm.net  Wed Jan 26 14:24:15 2011
From: foom at fuhm.net (James Y Knight)
Date: Wed, 26 Jan 2011 08:24:15 -0500
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <1296034834.25379.18.camel@marge>
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	<20110125032609.GC24080@unaka.lan>
	<1296034834.25379.18.camel@marge>
Message-ID: 

On Jan 26, 2011, at 4:40 AM, Victor Stinner wrote:
> During
> Python 3.2 development, we tried to be able to use a filesystem encoding
> different than the locale encoding (PYTHONFSENCODING environment
> variable): but it doesn't work simply because Python is not alone in the
> OS. Except Python, all programs speak the same "language": the locale
> encoding. Let's try to give you an example: if create a module with a
> name encoded to UTF-8, your file browser will display mojibake.

Is that really true? I'm pretty sure GTK+ treats all filenames as UTF-8 no matter what the locale says. (over-rideable by G_FILENAME_ENCODING or G_BROKEN_FILENAMES)

James

From victor.stinner at haypocalc.com  Wed Jan 26 17:47:10 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 26 Jan 2011 17:47:10 +0100
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: 
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	<20110125032609.GC24080@unaka.lan> <1296034834.25379.18.camel@marge>
	
Message-ID: <1296060430.2672.76.camel@marge>

Le mercredi 26 janvier 2011 ? 08:24 -0500, James Y Knight a ?crit :
> On Jan 26, 2011, at 4:40 AM, Victor Stinner wrote:
> > During
> > Python 3.2 development, we tried to be able to use a filesystem encoding
> > different than the locale encoding (PYTHONFSENCODING environment
> > variable): but it doesn't work simply because Python is not alone in the
> > OS. Except Python, all programs speak the same "language": the locale
> > encoding. Let's try to give you an example: if create a module with a
> > name encoded to UTF-8, your file browser will display mojibake.
> 
> Is that really true? I'm pretty sure GTK+ treats all filenames as
> UTF-8 no matter what the locale says. (over-rideable by
> G_FILENAME_ENCODING or G_BROKEN_FILENAMES)

Not exactly. Gtk+ uses the glib library, and to encode/decode filenames,
the glib library uses:

 - UTF-8 on Windows
 - G_FILENAME_ENCODING environment variable if set (comma-separated list
of encodings)
 - UTF-8 if G_BROKEN_FILENAMES env var is set
 - or the locale encoding

glib has no type to store a filename, a filename is a raw byte string
(char*). It has a nice function to workaround mojibake issues:
g_filename_display_name(). This function tries to decode the filename
from each encoding of the filename encoding list, if all decodings
failed, use UTF-8 and escape undecodable bytes.

So yes, if you set G_FILENAME_ENCODING you can fix mojibake issues. But
you have to pass the raw bytes filenames to other libraries and
programs.

The problem with PYTHONFSENCODING is that sys.getfilesystemencoding() is
not only used for the filenames, but also for the command line arguments
and the environment variables.

For more information about glib, see g_filename_to_utf8(),
g_filename_display_name() and g_get_filename_charsets() documentation:

http://library.gnome.org/devel/glib/2.26/glib-Character-Set-Conversion.html

Victor


From brett at python.org  Wed Jan 26 18:43:52 2011
From: brett at python.org (Brett Cannon)
Date: Wed, 26 Jan 2011 09:43:52 -0800
Subject: [Python-Dev] [Python-checkins] r88197 -
	python/branches/py3k/Lib/email/generator.py
In-Reply-To: 
References: <20110126003919.A9236EEC96@mail.python.org>
	
	<1296035848.25379.27.camel@marge>
	
Message-ID: 

On Wed, Jan 26, 2011 at 04:34, Nick Coghlan  wrote:
> On Wed, Jan 26, 2011 at 7:57 PM, Victor Stinner
>  wrote:
>> I was stupid to not run at least test_email, sorry. And no, I didn't ask
>> for a review, because I thought that such minor change cannot be
>> harmful.
>
> During the RC period, *everything* that touches the code base should
> be reviewed by a second committer before checkin, and sanctioned by
> the RM as well. This applies even for apparently trivial changes.

Especially as this is not the first slip-up; Raymond had a
copy-and-paste slip that broke the buildbots. Luckily he was in
#python-dev when it happened and it was noticed fast enough he fixed
in in under a minute.

So yes, even stuff we would all consider minor **must** have a review.
Time to update the devguide I think.

-Brett

From g.brandl at gmx.net  Wed Jan 26 19:08:36 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 26 Jan 2011 19:08:36 +0100
Subject: [Python-Dev] [Python-checkins] r88197 -
	python/branches/py3k/Lib/email/generator.py
In-Reply-To: <1296035848.25379.27.camel@marge>
References: <20110126003919.A9236EEC96@mail.python.org>	
	<1296035848.25379.27.camel@marge>
Message-ID: 

Am 26.01.2011 10:57, schrieb Victor Stinner:
> Hi,
> 
> Le mardi 25 janvier 2011 ? 18:07 -0800, Brett Cannon a ?crit :
>> This broke the buildbots (R. David Murray thinks you may have
>> forgotten to call super() in the 'payload is None' branch). Are you
>> getting code reviews and fully running the test suite before
>> committing? We are in RC.
>> (...)
>> > -        if _has_surrogates(msg._payload):
>> > -            self.write(msg._payload)
>> > +        payload = msg.get_payload()
>> > +        if payload is None:
>> > +            return
>> > +        if _has_surrogates(payload):
>> > +            self.write(payload)
> 
> I didn't realize that such minor change can do anything harmful:

That's why the rule is that *every change needs to be reviewed*, not
*every change that doesn't look harmful needs to be reviewed*.

(This is true only for code changes, of course.  Doc changes rarely have
hidden bugs, nor are they embarrassing when a bug slips into the release.
And I get the "test suite" (building the docs) results twice a day and
can fix problems myself.)

> the
> parent method (Generator._handle_text) has exaclty the same test. If
> msg._payload is None, call the parent method with None does nothing. But
> _has_surrogates() doesn't support None.
> 
> The problem is not the test of None, but replacing msg._payload by
> msg.get_payload(). I thought that get_payload() was a dummy getter
> reading self._payload, but I was completly wrong :-)
>
> I was stupid to not run at least test_email, sorry. And no, I didn't ask
> for a review, because I thought that such minor change cannot be
> harmful.

I hope you know better now :)  *Always* run the test suite *before* even
asking for review.

Georg


From foom at fuhm.net  Wed Jan 26 19:25:49 2011
From: foom at fuhm.net (James Y Knight)
Date: Wed, 26 Jan 2011 13:25:49 -0500
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <1296060430.2672.76.camel@marge>
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	<20110125032609.GC24080@unaka.lan>
	<1296034834.25379.18.camel@marge>
	
	<1296060430.2672.76.camel@marge>
Message-ID: <7A5FC2EF-45CC-4838-8919-00E9563AF9D3@fuhm.net>


On Jan 26, 2011, at 11:47 AM, Victor Stinner wrote:
> Not exactly. Gtk+ uses the glib library, and to encode/decode filenames,
> the glib library uses:
> 
> - UTF-8 on Windows
> - G_FILENAME_ENCODING environment variable if set (comma-separated list
> of encodings)
> - UTF-8 if G_BROKEN_FILENAMES env var is set
> - or the locale encoding


But the documentation says:

> On Unix, the character sets are determined by consulting the environment variables G_FILENAME_ENCODING and G_BROKEN_FILENAMES. On Windows, the character set used in the GLib API is always UTF-8 and said environment variables have no effect.
> 
> G_FILENAME_ENCODING may be set to a comma-separated list of character set names. The special token "@locale" is taken to mean the character set for thecurrent locale. If G_FILENAME_ENCODING is not set, but G_BROKEN_FILENAMES is, the character set of the current locale is taken as the filename encoding. If neither environment variable is set, UTF-8 is taken as the filename encoding, but the character set of the current locale is also put in the list of encodings.

Which indicates to me that (unless you override the behavior with env vars) it encodes filenames in UTF-8 regardless of the locale, and attempts decoding in UTF-8 primarily. And that only when the filename doesn't make sense in UTF-8, it will also try decoding it in the locale encoding.

James

From andy-python at hammerhartes.de  Wed Jan 26 22:09:15 2011
From: andy-python at hammerhartes.de (=?ISO-8859-1?Q?Andreas_St=FChrk?=)
Date: Wed, 26 Jan 2011 22:09:15 +0100
Subject: [Python-Dev] r88178 -
	python/branches/py3k/Lib/test/crashers/underlying_dict.py
In-Reply-To: 
References: <20110125000028.94263EEBDB@mail.python.org>
	<20110125122603.74e49f8c@pitrou.net>
	
Message-ID: 

> I gets to a dict of class circumventing dictproxy. It's yet unclear
> why it segfaults.

The crash as well as the output "1" are both caused because updating
the class dictionary directly doesn't invalidate the method cache.
When the new value for "f" is assigned to the dict, the old "f" gets
garbage collected (because the method cache uses borrowed references),
but there is still an entry in the cache for the (now
garbage-collected) function. When "a.f" is executed next, the entry of
the cache is used and a new method is created. When that method gets
called, it returns "1" and when the interpreter tries to garbage
collect the new method on interpreter finalization, it segfaults
because the referenced "f" is already collected.

Regards,
Andreas

From martin at v.loewis.de  Wed Jan 26 22:10:48 2011
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 26 Jan 2011 22:10:48 +0100
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <1296043036.25379.41.camel@marge>
References: <1295440442.432.18.camel@marge>							<20110125032609.GC24080@unaka.lan>
	<1296034834.25379.18.camel@marge>	<4D3FF372.3040007@v.loewis.de>
	<1296043036.25379.41.camel@marge>
Message-ID: <4D408DD8.8030505@v.loewis.de>

> If NFSv3 doesn't reencode filenames for each client and the clients
> don't reencode filenames, all clients have to use the same locale
> encoding than the server. Otherwise, I don't see how it can work.

In practice, users accept that they get mojibake - their editors can
still open the files, and they can double-click them in a file browser
just fine. So it doesn't really need to work, and users can still use
it.

> Again, I don't think that Python should do anything special to
> workaround these issues.

I agree, and I'm certainly in favor of keeping the current code base.
Just make sure you understand the reasoning of those opposing.

Regards,
Martin

From a.badger at gmail.com  Thu Jan 27 01:47:08 2011
From: a.badger at gmail.com (Toshio Kuratomi)
Date: Wed, 26 Jan 2011 16:47:08 -0800
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <4D3FF372.3040007@v.loewis.de>
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	<20110125032609.GC24080@unaka.lan>
	<1296034834.25379.18.camel@marge> <4D3FF372.3040007@v.loewis.de>
Message-ID: <20110127004708.GI24080@unaka.lan>

On Wed, Jan 26, 2011 at 11:12:02AM +0100, "Martin v. L?wis" wrote:
> Am 26.01.2011 10:40, schrieb Victor Stinner:
> > Le lundi 24 janvier 2011 ? 19:26 -0800, Toshio Kuratomi a ?crit :
> >> Why not locale:
> >> * Relying on locale is simply not portable. (...)
> >> * Mixing of modules from different locales won't work. (...)
> > 
> > I don't understand what you are talking about.
> 
> I think by "portability", he means "moving files from one computer to
> another". He argues that if Python would mandate UTF-8 for all file
> names on Unix, moving files in such a way would support portability,
> whereas using the locale's filename might not (if the locale use a
> different charset on the target system).
> 
> While this is technically true, I don't think it's a helpful way of
> thinking: by mandating that file names are UTF-8 when accessed from
> Python, we make the actual files inaccessible on both the source and
> the target system.
> 
> > I don't understand the relation between the local filesystem encoding
> > and the portability. I suppose that you are talking about the
> > distribution of a module to other computers. Here the question is how
> > the filenames are stored during the transfer. The user is free to use
> > any tool, and try to find a tool handling Unicode correctly :-) But it's
> > no more the Python problem.
> 
> There are cases where there is no real "transfer", in the sense in which
> you are using the word. For example, with NFS, you can access the very
> same file simultaneously on two systems, with no file name conversion
> (unless you are using NFSv4, and unless your NFSv4 implementations
> support the UTF-8 mandate in NFS well).
> 
> Also, if two users of the same machine have different locale settings,
> the same file name might be interpreted differently.
> 
Thanks Martin, I think that you understand my view even if you don't share
it.

There's one further case that I am worried about that has no real
"transfer".  Since people here seem to think that unicode module names are
the future (for instance, the comments about redefining the C locale to
include utf-8 and the comments about archiving tools needing to support
encoding bits), there are eventually going to be unicode modules that become
dependencies of other modules and programs.  These will need to be installed
on systems.  Linux distributions that ship these will need to choose
a filesystem encoding for the filenames of these.  Likely the sensible thing
for them to do is to use utf-8 since all the ones I can think of default to
utf-8.  But, as Stephen and Victor have pointed out, users change their
locale settings to things that aren't utf-8 and save their modules using
filenames in that encoding.  When they update their OS to a version that has
utf-8 python module names, they will find that they have to make a choice.
They can either change their locale settings to a utf-8 encoding and have
the system installed modules work or they can leave their encoding on their
non-utf-8 encoding and have the modules that they've created on-site work.

This is not a good position to put users of these systems in.

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: 

From nyamatongwe at gmail.com  Thu Jan 27 02:37:52 2011
From: nyamatongwe at gmail.com (Neil Hodgson)
Date: Thu, 27 Jan 2011 12:37:52 +1100
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <20110127004708.GI24080@unaka.lan>
References: <1295440442.432.18.camel@marge>
	
	
	
	
	
	
	<20110125032609.GC24080@unaka.lan>
	<1296034834.25379.18.camel@marge> <4D3FF372.3040007@v.loewis.de>
	<20110127004708.GI24080@unaka.lan>
Message-ID: 

Toshio Kuratomi:

> When they update their OS to a version that has
> utf-8 python module names, they will find that they have to make a choice.
> They can either change their locale settings to a utf-8 encoding and have
> the system installed modules work or they can leave their encoding on their
> non-utf-8 encoding and have the modules that they've created on-site work.

   When switching to a UTF-8 locale, they can also change the file
names of their modules to be encoded in UTF-8. It would be fairly easy
to write a script that identifies non-ASCII file names in a directory
and offers to transcode their names from their current encoding to
UTF-8.

   Neil

From v+python at g.nevcal.com  Thu Jan 27 03:43:11 2011
From: v+python at g.nevcal.com (Glenn Linderman)
Date: Wed, 26 Jan 2011 18:43:11 -0800
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: <20110127004708.GI24080@unaka.lan>
References: <1295440442.432.18.camel@marge>							<20110125032609.GC24080@unaka.lan>	<1296034834.25379.18.camel@marge>
	<4D3FF372.3040007@v.loewis.de> <20110127004708.GI24080@unaka.lan>
Message-ID: <4D40DBBF.2010808@g.nevcal.com>

On 1/26/2011 4:47 PM, Toshio Kuratomi wrote:
> There's one further case that I am worried about that has no real
> "transfer".  Since people here seem to think that unicode module names are
> the future (for instance, the comments about redefining the C locale to
> include utf-8 and the comments about archiving tools needing to support
> encoding bits), there are eventually going to be unicode modules that become
> dependencies of other modules and programs.  These will need to be installed
> on systems.  Linux distributions that ship these will need to choose
> a filesystem encoding for the filenames of these.  Likely the sensible thing
> for them to do is to use utf-8 since all the ones I can think of default to
> utf-8.  But, as Stephen and Victor have pointed out, users change their
> locale settings to things that aren't utf-8 and save their modules using
> filenames in that encoding.  When they update their OS to a version that has
> utf-8 python module names, they will find that they have to make a choice.
> They can either change their locale settings to a utf-8 encoding and have
> the system installed modules work or they can leave their encoding on their
> non-utf-8 encoding and have the modules that they've created on-site work.
>
> This is not a good position to put users of these systems in.

The way this case should work, is that programs that install files 
(installation is a form of transfer) should transform their names from 
the encoding used in the transfer medium to the encoding of the 
filesystem on which they are installed.

Python3 should access the files, transforming the names from the 
encoding of the filesystem on which they are installed to Unicode for 
use by the program.

I think Python3 is trying to do its part, and Victor is trying to make 
that more robust on more platforms, specifically Windows.

The programs that install files, which may include programs that install 
Python files I don't know, may or may not be doing their part, but 
clearly there are cases where they do not.

Systems that have different encodings for names on the same or different 
file systems need to have a way to obtain the encoding for the file 
names, so they can be properly decoded.  If they don't have such a way, 
they are broken.

=====
The rest of this is an attempt to describe the problem of Linux and 
other systems which use byte strings instead of character strings as 
file names.  No problem, as long as programs allow byte strings as file 
names.  Python3 does not, for the import statement, thus the problem is 
relevant for discussion here, as has been ongoing.
=====

Since file names are defined to be byte strings, there is no way to 
obtain the encoding for file names, so they cannot always be decoded, 
and sometimes not properly decoded, because no one knows which encoding 
was used to create them, _if any_.

Hence, Linux programs that use character strings as file names 
internally and expect them to match the byte strings in the file system 
are promoting a fiction: that there is a transformation (encoding) from 
character strings to byte strings that will match.

When using ASCII character strings, they can be transformed to bytes 
using a simple transformation: identity... but that isn't necessarily 
correct, if the files were created using EBCDIC (unlikely on Linux 
systems, but not impossible, since Linux files are byte strings).

When using non-ASCII character strings, the fiction promoted is even 
bigger, and the transformation even harder.  Any 8-bit character 
encoding can pretend that identity is the correct transformation, but 
the result is mojibake if it isn't.  Unicode other multi-byte encodings 
have an even harder job, because there can be 8-bit sequences that are 
not legal for some transformations, but are legal for others.  This is 
when the fiction is exposed!

As the recent description of glib points out, when the file names are 
read as bytes, and shown to the user for selection, possibly using some 
mojibake-generating transformation to characters, the user has a 
fighting chance to pick the right file, less chance if the 
transformation is lossy ('?' substitutions, etc.) and/or the names are 
redundant in their lossless characters.

However, when the specification of the name is in characters (such as 
for Python import, or file names specified as character constants in any 
application system that provides/permits such), and there are large 
numbers of transformations that could be used to convert characters to 
bytes, the problem is harder, and error-prone... programs that want to 
promote the fiction of using characters for filenames must work harder.  
It seems that Python on Linux is such a program.

One technique is to have conventions agreed on by applications and users 
to limit the number of encodings used on a particular system to one 
(optimal) or a few, the latter requires understanding that files created 
in one encoding may not be accessible by systems that use a different 
one... until they are renamed.  Subsets of applications and users can 
the happily share files with others of their encoding, and with the 
subset of files that can be decoded successfully by their encoding, even 
though it is not correct.   (often ASCII, or a few mojibake characters 
learned for cross-subset usage.) When multiple encodings are used 
without such conventions, chaos results.

Another technique that would be amusing is to use Base64 (as Oleg 
suggested), URL-encoding, or some other mapping that transforms 
non-ASCII names to ASCII character sequences and the identity mapping to 
obtain bytes, and then Python could ship such files to any system, as 
long as it always included that mapping as one of the encodings it would 
try to find files.  This would probably be the most powerful solution, 
but would only need to be applied to those systems that do not use 
characters for filenames.  It could, in fact, be applied on any system 
that uses a subset of characters for filenames, and hence transcends the 
need for Unicode support in a file system to use Unicode names in 
Python3 import statements.  It would likely be problematical for use 
with 3rd-party libraries, however.

Another technique would be to try each possible encoding in turn, in 
some defined order, and the filesystem searched for that byte string as 
a file name, possibly matching files that shouldn't have been matched.  
To limit that search, such programs could allow configuration of an 
smaller ordered list of encodings to be tried to limit the search, and a 
specific one to be used for the creation of new files; this opens up the 
possibility of not trying the "right" encoding, for some rogue file name.

This would be an issue and implementation for Linux systems, but would 
not need to be used on systems such as MacOS (which defines a particular 
encoding) or Windows (which defines a particular encoding) etc.  When 
mounting filesystems that use byte string file names on systems with a 
define encoding, it should be the responsibility of the mounting system 
to do such transformations, and possibly have such configurations, and 
possibly have mappings or renaming facilities, and possibly prohibit 
access to files whose names cannot be transformed (of course, one can 
always punt by configuring latin-1 or other encodings that can match any 
byte string, but that produces mojibake, and then there is no surety 
that particular files will appear to have the name that programs expect).

Of course, Victor's patch is addressing Windows issues, and Windows has 
defined encodings, it is just a matter of using the proper APIs to see 
them, and should be accepted.

It sounds like the current situation on Linux is that Python can access 
the subset of files that match the locale encoding for which it is run.  
It sounds like it would be inappropriate for Python to begin shipping 
files with non-ASCII names as part of its Linux distribution, unless 
facilities are created or tools used to remap non-ASCII names to the 
local locale encoding.  Locales that are not ASCII supersets (in 
character repertoire, not encoding) could not be supported.  Locales 
that do not support all the characters used in files shipped with Python 
could not be supported.  Since locales vary wildly in their available 
non-ASCII names, that limits Python eithr to shipping ASCII names only, 
or restricting the locales that are supported to those that support the 
characters used.

I suppose  that Victor's patch would point out most or all the places 
where such transformations would have to be implemented, if it is 
important to support systems having byte string file names whose users 
cannot agree to use a single encoding for transforming to/from characters.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From greg at krypto.org  Thu Jan 27 06:50:30 2011
From: greg at krypto.org (Gregory P. Smith)
Date: Wed, 26 Jan 2011 21:50:30 -0800
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <1295911245.3704.13.camel@localhost.localdomain>
References: <4D3DDE5E.4080807@v.loewis.de> <20110124231233.79bed8eb@pitrou.net>
	<4D3E0617.7010001@v.loewis.de>
	<1295911245.3704.13.camel@localhost.localdomain>
Message-ID: 

On Mon, Jan 24, 2011 at 3:20 PM, Antoine Pitrou  wrote:
> Le mardi 25 janvier 2011 ? 00:07 +0100, "Martin v. L?wis" a ?crit :
>> >> I'd like to propose PEP 393, which takes a different approach,
>> >> addressing both problems simultaneously: by getting a flexible
>> >> representation (one that can be either 1, 2, or 4 bytes), we can
>> >> support the full range of Unicode on all systems, but still use
>> >> only one byte per character for strings that are pure ASCII (which
>> >> will be the majority of strings for the majority of users).
>> >
>> > For this kind of experiment, I think a concrete attempt at implementing
>> > (together with performance/memory savings numbers) would be much more
>> > useful than an abstract proposal.
>>
>> I partially agree. An implementation is certainly needed, but there is
>> nothing wrong (IMO) with designing the change before implementing it.
>> Also, several people have offered to help with the implementation, so
>> we need to agree on a specification first (which is actually cheaper
>> than starting with the implementation only to find out that people
>> misunderstood each other).
>
> I'm not sure it's really cheaper. When implementing you will probably
> find out that it makes more sense to change the meaning of some fields,
> add or remove some, etc. You will also want to try various tweaks since
> the whole point is to lighten the footprint of unicode strings in common
> workloads.

Yep.  This is only a proposal, an implementation will allow all of
that to be experimented with.

I have frequently see code today, even in python 2.x, that suffers
greatly from unicode vs string use (due to APIs in some code that were
returning unicode objects unnecessarily when the data was really all
ascii text).  python 3.x only increases this as the default for so
many things passes through unicode even for programs that may not need
it.

>
> So, the only criticism I have, intuitively, is that the unicode
> structure seems to become a bit too large. For example, I'm not sure you
> need a generic (pointer, size) pair in addition to the
> representation-specific ones.

I believe the intent this pep is aiming at is for the existing in
memory structure to be compatible with already compiled binary
extension modules without having to recompile them or change the APIs
they are using.

Personally I don't care at all about preserving that level of binary
compatibility, it has been convenient in the past but is rarely the
right thing to do.  Of course I'd personally like to see PyObject
nuked and revisited, it is too large and is probably not cache line
efficient.

>
> Incidentally, to slightly reduce the overhead the unicode objects,
> there's this proposal: http://bugs.python.org/issue1943

Interesting.  But that aims more at cpu performance than memory
overhead.  What I see is programs that predominantly process ascii
data yet waste memory on a 2-4x data explosion of the internal
representation.  This PEP aims to address that larger target.

-gps

From lukasz at langa.pl  Thu Jan 27 11:00:19 2011
From: lukasz at langa.pl (=?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=)
Date: Thu, 27 Jan 2011 11:00:19 +0100
Subject: [Python-Dev] Why do we bundle lib2to3 with Python? Was: Location of
 tests for packages
Message-ID: <4D414233.4000906@langa.pl>


W dniu 2011-01-24 23:13, Benjamin Peterson pisze:
>  I prefer lib2to3 tests to stay in lib2to3/.

On a related note, I had trouble myself with using outdated 2to3 and
heard complaints about that at least a couple of times. What do we gain
from bundling 2to3 with Python?

-- 
Best regards,
?ukasz Langa


From solipsis at pitrou.net  Thu Jan 27 15:57:18 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 27 Jan 2011 15:57:18 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: 
References: <4D3DDE5E.4080807@v.loewis.de>
	<20110124231233.79bed8eb@pitrou.net> <4D3E0617.7010001@v.loewis.de>
	<1295911245.3704.13.camel@localhost.localdomain>
	
Message-ID: <1296140238.3685.0.camel@localhost.localdomain>

Le mercredi 26 janvier 2011 ? 21:50 -0800, Gregory P. Smith a ?crit :
> >
> > Incidentally, to slightly reduce the overhead the unicode objects,
> > there's this proposal: http://bugs.python.org/issue1943
> 
> Interesting.  But that aims more at cpu performance than memory
> overhead.  What I see is programs that predominantly process ascii
> data yet waste memory on a 2-4x data explosion of the internal
> representation.  This PEP aims to address that larger target.

Right, but we should keep in mind that many unicode strings will not be
very large, and so the constant overhead of unicode objects is not
necessarily negligible.

Regards

Antoine.



From brett at python.org  Thu Jan 27 18:22:47 2011
From: brett at python.org (Brett Cannon)
Date: Thu, 27 Jan 2011 09:22:47 -0800
Subject: [Python-Dev] Why do we bundle lib2to3 with Python? Was:
 Location of tests for packages
In-Reply-To: <4D414233.4000906@langa.pl>
References: <4D414233.4000906@langa.pl>
Message-ID: 

2011/1/27 ?ukasz Langa :
>
> W dniu 2011-01-24 23:13, Benjamin Peterson pisze:
>>
>> ?I prefer lib2to3 tests to stay in lib2to3/.
>
> On a related note, I had trouble myself with using outdated 2to3 and
> heard complaints about that at least a couple of times. What do we gain
> from bundling 2to3 with Python?

Same thing we get when we bundle anything with Python: one less
dependency for people to download. Obviously this shouldn't be as much
of an issue once Python 3.2 is out.

From stefan_ml at behnel.de  Thu Jan 27 20:06:10 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 27 Jan 2011 20:06:10 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <4D3DDE5E.4080807@v.loewis.de>
References: <4D3DDE5E.4080807@v.loewis.de>
Message-ID: 

"Martin v. L?wis", 24.01.2011 21:17:
> The Py_UNICODE type is still supported but deprecated. It is always
> defined as a typedef for wchar_t, so the wstr representation can double
> as Py_UNICODE representation.

It's too bad this isn't initialised by default, though. Py_UNICODE is the 
only representation that can be used efficiently from C code and Cython 
relies on it for fast text processing. This proposal will therefore likely 
have a pretty negative performance impact on extensions written in Cython 
as the compiler could no longer expect this representation to be available 
instantaneously.

Stefan


From martin at v.loewis.de  Thu Jan 27 21:26:13 2011
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 27 Jan 2011 21:26:13 +0100
Subject: [Python-Dev] Import and unicode: part two
In-Reply-To: 
References: <1295440442.432.18.camel@marge>							<20110125032609.GC24080@unaka.lan>	<1296034834.25379.18.camel@marge>	<4D3FF372.3040007@v.loewis.de>	<20110127004708.GI24080@unaka.lan>
	
Message-ID: <4D41D4E5.5050603@v.loewis.de>

>    When switching to a UTF-8 locale, they can also change the file
> names of their modules to be encoded in UTF-8. It would be fairly easy
> to write a script that identifies non-ASCII file names in a directory
> and offers to transcode their names from their current encoding to
> UTF-8.

In fact, convmv (http://j3e.de/linux/convmv/) does exactly that;
it comes as a Debian package also.

Regards,
Martin

From foom at fuhm.net  Thu Jan 27 21:26:15 2011
From: foom at fuhm.net (James Y Knight)
Date: Thu, 27 Jan 2011 15:26:15 -0500
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: 
References: <4D3DDE5E.4080807@v.loewis.de> 
Message-ID: <999F03D8-C1D7-4A3D-BC7A-3A195CFD9CE5@fuhm.net>

On Jan 27, 2011, at 2:06 PM, Stefan Behnel wrote:
> "Martin v. L?wis", 24.01.2011 21:17:
>> The Py_UNICODE type is still supported but deprecated. It is always
>> defined as a typedef for wchar_t, so the wstr representation can double
>> as Py_UNICODE representation.
> 
> It's too bad this isn't initialised by default, though. Py_UNICODE is the only representation that can be used efficiently from C code and Cython relies on it for fast text processing. This proposal will therefore likely have a pretty negative performance impact on extensions written in Cython as the compiler could no longer expect this representation to be available instantaneously.

But the whole point of the exercise is so that it doesn't have to store a 4byte-per-char representation when a 1byte-per-char rep would do. If cython wants to work most efficiently with this proposal, it should learn to deal with the three possible raw representations.

James

From brett at python.org  Thu Jan 27 21:38:45 2011
From: brett at python.org (Brett Cannon)
Date: Thu, 27 Jan 2011 12:38:45 -0800
Subject: [Python-Dev] getting stable URLs for major.minor versions
Message-ID: 

Because of all the writing I have been doing lately, I have been
pulling up a lot of URLs pointing to various Python releases based
around minor versions (e.g., Python 2.7, not specifically 2.7.1). What
has been somewhat annoying is that there are no URLs which act as a
redirect to the latest release of a minor version. For instance, it
would be great if http://www.python.org/2.7 redirected to the Python
2.7.1 page. Linking to the 2.7.0 release page seems off since it is
out of date, but linking to 2.7.1 also seems silly as that will become
out of date as the newest release of Python 2.7 at some point as well.

Can we consider coming up with some URL scheme where people can link
to a version of Python that always redirects to the newest release?
Bonus points if we extend this to major versions, too. =) I am asking
here since the RMs will have to be okay with doing this as part of the
release plan.

Get the ball rolling, I say we make http://www.python.org/version/2.7
and http://www.python.org/version/2 redirect to the 2.7.1 release
page, etc. Personally I would rather have http://www.python.org/2.7
redirect to 2.7.1, but since that already redirects to 2.7.0 I doubt
people would be okay with the change.

From martin at v.loewis.de  Thu Jan 27 22:05:38 2011
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 27 Jan 2011 22:05:38 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <1295911245.3704.13.camel@localhost.localdomain>
References: <4D3DDE5E.4080807@v.loewis.de>	<20110124231233.79bed8eb@pitrou.net>
	<4D3E0617.7010001@v.loewis.de>
	<1295911245.3704.13.camel@localhost.localdomain>
Message-ID: <4D41DE22.2000100@v.loewis.de>

> So, the only criticism I have, intuitively, is that the unicode
> structure seems to become a bit too large. For example, I'm not sure you
> need a generic (pointer, size) pair in addition to the
> representation-specific ones.

It's not really a generic pointer, but rather a variable-sized pointer.
It may not fit into any of the other representations (e.g. if there is
a four-byte wchar_t, then a two-byte representation would fit neither
into the UTF-8 pointer nor into the wchar_t pointer).

> Incidentally, to slightly reduce the overhead the unicode objects,
> there's this proposal: http://bugs.python.org/issue1943

I wonder what aspects of this patch and discussion should be integrated
into the PEP. The notion of allocating the memory in the same block is
already considered in the PEP; what else might be relevant?
Input is welcome!

Regards,
Martin

From martin at v.loewis.de  Thu Jan 27 22:07:58 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Jan 2011 22:07:58 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: 
References: <4D3DDE5E.4080807@v.loewis.de>
	<20110124231233.79bed8eb@pitrou.net>	<4D3E0617.7010001@v.loewis.de>	<1295911245.3704.13.camel@localhost.localdomain>
	
Message-ID: <4D41DEAE.3080501@v.loewis.de>

> I believe the intent this pep is aiming at is for the existing in
> memory structure to be compatible with already compiled binary
> extension modules without having to recompile them or change the APIs
> they are using.

No, binary compatibility is not achieved. ABI-conforming modules will
continue to work even under this change, but only because access to the
unicode object internal representation is not available to the
restricted ABI.

> Personally I don't care at all about preserving that level of binary
> compatibility, it has been convenient in the past but is rarely the
> right thing to do.  Of course I'd personally like to see PyObject
> nuked and revisited, it is too large and is probably not cache line
> efficient.

That's a different PEP :-)

Regards,
Martin

From v+python at g.nevcal.com  Thu Jan 27 22:06:18 2011
From: v+python at g.nevcal.com (Glenn Linderman)
Date: Thu, 27 Jan 2011 13:06:18 -0800
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <999F03D8-C1D7-4A3D-BC7A-3A195CFD9CE5@fuhm.net>
References: <4D3DDE5E.4080807@v.loewis.de> 
	<999F03D8-C1D7-4A3D-BC7A-3A195CFD9CE5@fuhm.net>
Message-ID: <4D41DE4A.20303@g.nevcal.com>

On 1/27/2011 12:26 PM, James Y Knight wrote:
> On Jan 27, 2011, at 2:06 PM, Stefan Behnel wrote:
>> "Martin v. L?wis", 24.01.2011 21:17:
>>> The Py_UNICODE type is still supported but deprecated. It is always
>>> defined as a typedef for wchar_t, so the wstr representation can double
>>> as Py_UNICODE representation.
>> It's too bad this isn't initialised by default, though. Py_UNICODE is the only representation that can be used efficiently from C code and Cython relies on it for fast text processing. This proposal will therefore likely have a pretty negative performance impact on extensions written in Cython as the compiler could no longer expect this representation to be available instantaneously.
> But the whole point of the exercise is so that it doesn't have to store a 4byte-per-char representation when a 1byte-per-char rep would do. If cython wants to work most efficiently with this proposal, it should learn to deal with the three possible raw representations.

C was doing fast text processing on char long before Py_UNICODE existed, 
or wchar_t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From martin at v.loewis.de  Thu Jan 27 22:16:54 2011
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 27 Jan 2011 22:16:54 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <1295915323.3219.44.camel@radiator.bos.redhat.com>
References: <4D3DDE5E.4080807@v.loewis.de>
	<1295915323.3219.44.camel@radiator.bos.redhat.com>
Message-ID: <4D41E0C6.3090102@v.loewis.de>

> Repetition of "11"; I'm guessing that the 2byte/UCS-2 should read "10",
> so that they give the width of the char representation.

Thanks, fixed.

>>   00 => null pointer
> 
> Naturally this assumes that all pointers are at least 4-byte aligned (so
> that they can be masked off).  I assume that this is sane on every
> platform that Python supports, but should it be spelled out explicitly
> somewhere in the PEP?

I'll change the PEP to move the type indicator into the state field, so
that issue becomes irrelevant.

>>   The string is null-terminated (in its respective representation).
>> - hash, state: same as in Python 3.2
>> - utf8_length, utf8: UTF-8 representation (null-terminated)
> If this is to share its buffer with the "str" representation for the
> Latin-1 case, then I take it this ptr will typically be (str & ~4) ?
> i.e. only "str" has the low-order-bit type info.

Yes, the other pointers are aligned. Notice that the case in which
sharing occurs is only ASCII, though (for Latin-1, some characters
require two bytes in UTF-8).

> Spelling out the meaning of "optional":
>   does this mean that the relevant ptr is NULL; if so, if utf8 is null,
> is utf8_length undefined, or is it some dummy value?

I've clarified this: I propose length is undefined (unless there is a
good reason to clear it).

>> If the string is created directly with the canonical representation
>> (see below), this representation doesn't take a separate memory block,
>> but is allocated right after the PyUnicodeObject struct.
> 
> Is the idea to do pointer arithmentic when deleting the PyUnicodeObject
> to determine if the ptr is in that location, and not delete it if it is,
> or is there some other way of determining whether the pointers need
> deallocating?

Correct.

> If the former, is this embedding an assumption that the
> underlying allocator couldn't have allocated a buffer directly adjacent
> to the PyUnicodeObject.  I know that GNU libc's malloc/free
> implementation has gaps of two machine words between each allocation;
> off the top of my head I'm not sure if the optimized Object/obmalloc.c
> allocator enforces such gaps.

No, it doesn't... So I guess I reserve another bit in the state for that.

> GDB Debugging Hooks
> -------------------
> Tools/gdb/libpython.py contains debugging hooks that embed knowledge
> about the internals of CPython's data types, include PyUnicodeObject
> instances.  It will need to be slightly updated to track the change.

Thanks, added.

Regards,
Martin

From fdrake at acm.org  Thu Jan 27 22:16:59 2011
From: fdrake at acm.org (Fred Drake)
Date: Thu, 27 Jan 2011 16:16:59 -0500
Subject: [Python-Dev] getting stable URLs for major.minor versions
In-Reply-To: 
References: 
Message-ID: 

On Thu, Jan 27, 2011 at 3:38 PM, Brett Cannon  wrote:
> Linking to the 2.7.0 release page seems off since it is
> out of date, but linking to 2.7.1 also seems silly as that will become
> out of date as the newest release of Python 2.7 at some point as well.

I'd love to see something like this as well.  Part of the problem is
that when we want URLs to specific versions (which might even mean
2.7.0), we use the version number as released, and... there's really
not a 2.7.0.  I'd love for us to include ".0" in the actual release
number, instead of calling it just 2.7.  Then we could much more
easily handle this for docs, downloads, and anywhere else we want to
multi-plex multiple versions.


? -Fred

--
Fred L. Drake, Jr.? ? 
"A storm broke loose in my mind."? --Albert Einstein

From skip at pobox.com  Thu Jan 27 22:21:22 2011
From: skip at pobox.com (skip at pobox.com)
Date: Thu, 27 Jan 2011 15:21:22 -0600
Subject: [Python-Dev] getting stable URLs for major.minor versions
In-Reply-To: 
References: 
Message-ID: <19777.57810.481593.401954@montanaro.dyndns.org>

    Brett> Bonus points if we extend this to major versions, too. =)

I know you added a smiley, but just wanted to point out that since Python 2
and 3 are really different languages, referring 2.4 users to 3.3 might be a
bad idea.  (I imagine it wouldn't be hard to generalize from micro to minor
though. )

Skip

From stefan_ml at behnel.de  Thu Jan 27 22:24:34 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 27 Jan 2011 22:24:34 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <999F03D8-C1D7-4A3D-BC7A-3A195CFD9CE5@fuhm.net>
References: <4D3DDE5E.4080807@v.loewis.de> 
	<999F03D8-C1D7-4A3D-BC7A-3A195CFD9CE5@fuhm.net>
Message-ID: 

James Y Knight, 27.01.2011 21:26:
> On Jan 27, 2011, at 2:06 PM, Stefan Behnel wrote:
>> "Martin v. L?wis", 24.01.2011 21:17:
>>> The Py_UNICODE type is still supported but deprecated. It is always
>>> defined as a typedef for wchar_t, so the wstr representation can
>>> double as Py_UNICODE representation.
>>
>> It's too bad this isn't initialised by default, though. Py_UNICODE is
>> the only representation that can be used efficiently from C code and
>> Cython relies on it for fast text processing. This proposal will
>> therefore likely have a pretty negative performance impact on
>> extensions written in Cython as the compiler could no longer expect
>> this representation to be available instantaneously.
>
> But the whole point of the exercise is so that it doesn't have to store
> a 4byte-per-char representation when a 1byte-per-char rep would do.

I am well aware of that. But I'm arguing that the current simpler internal 
representation has had its advantages for CPython as a platform.


> If cython wants to work most efficiently with this proposal, it should
> learn to deal with the three possible raw representations.

I agree. After all, CPython is lucky to have it available. It wouldn't be 
the first time that we duplicate looping code based on the input type. 
However, like the looping code, it will also complicate all indexing code 
at runtime as it always needs to test which of the representations is 
current before it can read a character. Currently, all of this is a compile 
time decision. This will necessarily have a performance impact.

Stefan


From martin at v.loewis.de  Thu Jan 27 22:37:32 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Jan 2011 22:37:32 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: 
References: <4D3DDE5E.4080807@v.loewis.de>
	
Message-ID: <4D41E59C.9060903@v.loewis.de>

Am 25.01.2011 12:08, schrieb Nick Coghlan:
> On Tue, Jan 25, 2011 at 6:17 AM, "Martin v. L?wis"  wrote:
>> A new function PyUnicode_AsUTF8 is provided to access the UTF-8
>> representation. It is thus identical to the existing
>> _PyUnicode_AsString, which is removed. The function will compute the
>> utf8 representation when first called. Since this representation will
>> consume memory until the string object is released, applications
>> should use the existing PyUnicode_AsUTF8String where possible
>> (which generates a new string object every time). API that implicitly
>> converts a string to a char* (such as the ParseTuple functions) will
>> use this function to compute a conversion.
> 
> I'm not entirely clear as to what "this function" is referring to here.

PyUnicode_AsUTF8 (i.e. the one where you don't need to release the
memory). I made this explicit now.

> I'm also dubious of the "PyUnicode_Finalize" name - "PyUnicode_Ready"
> might be a better option (PyType_Ready seems a better analogy for a
> "I've filled everything in, please calculate the derived fields now"
> than Py_Finalize).

Ok, changed (when I was pondering about this PEP, this once occurred
me also, but I forgot when I typed it in).

> 
> More generally, let me see if I understand the proposed structure correctly:
> 
> str: Always set once PyUnicode_Ready() has been called.
>   Always points to the canonical representation of the string (as
> indicated by PyUnicode_Kind)
> length: Always set once PyUnicode_Ready() has been called. Specifies
> the number of code points in the string.

Correct.

> wstr: Set only if PyUnicode_AsUnicode has been called on the string.

Might also be set when the string is created through
PyUnicode_FromUnicode was used, and PyUnicode_Ready hasn't been called.

>     If (sizeof(wchar_t) == 2 && PyUnicode_Kind() == PyUnicode_2BYTE)
> or (sizeof(wchar_t) == 4 && PyUnicode_Kind() == PyUnicode_4BYTE), wstr
> = str, otherwise wstr points to dedicated memory
> wstr_length: Valid only if wstr != NULL
>     If wstr_length != length, indicates presence of surrogate pairs in
> a UCS-2 string (i.e. sizeof(wchar_t) == 2, PyUnicode_Kind() ==
> PyUnicode_4BYTE).

Correct.

> utf8: Set only if PyUnicode_AsUTF8 has been called on the string.
>     If string contents are pure ASCII, utf8 = str, otherwise utf8
> points to dedicated memory.
> utf8_length: Valid only if utf8_ptr != NULL

Correct.

> One change I would propose is that rather than hiding flags in the low
> order bits of the str pointer, we expand the use of the existing
> "state" field to cover the representation information in addition to
> the interning information.

Thanks for the idea; done.

> I would also suggest explicitly flagging
> internally whether or not a 1 byte string is ASCII or Latin-1 along
> the lines of:

Not sure about that. It would complicate PyUnicode_Kind.

Instead, I'd rather fill out utf8 right away if we can use sharing
(e.g. when the string is created with a max value <128, or
PyUnicode_Ready has determined that).

So I keep it for the moment as reserved (but would use it when
str is NULL, as I'd have to fill in some value, anyway).

Regards,
Martin

From martin at v.loewis.de  Thu Jan 27 22:42:39 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Jan 2011 22:42:39 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <4D3F5228.4010901@egenix.com>
References: <4D3DDE5E.4080807@v.loewis.de> <4D3F5228.4010901@egenix.com>
Message-ID: <4D41E6CF.1020206@v.loewis.de>

>>From my first impression, I'm
> not too thrilled by the prospect of making the Unicode implementation
> more complicated by having three different representations on each
> object.

Thanks, added as a concern.

> I also don't see how this could save a lot of memory. As an example
> take a French text with say 10mio code points. This would end up
> appearing in memory as 3 copies on Windows: one copy stored as UCS2 (20MB),
> one as Latin-1 (10MB) and one as UTF-8 (probably around 15MB, depending
> on how many accents are used). That's a saving of -10MB compared to
> today's implementation :-)

As others have pointed out: that's not how it works. It actually *will*
save memory, since the alternative representations are optional.

Regards,
Martin

From martin at v.loewis.de  Thu Jan 27 22:47:03 2011
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 27 Jan 2011 22:47:03 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: 
References: <4D3DDE5E.4080807@v.loewis.de> 
Message-ID: <4D41E7D7.5060708@v.loewis.de>

Am 27.01.2011 20:06, schrieb Stefan Behnel:
> "Martin v. L?wis", 24.01.2011 21:17:
>> The Py_UNICODE type is still supported but deprecated. It is always
>> defined as a typedef for wchar_t, so the wstr representation can double
>> as Py_UNICODE representation.
> 
> It's too bad this isn't initialised by default, though. Py_UNICODE is
> the only representation that can be used efficiently from C code and
> Cython relies on it for fast text processing.

That's not true. The str representation can also be used efficiently from C.

> This proposal will
> therefore likely have a pretty negative performance impact on extensions
> written in Cython as the compiler could no longer expect this
> representation to be available instantaneously.

In any case, I've added this concern.

Regards,
Martin

From martin at v.loewis.de  Thu Jan 27 22:54:25 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Jan 2011 22:54:25 +0100
Subject: [Python-Dev] getting stable URLs for major.minor versions
In-Reply-To: 
References: 
Message-ID: <4D41E991.7030002@v.loewis.de>

Am 27.01.2011 21:38, schrieb Brett Cannon:
> Because of all the writing I have been doing lately, I have been
> pulling up a lot of URLs pointing to various Python releases based
> around minor versions (e.g., Python 2.7, not specifically 2.7.1). What
> has been somewhat annoying is that there are no URLs which act as a
> redirect to the latest release of a minor version. For instance, it
> would be great if http://www.python.org/2.7 redirected to the Python
> 2.7.1 page.

The tradition is that /X.Y actually points to download/releases/X.Y.
These redirects haven't been added for 2.7, but are present for all
earlier releases, and 3.1. So unless there are strong objections,
I'll add the missing redirects soon.

> Get the ball rolling, I say we make http://www.python.org/version/2.7
> and http://www.python.org/version/2 redirect to the 2.7.1 release
> page, etc. Personally I would rather have http://www.python.org/2.7
> redirect to 2.7.1, but since that already redirects to 2.7.0 I doubt
> people would be okay with the change.

How about http://www.python.org/2.7.x redirecting to the latest 2.7.x
release? Likewise 2.x and 3.x.

Regards,
Martin

From brett at python.org  Thu Jan 27 22:40:25 2011
From: brett at python.org (Brett Cannon)
Date: Thu, 27 Jan 2011 13:40:25 -0800
Subject: [Python-Dev] getting stable URLs for major.minor versions
In-Reply-To: <19777.57810.481593.401954@montanaro.dyndns.org>
References: 
	<19777.57810.481593.401954@montanaro.dyndns.org>
Message-ID: 

On Thu, Jan 27, 2011 at 13:21,   wrote:
> ? ?Brett> Bonus points if we extend this to major versions, too. =)
>
> I know you added a smiley, but just wanted to point out that since Python 2
> and 3 are really different languages, referring 2.4 users to 3.3 might be a
> bad idea. ?(I imagine it wouldn't be hard to generalize from micro to minor
> though. )

I don't get what you are worried about: http://www.python.org/2 would
refer to 2.7.1 while http://www.python.org/3 would refer to 3.1.3.

I added the smiley as I doubt many people worry about linking to
Python 2 vs. Python 3 as generically as I have lately.

>
> Skip
>

From martin at v.loewis.de  Thu Jan 27 23:01:42 2011
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 27 Jan 2011 23:01:42 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: 
References: <4D3DDE5E.4080807@v.loewis.de>
		<999F03D8-C1D7-4A3D-BC7A-3A195CFD9CE5@fuhm.net>
	
Message-ID: <4D41EB46.7060105@v.loewis.de>

> I agree. After all, CPython is lucky to have it available. It wouldn't
> be the first time that we duplicate looping code based on the input
> type. However, like the looping code, it will also complicate all
> indexing code at runtime as it always needs to test which of the
> representations is current before it can read a character. Currently,
> all of this is a compile time decision. This will necessarily have a
> performance impact.

That's most certainly the case. That's one of the reasons to discuss
this through a PEP, rather than just coming up with a patch: if people
object to it too much because of the impact on execution speed, it may
get rejected. Of course, that would make those unhappy who complain
about the memory consumption.

This is a classical time-space-tradeoff, favoring space reduction
over time reduction.

I fully understand that the actual impact can only be observed when
an implementation is available, and applications have made a reasonable
effort to work with the implementation efficiently (or perhaps not,
which would show the impact on unmodified implementations).

This is something that works much better in PyPy: the actual string
operations are written in RPython, and the tracing JIT would generate
all versions of the code that are relevant for the different
representations (IIUC, this approach is only planned for PyPy, yet).

I hope that C macros can help reduce the maintenance burden.

Regards,
Martin

From greg at krypto.org  Thu Jan 27 23:05:51 2011
From: greg at krypto.org (Gregory P. Smith)
Date: Thu, 27 Jan 2011 14:05:51 -0800
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <4D41E6CF.1020206@v.loewis.de>
References: <4D3DDE5E.4080807@v.loewis.de> <4D3F5228.4010901@egenix.com>
	<4D41E6CF.1020206@v.loewis.de>
Message-ID: 

BTW, has anyone looked at what other languages with a native unicode
type do for their implementations if any of them attempt to conserve
ram?

From brett at python.org  Thu Jan 27 22:57:24 2011
From: brett at python.org (Brett Cannon)
Date: Thu, 27 Jan 2011 13:57:24 -0800
Subject: [Python-Dev] getting stable URLs for major.minor versions
In-Reply-To: <4D41E991.7030002@v.loewis.de>
References: 
	<4D41E991.7030002@v.loewis.de>
Message-ID: 

On Thu, Jan 27, 2011 at 13:54, "Martin v. L?wis"  wrote:
> Am 27.01.2011 21:38, schrieb Brett Cannon:
>> Because of all the writing I have been doing lately, I have been
>> pulling up a lot of URLs pointing to various Python releases based
>> around minor versions (e.g., Python 2.7, not specifically 2.7.1). What
>> has been somewhat annoying is that there are no URLs which act as a
>> redirect to the latest release of a minor version. For instance, it
>> would be great if http://www.python.org/2.7 redirected to the Python
>> 2.7.1 page.
>
> The tradition is that /X.Y actually points to download/releases/X.Y.
> These redirects haven't been added for 2.7, but are present for all
> earlier releases, and 3.1. So unless there are strong objections,
> I'll add the missing redirects soon.

That would be great. Keeping bumping up against the missing 2.7 redirect.

>
>> Get the ball rolling, I say we make http://www.python.org/version/2.7
>> and http://www.python.org/version/2 redirect to the 2.7.1 release
>> page, etc. Personally I would rather have http://www.python.org/2.7
>> redirect to 2.7.1, but since that already redirects to 2.7.0 I doubt
>> people would be okay with the change.
>
> How about http://www.python.org/2.7.x redirecting to the latest 2.7.x
> release? Likewise 2.x and 3.x.

Works for me! Short and elegant.

From alexander.belopolsky at gmail.com  Thu Jan 27 23:25:44 2011
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 27 Jan 2011 17:25:44 -0500
Subject: [Python-Dev] getting stable URLs for major.minor versions
In-Reply-To: <4D41E991.7030002@v.loewis.de>
References: 
	<4D41E991.7030002@v.loewis.de>
Message-ID: 

On Thu, Jan 27, 2011 at 4:54 PM, "Martin v. L?wis"  wrote:
..
> How about http://www.python.org/2.7.x redirecting to the latest 2.7.x
> release? Likewise 2.x and 3.x.

Whatever we do, let's use this opportunity to  unify redirect rules
for http://www.python.org/X.Y and http://docs.python.org/X.Y.  For a
related discussion, see http://bugs.python.org/issue10446.

From solipsis at pitrou.net  Thu Jan 27 23:30:08 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 27 Jan 2011 23:30:08 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <4D41DE22.2000100@v.loewis.de>
References: <4D3DDE5E.4080807@v.loewis.de>
	<20110124231233.79bed8eb@pitrou.net>  <4D3E0617.7010001@v.loewis.de>
	<1295911245.3704.13.camel@localhost.localdomain>
	<4D41DE22.2000100@v.loewis.de>
Message-ID: <1296167408.3693.1.camel@localhost.localdomain>


> > Incidentally, to slightly reduce the overhead the unicode objects,
> > there's this proposal: http://bugs.python.org/issue1943
> 
> I wonder what aspects of this patch and discussion should be integrated
> into the PEP. The notion of allocating the memory in the same block is
> already considered in the PEP; what else might be relevant?

Ok, I'm sorry for not reading the PEP carefully enough, then.
The patch does a couple of other tweaks such as making "state" a char
rather than an int, and changing the freelist algorithm. But the latter
doesn't need to be spelled out in a PEP anyway.

Regards

Antoine.



From martin at v.loewis.de  Thu Jan 27 23:40:29 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Jan 2011 23:40:29 +0100
Subject: [Python-Dev] getting stable URLs for major.minor versions
In-Reply-To: 
References: 	<4D41E991.7030002@v.loewis.de>
	
Message-ID: <4D41F45D.5020105@v.loewis.de>

> Whatever we do, let's use this opportunity to  unify redirect rules
> for http://www.python.org/X.Y and http://docs.python.org/X.Y.  For a
> related discussion, see http://bugs.python.org/issue10446.

TLDR; somebody should summarize it and specify what exactly needs to
be changed.

I'm only going to change the release redirects now.

Regards,
Martin

From alexander.belopolsky at gmail.com  Thu Jan 27 23:50:07 2011
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 27 Jan 2011 17:50:07 -0500
Subject: [Python-Dev] getting stable URLs for major.minor versions
In-Reply-To: <4D41F45D.5020105@v.loewis.de>
References: 
	<4D41E991.7030002@v.loewis.de>
	
	<4D41F45D.5020105@v.loewis.de>
Message-ID: 

On Thu, Jan 27, 2011 at 5:40 PM, "Martin v. L?wis"  wrote:
>> Whatever we do, let's use this opportunity to ?unify redirect rules
>> for http://www.python.org/X.Y and http://docs.python.org/X.Y. ?For a
>> related discussion, see http://bugs.python.org/issue10446.
>
> TLDR; somebody should summarize it and specify what exactly needs to
> be changed.
>

AFAICT, http://docs.python.org/X.Y links consistently point to
http://docs.python.org/release/X.Y.Z, where Z is the last micro
release of X.Y major.minor series.  I don't see any reason to change
anything at the moment, but if http://www.python.org will grow X.Y.x
redirects, it would be nice to have the same under
http://docs.python.org/release/ if not under http://docs.python.org/.

From stefan_ml at behnel.de  Thu Jan 27 23:53:40 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 27 Jan 2011 23:53:40 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <4D3DDE5E.4080807@v.loewis.de>
References: <4D3DDE5E.4080807@v.loewis.de>
Message-ID: 

"Martin v. L?wis", 24.01.2011 21:17:
> If the string is created directly with the canonical representation
> (see below), this representation doesn't take a separate memory block,
> but is allocated right after the PyUnicodeObject struct.

Does this mean it's supposed to become a PyVarObject? Antoine proposed 
that, too. Apart from breaking (more or less) all existing C subtyping 
code, this will also make it harder to subtype it in new code. I don't like 
that idea at all.

Stefan


From martin at v.loewis.de  Fri Jan 28 00:56:03 2011
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Fri, 28 Jan 2011 00:56:03 +0100
Subject: [Python-Dev] getting stable URLs for major.minor versions
In-Reply-To: 
References: 	<4D41E991.7030002@v.loewis.de>
	
Message-ID: <4D420613.8030603@v.loewis.de>

> Works for me! Short and elegant.

Done!

http://www.python.org/2.6.x
http://www.python.org/2.x
http://www.python.org/3.1.x
http://www.python.org/3.x

Regards,
Martin

From martin at v.loewis.de  Fri Jan 28 01:02:32 2011
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Fri, 28 Jan 2011 01:02:32 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: 
References: <4D3DDE5E.4080807@v.loewis.de> 
Message-ID: <4D420798.1050004@v.loewis.de>

Am 27.01.2011 23:53, schrieb Stefan Behnel:
> "Martin v. L?wis", 24.01.2011 21:17:
>> If the string is created directly with the canonical representation
>> (see below), this representation doesn't take a separate memory block,
>> but is allocated right after the PyUnicodeObject struct.
> 
> Does this mean it's supposed to become a PyVarObject?

What do you mean by "become"? Will it be declared as such? No.

> Antoine proposed
> that, too. Apart from breaking (more or less) all existing C subtyping
> code, this will also make it harder to subtype it in new code. I don't
> like that idea at all.

Why will it break all existing subtyping code? See the PEP: Only objects
created through PyUnicode_New will be affected - I don't think this can
include objects of a subtype.

Regards,
Martin

From eliben at gmail.com  Fri Jan 28 05:55:22 2011
From: eliben at gmail.com (Eli Bendersky)
Date: Fri, 28 Jan 2011 06:55:22 +0200
Subject: [Python-Dev] fcmp() in test.support
Message-ID: 

I'm working on improving the .rst documentation of test.support (Issue
11015), and came upon the undocumented "fcmp" function that's being
exported from test.support, along with a "FUZZ"constant.

As I search through the tests (py3k trunk), I see fcmp() is being used
only in two places in a fairly trivial way:
1. test_float: where it can be directly replaced by assertAlmostEqual
from unittest
2. test_builtin: where the assertion can also be easily rewritten in
terms of assertAlmostEqual

Although fcmp seems to provide extra functionality over
assertAlmostEqual, the above makes me think it should probably be
removed altogether, or added to unittest if it's still deemed
important.

+/- ?
Eli

From stefan_ml at behnel.de  Fri Jan 28 07:20:26 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 28 Jan 2011 07:20:26 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <4D420798.1050004@v.loewis.de>
References: <4D3DDE5E.4080807@v.loewis.de> 
	<4D420798.1050004@v.loewis.de>
Message-ID: 

"Martin v. L?wis", 28.01.2011 01:02:
> Am 27.01.2011 23:53, schrieb Stefan Behnel:
>> "Martin v. L?wis", 24.01.2011 21:17:
>>> If the string is created directly with the canonical representation
>>> (see below), this representation doesn't take a separate memory block,
>>> but is allocated right after the PyUnicodeObject struct.
>>
>> Does this mean it's supposed to become a PyVarObject?
>
> What do you mean by "become"? Will it be declared as such? No.
>
>> Antoine proposed
>> that, too. Apart from breaking (more or less) all existing C subtyping
>> code, this will also make it harder to subtype it in new code. I don't
>> like that idea at all.
>
> Why will it break all existing subtyping code? See the PEP: Only objects
> created through PyUnicode_New will be affected - I don't think this can
> include objects of a subtype.

Ok, that's fine then.

Stefan


From eliben at gmail.com  Fri Jan 28 07:52:55 2011
From: eliben at gmail.com (Eli Bendersky)
Date: Fri, 28 Jan 2011 08:52:55 +0200
Subject: [Python-Dev] Beta version of the new devguide
In-Reply-To: 
References: 
Message-ID: 

On Sun, Jan 23, 2011 at 03:08, Brett Cannon  wrote:
> http://docs.python.org/devguide/
>
> If you are a core developer and have a correction you want to make you
> can simply check out the devguide yourself (link is in the Resources
> section of the devguide) and make the corrections yourself. Otherwise
> reply here (you can email me directly but I already have instances of
> multiple people telling me about the same spelling mistake so it's
> nice to have it public so people know when I have been informed).

Brett,
A couple of concerns regarding the "Getting Set Up" page:

1)

"Do note that CPython will notice that it is being run from a source
checkout. This means that it if you edit Python source code in your
checkout the changes will be picked up by the interpreter for
immediate testing. "

I'm not sure what this means. Does CPython really know it's being run
from a source checkout as opposed to a source tarball? By editing
"Python source code" you mean the standard libraries/tests? To be
"picked up by the interpreter" you then need to run it from the root
of the checkout (after build) but this is also true for source
tarballs.

2)

"The core CPython interpreter only needs a C compiler to build itself;"

I find this confusing since the CPython interpreter doesn't build
itself. A developer builds it with a C compiler / makefile. Some tools
indeed "build themselves" in some kind of a bootstrap process (i.e.
gcc, AFAIK).


I apologize in advance if this is too nit-picky ;-)
Eli

From fweimer at bfk.de  Fri Jan 28 10:35:19 2011
From: fweimer at bfk.de (Florian Weimer)
Date: Fri, 28 Jan 2011 09:35:19 +0000
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To:  (Stefan Behnel's message of "Thu\,
	27 Jan 2011 20\:06\:10 +0100")
References: <4D3DDE5E.4080807@v.loewis.de> 
Message-ID: <821v3xl6aw.fsf@mid.bfk.de>

* Stefan Behnel:

> "Martin v. L?wis", 24.01.2011 21:17:
>> The Py_UNICODE type is still supported but deprecated. It is always
>> defined as a typedef for wchar_t, so the wstr representation can double
>> as Py_UNICODE representation.
>
> It's too bad this isn't initialised by default, though. Py_UNICODE is
> the only representation that can be used efficiently from C code

Is this really true?  I don't think I've seen any C API which actually
uses wchar_t, beyond that what is provided by libc.  UTF-8 and even
UTF-16 are much, much more common.

-- 
Florian Weimer                
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstra?e 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99

From stefan_ml at behnel.de  Fri Jan 28 11:30:33 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 28 Jan 2011 11:30:33 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <821v3xl6aw.fsf@mid.bfk.de>
References: <4D3DDE5E.4080807@v.loewis.de> 
	<821v3xl6aw.fsf@mid.bfk.de>
Message-ID: 

Florian Weimer, 28.01.2011 10:35:
> * Stefan Behnel:
>> "Martin v. L?wis", 24.01.2011 21:17:
>>> The Py_UNICODE type is still supported but deprecated. It is always
>>> defined as a typedef for wchar_t, so the wstr representation can double
>>> as Py_UNICODE representation.
>>
>> It's too bad this isn't initialised by default, though. Py_UNICODE is
>> the only representation that can be used efficiently from C code
>
> Is this really true?  I don't think I've seen any C API which actually
> uses wchar_t, beyond that what is provided by libc.  UTF-8 and even
> UTF-16 are much, much more common.

They are also much harder to use, unless you are really only interested in 
7-bit ASCII data - which is the case for most C libraries, so I believe 
that's what you meant here. However, this is the CPython runtime with 
built-in Unicode support, not the C runtime where it comes as an add-on at 
best, and where Unicode processing without being Unicode aware is common.

The nice thing about Py_UNICODE is that is basically gives you native 
Unicode code points directly, without needing to decode UTF-8 byte runs and 
the like. In Cython, it allows you to do things like this:

     def test_for_those_characters(unicode s):
         for c in s:
             # warning: randomly chosen Unicode escapes ahead
             if c in u"\u0356\u1012\u3359\u4567":
                 return True
         else:
             return False

The loop runs in plain C, using the somewhat obvious implementation with a 
loop over Py_UNICODE characters and a switch statement for the comparison. 
This would look a *lot* more ugly with UTF-8 encoded byte strings.

Regarding Cython specifically, the above will still be *possible* under the 
proposal, given that the memory layout of the strings will still represent 
the Unicode code points. It will just be trickier to implement in Cython's 
type system as there is no longer a (user visible) C type representation 
for those code units. It can be any of uchar, ushort16 or uint32, neither 
of which is necessarily a 'native' representation of a Unicode character in 
CPython. While I'm somewhat confident that I'll find a way to fix this in 
Cython, my point is just that this adds a certain level of complexity to C 
code using the new memory layout that simply wasn't there before.

Stefan


From skip at pobox.com  Fri Jan 28 11:50:37 2011
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 28 Jan 2011 04:50:37 -0600
Subject: [Python-Dev] getting stable URLs for major.minor versions
In-Reply-To: 
References: 
	<19777.57810.481593.401954@montanaro.dyndns.org>
	
Message-ID: <19778.40829.489205.651604@montanaro.dyndns.org>


    Brett> I don't get what you are worried about: http://www.python.org/2
    Brett> would refer to 2.7.1 while http://www.python.org/3 would refer to
    Brett> 3.1.3.

In my world, 2 == major, 7 == minor, 1 == micro.  I interpreted your
reference to "major" as implying .../2 would refer to .../3.  I thought the
smiley was because you didn't relly expect people to do that.

S

From fweimer at bfk.de  Fri Jan 28 15:27:39 2011
From: fweimer at bfk.de (Florian Weimer)
Date: Fri, 28 Jan 2011 14:27:39 +0000
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To:  (Stefan Behnel's message of "Fri\,
	28 Jan 2011 11\:30\:33 +0100")
References: <4D3DDE5E.4080807@v.loewis.de> 
	<821v3xl6aw.fsf@mid.bfk.de> 
Message-ID: <82r5bxhzms.fsf@mid.bfk.de>

* Stefan Behnel:

> The nice thing about Py_UNICODE is that is basically gives you native
> Unicode code points directly, without needing to decode UTF-8 byte
> runs and the like. In Cython, it allows you to do things like this:
>
>     def test_for_those_characters(unicode s):
>         for c in s:
>             # warning: randomly chosen Unicode escapes ahead
>             if c in u"\u0356\u1012\u3359\u4567":
>                 return True
>         else:
>             return False
>
> The loop runs in plain C, using the somewhat obvious implementation
> with a loop over Py_UNICODE characters and a switch statement for the
> comparison. This would look a *lot* more ugly with UTF-8 encoded byte
> strings.

Not really, because UTF-8 is quite search-friendly.  (The if would
have to invoke a memmem()-like primitive.)  Random subscrips are
problematic.

However, why would one want to write loops like the above?  Don't you
have to take combining characters (comprising multiple codepoints)
into account most of the time when you look at individual characters?
Then UTF-32 does not offer much of a simplification.

-- 
Florian Weimer                
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstra?e 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99

From stefan_ml at behnel.de  Fri Jan 28 16:22:37 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 28 Jan 2011 16:22:37 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <82r5bxhzms.fsf@mid.bfk.de>
References: <4D3DDE5E.4080807@v.loewis.de>
		<821v3xl6aw.fsf@mid.bfk.de>
	 <82r5bxhzms.fsf@mid.bfk.de>
Message-ID: 

Florian Weimer, 28.01.2011 15:27:
> * Stefan Behnel:
>
>> The nice thing about Py_UNICODE is that is basically gives you native
>> Unicode code points directly, without needing to decode UTF-8 byte
>> runs and the like. In Cython, it allows you to do things like this:
>>
>>      def test_for_those_characters(unicode s):
>>          for c in s:
>>              # warning: randomly chosen Unicode escapes ahead
>>              if c in u"\u0356\u1012\u3359\u4567":
>>                  return True
>>          else:
>>              return False
>>
>> The loop runs in plain C, using the somewhat obvious implementation
>> with a loop over Py_UNICODE characters and a switch statement for the
>> comparison. This would look a *lot* more ugly with UTF-8 encoded byte
>> strings.
>
> Not really, because UTF-8 is quite search-friendly.  (The if would
> have to invoke a memmem()-like primitive.)  Random subscrips are
> problematic.
>
> However, why would one want to write loops like the above?  Don't you
> have to take combining characters (comprising multiple codepoints)
> into account most of the time when you look at individual characters?
> Then UTF-32 does not offer much of a simplification.

Hmm, I think this discussion is pointless. Regardless of the memory layout, 
you can always go down to the byte level and use an efficient 
(multi-)substring search algorithm. (which is obviously helped if you know 
the layout at compile time *wink*)

Bad example, I guess.

Stefan


From techtonik at gmail.com  Fri Jan 28 17:12:39 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Fri, 28 Jan 2011 18:12:39 +0200
Subject: [Python-Dev] Finally fix installer to add Python to %PATH% on
	Windows
Message-ID: 

Hi, I'd like to

You probably know that after installation on Windows system it is
possible to call Python from Explorer's Run dialog (Win-R). It is
because Python path is added to App Paths registry key and Windows
Explorer shell checks this key when looking for executable.

But Python doesn't work from cmd session and, more importantly,
*Python doesn't work from .bat files*. It is because cmd shell doesn't
know about App Paths and relies on system PATH to find executable. As
far as I remember, there is no function in Python stdlib either, that
executes processes and does lookups in App Paths.

I never paid much attention to this fact, because I put several .bat
files for every 25, 26, 27, 31 and 32 version of Python into PATH
manually. But when bootstrap script for build environment of Native
Client (NaCl) said that I have no Python available and started to
install its own, I've asked myself - "How come? There are 5! possible
versions of Python on my system." It appeared that the following .bat
file doesn't work:

---cut mypy.bat--
python.exe
---cut mypy.bat--

C:\>mypy.bat

C:\>python.exe
'python.exe' is not recognized as an internal or external command,
operable program or batch file.


I've seen about 7 requests to add Python into %PATH% in installer. All
closed with no result, but with some fear and uncertainty. Martin
feared that MSI installers are not able to remove entry from PATH and
even if they can, they may kill the whole PATH instead of removing
just one entry.

To prove or dispel these fears, I've just installed/uninstalled
Mercurial from mercurial-1.7.3-1-x86.msi and App Engine from
GoogleAppEngine-1.4.1.msi several times. Both add entries to PATH and
both remove them without any further problems. Should we finally add
this to 3.2 installer for Python?

-- 
anatoly t.

From brian.curtin at gmail.com  Fri Jan 28 17:29:07 2011
From: brian.curtin at gmail.com (Brian Curtin)
Date: Fri, 28 Jan 2011 10:29:07 -0600
Subject: [Python-Dev] Finally fix installer to add Python to %PATH% on
	Windows
In-Reply-To: 
References: 
Message-ID: 

On Fri, Jan 28, 2011 at 10:12, anatoly techtonik wrote:

> Hi, I'd like to
>
> You probably know that after installation on Windows system it is
> possible to call Python from Explorer's Run dialog (Win-R). It is
> because Python path is added to App Paths registry key and Windows
> Explorer shell checks this key when looking for executable.
>
> But Python doesn't work from cmd session and, more importantly,
> *Python doesn't work from .bat files*. It is because cmd shell doesn't
> know about App Paths and relies on system PATH to find executable. As
> far as I remember, there is no function in Python stdlib either, that
> executes processes and does lookups in App Paths.
>
> I never paid much attention to this fact, because I put several .bat
> files for every 25, 26, 27, 31 and 32 version of Python into PATH
> manually. But when bootstrap script for build environment of Native
> Client (NaCl) said that I have no Python available and started to
> install its own, I've asked myself - "How come? There are 5! possible
> versions of Python on my system." It appeared that the following .bat
> file doesn't work:
>
> ---cut mypy.bat--
> python.exe
> ---cut mypy.bat--
>
> C:\>mypy.bat
>
> C:\>python.exe
> 'python.exe' is not recognized as an internal or external command,
> operable program or batch file.
>
>
> I've seen about 7 requests to add Python into %PATH% in installer. All
> closed with no result, but with some fear and uncertainty. Martin
> feared that MSI installers are not able to remove entry from PATH and
> even if they can, they may kill the whole PATH instead of removing
> just one entry.
>
> To prove or dispel these fears, I've just installed/uninstalled
> Mercurial from mercurial-1.7.3-1-x86.msi and App Engine from
> GoogleAppEngine-1.4.1.msi several times. Both add entries to PATH and
> both remove them without any further problems. Should we finally add
> this to 3.2 installer for Python?
>
> --
> anatoly t.


Definitely not for 3.2, but this is something I'd like to look into for 3.3.

Recently I've talked to two Python trainers/educators and the major gripe
their attendees see is that you can't just sit down and type "python" and
have something work. For multi-Python installs, we'll have to define what
that "something" is, but I think it should be possible for the installer to
optionally put Python into the path, and to also remove itself on uninstall.

One of said trainers is running a course inside my company right now and the
training room VMs they are running on do not have the path setup. Some users
were puzzled as to why "python foo.py" doesn't work, but executing "foo.py"
does (via file association).

One quick-and-dirty solution was to create a "Command Shell" shortcut in the
start menu which would just be a batch file that adds Python to the path for
that cmd session. It would be kind of similar to the "Python (command line)"
shortcut, which uses pythonw.exe. I think we can do better than this,
though.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From mail at timgolden.me.uk  Fri Jan 28 17:37:34 2011
From: mail at timgolden.me.uk (Tim Golden)
Date: Fri, 28 Jan 2011 16:37:34 +0000
Subject: [Python-Dev] Finally fix installer to add Python to %PATH% on
 Windows
In-Reply-To: 
References: 
	
Message-ID: <4D42F0CE.8080406@timgolden.me.uk>

On 28/01/2011 16:29, Brian Curtin wrote:
> On Fri, Jan 28, 2011 at 10:12, anatoly techtonikwrote:
>> I've seen about 7 requests to add Python into %PATH% in installer. All
>> closed with no result, but with some fear and uncertainty. Martin
>> feared that MSI installers are not able to remove entry from PATH and
>> even if they can, they may kill the whole PATH instead of removing
>> just one entry.
>>
>> To prove or dispel these fears, I've just installed/uninstalled
>> Mercurial from mercurial-1.7.3-1-x86.msi and App Engine from
>> GoogleAppEngine-1.4.1.msi several times. Both add entries to PATH and
>> both remove them without any further problems. Should we finally add
>> this to 3.2 installer for Python?
>>
>> --
>> anatoly t.
>
>
> Definitely not for 3.2, but this is something I'd like to look into for 3.3.
>
> Recently I've talked to two Python trainers/educators and the major gripe
> their attendees see is that you can't just sit down and type "python" and
> have something work. For multi-Python installs, we'll have to define what
> that "something" is, but I think it should be possible for the installer to
> optionally put Python into the path, and to also remove itself on uninstall.

I don't think, ultimately, that there is any insurmountable technical
objection. There are misgivings but they could undoubtedly be overcome
or overridden. But it would require someone to patch the MSI builder
so it added the functionality and -- I think -- offered it as an option
which could be enabled or disabled.

TJG

From status at bugs.python.org  Fri Jan 28 18:07:04 2011
From: status at bugs.python.org (Python tracker)
Date: Fri, 28 Jan 2011 18:07:04 +0100 (CET)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <20110128170704.C6AB61CCED@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (2011-01-21 - 2011-01-28)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open    2567 (+40)
  closed 20262 (+34)
  total  22829 (+74)

Open issues with patches: 1085 


Issues opened (54)
==================

#10042: total_ordering stack overflow
http://bugs.python.org/issue10042  reopened by eric.araujo

#10501: make_buildinfo regression with unquoted path
http://bugs.python.org/issue10501  reopened by eli.bendersky

#10708: Misc/porting should be folded into the development FAQ or the 
http://bugs.python.org/issue10708  reopened by pitrou

#10976: json.loads() throws TypeError on bytes object
http://bugs.python.org/issue10976  opened by hhas

#10977: Concrete object C API needs abstract path for subclasses of bu
http://bugs.python.org/issue10977  opened by rhettinger

#10978: Add optional argument to Semaphore.release for releasing multi
http://bugs.python.org/issue10978  opened by rhettinger

#10979: setUpClass exception causes explosion with "-b"
http://bugs.python.org/issue10979  opened by brandon-rhodes

#10980: http.server Header Unicode Bug
http://bugs.python.org/issue10980  opened by aronacher

#10983: Errors in http.client.HTTPConnection class (python3)
http://bugs.python.org/issue10983  opened by nooB

#10984: argparse add_mutually_exclusive_group should accept existing a
http://bugs.python.org/issue10984  opened by gotgenes

#10988: Descriptor protocol documentation for super bindings is incorr
http://bugs.python.org/issue10988  opened by Joshua.Arnold

#10989: ssl.SSLContext(True).load_verify_locations(None, True) segfaul
http://bugs.python.org/issue10989  opened by haypo

#10990: tests mutating sys.gettrace() w/o re-instating previous state
http://bugs.python.org/issue10990  opened by brett.cannon

#10991: trace fails when test imported a temporary file
http://bugs.python.org/issue10991  opened by brett.cannon

#10992: tests failing when run under coverage
http://bugs.python.org/issue10992  opened by brett.cannon

#10994: implementation details in sys module
http://bugs.python.org/issue10994  opened by fijall

#10998: Remove last traces of -Q / sys.flags.division_warning / Py_Div
http://bugs.python.org/issue10998  opened by eric.araujo

#10999: os.chflags refers to stat constants, but the constants are not
http://bugs.python.org/issue10999  opened by r.david.murray

#11001: Various obvious errors in cookies documentation
http://bugs.python.org/issue11001  opened by spookylukey

#11003: os.system should be deprecated in favour of subprocess module
http://bugs.python.org/issue11003  opened by Jakob.Bowyer

#11005: Assertion error on RLock._acquire_restore
http://bugs.python.org/issue11005  opened by haypo

#11006: warnings with subprocess and pipe2
http://bugs.python.org/issue11006  opened by pitrou

#11007: stack tracebacks should give the relevant class name
http://bugs.python.org/issue11007  opened by stickwithjosh

#11009: urllib.splituser is not documented
http://bugs.python.org/issue11009  opened by techtonik

#11011: More functools functions
http://bugs.python.org/issue11011  opened by Jason.Baker

#11012: Add log1p(), exp1m(), gamma(), and lgamma() to cmath
http://bugs.python.org/issue11012  opened by rhettinger

#11015: Bring test.support docs up to date
http://bugs.python.org/issue11015  opened by ncoghlan

#11021: email MIME-Version headers for each part in multipart message
http://bugs.python.org/issue11021  opened by david.caro

#11022: locale.setlocale() doesn't change I/O codec, os.environ does
http://bugs.python.org/issue11022  opened by sdaoden

#11023: pep 227 missing text
http://bugs.python.org/issue11023  opened by aisaac

#11025: Distutils2 install command without setup.py or setup.cfg creat
http://bugs.python.org/issue11025  opened by Boris.FELD

#11027: Implement sectionxform in configparser
http://bugs.python.org/issue11027  opened by Kunjesh.Kaushik

#11028: Implement the setup.py -> setup.cfg in mkcfg
http://bugs.python.org/issue11028  opened by alaintty

#11029: Crash, 2.7.1, Tkinter and threads and line drawing
http://bugs.python.org/issue11029  opened by PythonInTheGrass

#11030: regrtest - allow for relative path with --coverdir
http://bugs.python.org/issue11030  opened by sandro.tosi

#11031: regrtest - --testdir, new command-line option to specify alter
http://bugs.python.org/issue11031  opened by sandro.tosi

#11032: _string: formatter_field_name_split() and formatter_parser doe
http://bugs.python.org/issue11032  opened by haypo

#11033: ElementTree.fromstring doesn't work with Unicode
http://bugs.python.org/issue11033  opened by Peter.Cai

#11034: Build problem on Windows with MSVC++ Express 2008
http://bugs.python.org/issue11034  opened by eli.bendersky

#11035: Segmentation fault
http://bugs.python.org/issue11035  opened by Dmitry.Groshev

#11037: How distutils2 handle namespaces
http://bugs.python.org/issue11037  opened by sdouche

#11038: Some commands should stop if Name and Version are missing
http://bugs.python.org/issue11038  opened by gawel

#11040: After registering a project to PyPI, classifiers fields aren't
http://bugs.python.org/issue11040  opened by Julien.Miotte

#11041: On the distutils2 documentation, 'requires-python' shouldn't b
http://bugs.python.org/issue11041  opened by Julien.Miotte

#11042: [PyPI CSS] Adding project urls onto a project page using regis
http://bugs.python.org/issue11042  opened by Julien.Miotte

#11043: On GNU/Linux (Ubuntu) distutils2.mkcfg shouldn't create an exe
http://bugs.python.org/issue11043  opened by Julien.Miotte

#11044: The description-file isn't handled by distutils2
http://bugs.python.org/issue11044  opened by Julien.Miotte

#11045: shutil._make_tarball
http://bugs.python.org/issue11045  opened by tarek

#11046: darwin/MacOS X setup.py hack
http://bugs.python.org/issue11046  opened by sdaoden

#10997: Duplicate entries in IDLE "Recent Files" menu item on OS X
http://bugs.python.org/issue10997  opened by ned.deily

#11016: Add S_ISDOOR to the stat module
http://bugs.python.org/issue11016  opened by pitrou

#11024: imaplib: Time2Internaldate() returns localized strings
http://bugs.python.org/issue11024  opened by spaetz

#11036: Allow multiple files in the description-file metadata
http://bugs.python.org/issue11036  opened by gawel

#11047: Bad description for a change
http://bugs.python.org/issue11047  opened by Oren_Held



Most recent 15 issues with no replies (15)
==========================================

#11047: Bad description for a change
http://bugs.python.org/issue11047

#11044: The description-file isn't handled by distutils2
http://bugs.python.org/issue11044

#11043: On GNU/Linux (Ubuntu) distutils2.mkcfg shouldn't create an exe
http://bugs.python.org/issue11043

#11042: [PyPI CSS] Adding project urls onto a project page using regis
http://bugs.python.org/issue11042

#11041: On the distutils2 documentation, 'requires-python' shouldn't b
http://bugs.python.org/issue11041

#11040: After registering a project to PyPI, classifiers fields aren't
http://bugs.python.org/issue11040

#11038: Some commands should stop if Name and Version are missing
http://bugs.python.org/issue11038

#11037: How distutils2 handle namespaces
http://bugs.python.org/issue11037

#11036: Allow multiple files in the description-file metadata
http://bugs.python.org/issue11036

#11033: ElementTree.fromstring doesn't work with Unicode
http://bugs.python.org/issue11033

#11031: regrtest - --testdir, new command-line option to specify alter
http://bugs.python.org/issue11031

#11030: regrtest - allow for relative path with --coverdir
http://bugs.python.org/issue11030

#11028: Implement the setup.py -> setup.cfg in mkcfg
http://bugs.python.org/issue11028

#11023: pep 227 missing text
http://bugs.python.org/issue11023

#11012: Add log1p(), exp1m(), gamma(), and lgamma() to cmath
http://bugs.python.org/issue11012



Most recent 15 issues waiting for review (15)
=============================================

#11047: Bad description for a change
http://bugs.python.org/issue11047

#11034: Build problem on Windows with MSVC++ Express 2008
http://bugs.python.org/issue11034

#11032: _string: formatter_field_name_split() and formatter_parser doe
http://bugs.python.org/issue11032

#11031: regrtest - --testdir, new command-line option to specify alter
http://bugs.python.org/issue11031

#11030: regrtest - allow for relative path with --coverdir
http://bugs.python.org/issue11030

#11024: imaplib: Time2Internaldate() returns localized strings
http://bugs.python.org/issue11024

#11015: Bring test.support docs up to date
http://bugs.python.org/issue11015

#11011: More functools functions
http://bugs.python.org/issue11011

#11001: Various obvious errors in cookies documentation
http://bugs.python.org/issue11001

#10999: os.chflags refers to stat constants, but the constants are not
http://bugs.python.org/issue10999

#10998: Remove last traces of -Q / sys.flags.division_warning / Py_Div
http://bugs.python.org/issue10998

#10997: Duplicate entries in IDLE "Recent Files" menu item on OS X
http://bugs.python.org/issue10997

#10992: tests failing when run under coverage
http://bugs.python.org/issue10992

#10990: tests mutating sys.gettrace() w/o re-instating previous state
http://bugs.python.org/issue10990

#10989: ssl.SSLContext(True).load_verify_locations(None, True) segfaul
http://bugs.python.org/issue10989



Top 10 most discussed issues (10)
=================================

#10990: tests mutating sys.gettrace() w/o re-instating previous state
http://bugs.python.org/issue10990  22 msgs

#9124: Mailbox module should use binary I/O, not text I/O
http://bugs.python.org/issue9124  21 msgs

#10848: Move test.regrtest from getopt to argparse
http://bugs.python.org/issue10848  19 msgs

#11034: Build problem on Windows with MSVC++ Express 2008
http://bugs.python.org/issue11034  12 msgs

#5863: bz2.BZ2File should accept other file-like objects.
http://bugs.python.org/issue5863  11 msgs

#11027: Implement sectionxform in configparser
http://bugs.python.org/issue11027  11 msgs

#11016: Add S_ISDOOR to the stat module
http://bugs.python.org/issue11016  11 msgs

#10954: No warning for csv.writer API change
http://bugs.python.org/issue10954  10 msgs

#10994: implementation details in sys module
http://bugs.python.org/issue10994  10 msgs

#11022: locale.setlocale() doesn't change I/O codec, os.environ does
http://bugs.python.org/issue11022   9 msgs



Issues closed (34)
==================

#4177: Crash in MIMEText on FreeBSD
http://bugs.python.org/issue4177  closed by haypo

#5097: asyncore.dispatcher_with_send undocumented
http://bugs.python.org/issue5097  closed by giampaolo.rodola

#5831: Doc mistake : threading.Timer is *not* a class
http://bugs.python.org/issue5831  closed by eric.araujo

#10948: Trouble with dir_util created dir cache
http://bugs.python.org/issue10948  closed by eric.araujo

#10949: logging.RotatingFileHandler not robust enough
http://bugs.python.org/issue10949  closed by vinay.sajip

#10952: Don't normalize module names to NFKC?
http://bugs.python.org/issue10952  closed by haypo

#10955: Possible regression with stdlib in zipfile
http://bugs.python.org/issue10955  closed by haypo

#10957: Python developer FAQ grammar error
http://bugs.python.org/issue10957  closed by brett.cannon

#10960: os.stat() does not mention that it follow symlinks by default
http://bugs.python.org/issue10960  closed by r.david.murray

#10970: "string".encode('base64') is not the same as base64.b64encode(
http://bugs.python.org/issue10970  closed by terry.reedy

#10973: OS X 10.6 IDLE, tkinter: Cocoa Tk 8.5 crash when composite cha
http://bugs.python.org/issue10973  closed by ned.deily

#10974: IDLE 3.x can crash decoding recent file list
http://bugs.python.org/issue10974  closed by ned.deily

#10975: #10961: Pydoc touchups in new 3.2 Web server (issue4090042)
http://bugs.python.org/issue10975  closed by eric.araujo

#10981: argparse: options starting with -- match substrings
http://bugs.python.org/issue10981  closed by david.caro

#10982: asyncore timeouts do not work correctly
http://bugs.python.org/issue10982  closed by giampaolo.rodola

#10985: test_sys triggers a fatal python error when run under coverage
http://bugs.python.org/issue10985  closed by brett.cannon

#10986: traceback's rendering behavior while throwing custom exception
http://bugs.python.org/issue10986  closed by benjamin.peterson

#10987: _pickle doesn't handle recursion limits properly
http://bugs.python.org/issue10987  closed by pitrou

#10993: HTTPSConnection does not close when call close() method
http://bugs.python.org/issue10993  closed by tanakorn

#10995: mailbox.py open() calls don't set encoding
http://bugs.python.org/issue10995  closed by r.david.murray

#10996: Typo in What's New in 3.2
http://bugs.python.org/issue10996  closed by rhettinger

#11000: Doc: ast.parse parses source, not just expressions
http://bugs.python.org/issue11000  closed by terry.reedy

#11002: 'Upload' link on Files page is broken
http://bugs.python.org/issue11002  closed by eric.araujo

#11004: AssertionError on collections.deque().count(1)
http://bugs.python.org/issue11004  closed by rhettinger

#11008: logging.dictConfig not documented as new in version 2.7
http://bugs.python.org/issue11008  closed by vinay.sajip

#11010: Unicode BOM left in loaded text
http://bugs.python.org/issue11010  closed by loewis

#11013: Build of  2.7 svn fails in readline
http://bugs.python.org/issue11013  closed by brett.cannon

#11014: 'filter' argument for Tarfile.add needs to be a keyword-only a
http://bugs.python.org/issue11014  closed by rhettinger

#11017: optparse: error: invalid integer value
http://bugs.python.org/issue11017  closed by eric.araujo

#11018: typo in test_bz2
http://bugs.python.org/issue11018  closed by pitrou

#11019: BytesGenerator fails if the Message body is None
http://bugs.python.org/issue11019  closed by r.david.murray

#11020: Pyclbr broken because of missing 2-to-3 conversion
http://bugs.python.org/issue11020  closed by rhettinger

#11026: Distutils2 install command fail with python 2.5/2.7
http://bugs.python.org/issue11026  closed by Boris.FELD

#11039: Use of 'L' specifier is inconsistent when printing long intege
http://bugs.python.org/issue11039  closed by eric.smith

From merwok at netwok.org  Fri Jan 28 18:43:16 2011
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Fri, 28 Jan 2011 18:43:16 +0100
Subject: [Python-Dev] Finally fix installer to add Python to %PATH% on
 Windows
In-Reply-To: 
References: 
Message-ID: <4D430034.7060207@netwok.org>

Hello

See http://bugs.python.org/issue3561 (rejected by Martin).

Regards

From brett at python.org  Fri Jan 28 19:05:47 2011
From: brett at python.org (Brett Cannon)
Date: Fri, 28 Jan 2011 10:05:47 -0800
Subject: [Python-Dev] Beta version of the new devguide
In-Reply-To: 
References: 
	
Message-ID: 

On Thu, Jan 27, 2011 at 22:52, Eli Bendersky  wrote:
> On Sun, Jan 23, 2011 at 03:08, Brett Cannon  wrote:
>> http://docs.python.org/devguide/
>>
>> If you are a core developer and have a correction you want to make you
>> can simply check out the devguide yourself (link is in the Resources
>> section of the devguide) and make the corrections yourself. Otherwise
>> reply here (you can email me directly but I already have instances of
>> multiple people telling me about the same spelling mistake so it's
>> nice to have it public so people know when I have been informed).
>
> Brett,
> A couple of concerns regarding the "Getting Set Up" page:
>
> 1)
>
> "Do note that CPython will notice that it is being run from a source
> checkout. This means that it if you edit Python source code in your
> checkout the changes will be picked up by the interpreter for
> immediate testing. "
>
> I'm not sure what this means. Does CPython really know it's being run
> from a source checkout as opposed to a source tarball?

Technically yes because of sys.subversion, but otherwise not really.
But then again the distinction is so minimal I'm not going to bother
rephrasing it to make it clear.

> By editing
> "Python source code" you mean the standard libraries/tests?

I'll make it "Python's".

> To be
> "picked up by the interpreter" you then need to run it from the root
> of the checkout (after build) but this is also true for source
> tarballs.

Once again, not an important distinction.

>
> 2)
>
> "The core CPython interpreter only needs a C compiler to build itself;"
>
> I find this confusing since the CPython interpreter doesn't build
> itself. A developer builds it with a C compiler / makefile. Some tools
> indeed "build themselves" in some kind of a bootstrap process (i.e.
> gcc, AFAIK).

True. I'll rephrase.

>
>
> I apologize in advance if this is too nit-picky ;-)

Sure, but at least you said it nicely. =)

-Brett

> Eli
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org
>

From brett at python.org  Fri Jan 28 19:09:12 2011
From: brett at python.org (Brett Cannon)
Date: Fri, 28 Jan 2011 10:09:12 -0800
Subject: [Python-Dev] fcmp() in test.support
In-Reply-To: 
References: 
Message-ID: 

On Thu, Jan 27, 2011 at 20:55, Eli Bendersky  wrote:
> I'm working on improving the .rst documentation of test.support (Issue
> 11015), and came upon the undocumented "fcmp" function that's being
> exported from test.support, along with a "FUZZ"constant.
>
> As I search through the tests (py3k trunk), I see fcmp() is being used
> only in two places in a fairly trivial way:
> 1. test_float: where it can be directly replaced by assertAlmostEqual
> from unittest
> 2. test_builtin: where the assertion can also be easily rewritten in
> terms of assertAlmostEqual
>
> Although fcmp seems to provide extra functionality over
> assertAlmostEqual, the above makes me think it should probably be
> removed altogether, or added to unittest if it's still deemed
> important.
>
> +/- ?

I say drop it if it can be done so safely.

From raymond.hettinger at gmail.com  Fri Jan 28 19:51:19 2011
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Fri, 28 Jan 2011 10:51:19 -0800
Subject: [Python-Dev] fcmp() in test.support
In-Reply-To: 
References: 
	
Message-ID: <99C803CF-798E-45DD-A163-F597C38A5E89@gmail.com>


On Jan 28, 2011, at 10:09 AM, Brett Cannon wrote:

> On Thu, Jan 27, 2011 at 20:55, Eli Bendersky  wrote:
>> I'm working on improving the .rst documentation of test.support (Issue
>> 11015), and came upon the undocumented "fcmp" function that's being
>> exported from test.support, along with a "FUZZ"constant.
>> 
>> As I search through the tests (py3k trunk), I see fcmp() is being used
>> only in two places in a fairly trivial way:
>> 1. test_float: where it can be directly replaced by assertAlmostEqual
>> from unittest
>> 2. test_builtin: where the assertion can also be easily rewritten in
>> terms of assertAlmostEqual
>> 
>> Although fcmp seems to provide extra functionality over
>> assertAlmostEqual, the above makes me think it should probably be
>> removed altogether, or added to unittest if it's still deemed
>> important.
>> 
>> +/- ?
> 
> I say drop it if it can be done so safely.

Yes, please remove fcmp() altogether.
Like you said, the usage is trivial.

If you're feeling bold, replace them with assertEqual(),
the tests look like they produce exact values even
in floating point.


Raymond


------------------------------


~/py32 $ ack "fcmp" --python
Doc/tools/pygments/lexers/asm.py
261:             r'|lshr|ashr|and|or|xor|icmp|fcmp'

Lib/test/support.py
36:    "fcmp", "is_jython", "TESTFN", "HOST", "FUZZ", "SAVEDCWD", "temp_cwd",
354:def fcmp(x, y): # fuzzy comparison function
364:            outcome = fcmp(x[i], y[i])

Lib/test/test_builtin.py
13:from test.support import fcmp, TESTFN, unlink,  run_unittest, check_warnings
397:        self.assertTrue(not fcmp(divmod(3.25, 1.0), (3.0, 0.25)))
398:        self.assertTrue(not fcmp(divmod(-3.25, 1.0), (-4.0, 0.75)))
399:        self.assertTrue(not fcmp(divmod(3.25, -1.0), (-4.0, -0.75)))
400:        self.assertTrue(not fcmp(divmod(-3.25, -1.0), (3.0, -0.25)))

Lib/test/test_float.py
91:        self.assertEqual(support.fcmp(float("  .25e-1  "), .025), 0)


From fuzzyman at voidspace.org.uk  Fri Jan 28 20:21:08 2011
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 28 Jan 2011 19:21:08 +0000
Subject: [Python-Dev] Finally fix installer to add Python to %PATH% on
 Windows
In-Reply-To: 
References: 
	
Message-ID: <4D431724.4010002@voidspace.org.uk>

On 28/01/2011 16:29, Brian Curtin wrote:
> On Fri, Jan 28, 2011 at 10:12, anatoly techtonik  > wrote:
>
>     Hi, I'd like to
>
>     You probably know that after installation on Windows system it is
>     possible to call Python from Explorer's Run dialog (Win-R). It is
>     because Python path is added to App Paths registry key and Windows
>     Explorer shell checks this key when looking for executable.
>
>     But Python doesn't work from cmd session and, more importantly,
>     *Python doesn't work from .bat files*. It is because cmd shell doesn't
>     know about App Paths and relies on system PATH to find executable. As
>     far as I remember, there is no function in Python stdlib either, that
>     executes processes and does lookups in App Paths.
>
>     I never paid much attention to this fact, because I put several .bat
>     files for every 25, 26, 27, 31 and 32 version of Python into PATH
>     manually. But when bootstrap script for build environment of Native
>     Client (NaCl) said that I have no Python available and started to
>     install its own, I've asked myself - "How come? There are 5! possible
>     versions of Python on my system." It appeared that the following .bat
>     file doesn't work:
>
>     ---cut mypy.bat--
>     python.exe
>     ---cut mypy.bat--
>
>     C:\>mypy.bat
>
>     C:\>python.exe
>     'python.exe' is not recognized as an internal or external command,
>     operable program or batch file.
>
>
>     I've seen about 7 requests to add Python into %PATH% in installer. All
>     closed with no result, but with some fear and uncertainty. Martin
>     feared that MSI installers are not able to remove entry from PATH and
>     even if they can, they may kill the whole PATH instead of removing
>     just one entry.
>
>     To prove or dispel these fears, I've just installed/uninstalled
>     Mercurial from mercurial-1.7.3-1-x86.msi and App Engine from
>     GoogleAppEngine-1.4.1.msi several times. Both add entries to PATH and
>     both remove them without any further problems. Should we finally add
>     this to 3.2 installer for Python?
>
>     --
>     anatoly t.
>
>
> Definitely not for 3.2, but this is something I'd like to look into 
> for 3.3.
>
> Recently I've talked to two Python trainers/educators and the major 
> gripe their attendees see is that you can't just sit down and type 
> "python" and have something work. For multi-Python installs, we'll 
> have to define what that "something" is, but I think it should be 
> possible for the installer to optionally put Python into the path, and 
> to also remove itself on uninstall.
>

I've helped quite a few "python newbies" on Windows who are also 
surprised / frustrated on learning that "python" on the command line 
doesn't work after installing python.

All the best,

Michael Foord

> One of said trainers is running a course inside my company right now 
> and the training room VMs they are running on do not have the path 
> setup. Some users were puzzled as to why "python foo.py" doesn't work, 
> but executing "foo.py" does (via file association).
>
> One quick-and-dirty solution was to create a "Command Shell" shortcut 
> in the start menu which would just be a batch file that adds Python to 
> the path for that cmd session. It would be kind of similar to the 
> "Python (command line)" shortcut, which uses pythonw.exe. I think we 
> can do better than this, though.
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk


-- 
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From raymond.hettinger at gmail.com  Fri Jan 28 20:29:02 2011
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Fri, 28 Jan 2011 11:29:02 -0800
Subject: [Python-Dev] Finally fix installer to add Python to %PATH% on
	Windows
In-Reply-To: <4D431724.4010002@voidspace.org.uk>
References: 
	
	<4D431724.4010002@voidspace.org.uk>
Message-ID: <7DA37C12-D3DA-49B3-996A-017CF304BC5C@gmail.com>


On Jan 28, 2011, at 11:21 AM, Michael Foord wrote:

> On 28/01/2011 16:29, Brian Curtin wrote:
>> 
>> Recently I've talked to two Python trainers/educators and the major gripe their attendees see is that you can't just sit down and type "python" and have something work. For multi-Python installs, we'll have to define what that "something" is, but I think it should be possible for the installer to optionally put Python into the path, and to also remove itself on uninstall.
>> 
> 
> I've helped quite a few "python newbies" on Windows who are also surprised / frustrated on learning that "python" on the command line doesn't work after installing python.

At the very least, we should add some prominent instructions for getting the command line version up and running.


Raymond


From fuzzyman at voidspace.org.uk  Fri Jan 28 20:34:12 2011
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 28 Jan 2011 19:34:12 +0000
Subject: [Python-Dev] fcmp() in test.support
In-Reply-To: 
References: 
Message-ID: <4D431A34.9030401@voidspace.org.uk>

On 28/01/2011 04:55, Eli Bendersky wrote:
> I'm working on improving the .rst documentation of test.support (Issue
> 11015), and came upon the undocumented "fcmp" function that's being
> exported from test.support, along with a "FUZZ"constant.
>

This module shouldn't really be documented at all. It exists to support 
the Python test framework and we don't want to have to support users or 
make API stability guarantees. Plus most of it is rather old. Please 
don't document more stuff in this module.

> As I search through the tests (py3k trunk), I see fcmp() is being used
> only in two places in a fairly trivial way:
> 1. test_float: where it can be directly replaced by assertAlmostEqual
> from unittest
> 2. test_builtin: where the assertion can also be easily rewritten in
> terms of assertAlmostEqual
>
> Although fcmp seems to provide extra functionality over
> assertAlmostEqual, the above makes me think it should probably be
> removed altogether, or added to unittest if it's still deemed
> important.
>
Yes, get rid of it.

Michael Foord

> +/- ?
> Eli
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk


-- 
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html


From eliben at gmail.com  Fri Jan 28 20:59:23 2011
From: eliben at gmail.com (Eli Bendersky)
Date: Fri, 28 Jan 2011 21:59:23 +0200
Subject: [Python-Dev] fcmp() in test.support
In-Reply-To: <4D431A34.9030401@voidspace.org.uk>
References: 
	<4D431A34.9030401@voidspace.org.uk>
Message-ID: 

>> I'm working on improving the .rst documentation of test.support (Issue
>> 11015), and came upon the undocumented "fcmp" function that's being
>> exported from test.support, along with a "FUZZ"constant.
>>

The documentation of the test module clearly states right at the top:

""
Note

The test package is meant for internal use by Python only. It is
documented for the benefit of the core developers of Python. Any use
of this package outside of Python?s standard library is discouraged as
code mentioned here can change or be removed without notice between
releases of Python.
""

Given that disclaimer, I don't think it's a bad idea to document more
parts of test.support. People adding new tests should be aware of some
of the tools that already exist there, and only some of which are
documented. Just my 2c here.

Maybe Nick will want to chip in here since he opened issue 11015.

Eli

From lists at cheimes.de  Fri Jan 28 21:34:05 2011
From: lists at cheimes.de (Christian Heimes)
Date: Fri, 28 Jan 2011 21:34:05 +0100
Subject: [Python-Dev] Finally fix installer to add Python to %PATH% on
	Windows
In-Reply-To: <7DA37C12-D3DA-49B3-996A-017CF304BC5C@gmail.com>
References: 		<4D431724.4010002@voidspace.org.uk>
	<7DA37C12-D3DA-49B3-996A-017CF304BC5C@gmail.com>
Message-ID: 

Am 28.01.2011 20:29, schrieb Raymond Hettinger:
> At the very least, we should add some prominent instructions for getting the command line version up and running.

/me pops out of Guido's time machine and says: "execute
Tools/scripts/win_add2path.py"

I'm -1 on adding Python to %PATH%. The private MSVCRT DLLs may lead to
unexpected side effects and it doesn't scale at all. What about people
with more than one Python installation? I suggest that we add a single
user specific directory or a global directory to %PATH% for all
installations. Then the Python installer or 3rd party modules can drop
executables like python3.3.exe or plip-3.3.exe into this directory. A
.bat file won't do good because .bat files must be called with "call
python33.bat" from another .bat file or the first one gets terminated.

We can even use a single and simple executable as template for all tasks:

 * get registry key from resource section of the executable
 * use the registry key to lookup the location and name of pythonXX.dll
 * load DLL
 * get optional dotted module name for resource section
 * either fire up interpreter as shell, with **argv or -m
dotted.module.name **argv

Done ;)

Christian


From brian.curtin at gmail.com  Fri Jan 28 21:46:38 2011
From: brian.curtin at gmail.com (Brian Curtin)
Date: Fri, 28 Jan 2011 14:46:38 -0600
Subject: [Python-Dev] Finally fix installer to add Python to %PATH% on
	Windows
In-Reply-To: 
References: 
	
	<4D431724.4010002@voidspace.org.uk>
	<7DA37C12-D3DA-49B3-996A-017CF304BC5C@gmail.com>
	
Message-ID: 

On Fri, Jan 28, 2011 at 14:34, Christian Heimes  wrote:

> Am 28.01.2011 20:29, schrieb Raymond Hettinger:
> > At the very least, we should add some prominent instructions for getting
> the command line version up and running.
>
> /me pops out of Guido's time machine and says: "execute
> Tools/scripts/win_add2path.py"
>
> I'm -1 on adding Python to %PATH%. The private MSVCRT DLLs may lead to
> unexpected side effects and it doesn't scale at all. What about people
> with more than one Python installation?


The same "problem" exists when it comes to file associations. The last
installer you've run wins the battle. Since setting file associations is
optional, and only one association can exist, I don't see why we can't do
the same for the PATH.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From martin at v.loewis.de  Fri Jan 28 22:49:08 2011
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Fri, 28 Jan 2011 22:49:08 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: 
References: <4D3DDE5E.4080807@v.loewis.de>
		<821v3xl6aw.fsf@mid.bfk.de>
	
Message-ID: <4D4339D4.5060906@v.loewis.de>

> The nice thing about Py_UNICODE is that is basically gives you native
> Unicode code points directly, without needing to decode UTF-8 byte runs
> and the like. In Cython, it allows you to do things like this:
> 
>     def test_for_those_characters(unicode s):
>         for c in s:
>             # warning: randomly chosen Unicode escapes ahead
>             if c in u"\u0356\u1012\u3359\u4567":
>                 return True
>         else:
>             return False
> 
> The loop runs in plain C, using the somewhat obvious implementation with
> a loop over Py_UNICODE characters and a switch statement for the
> comparison. This would look a *lot* more ugly with UTF-8 encoded byte
> strings.

And indeed, when Cython is updated to 3.3, it shouldn't access the UTF-8
representation for such a loop. Instead, it should access the str
representation, and might compile this to code like

#define Cython_CharAt(data, kind, pos) kind==LATIN1 ? \
             ((unsigned char*)data)[pos] : kind==UCS2 ? \
             ((unsigned short*)data)[pos] : \
             ((Py_UCS4*)data)[pos]

     void *data = PyUnicode_Data(s);
     int kind = PyUnicode_Kind(s);
     for(int pos=0; pos < PyUnicode_Size(s); pos++){
       Py_UCS4 c = Cython_CharAt(data, kind, pos);
       Py_UCS4 tmp = {0x356, 0x1012, 0x3359, 0x4567};
       for (int k=0; k<4; k++)
         if (c == tmp[k])
              return 1;
     }
     return 0;

> Regarding Cython specifically, the above will still be *possible* under
> the proposal, given that the memory layout of the strings will still
> represent the Unicode code points. It will just be trickier to implement
> in Cython's type system as there is no longer a (user visible) C type
> representation for those code units.

There is: Py_UCS4 remains available.

> It can be any of uchar, ushort16 or
> uint32, neither of which is necessarily a 'native' representation of a
> Unicode character in CPython.

There won't be a "native" representation anymore - that's the whole
point of the PEP.

> While I'm somewhat confident that I'll
> find a way to fix this in Cython, my point is just that this adds a
> certain level of complexity to C code using the new memory layout that
> simply wasn't there before.

Understood. However, I think it is easier than you think it is.

Regards,
Martin

From josiah.carlson at gmail.com  Sat Jan 29 01:54:08 2011
From: josiah.carlson at gmail.com (Josiah Carlson)
Date: Fri, 28 Jan 2011 16:54:08 -0800
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <4D3DDE5E.4080807@v.loewis.de>
References: <4D3DDE5E.4080807@v.loewis.de>
Message-ID: 

Pardon me for this drive-by posting, but this thread smells a lot like this
old thread (don't be afraid to read it all, there are some good points in
there; not directed at you Martin, but at all readers/posters in this
thread)...

http://mail.python.org/pipermail/python-3000/2006-September/003795.html

I'm
not averse to faster and/or more memory efficient unicode representations (I
would be quite happy with them, actually). I do see the usefulness of having
non-utf-8 representations, and caching them is a good idea, though I wonder
if that is a "good for Python itself to cache", or "good for the application
to cache".

The evil side of me says that we should just provide an API available in
Python/C for "give me the representation of unicode string X using the
2byte/4byte code points", and have it just return the appropriate
array.array() value (useful for passing to other APIs, or for those who need
to do manual manipulation of code-points), or whatever structure is deemed
to be appropriate.

The less evil side of me says that going with what the PEP offers isn't a
bad idea, and might just be a good idea.

I'll defer my vote to Martin.

Regards,
 - Josiah

On Mon, Jan 24, 2011 at 12:17 PM, "Martin v. L?wis" wrote:

> I have been thinking about Unicode representation for some time now.
> This was triggered, on the one hand, by discussions with Glyph Lefkowitz
> (who complained that his server app consumes too much memory), and Carl
> Friedrich Bolz (who profiled Python applications to determine that
> Unicode strings are among the top consumers of memory in Python).
> On the other hand, this was triggered by the discussion on supporting
> surrogates in the library better.
>
> I'd like to propose PEP 393, which takes a different approach,
> addressing both problems simultaneously: by getting a flexible
> representation (one that can be either 1, 2, or 4 bytes), we can
> support the full range of Unicode on all systems, but still use
> only one byte per character for strings that are pure ASCII (which
> will be the majority of strings for the majority of users).
>
> You'll find the PEP at
>
> http://www.python.org/dev/peps/pep-0393/
>
> For convenience, I include it below.
>
> Regards,
> Martin
>
> PEP: 393
> Title: Flexible String Representation
> Version: $Revision: 88168 $
> Last-Modified: $Date: 2011-01-24 21:14:21 +0100 (Mo, 24. Jan 2011) $
> Author: Martin v. L?wis 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 24-Jan-2010
> Python-Version: 3.3
> Post-History:
>
> Abstract
> ========
>
> The Unicode string type is changed to support multiple internal
> representations, depending on the character with the largest Unicode
> ordinal (1, 2, or 4 bytes). This will allow a space-efficient
> representation in common cases, but give access to full UCS-4 on all
> systems. For compatibility with existing APIs, several representations
> may exist in parallel; over time, this compatibility should be phased
> out.
>
> Rationale
> =========
>
> There are two classes of complaints about the current implementation
> of the unicode type: on systems only supporting UTF-16, users complain
> that non-BMP characters are not properly supported. On systems using
> UCS-4 internally (and also sometimes on systems using UCS-2), there is
> a complaint that Unicode strings take up too much memory - especially
> compared to Python 2.x, where the same code would often use ASCII
> strings (i.e. ASCII-encoded byte strings). With the proposed approach,
> ASCII-only Unicode strings will again use only one byte per character;
> while still allowing efficient indexing of strings containing non-BMP
> characters (as strings containing them will use 4 bytes per
> character).
>
> One problem with the approach is support for existing applications
> (e.g. extension modules). For compatibility, redundant representations
> may be computed. Applications are encouraged to phase out reliance on
> a specific internal representation if possible. As interaction with
> other libraries will often require some sort of internal
> representation, the specification choses UTF-8 as the recommended way
> of exposing strings to C code.
>
> For many strings (e.g. ASCII), multiple representations may actually
> share memory (e.g. the shortest form may be shared with the UTF-8 form
> if all characters are ASCII). With such sharing, the overhead of
> compatibility representations is reduced.
>
> Specification
> =============
>
> The Unicode object structure is changed to this definition::
>
>  typedef struct {
>    PyObject_HEAD
>    Py_ssize_t length;
>    void *str;
>    Py_hash_t hash;
>    int state;
>    Py_ssize_t utf8_length;
>    void *utf8;
>    Py_ssize_t wstr_length;
>    void *wstr;
>  } PyUnicodeObject;
>
> These fields have the following interpretations:
>
> - length: number of code points in the string (result of sq_length)
> - str: shortest-form representation of the unicode string; the lower
>  two bits of the pointer indicate the specific form:
>  01 => 1 byte (Latin-1); 11 => 2 byte (UCS-2); 11 => 4 byte (UCS-4);
>  00 => null pointer
>
>  The string is null-terminated (in its respective representation).
> - hash, state: same as in Python 3.2
> - utf8_length, utf8: UTF-8 representation (null-terminated)
> - wstr_length, wstr: representation in platform's wchar_t
>  (null-terminated). If wchar_t is 16-bit, this form may use surrogate
>  pairs (in which cast wstr_length differs form length).
>
> All three representations are optional, although the str form is
> considered the canonical representation which can be absent only
> while the string is being created.
>
> The Py_UNICODE type is still supported but deprecated. It is always
> defined as a typedef for wchar_t, so the wstr representation can double
> as Py_UNICODE representation.
>
> The str and utf8 pointers point to the same memory if the string uses
> only ASCII characters (using only Latin-1 is not sufficient). The str
> and wstr pointers point to the same memory if the string happens to
> fit exactly to the wchar_t type of the platform (i.e. uses some
> BMP-not-Latin-1 characters if sizeof(wchar_t) is 2, and uses some
> non-BMP characters if sizeof(wchar_t) is 4).
>
> If the string is created directly with the canonical representation
> (see below), this representation doesn't take a separate memory block,
> but is allocated right after the PyUnicodeObject struct.
>
> String Creation
> ---------------
>
> The recommended way to create a Unicode object is to use the function
> PyUnicode_New::
>
>   PyObject* PyUnicode_New(Py_ssize_t size, Py_UCS4 maxchar);
>
> Both parameters must denote the eventual size/range of the strings.
> In particular, codecs using this API must compute both the number of
> characters and the maximum character in advance. An string is
> allocated according to the specified size and character range and is
> null-terminated; the actual characters in it may be unitialized.
>
> PyUnicode_FromString and PyUnicode_FromStringAndSize remain supported
> for processing UTF-8 input; the input is decoded, and the UTF-8
> representation is not yet set for the string.
>
> PyUnicode_FromUnicode remains supported but is deprecated. If the
> Py_UNICODE pointer is non-null, the str representation is set. If the
> pointer is NULL, a properly-sized wstr representation is allocated,
> which can be modified until PyUnicode_Finalize() is called (explicitly
> or implicitly). Resizing a Unicode string remains possible until it
> is finalized.
>
> PyUnicode_Finalize() converts a string containing only a wstr
> representation into the canonical representation. Unless wstr and str
> can share the memory, the wstr representation is discarded after the
> conversion.
>
> String Access
> -------------
>
> The canonical representation can be accessed using two macros
> PyUnicode_Kind and PyUnicode_Data. PyUnicode_Kind gives one of the
> value PyUnicode_1BYTE (1), PyUnicode_2BYTE (2), or PyUnicode_4BYTE
> (3). PyUnicode_Data gives the void pointer to the data, masking out
> the pointer kind. All these functions call PyUnicode_Finalize
> in case the canonical representation hasn't been computed yet.
>
> A new function PyUnicode_AsUTF8 is provided to access the UTF-8
> representation. It is thus identical to the existing
> _PyUnicode_AsString, which is removed. The function will compute the
> utf8 representation when first called. Since this representation will
> consume memory until the string object is released, applications
> should use the existing PyUnicode_AsUTF8String where possible
> (which generates a new string object every time). API that implicitly
> converts a string to a char* (such as the ParseTuple functions) will
> use this function to compute a conversion.
>
> PyUnicode_AsUnicode is deprecated; it computes the wstr representation
> on first use.
>
> String Operations
> -----------------
>
> Various convenience functions will be provided to deal with the
> canonical representation, in particular with respect to concatenation
> and slicing.
>
> Stable ABI
> ----------
>
> None of the functions in this PEP become part of the stable ABI.
>
> Copyright
> =========
>
> This document has been placed in the public domain.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/josiah.carlson%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From stefan_ml at behnel.de  Sat Jan 29 07:33:54 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 29 Jan 2011 07:33:54 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <4D4339D4.5060906@v.loewis.de>
References: <4D3DDE5E.4080807@v.loewis.de>		<821v3xl6aw.fsf@mid.bfk.de>	
	<4D4339D4.5060906@v.loewis.de>
Message-ID: 

"Martin v. L?wis", 28.01.2011 22:49:
> And indeed, when Cython is updated to 3.3, it shouldn't access the UTF-8
> representation for such a loop. Instead, it should access the str
> representation

Sure.


>> Regarding Cython specifically, the above will still be *possible* under
>> the proposal, given that the memory layout of the strings will still
>> represent the Unicode code points. It will just be trickier to implement
>> in Cython's type system as there is no longer a (user visible) C type
>> representation for those code units.
>
> There is: Py_UCS4 remains available.

Thanks for that pointer. I had always thought that all "*UCS4*" names were 
platform specific and had completely missed that type. It's a lot nicer 
than Py_UNICODE because it allows users to fold surrogate pairs back into 
the character value.

It's completely missing from the docs, BTW. Google doesn't give me a single 
mention for all of docs.python.org, even though it existed at least since 
(and likely long before) Cython's oldest supported runtime Python 2.3.

If I had known about that type earlier, I could have ended up making that 
the native Unicode character type in Cython instead of bothering with 
Py_UNICODE. But this can still be changed I think. Since type inference was 
available before native Py_UNICODE support, it's unlikely that users will 
have Py_UNICODE written in their code explicitly. So I can make the switch 
under the hood.

Just to explain, a native CPython C type is much better than an arbitrary 
integer type, because it allows Cython to apply specific coercion rules 
from and to Python object types. As currently Py_UNICODE, Py_UCS4 would 
obviously coerce from and to a 1 character Unicode string, but it could 
additionally handle surrogate pair splitting and combining automatically on 
current 16-bit Unicode builds so that you'd get a Unicode string with two 
code points on coercion to Python.


>> While I'm somewhat confident that I'll
>> find a way to fix this in Cython, my point is just that this adds a
>> certain level of complexity to C code using the new memory layout that
>> simply wasn't there before.
>
> Understood. However, I think it is easier than you think it is.

Let's see about the implications once there is an implementation.

Stefan


From stefan_ml at behnel.de  Sat Jan 29 08:47:38 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 29 Jan 2011 08:47:38 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <4D3DDE5E.4080807@v.loewis.de>
References: <4D3DDE5E.4080807@v.loewis.de>
Message-ID: 

"Martin v. L?wis", 24.01.2011 21:17:
> I have been thinking about Unicode representation for some time now.
> This was triggered, on the one hand, by discussions with Glyph Lefkowitz
> (who complained that his server app consumes too much memory), and Carl
> Friedrich Bolz (who profiled Python applications to determine that
> Unicode strings are among the top consumers of memory in Python).
> On the other hand, this was triggered by the discussion on supporting
> surrogates in the library better.
>
> I'd like to propose PEP 393, which takes a different approach,
> addressing both problems simultaneously: by getting a flexible
> representation (one that can be either 1, 2, or 4 bytes), we can
> support the full range of Unicode on all systems, but still use
> only one byte per character for strings that are pure ASCII (which
> will be the majority of strings for the majority of users).
>
> You'll find the PEP at
>
> http://www.python.org/dev/peps/pep-0393/
>
> [...]
> Stable ABI
> ----------
>
> None of the functions in this PEP become part of the stable ABI.

I think that's only part of the truth. This PEP can potentially have an 
impact on the stable ABI in the sense that the build-time size of 
Py_UNICODE may no longer be important for extensions that work on unicode 
buffers in the future as long as they only use the 'str' pointer and not 
'wstr'.

Stefan


From stefan_ml at behnel.de  Sat Jan 29 08:48:18 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 29 Jan 2011 08:48:18 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <4D3DDE5E.4080807@v.loewis.de>
References: <4D3DDE5E.4080807@v.loewis.de>
Message-ID: 

"Martin v. L?wis", 24.01.2011 21:17:
> I have been thinking about Unicode representation for some time now.
> This was triggered, on the one hand, by discussions with Glyph Lefkowitz
> (who complained that his server app consumes too much memory), and Carl
> Friedrich Bolz (who profiled Python applications to determine that
> Unicode strings are among the top consumers of memory in Python).
> On the other hand, this was triggered by the discussion on supporting
> surrogates in the library better.
>
> I'd like to propose PEP 393, which takes a different approach,
> addressing both problems simultaneously: by getting a flexible
> representation (one that can be either 1, 2, or 4 bytes), we can
> support the full range of Unicode on all systems, but still use
> only one byte per character for strings that are pure ASCII (which
> will be the majority of strings for the majority of users).
>
> You'll find the PEP at
>
> http://www.python.org/dev/peps/pep-0393/

After much discussion, I'm +1 for this PEP. Implementation and benchmarks 
are pending, but there are strong indicators that it will bring relief for 
the memory overhead of most applications without leading to a major 
degradation performance-wise. Not for Python code anyway, and I'll try to 
make sure Cython extensions won't notice much when switching to CPython 3.3.

Martin, this is a smart way of doing it.

Stefan


From martin at v.loewis.de  Sat Jan 29 10:05:59 2011
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sat, 29 Jan 2011 10:05:59 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: 
References: <4D3DDE5E.4080807@v.loewis.de> 
Message-ID: <4D43D877.6090701@v.loewis.de>

>> None of the functions in this PEP become part of the stable ABI.
> 
> I think that's only part of the truth. This PEP can potentially have an
> impact on the stable ABI in the sense that the build-time size of
> Py_UNICODE may no longer be important for extensions that work on
> unicode buffers in the future as long as they only use the 'str' pointer
> and not 'wstr'.

Py_UNICODE isn't part of the stable ABI, so it wasn't important for
extensions using the stable ABI before - so really no change here.

Regards,
Martin

From stefan_ml at behnel.de  Sat Jan 29 11:00:48 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 29 Jan 2011 11:00:48 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <4D43D877.6090701@v.loewis.de>
References: <4D3DDE5E.4080807@v.loewis.de> 
	<4D43D877.6090701@v.loewis.de>
Message-ID: 

"Martin v. L?wis", 29.01.2011 10:05:
>>> None of the functions in this PEP become part of the stable ABI.
>>
>> I think that's only part of the truth. This PEP can potentially have an
>> impact on the stable ABI in the sense that the build-time size of
>> Py_UNICODE may no longer be important for extensions that work on
>> unicode buffers in the future as long as they only use the 'str' pointer
>> and not 'wstr'.
>
> Py_UNICODE isn't part of the stable ABI, so it wasn't important for
> extensions using the stable ABI before - so really no change here.

I know, that's not what I meant. But this PEP would enable a C API that 
provides direct access to the underlying buffer. Just as is currently 
provided for the Py_UNICODE array, but with a stable ABI because the buffer 
type won't change based on build time options.

OTOH, one could argue that this is already partly provided by the generic 
buffer API.

Stefan


From ncoghlan at gmail.com  Sat Jan 29 13:49:01 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 29 Jan 2011 22:49:01 +1000
Subject: [Python-Dev] fcmp() in test.support
In-Reply-To: <4D431A34.9030401@voidspace.org.uk>
References: 
	<4D431A34.9030401@voidspace.org.uk>
Message-ID: 

On Sat, Jan 29, 2011 at 5:34 AM, Michael Foord
 wrote:
> This module shouldn't really be documented at all. It exists to support the
> Python test framework and we don't want to have to support users or make API
> stability guarantees. Plus most of it is rather old. Please don't document
> more stuff in this module.

As Eli noted, we explicitly disclaim all stability guarantees when it
comes to this module.

Documenting it properly is intended to make it easier to write high
quality tests using the utilities we have developed over the years
without having to read the source code.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Sat Jan 29 13:53:44 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 29 Jan 2011 22:53:44 +1000
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: 
References: <4D3DDE5E.4080807@v.loewis.de> 
	<4D43D877.6090701@v.loewis.de> 
Message-ID: 

On Sat, Jan 29, 2011 at 8:00 PM, Stefan Behnel  wrote:
> OTOH, one could argue that this is already partly provided by the generic
> buffer API.

Which won't be part of the stable ABI until 3.3 - there are some
discrepancies between PEP 3118 and the actual implementation that we
need to sort out first.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From solipsis at pitrou.net  Sat Jan 29 14:21:14 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 29 Jan 2011 14:21:14 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
References: <4D3DDE5E.4080807@v.loewis.de> 
	<4D43D877.6090701@v.loewis.de> 
Message-ID: <20110129142114.0c6d04c1@pitrou.net>

On Sat, 29 Jan 2011 11:00:48 +0100
Stefan Behnel  wrote:
> 
> I know, that's not what I meant. But this PEP would enable a C API that 
> provides direct access to the underlying buffer. Just as is currently 
> provided for the Py_UNICODE array, but with a stable ABI because the buffer 
> type won't change based on build time options.
> 
> OTOH, one could argue that this is already partly provided by the generic 
> buffer API.

Unicode objects don't provide the buffer API (and chances are they never
will).

Regards

Antoine.



From stefan_ml at behnel.de  Sat Jan 29 18:03:23 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 29 Jan 2011 18:03:23 +0100
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: <4D3DDE5E.4080807@v.loewis.de>
References: <4D3DDE5E.4080807@v.loewis.de>
Message-ID: 

"Martin v. L?wis", 24.01.2011 21:17:
> I'd like to propose PEP 393, which takes a different approach,
> addressing both problems simultaneously: by getting a flexible
> representation (one that can be either 1, 2, or 4 bytes), we can
> support the full range of Unicode on all systems, but still use
> only one byte per character for strings that are pure ASCII (which
> will be the majority of strings for the majority of users).
>
> You'll find the PEP at
>
> http://www.python.org/dev/peps/pep-0393/
>[...]
> The Py_UNICODE type is still supported but deprecated. It is always
> defined as a typedef for wchar_t, so the wstr representation can double
> as Py_UNICODE representation.

What about the character property functions?

http://docs.python.org/py3k/c-api/unicode.html#unicode-character-properties

Will they be adapted to accept Py_UCS4 instead of Py_UNICODE?

Stefan


From alexander.belopolsky at gmail.com  Sat Jan 29 18:12:19 2011
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 29 Jan 2011 12:12:19 -0500
Subject: [Python-Dev] PEP 393: Flexible String Representation
In-Reply-To: 
References: <4D3DDE5E.4080807@v.loewis.de>
	
Message-ID: 

On Sat, Jan 29, 2011 at 12:03 PM, Stefan Behnel  wrote:
..
> What about the character property functions?
>
> http://docs.python.org/py3k/c-api/unicode.html#unicode-character-properties
>
> Will they be adapted to accept Py_UCS4 instead of Py_UNICODE?

They have been already.  See revision 84177.  Docs should be fixed.

From victor.stinner at haypocalc.com  Sun Jan 30 09:56:18 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Sun, 30 Jan 2011 09:56:18 +0100
Subject: [Python-Dev] Issue #11051: system calls per import
Message-ID: <1296377778.24415.4.camel@marge>

Hi,

Antoine Pitrou noticed that Python 3.2 tries a lot of filenames to load
a module:
http://bugs.python.org/issue11051

Python 3.1 does already test many filenames, but with Python 3.2, it is
even worse.

For each directory in sys.path, it tries 9 suffixes: '',
'.cpython-32m.so', 'module.cpython-32m.so', '.abi3.so',
'module.abi3.so', '.so', 'module.so', '.py', '.pyc'.

I don't understand why it tests so much .so suffixes. And why it does
test with and without "module".

Victor



From martin at v.loewis.de  Sun Jan 30 10:40:52 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 30 Jan 2011 10:40:52 +0100
Subject: [Python-Dev] Issue #11051: system calls per import
In-Reply-To: <1296377778.24415.4.camel@marge>
References: <1296377778.24415.4.camel@marge>
Message-ID: <4D453224.7020809@v.loewis.de>

> Python 3.1 does already test many filenames, but with Python 3.2, it is
> even worse.
> 
> For each directory in sys.path, it tries 9 suffixes: '',
> '.cpython-32m.so', 'module.cpython-32m.so', '.abi3.so',
> 'module.abi3.so', '.so', 'module.so', '.py', '.pyc'.
> 
> I don't understand why it tests so much .so suffixes. And why it does
> test with and without "module".

The many extensions have been specified in PEP 3149. The PEP also specifies

# This "tag" will appear between the module base name and the operation
# file system extension for shared libraries.

which apparently meant that the existing mechanism is extended to add
the tag.

The support for both the "short extension" (i.e. ".so") and "long
extension" (i.e. "module.so") goes back to r4297 (Python 1.1),
when the short extension was added as an alternative to the long
extension. The original module suffix was defined in r3518 when
dynamic extension modules got supported, as either "module.so"
(SUN_SHLIB) or "module.o" (dl_loadmod, apparently Irix).

Regards,
Martin

From georg at python.org  Sun Jan 30 10:25:05 2011
From: georg at python.org (Georg Brandl)
Date: Sun, 30 Jan 2011 10:25:05 +0100
Subject: [Python-Dev] Issue #11051: system calls per import
In-Reply-To: <1296377778.24415.4.camel@marge>
References: <1296377778.24415.4.camel@marge>
Message-ID: <4D452E71.6070401@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am 30.01.2011 09:56, schrieb Victor Stinner:
> Hi,
> 
> Antoine Pitrou noticed that Python 3.2 tries a lot of filenames to load
> a module:
> http://bugs.python.org/issue11051
> 
> Python 3.1 does already test many filenames, but with Python 3.2, it is
> even worse.
> 
> For each directory in sys.path, it tries 9 suffixes: '',
> '.cpython-32m.so', 'module.cpython-32m.so', '.abi3.so',
> 'module.abi3.so', '.so', 'module.so', '.py', '.pyc'.

'' is not really a suffix, but a test for a package directory.

> I don't understand why it tests so much .so suffixes.

Because of PEP 3149 and PEP 384.

> And why it does test with and without "module".

Because it always did (there's a thing called backwards compatibility.)

This is of course probably the obvious one to start a deprecation process.

Georg
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1FLnEACgkQN9GcIYhpnLApaACdGDe9qVlZNVHRF92yTqYnYFIp
hjIAn34YqvMy8fy7pcz0qAlS/WhRWR4G
=1b9C
-----END PGP SIGNATURE-----

From ncoghlan at gmail.com  Sun Jan 30 13:52:25 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 30 Jan 2011 22:52:25 +1000
Subject: [Python-Dev] Issue #11051: system calls per import
In-Reply-To: <4D452E71.6070401@python.org>
References: <1296377778.24415.4.camel@marge>
	<4D452E71.6070401@python.org>
Message-ID: 

On Sun, Jan 30, 2011 at 7:25 PM, Georg Brandl  wrote:
>> And why it does test with and without "module".
>
> Because it always did (there's a thing called backwards compatibility.)
>
> This is of course probably the obvious one to start a deprecation process.

But why do we check the long suffix for the *new* extension module
naming variants from PEP 3149 and PEP 384? Those are completely new,
so there's no backwards compatibility argument there.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From victor.stinner at haypocalc.com  Sun Jan 30 17:35:45 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Sun, 30 Jan 2011 17:35:45 +0100
Subject: [Python-Dev] Issue #11051: system calls per import
In-Reply-To: 
References: <1296377778.24415.4.camel@marge> <4D452E71.6070401@python.org>
	
Message-ID: <1296405345.24507.9.camel@marge>

Le dimanche 30 janvier 2011 ? 22:52 +1000, Nick Coghlan a ?crit :
> On Sun, Jan 30, 2011 at 7:25 PM, Georg Brandl  wrote:
> >> And why it does test with and without "module".
> >
> > Because it always did (there's a thing called backwards compatibility.)
> >
> > This is of course probably the obvious one to start a deprecation process.
> 
> But why do we check the long suffix for the *new* extension module
> naming variants from PEP 3149 and PEP 384? Those are completely new,
> so there's no backwards compatibility argument there.

My implicit question was: can we limit the number of tested suffixes? I
see two candidates: remove 'module.cpython-32m.so' ('.cpython-32m.so'
should be enough) and 'module.abi3.so' ('.abi3.so' should be enough).

And the real question is: should we change that before 3.2 final? If we
don't change that in 3.2, it will be harder to change it later (but it
is still possible).

Limit the number of suffixes is maybe not the right solution to limit
the number of system calls at startup. We can imagine alternatives:

 * remember the last filename when loading a module and retry this
filename first
 * specify that a module is a Python system module and should only be
loaded from "system directories"
 * specify the module type (directory, .py file, dynamic library, ...)
when loading a module
 * (or a least remember the module type and retry this type first)
 * etc.

We should find a compromise between speed (limit the number of system
calls) and the usability of Python modules.

Victor


From georg at python.org  Sun Jan 30 17:50:32 2011
From: georg at python.org (Georg Brandl)
Date: Sun, 30 Jan 2011 17:50:32 +0100
Subject: [Python-Dev] Issue #11051: system calls per import
In-Reply-To: <1296405345.24507.9.camel@marge>
References: <1296377778.24415.4.camel@marge>
	<4D452E71.6070401@python.org>	
	<1296405345.24507.9.camel@marge>
Message-ID: <4D4596D8.8040908@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am 30.01.2011 17:35, schrieb Victor Stinner:
> Le dimanche 30 janvier 2011 ? 22:52 +1000, Nick Coghlan a ?crit :
>> On Sun, Jan 30, 2011 at 7:25 PM, Georg Brandl  wrote:
>>>> And why it does test with and without "module".
>>>
>>> Because it always did (there's a thing called backwards compatibility.)
>>>
>>> This is of course probably the obvious one to start a deprecation process.
>>
>> But why do we check the long suffix for the *new* extension module
>> naming variants from PEP 3149 and PEP 384? Those are completely new,
>> so there's no backwards compatibility argument there.
> 
> My implicit question was: can we limit the number of tested suffixes? I
> see two candidates: remove 'module.cpython-32m.so' ('.cpython-32m.so'
> should be enough) and 'module.abi3.so' ('.abi3.so' should be enough).
> 
> And the real question is: should we change that before 3.2 final?

We most definitely shouldn't.

Georg
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1FltgACgkQN9GcIYhpnLDquwCfZH+jtM6nsXz4Iyi2XrhpDKBH
+6IAnA4Be/CWQhiQ9hq1VqGH2ent7say
=e1d5
-----END PGP SIGNATURE-----

From alexander.belopolsky at gmail.com  Sun Jan 30 17:54:51 2011
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sun, 30 Jan 2011 11:54:51 -0500
Subject: [Python-Dev] Issue #11051: system calls per import
In-Reply-To: <1296405345.24507.9.camel@marge>
References: <1296377778.24415.4.camel@marge> <4D452E71.6070401@python.org>
	
	<1296405345.24507.9.camel@marge>
Message-ID: 

On Sun, Jan 30, 2011 at 11:35 AM, Victor Stinner
 wrote:
..
> We should find a compromise between speed (limit the number of system
> calls) and the usability of Python modules.

Do you have measurements that show python spending significant time on
failing open calls?

From p.f.moore at gmail.com  Sun Jan 30 20:37:40 2011
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 30 Jan 2011 19:37:40 +0000
Subject: [Python-Dev] Stable buildbots
In-Reply-To: 
References: <20101113133712.60e9be27@pitrou.net>
	
	
	
	<4CEB7E12.1070201@snakebite.org>
	
Message-ID: 

On 23 November 2010 23:18, David Bolen  wrote:
> Trent Nelson  writes:
>
>> That's interesting. ?(That kill_python.exe doesn't kill the wedged
>> processes, but pskill does.) ?kill_python is pretty simple, it just
>> calls TerminateProcess() after acquiring a handle with the relevant
>> PROCESS_TERMINATE access right. ?(...)
>>
>> Are you calling pskill with the -t flag? i.e. kill process and all
>> dependents? ?That might be the ticket, especially if killing the child
>> process that wedged select() is waiting on causes it to return, and
>> thus, makes it killable.
>
> Nope, just "pskill python_d". ?Haven't bothered to check the pskill
> source but I'm assuming it's just a basic TerminateProcess. Ideally my
> quickest workaround would just be to replace the kill_python in the
> buildbot tools script with that command but of course they could get
> updated on checkouts and I'm not arguing it's generally appropriate enough
> to belong in the source.

After a long, long time (:-(), I'm finally getting a chance to look at
this. I've patched buildbot as mentioned earlier in the thread, but I
don't see where I should put the pskill command to make it work. At
the moment, I have scheduled tasks to pskill python_d and
vsjitdebugger. The python_d one runs daily and the debugger one
hourly. (I daren't kill python_d too often, or I'll start killing
in-progress tests, I assume). The vsjitdebugger one is there because I
think it solves the CRT popup issue (I'll add the autoit script as
well, but as I'm running as a service, I'm not sure the popup will
alwats be visible for the autoit script to pick up...)

Presumably, you're inserting a pskill command somewhere into the
actual build process. I don't know much about buildbot, but I thought
that was controlled by the master and/or the Python build scripts,
neither of which I can change.

If I want to add a pskill command just after a build/test has run
(i.e., about where kill_python runs at the moment) how do I do that?

Thanks,
Paul.

From martin at v.loewis.de  Sun Jan 30 20:43:57 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 30 Jan 2011 20:43:57 +0100
Subject: [Python-Dev] Issue #11051: system calls per import
In-Reply-To: 
References: <1296377778.24415.4.camel@marge>
	<4D452E71.6070401@python.org>		<1296405345.24507.9.camel@marge>
	
Message-ID: <4D45BF7D.70405@v.loewis.de>

Am 30.01.2011 17:54, schrieb Alexander Belopolsky:
> On Sun, Jan 30, 2011 at 11:35 AM, Victor Stinner
>  wrote:
> ..
>> We should find a compromise between speed (limit the number of system
>> calls) and the usability of Python modules.
> 
> Do you have measurements that show python spending significant time on
> failing open calls?

No; past measurements always showed that this is insignificant, probably
thanks to operating system caching the relevant directory blocks (so
it doesn't really matter whether you make one or ten lookups per
directory; my guess is that it matters more if you look into ten
directories instead of one).

Regards,
Martin

From db3l.net at gmail.com  Sun Jan 30 21:50:36 2011
From: db3l.net at gmail.com (David Bolen)
Date: Sun, 30 Jan 2011 15:50:36 -0500
Subject: [Python-Dev] Stable buildbots
References: <20101113133712.60e9be27@pitrou.net>
	
	
	
	<4CEB7E12.1070201@snakebite.org>
	
	
Message-ID: 

Paul Moore  writes:

> Presumably, you're inserting a pskill command somewhere into the
> actual build process. I don't know much about buildbot, but I thought
> that was controlled by the master and/or the Python build scripts,
> neither of which I can change.
>
> If I want to add a pskill command just after a build/test has run
> (i.e., about where kill_python runs at the moment) how do I do that?

I haven't been able to - as you say there's no good way to hook into
the build process in real time as the changes have to be external or
they'll get zapped on the next checkout.  I suppose you could rapidly
try to monitor the output of the build slave log file, but then you
risk killing a process from a next step if you miss something or are
too slow.  And I've had cases (after long periods of continuous
runtime) where the build slave log stops being generated even while
the slave is running fine.

Anyway, in the absence of changes to the build tree, I finally gave up
and now run an external script (see below) that whacks any python_d
process it finds running for more than 2 hours (arbitrary choice).  I
considered trying to dig deeper to identify processes with no logical
test parent (more similar to the build kill_python itself), but
decided it was too much effort for the minimal extra gain.  So not
terribly different from your once a day pskill, though as you say if
you arbitrarily kill all python_d processes at any given point in
time, you risk interrupting an active test.

So the AutoIt script covers pop-ups and the script below cleans up
hung processes.  On the subject of pop-ups, I'm not sure but if you
find your service not showing them try enabling the "Allow service to
interact with the desktop" option in the service definition.  In my
experience though if a service can't perform a UI interaction, the
interaction just fails, so I wouldn't expect the process to get stuck
in that case.

Anyway, in my case the kill script itself is Cygwin/bash based, but
using the pstools tools, and conceptually just kills (pskill) any
python_d process identified as having been running for 2 or more hours
of wall time (via pslist):

          - - - - - - - - - - - - - - - - - - - - - - - - -
#!/bin/sh
#
# kill_python.sh
#
# Quick 'n dirty script to watch for python_d processes that exceed a few
# hours of runtime, then kill then assuming they're hung
#

PROC="python_d"
TIMEOUT="2"

while [ 1 ]; do

    echo "`date` Checking..."

    PIDS=`pslist 2>&1 | grep "^$PROC" | awk -v TIMEOUT=$TIMEOUT '{split($NF,fields,":"); if (int(fields[1]) >= int(TIMEOUT)) {print $2}}'`

    if [ "$PIDS" ]; then
	echo ===== `date`
	for pid in $PIDS; do
	    pslist $pid 2>&1 | grep "^$PROC"
	    pskill $pid
	done
	echo =====
    fi

    sleep 300
done
          - - - - - - - - - - - - - - - - - - - - - - - - -

It's a kludge, but as you say, for us to impose this on the build
slave side requires it to be outside of the build tree.  I've been
running it for about a month now and it seems to be doing the job.  I
run a similar script on OSX (my Tiger slave also sometimes sees stuck
processes, though they just burn CPU rather than interfere with
tests), but there I can identify stranded python_d processes if they
are owned by init, so the script can react more quickly.

I'm pretty sure the best long term fix is to move the kill processing
into the clean script (as per issue 9973) rather than where it
currently is in the build script, but so far I don't think the idea
has been able to attract the interest of anyone who can actually
commit such a change.  (See also the Dec continuation of this thread -
http://www.mail-archive.com/python-dev at python.org/msg54389.html)

I had also created issue 10641 from when I thought I found a problem
with kill_python, but that turned out incorrect, and in subsequent
tests kill_python in the build tree always worked, so the core issue
seems to always be the failure to run it at all as opposed to it not
working.

For now though, these two external "monitors" seem to have helped
contain the number of manual operations I have to do on my two Windows
slaves.  (Though recently I've begun seeing two new sorts of pop-ups
under Windows 7 but both related to memory, so I think I just need to
give my VM a little more memory)

-- David


From solipsis at pitrou.net  Sun Jan 30 22:17:24 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 30 Jan 2011 22:17:24 +0100
Subject: [Python-Dev] Stable buildbots
References: <20101113133712.60e9be27@pitrou.net>
	
	
	
	<4CEB7E12.1070201@snakebite.org>
	
	
	
Message-ID: <20110130221724.40e9cb4d@pitrou.net>


Hello,

> I'm pretty sure the best long term fix is to move the kill processing
> into the clean script (as per issue 9973) rather than where it
> currently is in the build script, but so far I don't think the idea
> has been able to attract the interest of anyone who can actually
> commit such a change.

Thanks for bringing my attention on this. I've added a comment on that
issue. If you say this should improve things, there's probably no
reason not to commit such a patch.

Regards

Antoine.



From p.f.moore at gmail.com  Sun Jan 30 22:46:25 2011
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 30 Jan 2011 21:46:25 +0000
Subject: [Python-Dev] Stable buildbots
In-Reply-To: 
References: <20101113133712.60e9be27@pitrou.net>
	
	
	
	<4CEB7E12.1070201@snakebite.org>
	
	
	
Message-ID: 

On 30 January 2011 20:50, David Bolen  wrote:
> I haven't been able to - as you say there's no good way to hook into
> the build process in real time as the changes have to be external or
> they'll get zapped on the next checkout. ?I suppose you could rapidly
> try to monitor the output of the build slave log file, but then you
> risk killing a process from a next step if you miss something or are
> too slow. ?And I've had cases (after long periods of continuous
> runtime) where the build slave log stops being generated even while
> the slave is running fine.

OK, sounds like I hadn't missed anything, then, which is good in some sense :-)

> For now though, these two external "monitors" seem to have helped
> contain the number of manual operations I have to do on my two Windows
> slaves. ?(Though recently I've begun seeing two new sorts of pop-ups
> under Windows 7 but both related to memory, so I think I just need to
> give my VM a little more memory)

Yes, my (somewhat more simplistic) kill scripts had done some good as well.

Having said that, http://bugs.python.org/issue9931 is currently
stopping my buildslave (at least if I run it as a service), so it's a
bit of a moot point at the moment...

(One thing that might be good is if there were a means in the
buildslave architecture to deliberately disable a test temporarily, if
it's known to fail - I know ignoring errors isn't a good thing in
general, but OTOH, having a slave effectively dead for months because
of a known issue isn't a lot of help, either :-()

Thanks for the reply.

Paul.

From greg.ewing at canterbury.ac.nz  Sun Jan 30 22:23:45 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 31 Jan 2011 10:23:45 +1300
Subject: [Python-Dev] Issue #11051: system calls per import
In-Reply-To: <1296405345.24507.9.camel@marge>
References: <1296377778.24415.4.camel@marge> <4D452E71.6070401@python.org>
	
	<1296405345.24507.9.camel@marge>
Message-ID: <4D45D6E1.6030906@canterbury.ac.nz>

Victor Stinner wrote:

> Limit the number of suffixes is maybe not the right solution to limit
> the number of system calls at startup. We can imagine alternatives:
> 
>  * remember the last filename when loading a module and retry this
> filename first
>  * specify that a module is a Python system module and should only be
> loaded from "system directories"
>  * specify the module type (directory, .py file, dynamic library, ...)
> when loading a module
>  * (or a least remember the module type and retry this type first)
>  * etc.

Maybe also

    * Read and cache the directory contents and search it ourselves
      instead of making a system call for every possible name.

-- 
Greg

From raymond.hettinger at gmail.com  Mon Jan 31 05:26:31 2011
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sun, 30 Jan 2011 20:26:31 -0800
Subject: [Python-Dev] [Python-checkins] r88273 -
	python/branches/py3k/Doc/whatsnew/3.2.rst
In-Reply-To: <20110131042140.E8C47EE991@mail.python.org>
References: <20110131042140.E8C47EE991@mail.python.org>
Message-ID: 


On Jan 30, 2011, at 8:21 PM, eli.bendersky wrote:

Please use the open tracker item and do not edit the document directly.


Raymond



From martin at v.loewis.de  Mon Jan 31 08:33:13 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 31 Jan 2011 08:33:13 +0100
Subject: [Python-Dev] Issue #11051: system calls per import
In-Reply-To: <4D45D6E1.6030906@canterbury.ac.nz>
References: <1296377778.24415.4.camel@marge>
	<4D452E71.6070401@python.org>		<1296405345.24507.9.camel@marge>
	<4D45D6E1.6030906@canterbury.ac.nz>
Message-ID: <4D4665B9.9000108@v.loewis.de>

> Maybe also
> 
>    * Read and cache the directory contents and search it ourselves
>      instead of making a system call for every possible name.

I wouldn't do that - I would expect that this is actually slower than
making the system calls, because the system might get away with not
reading the entire directory (whereas it will have to when we explicitly
ask for that).

Regards,
Martin

From guido at python.org  Mon Jan 31 09:08:25 2011
From: guido at python.org (Guido van Rossum)
Date: Mon, 31 Jan 2011 00:08:25 -0800
Subject: [Python-Dev] Issue #11051: system calls per import
In-Reply-To: <4D4665B9.9000108@v.loewis.de>
References: <1296377778.24415.4.camel@marge> <4D452E71.6070401@python.org>
	
	<1296405345.24507.9.camel@marge> <4D45D6E1.6030906@canterbury.ac.nz>
	<4D4665B9.9000108@v.loewis.de>
Message-ID: 

On Sun, Jan 30, 2011 at 11:33 PM, "Martin v. L?wis"  wrote:
>> Maybe also
>>
>> ? ?* Read and cache the directory contents and search it ourselves
>> ? ? ?instead of making a system call for every possible name.
>
> I wouldn't do that - I would expect that this is actually slower than
> making the system calls, because the system might get away with not
> reading the entire directory (whereas it will have to when we explicitly
> ask for that).

Hm. Long (very long) ago I had to implement just that, and it was much
faster. But this was over NFS. Still, I think the directory would have
to be truly enormous before reading its contents (which doesn't access
all the inodes) is slower than statting a few dozen of its entries. At
least on most *nix filesystems.

Another thing to consider: on App Engine (which despite of all its
architectural weirdness uses a -- mostly -- standard Linux filesystem
for the Python code of the app) someone measured that importing from a
zipfile is much faster than importing from the filesystem. I would
imagine this extends to other contexts too, and it makes sense because
the zipfile directory gets cached in memory so no stat() calls are
necessary.

(Basically I am biased to believe that stat() is a pretty slow system
call -- this may just be old NFS lore though.)

-- 
--Guido van Rossum (python.org/~guido)

From j.bos-interpay at xs4all.nl  Mon Jan 31 10:17:51 2011
From: j.bos-interpay at xs4all.nl (Jurjen N.E. Bos)
Date: Mon, 31 Jan 2011 10:17:51 +0100
Subject: [Python-Dev] Byte code arguments from two to one byte: did anyone
	try this?
Message-ID: 

I tried to find any research on this subject, but I couldn't find any,
so I'll be daring and vulnerable and just try it out to see what your  
thoughts
are.
I single stepped a simple loop in Python to see where the efficiency  
bottlenecks are.
I was impressed by the optimizations already in there, but I still  
dare to suggest an optimization that from my estimates might shave  
off a few cycles, speeding up Python about 5%.
The idea is simple: change the byte code argument values from two  
bytes to one.
Implications are:
- code changes are relatively simple, see below
- fewer memory reads, which are becoming more and more expensive
- saves three instructions for every opcode with args (i.e. most of  
them)


Code changes are, as far as I could find:
compile.c:
assemble_emit must produce extended opcodes
     for all cases of more than 8 bits instead of 16

ceval.c:
NEXTARG and PEEKARG need adjustment
EXTENDED_ARG needs adjustment
     (this will be a four byte instruction, which is ugly, I agree)

peephole.c:
GETARG, SETARG, need adjustment
also GETJUMPTGT, CODESIZE
routine tuple_of_constants, fold_binops_on_constants, PyCode_Optimize
     are dependent on instruction length, which will be 2 instead of 3
(search for the digit 3 will find all cases, as far as I checked)
you probably will have to write a macro for codestr[i+3]
there is a check for code length >32700, but I think this one might  
stay,
maybe if a few extra checks are added.

dis:
minor adjustments


Estimation of speed impact:
about 80% of the instructions seem to have an argument, and I never  
saw an  opcode >255 while looking at bytecode, so they are probably  
not frequent.

The NEXTARG macro expands on my Macbook to:

mov    -408(%ebp),%edx        (next_instr)
movzbl 2(%edx),%eax           (*second byte)
shl    $0x8,%eax              (*shift)
movzbl 1(%edx),%edx           (first byte)
add    %edx,%eax              (*combine)

and the starred instructions will vanish.
The main loop is approximately 40 instructions, so a saving of three  
instructions is significant. I don't dare to claim 3/40 = 7.5% savings,
but I think 5% may be realistic.

Did anyone try this already? If not, I might take up the gauntlet
and try it myself, but I never did this before...


- Jurjen

PS I also saw that some scratch variables, mainly v and x, are  
carefull stored back in memory by the compiler and the end of the big  
interpreter loop, while their value isn't used anymore, of course.
A few carefully placed braces might tell the compiler how useless  
this is and
save another few percent.


From stefan_ml at behnel.de  Mon Jan 31 11:10:13 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Mon, 31 Jan 2011 11:10:13 +0100
Subject: [Python-Dev] Byte code arguments from two to one byte: did
	anyone try this?
In-Reply-To: 
References: 
Message-ID: 

Jurjen N.E. Bos, 31.01.2011 10:17:
> I single stepped a simple loop in Python to see where the efficiency
> bottlenecks are.

What version of CPython did you try that with? The latest py3k branch?

Stefan


From georg at python.org  Mon Jan 31 11:32:02 2011
From: georg at python.org (Georg Brandl)
Date: Mon, 31 Jan 2011 11:32:02 +0100
Subject: [Python-Dev] [RELEASED] Python 3.2 rc 2
Message-ID: <4D468FA2.4040704@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On behalf of the Python development team, I'm quite happy to announce
the second release candidate of Python 3.2.

Python 3.2 is a continuation of the efforts to improve and stabilize the
Python 3.x line.  Since the final release of Python 2.7, the 2.x line
will only receive bugfixes, and new features are developed for 3.x only.

Since PEP 3003, the Moratorium on Language Changes, is in effect, there
are no changes in Python's syntax and built-in types in Python 3.2.
Development efforts concentrated on the standard library and support for
porting code to Python 3.  Highlights are:

* numerous improvements to the unittest module
* PEP 3147, support for .pyc repository directories
* PEP 3149, support for version tagged dynamic libraries
* PEP 3148, a new futures library for concurrent programming
* PEP 384, a stable ABI for extension modules
* PEP 391, dictionary-based logging configuration
* an overhauled GIL implementation that reduces contention
* an extended email package that handles bytes messages
* a much improved ssl module with support for SSL contexts and certificate
  hostname matching
* a sysconfig module to access configuration information
* additions to the shutil module, among them archive file support
* many enhancements to configparser, among them mapping protocol support
* improvements to pdb, the Python debugger
* countless fixes regarding bytes/string issues; among them full support
  for a bytes environment (filenames, environment variables)
* many consistency and behavior fixes for numeric operations

For a more extensive list of changes in 3.2, see

    http://docs.python.org/3.2/whatsnew/3.2.html

To download Python 3.2 visit:

    http://www.python.org/download/releases/3.2/

Please consider trying Python 3.2 with your code and reporting any bugs
you may notice to:

    http://bugs.python.org/


Enjoy!

- -- 
Georg Brandl, Release Manager
georg at python.org
(on behalf of the entire python-dev team and 3.2's contributors)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iEYEARECAAYFAk1Gj6IACgkQN9GcIYhpnLC53wCfcZhc6bxbc+fsmi+PAJxM6npr
Hh4An3QRdeyKHm+L3CqVk+EX02PxNx2r
=sTu6
-----END PGP SIGNATURE-----

From steve at pearwood.info  Mon Jan 31 11:31:53 2011
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 31 Jan 2011 21:31:53 +1100
Subject: [Python-Dev] Byte code arguments from two to one byte: did
 anyone try this?
In-Reply-To: 
References: 
Message-ID: <4D468F99.8070001@pearwood.info>

Jurjen N.E. Bos wrote:
> I was impressed by the optimizations already in there, but I still dare 
> to suggest an optimization that from my estimates might shave off a few 
> cycles, speeding up Python about 5%.
> The idea is simple: change the byte code argument values from two bytes 
> to one.


Interesting. Have you seem Cesare Di Mauro's WPython project, which 
takes the opposite strategy?

http://code.google.com/p/wpython2/



-- 
Steven

From jussi.enkovaara at csc.fi  Mon Jan 31 11:50:27 2011
From: jussi.enkovaara at csc.fi (Jussi Enkovaara)
Date: Mon, 31 Jan 2011 12:50:27 +0200
Subject: [Python-Dev] Issue #11051: system calls per import
In-Reply-To: <4D45BF7D.70405@v.loewis.de>
References: <1296377778.24415.4.camel@marge>	<4D452E71.6070401@python.org>		<1296405345.24507.9.camel@marge>	
	<4D45BF7D.70405@v.loewis.de>
Message-ID: <4D4693F3.8030200@csc.fi>

On 2011-01-30 21:43, "Martin v. L?wis" wrote:
> Am 30.01.2011 17:54, schrieb Alexander Belopolsky:
>> On Sun, Jan 30, 2011 at 11:35 AM, Victor Stinner
>>   wrote:
>> ..
>>> We should find a compromise between speed (limit the number of system
>>> calls) and the usability of Python modules.
>>
>> Do you have measurements that show python spending significant time on
>> failing open calls?
>
> No; past measurements always showed that this is insignificant, probably
> thanks to operating system caching the relevant directory blocks (so
> it doesn't really matter whether you make one or ten lookups per
> directory; my guess is that it matters more if you look into ten
> directories instead of one).

Dear Python-developers,
I would like you to be aware of one particular problem related to the system calls 
in massively parallel systems. We are developing a Python-based simulation software 
GPAW (https://wiki.fysik.dtu.dk/gpaw/) and tested it with up to tens of thousands 
of CPU cores. The program uses MPI, thus thousands of Python interpreters are 
launched at start-up time. As all these interpreters execute the same import 
statements, the huge amount of (IO-related) system calls puts extreme pressure to 
the file system, and as result just starting the Python interpreter(s) can take ~45 
minutes with ~30 000 CPU cores!

Currently, we have tried to work around the problem either by installing Python and 
required additional modules (NumPy and GPAW) to a ramdisk, or by modifying the 
CPython source (at the moment 2.6 version) in such a way that only single process 
performs the system calls and uses MPI to broadcast the results to other processes 
(preliminary work in progress).

As a related problem, dynamic linking can also be quite expensive (or even not 
available in some systems), and in some cases we have made a small hack to CPython 
for enabling statically linked packages (simple modules can of course be included 
relatively easily in static Python build.)

I am not expecting that the problems can be solved easily for the general CPython 
interpreter, especially as massively parallel supercomputers are quite small niche 
of Python usage. However, I think it would be good to be aware of problems with 
large amount of system calls in a more special Python usage.

Best regards,
Jussi
-- 
Jussi Enkovaara, Application Scientist, High Performance Computing, CSC
PO. BOX 405 02101 Espoo, Finland, Tel +358 9 457 2935, fax +358 9 457 2302
CSC - IT Center for Science, www.csc.fi, e-mail: jussi.enkovaara at csc.fi

From Jurjen.Bos at hetnet.nl  Mon Jan 31 12:59:49 2011
From: Jurjen.Bos at hetnet.nl (Jurjen N.E. Bos)
Date: Mon, 31 Jan 2011 12:59:49 +0100
Subject: [Python-Dev] Followup: Byte code arguments from two to one byte:
	did	anyone try this?
In-Reply-To: 
References: 
Message-ID: 

> What version of CPython did you try that with? The latest py3k branch?

I had a quick look at 3.2, 2.5 and 2.7 and got the impression that  
the savings is  more if the interpreter loop is faster: the fewer  
instructions there are, the bigger a 3 instruction difference would  
make.

The NEXTARG macro is the same in all three versions:
#define NEXTARG()       (next_instr += 2, (next_instr[-1]<<8) +  
next_instr[-2])
and the compiler compiles this to two separate fetches.

I found out my compiler (gcc) will make better code if we used a short.
It produces a "movswl" instruction to do both fetches at the same  
time, if I force it to.
That saves two instructions already.

This would imply that on little-endian machines, this would already  
save a few percent changing just 1 line of code in ceval.c:
#define NEXTARG()       (next_instr += 2, *(short *)&next_instr[-2])

- Jurjen

From Jurjen.Bos at hetnet.nl  Mon Jan 31 13:28:39 2011
From: Jurjen.Bos at hetnet.nl (Jurjen N.E. Bos)
Date: Mon, 31 Jan 2011 13:28:39 +0100
Subject: [Python-Dev] short fetch for NEXTARG macro (was: one byte byte code
	arguments)
Message-ID: <86A291E9-5B01-478F-8FB3-20A422534EEB@hetnet.nl>

I just did it: my first python source code hack.
I replaced the NEXTARG and PEEKARG macros in ceval.c using a cast to  
short pointer, and lo and behold, a crude measurement indicates one  
to two percent speed increase.
That isn't much, but it is virtually for free!

Here are the macro's I used:
#define NEXTARG() (next_instr +=2, *(short*)&next_instr[-2])
#define PEEKARG() (*(short*)&next_instr[1])

- Jurjen

From solipsis at pitrou.net  Mon Jan 31 13:43:00 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 31 Jan 2011 13:43:00 +0100
Subject: [Python-Dev] Issue #11051: system calls per import
References: <1296377778.24415.4.camel@marge> <4D452E71.6070401@python.org>
	
	<1296405345.24507.9.camel@marge>
	<4D45D6E1.6030906@canterbury.ac.nz> <4D4665B9.9000108@v.loewis.de>
	
Message-ID: <20110131134300.2babc577@pitrou.net>

On Mon, 31 Jan 2011 00:08:25 -0800
Guido van Rossum  wrote:
> 
> (Basically I am biased to believe that stat() is a pretty slow system
> call -- this may just be old NFS lore though.)

I don't know about NFS, but starting a Python interpreter located on a
Samba share from a Windows VM is quite slow too.
I think Martin is right for the common case: on a local filesystem on a
modern Unix, stat() is certainly very fast. Remote or
distributed filesystems seem to be more of a problem.

Regards

Antoine.



From solipsis at pitrou.net  Mon Jan 31 13:45:26 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 31 Jan 2011 13:45:26 +0100
Subject: [Python-Dev] short fetch for NEXTARG macro (was: one byte byte
 code arguments)
References: <86A291E9-5B01-478F-8FB3-20A422534EEB@hetnet.nl>
Message-ID: <20110131134526.7a3af3fb@pitrou.net>

On Mon, 31 Jan 2011 13:28:39 +0100
"Jurjen N.E. Bos"  wrote:
> I just did it: my first python source code hack.
> I replaced the NEXTARG and PEEKARG macros in ceval.c using a cast to  
> short pointer, and lo and behold, a crude measurement indicates one  
> to two percent speed increase.
> That isn't much, but it is virtually for free!
> 
> Here are the macro's I used:
> #define NEXTARG() (next_instr +=2, *(short*)&next_instr[-2])
> #define PEEKARG() (*(short*)&next_instr[1])

Some architectures forbid unaligned access, so this can't be used as-is.

Regards

Antoine.



From cesare.di.mauro at gmail.com  Mon Jan 31 13:59:16 2011
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Mon, 31 Jan 2011 13:59:16 +0100
Subject: [Python-Dev] short fetch for NEXTARG macro (was: one byte byte
 code arguments)
In-Reply-To: <20110131134526.7a3af3fb@pitrou.net>
References: <86A291E9-5B01-478F-8FB3-20A422534EEB@hetnet.nl>
	<20110131134526.7a3af3fb@pitrou.net>
Message-ID: 

2011/1/31 Antoine Pitrou 

> On Mon, 31 Jan 2011 13:28:39 +0100
> "Jurjen N.E. Bos"  wrote:
> > I just did it: my first python source code hack.
> > I replaced the NEXTARG and PEEKARG macros in ceval.c using a cast to
> > short pointer, and lo and behold, a crude measurement indicates one
> > to two percent speed increase.
> > That isn't much, but it is virtually for free!
> >
> > Here are the macro's I used:
> > #define NEXTARG() (next_instr +=2, *(short*)&next_instr[-2])
> > #define PEEKARG() (*(short*)&next_instr[1])
>
> Some architectures forbid unaligned access, so this can't be used as-is.
>
> Regards
>
> Antoine.
>
>
WPython already addressed it (
http://code.google.com/p/wpython2/source/browse/Python/ceval.c?repo=wpython11):

#ifdef WORDS_BIGENDIAN
#define NEXTOPCODE() oparg = *next_instr++; \
opcode = oparg >> 8; oparg &= 0xff
#else
#define NEXTOPCODE() oparg = *next_instr++; \
opcode = oparg & 0xff; oparg >>= 8
#endif

Shorts alignament is also guaranted due to wordcodes (
http://wpython2.googlecode.com/files/Beyond%20Bytecode%20-%20A%20Wordcode-based%20Python.pdfpag.12).

Cesare
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From tjreedy at udel.edu  Mon Jan 31 14:23:29 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 31 Jan 2011 08:23:29 -0500
Subject: [Python-Dev] Byte code arguments from two to one byte: did
	anyone try this?
In-Reply-To: <4D468F99.8070001@pearwood.info>
References: 
	<4D468F99.8070001@pearwood.info>
Message-ID: 

On 1/31/2011 5:31 AM, Steven D'Aprano wrote:
> Jurjen N.E. Bos wrote:
>> I was impressed by the optimizations already in there, but I still
>> dare to suggest an optimization that from my estimates might shave off
>> a few cycles, speeding up Python about 5%.
>> The idea is simple: change the byte code argument values from two
>> bytes to one.
>
>
> Interesting. Have you seem Cesare Di Mauro's WPython project, which
> takes the opposite strategy?
>
> http://code.google.com/p/wpython2/

The two strategies could be mixed. Some 'word codes' could consist of a 
bytecode + byte arg, and others a real word code. Maybe WPython does 
that already. Might end up being slower though.

-- 
Terry Jan Reedy


From cesare.di.mauro at gmail.com  Mon Jan 31 14:30:57 2011
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Mon, 31 Jan 2011 14:30:57 +0100
Subject: [Python-Dev] Byte code arguments from two to one byte: did
 anyone try this?
In-Reply-To: 
References: 
	<4D468F99.8070001@pearwood.info> 
Message-ID: 

2011/1/31 Terry Reedy 

> On 1/31/2011 5:31 AM, Steven D'Aprano wrote:
>
>> Jurjen N.E. Bos wrote:
>>
>>> I was impressed by the optimizations already in there, but I still
>>> dare to suggest an optimization that from my estimates might shave off
>>> a few cycles, speeding up Python about 5%.
>>> The idea is simple: change the byte code argument values from two
>>> bytes to one.
>>>
>>
>>
>> Interesting. Have you seem Cesare Di Mauro's WPython project, which
>> takes the opposite strategy?
>>
>> http://code.google.com/p/wpython2/
>>
>
> The two strategies could be mixed. Some 'word codes' could consist of a
> bytecode + byte arg, and others a real word code. Maybe WPython does that
> already. Might end up being slower though.
>
> --
>  Terry Jan Reedy


Yes, WPython already does it (
http://wpython2.googlecode.com/files/Beyond%20Bytecode%20-%20A%20Wordcode-based%20Python.pdfpag.7)
, but on average it was faster (pag. 28).

Cesare

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From foom at fuhm.net  Mon Jan 31 15:29:46 2011
From: foom at fuhm.net (James Y Knight)
Date: Mon, 31 Jan 2011 09:29:46 -0500
Subject: [Python-Dev] short fetch for NEXTARG macro (was: one byte byte
	code arguments)
In-Reply-To: <20110131134526.7a3af3fb@pitrou.net>
References: <86A291E9-5B01-478F-8FB3-20A422534EEB@hetnet.nl>
	<20110131134526.7a3af3fb@pitrou.net>
Message-ID: <5BC68B65-92CA-4A2B-B0C4-8AAE764A0D0B@fuhm.net>


On Jan 31, 2011, at 7:45 AM, Antoine Pitrou wrote:

> On Mon, 31 Jan 2011 13:28:39 +0100
> "Jurjen N.E. Bos"  wrote:
>> I just did it: my first python source code hack.
>> I replaced the NEXTARG and PEEKARG macros in ceval.c using a cast to  
>> short pointer, and lo and behold, a crude measurement indicates one  
>> to two percent speed increase.
>> That isn't much, but it is virtually for free!
>> 
>> Here are the macro's I used:
>> #define NEXTARG() (next_instr +=2, *(short*)&next_instr[-2])
>> #define PEEKARG() (*(short*)&next_instr[1])
> 
> Some architectures forbid unaligned access, so this can't be used as-is.

It could perhaps be #ifdef'd in on x86/x86-64, though, which is by far the most common architecture to run python on.

James

From barry at python.org  Mon Jan 31 17:11:30 2011
From: barry at python.org (Barry Warsaw)
Date: Mon, 31 Jan 2011 11:11:30 -0500
Subject: [Python-Dev] Issue #11051: system calls per import
In-Reply-To: <1296405345.24507.9.camel@marge>
References: <1296377778.24415.4.camel@marge> <4D452E71.6070401@python.org>
	
	<1296405345.24507.9.camel@marge>
Message-ID: <20110131111130.1beefdc7@python.org>

On Jan 30, 2011, at 05:35 PM, Victor Stinner wrote:

>And the real question is: should we change that before 3.2 final? If we
>don't change that in 3.2, it will be harder to change it later (but it
>is still possible).

I don't see how you possibly can without re-entering beta.  Mucking with the
import machinery *at all* does not seem prudent in the last RC. ;)

FWIW, I recall this being discussed at the time of the PEPs and we decided not
to narrow the search patterns down.  I'd have to go through my archives for
the details, but I think it would be better to officially deprecate the
'module' form so that they can be removed in a future version.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 

From techtonik at gmail.com  Mon Jan 31 19:19:44 2011
From: techtonik at gmail.com (techtonik at gmail.com)
Date: Mon, 31 Jan 2011 18:19:44 +0000
Subject: [Python-Dev] MSI: Remove dependency from win32com.client module
	(issue4080047)
Message-ID: <90e6ba6e85f2cbfc00049b2875bf@google.com>

Reviewers: ,



Please review this at http://codereview.appspot.com/4080047/

Affected files:
   M     Tools/msi/msi.py
   M     Tools/msi/msilib.py


Index: Tools/msi/msi.py
===================================================================
--- Tools/msi/msi.py	(revision 88279)
+++ Tools/msi/msi.py	(working copy)
@@ -4,7 +4,6 @@
  import msilib, schema, sequence, os, glob, time, re, shutil, zipfile
  from msilib import Feature, CAB, Directory, Dialog, Binary, add_data
  import uisample
-from win32com.client import constants
  from distutils.spawn import find_executable
  from uuids import product_codes
  import tempfile
@@ -1360,7 +1359,7 @@

      # Step 2: Add CAB files
      i = msilib.MakeInstaller()
-    db = i.OpenDatabase(msi, constants.msiOpenDatabaseModeTransact)
+    db = i.OpenDatabase(msi, msilib.msiOpenDatabaseModeTransact)

      v = db.OpenView("SELECT LastSequence FROM Media")
      v.Execute(None)
Index: Tools/msi/msilib.py
===================================================================
--- Tools/msi/msilib.py	(revision 88279)
+++ Tools/msi/msilib.py	(working copy)
@@ -4,7 +4,6 @@
  import win32com.client.gencache
  import win32com.client
  import pythoncom, pywintypes
-from win32com.client import constants
  import re, string, os, sets, glob, subprocess, sys, _winreg, struct

  try:
@@ -29,6 +28,18 @@
  knownbits = datasizemask | type_valid | type_localizable | \
              typemask | type_nullable | type_key

+# Constants from Windows Installer SDK
+msiOpenDatabaseModeReadOnly = 0
+msiOpenDatabaseModeTransact = 1
+msiOpenDatabaseModeDirect = 2
+msiOpenDatabaseModeCreate = 3
+msiColumnInfoNames = 0
+msiColumnInfoTypes = 1
+msiReadStreamInteger = 0
+msiReadStreamBytes = 1
+msiViewModifyInsert = 1
+msidbFileAttributesVital = 512
+
  # Summary Info Property IDs
  PID_CODEPAGE=1
  PID_TITLE=2
@@ -141,8 +152,7 @@

  def gen_schema(destpath, schemapath):
      d = MakeInstaller()
-    schema = d.OpenDatabase(schemapath,
-            win32com.client.constants.msiOpenDatabaseModeReadOnly)
+    schema = d.OpenDatabase(schemapath, msiOpenDatabaseModeReadOnly)

      # XXX ORBER BY
      v=schema.OpenView("SELECT * FROM _Columns")
@@ -196,8 +206,7 @@
  def gen_sequence(destpath, msipath):
      dir = os.path.dirname(destpath)
      d = MakeInstaller()
-    seqmsi = d.OpenDatabase(msipath,
-            win32com.client.constants.msiOpenDatabaseModeReadOnly)
+    seqmsi = d.OpenDatabase(msipath, msiOpenDatabaseModeReadOnly)

      v = seqmsi.OpenView("SELECT * FROM _Tables");
      v.Execute(None)
@@ -212,7 +221,7 @@
          f.write("%s = [\n" % table)
          v1 = seqmsi.OpenView("SELECT * FROM `%s`" % table)
          v1.Execute(None)
-        info = v1.ColumnInfo(constants.msiColumnInfoTypes)
+        info = v1.ColumnInfo(msiColumnInfoTypes)
          while 1:
              r = v1.Fetch()
              if not r:break
@@ -226,7 +235,7 @@
                      rec.append(r.StringData(i))
                  elif info.StringData(i)[0]=="v":
                      size = r.DataSize(i)
-                    bytes = r.ReadStream(i, size,  
constants.msiReadStreamBytes)
+                    bytes = r.ReadStream(i, size, msiReadStreamBytes)
                      bytes = bytes.encode("latin-1") # binary data  
represented "as-is"
                      if table == "Binary":
                          fname = rec[0]+".bin"
@@ -275,7 +284,7 @@
                  r.SetStream(i+1, field.name)
              else:
                  raise TypeError, "Unsupported type %s" %  
field.__class__.__name__
-        v.Modify(win32com.client.constants.msiViewModifyInsert, r)
+        v.Modify(msiViewModifyInsert, r)
          r.ClearData()
      v.Close()

@@ -298,8 +307,7 @@
      ProductCode = ProductCode.upper()
      d = MakeInstaller()
      # Create the database
-    db = d.OpenDatabase(name,
-         win32com.client.constants.msiOpenDatabaseModeCreate)
+    db = d.OpenDatabase(name, msiOpenDatabaseModeCreate)
      # Create the tables
      for t in schema.tables:
          t.create(db)
@@ -538,7 +546,7 @@
          short = self.make_short(file)
          full = "%s|%s" % (short, file)
          filesize = os.stat(absolute).st_size
-        # constants.msidbFileAttributesVital
+        # msidbFileAttributesVital
          # Compressed omitted, since it is the database default
          # could add r/o, system, hidden
          attributes = 512



From brett at python.org  Mon Jan 31 19:38:57 2011
From: brett at python.org (Brett Cannon)
Date: Mon, 31 Jan 2011 10:38:57 -0800
Subject: [Python-Dev] Issue #11051: system calls per import
In-Reply-To: <20110131134300.2babc577@pitrou.net>
References: <1296377778.24415.4.camel@marge> <4D452E71.6070401@python.org>
	
	<1296405345.24507.9.camel@marge> <4D45D6E1.6030906@canterbury.ac.nz>
	<4D4665B9.9000108@v.loewis.de>
	
	<20110131134300.2babc577@pitrou.net>
Message-ID: 

On Mon, Jan 31, 2011 at 04:43, Antoine Pitrou  wrote:
> On Mon, 31 Jan 2011 00:08:25 -0800
> Guido van Rossum  wrote:
>>
>> (Basically I am biased to believe that stat() is a pretty slow system
>> call -- this may just be old NFS lore though.)
>
> I don't know about NFS, but starting a Python interpreter located on a
> Samba share from a Windows VM is quite slow too.
> I think Martin is right for the common case: on a local filesystem on a
> modern Unix, stat() is certainly very fast. Remote or
> distributed filesystems seem to be more of a problem.

I should mention that I have considered implementing a caching finder
and loader for filesystems in importlib for people to optionally
install to use for themselves. The real trick, though, is should it
only cache hits, misses, or both? Regardless, though, it would be a
very simple mixin or subclass to implement if there is demand for this
sort of thing.

And as for the zipfile being faster, that's true (I have incomplete
benchmarks in importlib that you can use if people want to measure
this stuff themselves, although you will need to tweak them to run
against a zipfile).

From amauryfa at gmail.com  Mon Jan 31 19:58:49 2011
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Mon, 31 Jan 2011 19:58:49 +0100
Subject: [Python-Dev] MSI: Remove dependency from win32com.client module
	(issue4080047)
In-Reply-To: <90e6ba6e85f2cbfc00049b2875bf@google.com>
References: <90e6ba6e85f2cbfc00049b2875bf@google.com>
Message-ID: 

Hi,

2011/1/31  :
> Reviewers: ,
>
> Please review this at http://codereview.appspot.com/4080047/
[...]

It looks good, but did you create an item in the issue tracker?

-- 
Amaury Forgeot d'Arc

From georg.brandl at gmail.com  Mon Jan 31 20:05:29 2011
From: georg.brandl at gmail.com (georg.brandl at gmail.com)
Date: Mon, 31 Jan 2011 19:05:29 +0000
Subject: [Python-Dev] MSI: Remove dependency from win32com.client module
	(issue4080047)
Message-ID: <000325574bc270e521049b2919d9@google.com>

Is there a bugs.python.org issue for this?

http://codereview.appspot.com/4080047/

From techtonik at gmail.com  Mon Jan 31 21:45:45 2011
From: techtonik at gmail.com (techtonik at gmail.com)
Date: Mon, 31 Jan 2011 20:45:45 +0000
Subject: [Python-Dev] MSI: Remove dependency from win32com.client module
	(issue4080047)
Message-ID: <20cf30434772fe60a6049b2a7f9a@google.com>

There is no b.p.o issue as it's not a bug, but a tiny copy/paste patch
to clean up the code a bit while I am trying to understand how to add
Python to the PATH.

I see no reason for b.p.o bureaucracy. Mercurial-style workflow [1] is
more beneficial to development as it doesn't require switching from
console to browser for submitting changes. This way tiny changes can be
integrated/updated more rapidly.

1.
http://mercurial.selenic.com/wiki/ContributingChanges#The_basics:_patches_by_email


http://codereview.appspot.com/4080047/

From brian.curtin at gmail.com  Mon Jan 31 21:49:42 2011
From: brian.curtin at gmail.com (Brian Curtin)
Date: Mon, 31 Jan 2011 14:49:42 -0600
Subject: [Python-Dev] MSI: Remove dependency from win32com.client module
	(issue4080047)
In-Reply-To: <20cf30434772fe60a6049b2a7f9a@google.com>
References: <20cf30434772fe60a6049b2a7f9a@google.com>
Message-ID: 

On Mon, Jan 31, 2011 at 14:45,  wrote:

> There is no b.p.o issue as it's not a bug, but a tiny copy/paste patch
> to clean up the code a bit while I am trying to understand how to add
> Python to the PATH.
>
> I see no reason for b.p.o bureaucracy. Mercurial-style workflow [1] is
> more beneficial to development as it doesn't require switching from
> console to browser for submitting changes. This way tiny changes can be
> integrated/updated more rapidly.
>
> 1.
>
> http://mercurial.selenic.com/wiki/ContributingChanges#The_basics:_patches_by_email
>
>
> http://codereview.appspot.com/4080047/


Please create an issue.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From solipsis at pitrou.net  Mon Jan 31 21:54:06 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 31 Jan 2011 21:54:06 +0100
Subject: [Python-Dev] MSI: Remove dependency from win32com.client module
 (issue4080047)
References: <20cf30434772fe60a6049b2a7f9a@google.com>
Message-ID: <20110131215406.5c597a50@pitrou.net>

On Mon, 31 Jan 2011 20:45:45 +0000
techtonik at gmail.com wrote:
> I see no reason for b.p.o bureaucracy. Mercurial-style workflow [1] is
> more beneficial to development as it doesn't require switching from
> console to browser for submitting changes.

Ok, why don't you contribute to Mercurial instead?



From g.brandl at gmx.net  Mon Jan 31 21:58:43 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 31 Jan 2011 21:58:43 +0100
Subject: [Python-Dev] MSI: Remove dependency from win32com.client module
	(issue4080047)
In-Reply-To: <20cf30434772fe60a6049b2a7f9a@google.com>
References: <20cf30434772fe60a6049b2a7f9a@google.com>
Message-ID: 

Am 31.01.2011 21:45, schrieb techtonik at gmail.com:
> There is no b.p.o issue as it's not a bug, but a tiny copy/paste patch
> to clean up the code a bit while I am trying to understand how to add
> Python to the PATH.
> 
> I see no reason for b.p.o bureaucracy. Mercurial-style workflow [1] is
> more beneficial to development as it doesn't require switching from
> console to browser for submitting changes. This way tiny changes can be
> integrated/updated more rapidly.

The tracker is not bureaucracy, it's how our development process works.
I know that Mercurial uses a different process, with patches always going
to the mailing list and being reviewed there, but that would be way too
much volume for python-dev considering our number of patches.

BTW, you should be able to send emails to report at bugs.python.org in order
to create new issues, and attachments will automatically become attached
to the bug reports.

Georg


From ethan at stoneleaf.us  Mon Jan 31 22:09:16 2011
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 31 Jan 2011 13:09:16 -0800
Subject: [Python-Dev] MSI: Remove dependency from win32com.client module
 (issue4080047)
In-Reply-To: <20cf30434772fe60a6049b2a7f9a@google.com>
References: <20cf30434772fe60a6049b2a7f9a@google.com>
Message-ID: <4D4724FC.1040705@stoneleaf.us>

techtonik at gmail.com wrote:
> I see no reason for b.p.o bureaucracy.

It provides a place for discussion, and makes it easier to coordinate 
multiple efforts.

~Ethan~

From techtonik at gmail.com  Mon Jan 31 22:09:03 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 31 Jan 2011 23:09:03 +0200
Subject: [Python-Dev] Finally fix installer to add Python to %PATH% on
	Windows
In-Reply-To: 
References: 
	
	<4D431724.4010002@voidspace.org.uk>
	<7DA37C12-D3DA-49B3-996A-017CF304BC5C@gmail.com>
	
Message-ID: 

On Fri, Jan 28, 2011 at 10:34 PM, Christian Heimes  wrote:
> Am 28.01.2011 20:29, schrieb Raymond Hettinger:
>> At the very least, we should add some prominent instructions for getting the command line version up and running.
>
> /me pops out of Guido's time machine and says: "execute
> Tools/scripts/win_add2path.py"
>
> I'm -1 on adding Python to %PATH%. The private MSVCRT DLLs may lead to
> unexpected side effects and it doesn't scale at all.

Can you explain that part? There are no any MSVCRT DLLs in my
Python26+ installation directories.

> What about people
> with more than one Python installation? I suggest that we add a single
> user specific directory or a global directory to %PATH% for all
> installations. Then the Python installer or 3rd party modules can drop
> executables like python3.3.exe or plip-3.3.exe into this directory.

python33.exe, but user story about people with more than one Python
installation is a different one.

> A
> .bat file won't do good because .bat files must be called with "call
> python33.bat" from another .bat file or the first one gets terminated.

Wow. I've spent so many years in Windows console and didn't know that. Thanks.

> We can even use a single and simple executable as template for all tasks:
>
> ?* get registry key from resource section of the executable
> ?* use the registry key to lookup the location and name of pythonXX.dll
> ?* load DLL
> ?* get optional dotted module name for resource section
> ?* either fire up interpreter as shell, with **argv or -m
> dotted.module.name **argv
>
> Done ;)

Actually, I would like to see the code that dynamically finds
pythonXX.dll that is available on the system, and loads it into
memory. This will be extremely useful for writing 3rd party
application plugins in Python. Plugins that they only work when Python
is installed and it doesn't really matter which Python version is
there. But that is another story also.
-- 
anatoly t.

From techtonik at gmail.com  Mon Jan 31 22:13:47 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 31 Jan 2011 23:13:47 +0200
Subject: [Python-Dev] Finally fix installer to add Python to %PATH% on
	Windows
In-Reply-To: 
References: 
	
	<4D431724.4010002@voidspace.org.uk>
	<7DA37C12-D3DA-49B3-996A-017CF304BC5C@gmail.com>
	
	
Message-ID: 

Ok. Here is the patch. I used Orca to reverse installer tables of
Mercurial MSI and inserted similar entry for Python.

Also available for review at: http://codereview.appspot.com/4023055
-- 
anatoly t.
-------------- next part --------------
Index: Tools/msi/msi.py
===================================================================
--- Tools/msi/msi.py	(revision 88279)
+++ Tools/msi/msi.py	(working copy)
@@ -463,6 +463,11 @@
              ("CompileGrammar", "COMPILEALL", 6802),
             ])
 
+    # Add target dir to PATH
+    add_data(db, "Environment",
+            [("Environmnent", "=-*PATH", "[~];[TARGETDIR]", "python.exe"), 
+            ])
+
     #####################################################################
     # Standard dialogs: FatalError, UserExit, ExitDialog
     fatal=PyDialog(db, "FatalError", x, y, w, h, modal, title,

From brian.curtin at gmail.com  Mon Jan 31 22:24:33 2011
From: brian.curtin at gmail.com (Brian Curtin)
Date: Mon, 31 Jan 2011 15:24:33 -0600
Subject: [Python-Dev] Finally fix installer to add Python to %PATH% on
	Windows
In-Reply-To: 
References: 
	
	<4D431724.4010002@voidspace.org.uk>
	<7DA37C12-D3DA-49B3-996A-017CF304BC5C@gmail.com>
	
	
	
Message-ID: 

On Mon, Jan 31, 2011 at 15:13, anatoly techtonik wrote:

> Ok. Here is the patch. I used Orca to reverse installer tables of
> Mercurial MSI and inserted similar entry for Python.
>
> Also available for review at: http://codereview.appspot.com/4023055
> --
> anatoly t.


That's the easy part. It doesn't cover any of the real issues with doing
this.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From techtonik at gmail.com  Mon Jan 31 22:43:28 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 31 Jan 2011 23:43:28 +0200
Subject: [Python-Dev] Finally fix installer to add Python to %PATH% on
	Windows
In-Reply-To: 
References: 
	
	<4D431724.4010002@voidspace.org.uk>
	<7DA37C12-D3DA-49B3-996A-017CF304BC5C@gmail.com>
	
	
	
	
Message-ID: 

On Mon, Jan 31, 2011 at 11:24 PM, Brian Curtin  wrote:
> On Mon, Jan 31, 2011 at 15:13, anatoly techtonik 
> wrote:
>>
>> Ok. Here is the patch. I used Orca to reverse installer tables of
>> Mercurial MSI and inserted similar entry for Python.
>>
>> Also available for review at: http://codereview.appspot.com/4023055
>> --
>> anatoly t.
>
> That's the easy part. It doesn't cover any of the real issues with doing
> this.

Please be more specific. It will also help if you integrate this part
while it's still hot.
--
anatoly t.

From brian.curtin at gmail.com  Mon Jan 31 22:49:49 2011
From: brian.curtin at gmail.com (Brian Curtin)
Date: Mon, 31 Jan 2011 15:49:49 -0600
Subject: [Python-Dev] Finally fix installer to add Python to %PATH% on
	Windows
In-Reply-To: 
References: 
	
	<4D431724.4010002@voidspace.org.uk>
	<7DA37C12-D3DA-49B3-996A-017CF304BC5C@gmail.com>
	
	
	
	
	
Message-ID: 

On Mon, Jan 31, 2011 at 15:43, anatoly techtonik wrote:

> On Mon, Jan 31, 2011 at 11:24 PM, Brian Curtin 
> wrote:
> > On Mon, Jan 31, 2011 at 15:13, anatoly techtonik 
> > wrote:
> >>
> >> Ok. Here is the patch. I used Orca to reverse installer tables of
> >> Mercurial MSI and inserted similar entry for Python.
> >>
> >> Also available for review at: http://codereview.appspot.com/4023055
> >> --
> >> anatoly t.
> >
> > That's the easy part. It doesn't cover any of the real issues with doing
> > this.
>
> Please be more specific. It will also help if you integrate this part
> while it's still hot.
> --
> anatoly t.
>

There are numerous comments in the various PATH-related issues on the issue
tracker, and many of them are duplicated in this very thread.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From techtonik at gmail.com  Mon Jan 31 22:50:18 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 31 Jan 2011 23:50:18 +0200
Subject: [Python-Dev] Mercurial style patch submission (Was: MSI: Remove
 dependency from win32com.client module (issue4080047))
Message-ID: 

On Mon, Jan 31, 2011 at 10:54 PM, Antoine Pitrou  wrote:
> On Mon, 31 Jan 2011 20:45:45 +0000
> techtonik at gmail.com wrote:
>> I see no reason for b.p.o bureaucracy. Mercurial-style workflow [1] is
>> more beneficial to development as it doesn't require switching from
>> console to browser for submitting changes.
>
> Ok, why don't you contribute to Mercurial instead?

If you don't want to receive a stupid answer, why don't you read the
link and say what you don't like in this approach in a constructive
manner?

http://mercurial.selenic.com/wiki/ContributingChanges#The_basics:_patches_by_email
-- 
anatoly t.

From techtonik at gmail.com  Mon Jan 31 22:58:20 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 31 Jan 2011 23:58:20 +0200
Subject: [Python-Dev] MSI: Remove dependency from win32com.client module
	(issue4080047)
In-Reply-To: <4D4724FC.1040705@stoneleaf.us>
References: <20cf30434772fe60a6049b2a7f9a@google.com>
	<4D4724FC.1040705@stoneleaf.us>
Message-ID: 

On Mon, Jan 31, 2011 at 11:09 PM, Ethan Furman  wrote:
> techtonik at gmail.com wrote:
>>
>> I see no reason for b.p.o bureaucracy.
>
> It provides a place for discussion, and makes it easier to coordinate
> multiple efforts.

Code review system provides a better space for discussion if we are
speaking about simple code cleanup. To me polluting tracker with the
issues that are neither bugs nor feature requests only makes bug
triaging process and search more cumbersome.
-- 
anatoly t.

From techtonik at gmail.com  Mon Jan 31 23:05:12 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Tue, 1 Feb 2011 00:05:12 +0200
Subject: [Python-Dev] MSI: Remove dependency from win32com.client module
	(issue4080047)
In-Reply-To: 
References: <20cf30434772fe60a6049b2a7f9a@google.com>
	
Message-ID: 

On Mon, Jan 31, 2011 at 10:58 PM, Georg Brandl  wrote:
> Am 31.01.2011 21:45, schrieb techtonik at gmail.com:
>> There is no b.p.o issue as it's not a bug, but a tiny copy/paste patch
>> to clean up the code a bit while I am trying to understand how to add
>> Python to the PATH.
>>
>> I see no reason for b.p.o bureaucracy. Mercurial-style workflow [1] is
>> more beneficial to development as it doesn't require switching from
>> console to browser for submitting changes. This way tiny changes can be
>> integrated/updated more rapidly.
>
> The tracker is not bureaucracy, it's how our development process works.

Don't you want to improve this process? Code review system is a much
better place to review patches than mailing list or bug tracker.
Especially patches that are not related to actual bugs.

> I know that Mercurial uses a different process, with patches always going
> to the mailing list and being reviewed there, but that would be way too
> much volume for python-dev considering our number of patches.

Seems reasonable. Do you have any stats how many patches are sent
weekly and how many of them are actually integrated?

> BTW, you should be able to send emails to report at bugs.python.org in order
> to create new issues, and attachments will automatically become attached
> to the bug reports.

Thanks. I'll keep this in mind.
-- 
anatoly t.

From brian.curtin at gmail.com  Mon Jan 31 23:09:57 2011
From: brian.curtin at gmail.com (Brian Curtin)
Date: Mon, 31 Jan 2011 16:09:57 -0600
Subject: [Python-Dev] Mercurial style patch submission (Was: MSI: Remove
 dependency from win32com.client module (issue4080047))
In-Reply-To: 
References: 
Message-ID: 

On Mon, Jan 31, 2011 at 15:50, anatoly techtonik wrote:

> On Mon, Jan 31, 2011 at 10:54 PM, Antoine Pitrou 
> wrote:
> > On Mon, 31 Jan 2011 20:45:45 +0000
> > techtonik at gmail.com wrote:
> >> I see no reason for b.p.o bureaucracy. Mercurial-style workflow [1] is
> >> more beneficial to development as it doesn't require switching from
> >> console to browser for submitting changes.
> >
> > Ok, why don't you contribute to Mercurial instead?
>
> If you don't want to receive a stupid answer, why don't you read the
> link and say what you don't like in this approach in a constructive
> manner?
>
>
> http://mercurial.selenic.com/wiki/ContributingChanges#The_basics:_patches_by_email
> --
> anatoly t.


>>> Don't send your patch to the BugTracker -
it can't be reviewed there, so it won't go anywhere!

We do fine with reviews on the tracker, and there has been some on and off
work on integrating Rietveld. For the people actually doing the work here,
accepting patches on the tracker and dealing with them there has been a
reasonably effective workflow, enough that we don't see a need to change it.

>>> Patches go to mercurial-devel at selenic.com - no
subscription necessary!

As you were directed to in an earlier email by Georg, there is now a way to
report bugs via email without requiring any subscription. *report*@*bugs*.*
python*.*org is the address.*
*
*
*>>> *Because this is a community project and our developers are very busy,
patches will sometimes fall through the cracks. If you've gotten no
response to your patch after a few days, feel free to resend it.

This is true of any workflow on just about any open source project. Whether
it's email or a bug tracker, not everything is going to be acknowledged,
reviewed, fixed, or rejected immediately. We feel that the tracker allows us
to, well, keep track of things. It works for us.


What they do works for them, and I'm sure it works great. Could it work for
python-dev? Maybe. Is it worth changing anything when no one who is doing
the actual work has voiced a need for change? Absolutely not.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From solipsis at pitrou.net  Mon Jan 31 23:17:52 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 31 Jan 2011 23:17:52 +0100
Subject: [Python-Dev] Mercurial style patch submission (Was: MSI: Remove
 dependency from win32com.client module (issue4080047))
In-Reply-To: 
References: 
Message-ID: <20110131231752.24887e1e@pitrou.net>

On Mon, 31 Jan 2011 23:50:18 +0200
anatoly techtonik  wrote:
> On Mon, Jan 31, 2011 at 10:54 PM, Antoine Pitrou  wrote:
> > On Mon, 31 Jan 2011 20:45:45 +0000
> > techtonik at gmail.com wrote:
> >> I see no reason for b.p.o bureaucracy. Mercurial-style workflow [1] is
> >> more beneficial to development as it doesn't require switching from
> >> console to browser for submitting changes.
> >
> > Ok, why don't you contribute to Mercurial instead?
> 
> If you don't want to receive a stupid answer, why don't you read the
> link and say what you don't like in this approach in a constructive
> manner?

Very simple: I don't want to be spammed with tons of patches, patch
reviews, and issue comments. Also, I want the history of issue
discussions to be easily accessible from permanent, issue-specific
URLs, rather than search through mailing-list archives to understand
why a change was made.

I appreciate that you refrained from giving a stupid answer, however.

From martin at v.loewis.de  Mon Jan 31 23:45:12 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 31 Jan 2011 23:45:12 +0100
Subject: [Python-Dev] Issue #11051: system calls per import
In-Reply-To: 
References: <1296377778.24415.4.camel@marge> <4D452E71.6070401@python.org>
	
	<1296405345.24507.9.camel@marge>
	<4D45D6E1.6030906@canterbury.ac.nz> <4D4665B9.9000108@v.loewis.de>
	
Message-ID: <4D473B78.8080408@v.loewis.de>

> Another thing to consider: on App Engine (which despite of all its
> architectural weirdness uses a -- mostly -- standard Linux filesystem
> for the Python code of the app) someone measured that importing from a
> zipfile is much faster than importing from the filesystem. I would
> imagine this extends to other contexts too, and it makes sense because
> the zipfile directory gets cached in memory so no stat() calls are
> necessary.

Of course, you can't know until you measure, and then you only know
about the specific case.

However, I think you can't really compare zip reading with directory
reading - I'd expect that reading a zip directory is signficantly faster
than reading the directory contents of the zip file unpacked, just
because this is so many fewer layers of indirection.

Regards,
Martin

From benjamin at python.org  Mon Jan 31 23:58:30 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Mon, 31 Jan 2011 16:58:30 -0600
Subject: [Python-Dev] Mercurial style patch submission (Was: MSI: Remove
 dependency from win32com.client module (issue4080047))
In-Reply-To: 
References: 
Message-ID: 

2011/1/31 anatoly techtonik :
> On Mon, Jan 31, 2011 at 10:54 PM, Antoine Pitrou  wrote:
>> On Mon, 31 Jan 2011 20:45:45 +0000
>> techtonik at gmail.com wrote:
>>> I see no reason for b.p.o bureaucracy. Mercurial-style workflow [1] is
>>> more beneficial to development as it doesn't require switching from
>>> console to browser for submitting changes.
>>
>> Ok, why don't you contribute to Mercurial instead?
>
> If you don't want to receive a stupid answer, why don't you read the
> link and say what you don't like in this approach in a constructive
> manner?

As I understand it, there used to be patches at python.org. I'm not sure
why this was discontinued, so perhaps someone more senior should chime
in. :)



-- 
Regards,
Benjamin

From benjamin at python.org  Mon Jan 31 23:59:14 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Mon, 31 Jan 2011 16:59:14 -0600
Subject: [Python-Dev] MSI: Remove dependency from win32com.client module
	(issue4080047)
In-Reply-To: 
References: <20cf30434772fe60a6049b2a7f9a@google.com>
	<4D4724FC.1040705@stoneleaf.us>
	
Message-ID: 

2011/1/31 anatoly techtonik :
> On Mon, Jan 31, 2011 at 11:09 PM, Ethan Furman  wrote:
>> techtonik at gmail.com wrote:
>>>
>>> I see no reason for b.p.o bureaucracy.
>>
>> It provides a place for discussion, and makes it easier to coordinate
>> multiple efforts.
>
> Code review system provides a better space for discussion if we are
> speaking about simple code cleanup. To me polluting tracker with the
> issues that are neither bugs nor feature requests only makes bug
> triaging process and search more cumbersome.

If it's not a bug or a feature request, why does it need to change?



-- 
Regards,
Benjamin